Public Repo for "Towards Understanding Link Predictor Generalizability Under Distribution Shifts""

Results Replication

Code run within a Miniconda Virtual Environment built from the environment.yml file

Additional Dataset Details Included in the Article Appendix

Generate dataset splits

bash gen_synth.sh

Run GCN baseline

bash gcn.sh

Minimal Project Installation

# Necessary Packages for Minimal Result Replication
# All packages installed via Conda unless specified
Python 3.9
CUDA 11.6
PyTorch 1.13.1
PyTorch Geometric  2.5.2
Torch Scatter 2.1.0
Torch Sparse 0.6.15
Torch Cluster 1.6.0
OGB 1.3.6 # pip

Dataset Naming Scheme

After gen_synth.py finishes running: 'forward' splits of the shifted dataset will then be available in the dataset/ folder under the name:

{data_name}_{split_type}_0_{valid_rat}_{test_rat}_seed1Dataset

'backward' splits swap 'test_rat' and 'valid_rat' parameters:

{data_name}_{split_type}_{test_rat}_{valid_rat}_0_seed1Dataset

Dataset Loading

LPShift datasets follow the OGB format for positive samples and HeaRT for negative valid and testing samples.
We advise running different size batches for training, validation, and testing to ensure efficient run time.

from synth_dataset import SynthDataset

data = SynthDataset(dataset_name="ogbl-collab_CN_2_1_0_seed1").get()  # PyG graph object for training adjacency matrix         
split_edge = SynthDataset(dataset_name="ogbl-collab_CN_2_1_0_seed1").get_edge_split()

pos_train_edge = split_edge['train']['edge']
pos_valid_edge = split_edge['valid']['edge']
pos_test_edge = split_edge['test']['edge']

with open(f'dataset/{dataset_name}Dataset/heart_valid_samples.npy', "rb") as f:
    neg_valid_edge = np.load(f)
    neg_valid_edge = torch.from_numpy(neg_valid_edge)
with open(f'dataset/{dataset_name}Dataset/heart_test_samples.npy', "rb") as f:
    neg_test_edge = np.load(f)
    neg_test_edge = torch.from_numpy(neg_test_edge)

If you use this code or find the article helpful, please cite:

@article{revolinsky2024understanding,
  title={Understanding the Generalizability of Link Predictors Under Distribution Shifts on Graphs},
  author={Revolinsky, Jay and Shomer, Harry and Tang, Jiliang},
  journal={arXiv preprint arXiv:2406.08788},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
eval.py		eval.py
gcn.sh		gcn.sh
gen_synth.py		gen_synth.py
gen_synth.sh		gen_synth.sh
gnn_model.py		gnn_model.py
main_gnn.py		main_gnn.py
synth_dataset.py		synth_dataset.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Public Repo for "Towards Understanding Link Predictor Generalizability Under Distribution Shifts""

Results Replication

Code run within a Miniconda Virtual Environment built from the environment.yml file

Additional Dataset Details Included in the Article Appendix

Generate dataset splits

Run GCN baseline

Minimal Project Installation

Dataset Naming Scheme

Dataset Loading

If you use this code or find the article helpful, please cite:

About

Uh oh!

Releases

Packages

Languages

License

revolins/LPShift

Folders and files

Latest commit

History

Repository files navigation

Public Repo for "Towards Understanding Link Predictor Generalizability Under Distribution Shifts""

Results Replication

Code run within a Miniconda Virtual Environment built from the environment.yml file

Additional Dataset Details Included in the Article Appendix

Generate dataset splits

Run GCN baseline

Minimal Project Installation

Dataset Naming Scheme

Dataset Loading

If you use this code or find the article helpful, please cite:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages