This repository contains the official implementation for the paper "A Scalable Pretraining Framework for Link Prediction with Efficient Adaptation" accepted at KDD 2025.
This repository contains the implementation for reproducing the results presented in Table 2 of our paper. The code currently supports experiments on two downstream datasets:
coraciteseer
- CUDA 11.8
- Python 3.8 or higher
- pip or conda
# Clone the repository
git clone https://github.com/yourusername/PALP.git
cd PALP
# Install dependencies
pip install -r requirements.txtThe pretraining of PALP was conducted on ogbn-papers100M. Due to the large scale of this graph, which takes up ~200G to store the node features and structures, we were unable to release the processed data in the repo. To reproduce the results from our paper, we provide the pretrained checkpoints directly in the ckpt-1 and ckpt-2 directories.
The repository contains the following key directories:
node_data/: Contains graph structures and node features. These files were originally processed using TSGFM.link_data/: Contains training data for link prediction, including:- Positive/negative edges for train, validation and test
- BUDDY features for edges (processed using subgraph-sketching)
ckpt-1/: Contains pretrained model checkpoints for the node moduleckpt-2/: Contains pretrained model checkpoints for the edge module
Both models were pretrained on ogbn-papers100M. The configuration files for these models are:
model_1_config.yamlmodel_2_config.yaml
To test the pretrained checkpoints with our proposed adaptation strategy, use the following command:
python test_link_merge_all.py --data_name 'cora' --train_ratio 0.4--data_name: Name of the dataset to use ('cora' or 'citeseer')--train_ratio: Training data ratio (default: 0.4)
This project is licensed under the MIT License - see the LICENSE file for details.
- We thank the authors of TSGFM for their data processing pipeline
- We thank the authors of subgraph-sketching for their BUDDY feature implementation
- We thank the authors of NAGphormer for their implementation of NAGphormer