Giter Club home page Giter Club logo

desco's Introduction

DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting

This repository is the official implementation of the paper: DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting. Please consider staring us if you find it interesting.

The paper is accepted by WSDM'24. You can view our project page.

DeSCo workflow

Code Structure

main.py is the implementation of DeSCo.

subgraph_counting contains all the modules needed by python scripts.

baseline.py is the implementation of two neural baselines (DIAMNet and LRP) that is compared with DeSCo in the paper. ablation_gnns.py is used for the ablation study of the expressive power of SHMP. It implements other expressive GNNs. ablation_wo_canonical.py is used for the ablation study of canonical partition. It implements DeSCo's neighborhood counting stage without canonical partition.

Requirements

Python >= 3.9

To install requirements:

pip install -r requirements.txt

Pre-trained Models

The neighborhood counting and gossip propagation model in our paper is trained on our synthetic dataset. Users can download our pre-trained model from here

Evaluation

To evaluate the trained models on real-world datasets, please run the following command:

python main.py --test_dataset COX2 --neigh_checkpoint ckpt/{checkpoint_path}/neigh/{model_name}.ckpt --gossip_checkpoint ckpt/{checkpoint_path}/gossip/{model_name}.ckpt --test_gossip

The above command gives an example of evaluating the trained models on COX2. The path of checkpoints should be replaced by the real path of your trained model checkpoints.

The code comes with analysis methods in subgraph_counting/workload.py, which outputs the inference count of the model. Users should be able to get any desired metrics with these count easily.

Train from Scratch

Alternatively, if you wish to train your own model instead of using our pre-trained version, here are the instructions you may need.

Dataset

To benefit future research, we release the large synthetic dataset with subgraph count ground-truth that we used in our pre-trained model. Users can download the dataset zip file from here and move the unziped folder under DeSCo/data/ to train from scratch.

Code and configurations

If you desire to train with the official configuration of DeSCo, simply run this command:

python main.py --train_dataset Syn_1827 --valid_dataset Syn_1827 --test_dataset MUTAG --train_neigh --train_gossip --test_gossip

To train the model(s) in the paper with other configurations, please specifies the parameters in the command.

The bool parameters train_neigh, train_gossip, and test_gossip, determine whether to train and to test the neighborhood counting and gossip propagation model.

Please refer to the Appendix for the detailed training parameters.

Citation

If you find our work useful, please consider citing:

@inproceedings{fu2024desco,
  title={DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting},
  author={Fu, Tianyu and Wei, Chiyue and Wang, Yu and Ying, Rex},
  booktitle={Proceedings of the 17th ACM International Conference on Web Search and Data Mining},
  pages={218--227},
  year={2024}
}

Contributing

Welcome to use the code or contribute to the project!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.