Giter Club home page Giter Club logo

tokengt's Introduction

Tokenized Graph Transformer (PyTorch)

Pure Transformers are Powerful Graph Learners
Jinwoo Kim, Tien Dat Nguyen, Seonwoo Min, Sungjun Cho, Moontae Lee, Honglak Lee, Seunghoon Hong
NeurIPS 2022

image-tokengt

Setting up experiments

Using the provided Docker image (recommended)

docker pull jw9730/tokengt:latest
docker run -it --gpus=all --ipc=host --name=tokengt -v /home:/home jw9730/tokengt:latest bash
# upon completion, you should be at /tokengt inside the container

Using the provided Dockerfile

git clone --recursive https://github.com/jw9730/tokengt.git /tokengt
cd tokengt
docker build --no-cache --tag tokengt:latest .
docker run -it --gpus all --ipc=host --name=tokengt -v /home:/home tokengt:latest bash
# upon completion, you should be at /tokengt inside the container

Using pip

sudo apt-get update
sudo apt-get install python3.9
git clone --recursive https://github.com/jw9730/tokengt.git tokengt
cd tokengt
bash install.sh

Running experiments

Synthetic second-order equivariant basis approximation

cd equivariant-basis-approximation/scripts

# Train and save logs, ckpts, and attention maps (--save_display)
bash [INPUT]-[NODE_IDENTIFIER]-[TYPE_IDENTIFIER].sh

# Test and save attention maps (--save_display)
bash [INPUT]-[NODE_IDENTIFIER]-[TYPE_IDENTIFIER]-test.sh

# For the visualization of saved attention maps, please see viz_multi.ipynb

PCQM4Mv2 large-scale graph regression

cd large-scale-regression/scripts

# TokenGT (ORF)
bash pcqv2-orf.sh

# TokenGT (Lap)
bash pcqv2-lap.sh

# TokenGT (Lap) + Performer
bash pcqv2-lap-performer-finetune.sh

# TokenGT (ablated)
bash pcqv2-ablated.sh

# Attention distance plot for TokenGT (ORF)
bash visualize-pcqv2-orf.sh

# Attention distance plot for TokenGT (Lap)
bash visualize-pcqv2-lap.sh

Pre-Trained Models

We provide checkpoints of TokenGT (ORF) and TokenGT (Lap), both trained with PCQM4Mv2. Please download ckpts.zip from this link. Then, unzip ckpts and place it in the large-scale-regression/scripts directory, so that each trained checkpoint is located at large-scale-regression/scripts/ckpts/pcqv2-tokengt-[NODE_IDENTIFIER]-trained/checkpoint_best.pt. After that, you can resume the training from these checkpoints by adding the option --pretrained-model-name pcqv2-tokengt-[NODE_IDENTIFIER]-trained to the training scripts.

References

Our implementation uses code from the following repositories:

Citation

If you find our work useful, please consider citing it:

@article{kim2022pure,
  author    = {Jinwoo Kim and Tien Dat Nguyen and Seonwoo Min and Sungjun Cho and Moontae Lee and Honglak Lee and Seunghoon Hong},
  title     = {Pure Transformers are Powerful Graph Learners},
  journal   = {arXiv},
  volume    = {abs/2207.02505},
  year      = {2022},
  url       = {https://arxiv.org/abs/2207.02505}
}

Acknowledgements

The development of this open-sourced code was supported in part by the National Research Foundation of Korea (NRF) (No. 2021R1A4A3032834).

tokengt's People

Contributors

jw9730 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tokengt's Issues

How to get the test rusults?

Could you provide some test scripts like the training scripts to get the result reported in the paper? Thanks a lot.

SEB in node classification

Hello,

First of all, great work!

I am implementing a node classification task using your model. It works perfectly with small tweaks and additions. However, I can't figure our how to implement SEB (sparse equivariant basis). Where do I need to add the torch.coalesce call?

Thanks in advance for your answer!

code

Hello, I'm particularly interested in your work. Can you publish your code?

Using for chess engine

Hello, I wanted to include this in my experimentation for augmenting a chess engine. I have prepared some datasets that represent each successive board state of 64 nodes (1 node for every space on the board) and along with it I have added edges with connecting nodes representing legal moves. I also have the stockfish evaluation of the board (so a positive/negative float representing who is winning based off of some specific heuristics that stockfish uses.) I want to first see if I can train a graph network to predict this board evaluation without any custom heuristics. Any chance you can point me in the right direction? I am familiar with pytorch lightning so was hoping I could just import your mode. The data I have prepared is arranged using networkx graph library by the way.
Thanks!

Reasoning behind `convert_to_single_emb`

I am trying to apply TokenGT to the 2D_data_npj molecular dataset for property prediction. I am struggling to understand why the following function is applied during the preprocessing stage:

@torch.jit.script
def convert_to_single_emb(x, offset: int = 512):
    feature_num = x.size(1) if len(x.size()) > 1 else 1
    feature_offset = 1 + torch.arange(0, feature_num * offset, offset, dtype=torch.long)
    x = x + feature_offset
    return x

This seems to increase the values of the features, and leads to errors with the embeddings downstream since the lookup table size is much smaller than say, 51200 (if I have a node feature dimension of 100).

Run this code without Fairseq

Is there any way to run this code without fairseq? I want to train tokengt with my own graph data and check the results.

Missing best_valid

In your visualize folder, torch.load("best_valid.pt") is not provided in either checkpoint or in the repo.
where can I find it?

Fairesq advantage

Hello,

Can I know what the advantage of using Fairseq is?
I mean if we constructed the dataset using the PyTorch geometric constructor, added your wrapper for the eig_values calculation, and then pushed all of them to a regular simple Transformer (using our own working implementation for example), will this work? (Let's forget about the usage of performer and its dependencies for now).

Thanks!

Weight sharing compatibility

In the Transformer, a weight sharing scheme between the input embedding and output projection layer is used to improve efficiency. Any reasons why this is not implemented, and how it could be done?

Are token features and node/type identifiers added or concatenated?

Thanks for providing the great implementation codes!

After looking into the codes, I have a quick question about the formation of input features to the TokenGT model. Specifically, if I understood the paper correctly, the node features, token identifiers, and token type identifiers are concatenated (C + 2 * d_p + d_e dimensions according to Section 2 - Main Transformer in the paper), while in the code here, they seem to be added together rather than concatenated. Am I misunderstanding the paper or the codes? Or are these two approaches actually equivalent or achieving similar performances?

Thank you for any help on this!

question

what's the difference between model in equivariant- and in large-scala-?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.