Giter Club home page Giter Club logo

greaselm's Introduction

GreaseLM: Graph REASoning Enhanced Language Models for Question Answering

This repo provides the source code & data of our paper GreaseLM: Graph REASoning Enhanced Language Models for Question Answering (ICLR 2022 spotlight). If you use any of our code, processed data or pretrained models, please cite:

@inproceedings{zhang2021greaselm,
  title={GreaseLM: Graph REASoning Enhanced Language Models},
  author={Zhang, Xikun and Bosselut, Antoine and Yasunaga, Michihiro and Ren, Hongyu and Liang, Percy and Manning, Christopher D and Leskovec, Jure},
  booktitle={International Conference on Learning Representations},
  year={2021}
}

1. Dependencies

Run the following commands to create a conda environment (assuming CUDA 10.1):

conda create -y -n greaselm python=3.8
conda activate greaselm
pip install numpy==1.18.3 tqdm
pip install torch==1.8.0+cu101 torchvision -f https://download.pytorch.org/whl/torch_stable.html
pip install transformers==3.4.0 nltk spacy
pip install wandb
conda install -y -c conda-forge tensorboardx
conda install -y -c conda-forge tensorboard

# for torch-geometric
pip install torch-scatter==2.0.7 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-geometric==1.7.0 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html

2. Download data

Download and preprocess data yourself

Preprocessing the data yourself may take long, so if you want to directly download preprocessed data, please jump to the next subsection.

Download the raw ConceptNet, CommonsenseQA, OpenBookQA data by using

./download_raw_data.sh

You can preprocess these raw data by running

CUDA_VISIBLE_DEVICES=0 python preprocess.py -p <num_processes>

You can specify the GPU you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=.... The script will:

  • Setup ConceptNet (e.g., extract English relations from ConceptNet, merge the original 42 relation types into 17 types)
  • Convert the QA datasets into .jsonl files (e.g., stored in data/csqa/statement/)
  • Identify all mentioned concepts in the questions and answers
  • Extract subgraphs for each q-a pair

The script to download and preprocess the MedQA-USMLE data and the biomedical knowledge graph based on Disease Database and DrugBank is provided in utils_biomed/.

Directly download preprocessed data

For your convenience, if you don't want to preprocess the data yourself, you can download all the preprocessed data here. Download them into the top-level directory of this repo and unzip them. Move the medqa_usmle and ddb folders into the data/ directory.

Resulting file structure

The resulting file structure should look like this:

.
├── README.md
├── data/
    ├── cpnet/                 (prerocessed ConceptNet)
    ├── csqa/
        ├── train_rand_split.jsonl
        ├── dev_rand_split.jsonl
        ├── test_rand_split_no_answers.jsonl
        ├── statement/             (converted statements)
        ├── grounded/              (grounded entities)
        ├── graphs/                (extracted subgraphs)
        ├── ...
    ├── obqa/
    ├── medqa_usmle/
    └── ddb/

3. Training GreaseLM

To train GreaseLM on CommonsenseQA, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh csqa --data_dir data/

You can specify up to 2 GPUs you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=....

Similarly, to train GreaseLM on OpenbookQA, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh obqa --data_dir data/

To train GreaseLM on MedQA-USMLE, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm__medqa_usmle.sh

4. Pretrained model checkpoints

You can download a pretrained GreaseLM model on CommonsenseQA here, which achieves an IH-dev acc. of 79.0 and an IH-test acc. of 74.0.

You can also download a pretrained GreaseLM model on OpenbookQA here, which achieves an test acc. of 84.8.

You can also download a pretrained GreaseLM model on MedQA-USMLE here, which achieves an test acc. of 38.5.

5. Evaluating a pretrained model checkpoint

To evaluate a pretrained GreaseLM model checkpoint on CommonsenseQA, run

CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh csqa --data_dir data/ --load_model_path /path/to/checkpoint

Again you can specify up to 2 GPUs you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=....

Similarly, to evaluate a pretrained GreaseLM model checkpoint on OpenbookQA, run

CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh obqa --data_dir data/ --load_model_path /path/to/checkpoint

To evaluate a pretrained GreaseLM model checkpoint on MedQA-USMLE, run

INHERIT_BERT=1 CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh medqa_usmle --data_dir data/ --load_model_path /path/to/checkpoint

6. Use your own dataset

  • Convert your dataset to {train,dev,test}.statement.jsonl in .jsonl format (see data/csqa/statement/train.statement.jsonl)
  • Create a directory in data/{yourdataset}/ to store the .jsonl files
  • Modify preprocess.py and perform subgraph extraction for your data
  • Modify utils/parser_utils.py to support your own dataset

7. Acknowledgment

This repo is built upon the following work:

QA-GNN: Question Answering using Language Models and Knowledge Graphs
https://github.com/michiyasunaga/qagnn

Many thanks to the authors and developers!

greaselm's People

Contributors

michiyasunaga avatar xikunzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

greaselm's Issues

Cannot reshape array of size 0 into shape (0)

Hi Xikun @XikunZhang ,

Thanks for your great work. When I preprocessed csqa, I have met this error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 337, in concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3
    adj, concepts = concepts2adj(schema_graph)
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 128, in concepts2adj
    adj = coo_matrix(adj.reshape(-1, n_node))
ValueError: cannot reshape array of size 0 into shape (0)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "preprocess.py", line 131, in <module>
    main()
  File "preprocess.py", line 125, in main
    rt_dic['func'](*rt_dic['args'])
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 512, in generate_adj_data_from_grounded_concepts__use_LM
    res3 = list(tqdm(p.imap(concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3, res2), total=len(res2)))
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
    for obj in iterable:
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
ValueError: cannot reshape array of size 0 into shape (0)

I have tried to fix it by editing the line https://github.com/snap-stanford/GreaseLM/blob/803946bba3273556c1ff2be6ad8b02850fe5972d/preprocess_utils/graph.py#L128 to just ignore the reshape method if the array has size 0:

try:
        adj = coo_matrix(adj.reshape(-1, n_node))
except:
        print("FAIL concepts2adj")

I think that I edited in an incorrect way because when running evaluation, I got this error:

points/csqa/csqa_model.pt
***** hyperparameters *****
dataset: csqa
******************************
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
NLP
pid: 74920
screen: 

gpu: 1

torch version: 1.8.0+cu101
torch cuda version: 10.1
cuda is available: True
cuda device count: 1
cudnn version: 7603
wandb id:  1ziiml5l
loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
train_statement_path ./data//csqa/statement/train.statement.jsonl
num_choice 5
Loading sparse adj data...
loading adj matrices: 100%|███████████████████████████████████████████████████████████████████████| 48705/48705 [00:22<00:00, 2158.86it/s]
| ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate: 0.00 | qc_num: 5.46 | ac_num: 1.54 |
Traceback (most recent call last):
  File "greaselm.py", line 606, in <module>
    main(args)
  File "greaselm.py", line 546, in main
    evaluate(args, has_test_split, devices, kg)
  File "greaselm.py", line 449, in evaluate
    dataset = load_data(args, devices, kg)
  File "greaselm.py", line 50, in load_data
    dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
  File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 121, in __init__
    assert all(len(self.train_qids) == len(self.train_adj_data[0]) == x.size(0) for x in [self.train_labels] + self.train_encoder_data + self.train_decoder_data)
AssertionError

Is it possible that you could give me some advices on how I can fix it (the first error).

Thank you & BR,

[Help] About the hyper-parameters to reproduce the result

@XikunZhang @michiyasunaga @roks

Hi,

Thanks for your great effort!

I've run the code in this repo with the same hyper-parameters provided in the script run_greaselm.sh, which are also the same as reported in the paper. But the results aren't as good as reported in the paper. For example, in csqa, the reported dev_acc and test_acc are 78.5(+-0.5) and 74.2(+-0.4) respectively, but the model I trained only performs 77.48 and 73.01 respectively.

I've tried several random seeds, but the problem still exists. So could you please release the hyper-parameters(i.e. random seed) that you used when you train the model?

Look forward to your response!

training time

Hi, thanks for your great work!

I'm trying to train this model with limited resources.

Can I know how much gpus (is it V100?) and how much time did you spend for training (the case for CommonSenseQA)

How to know other node's text information?

Hello,
Thank you for providing such an excellent paper.
I am a student with a lot of interest in this field. I know that the OpenBookQA dataset has 4 correct labels per question, and accordingly, 4 subgraphs occur.
Here, each subgraph has 200 nodes.
There are three types of nodes: context, question, answer, and other nodes. What is the way to know the text information of other nodes? Thanks for reading the long question. have a good day.

Question about freezing LM parameter problem

Hi, I'am very interesting in this model. I want to know why freeze LM parameters previous epochs. In my knowledge, LM parameters are fun-tuned in previous epochs(1~3) and then freeze it.
I would really appreciate for your help.

Bert_large_model can't solved..

I am trying to solve the error: ` File "D:\GreaseLM\modeling\modeling_greaselm.py", line 583, in from_pretrained
raise EnvironmentError(msg)

  • 'bert-large-uncased' is a correct model identifier listed on 'https://huggingface.co/models'
  • or 'bert-large-uncased' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.
    ` .

I tried to download all files from https://huggingface.co/google-bert/bert-large-uncased/tree/main but still, I don't know where to set this file or where the path to it is. Please help me with that.

RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB

When we are trying to run the greaselm.py we are getting this issue even if we run the batch size minimum of 8

we tried from 128-8 every time, It throws the error with different memory size as free , after some epochs. can you help us here in solving this issue and run the code

logits, _ = model(*[x[a:b] for x in input_data])
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 85, in forward
  logits, attn = self.lmgnn(lm_inputs, concept_ids,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 217, in forward
  outputs, gnn_output = self.mp(input_ids, token_type_ids, attention_mask, output_mask, gnn_input, adj, node_type_ids, node_scores, special_nodes_mask, output_hidden_$
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 411, in forward
  encoder_outputs, _X = self.encoder(embedding_output,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 815, in forward
  _X = self.gnn_layers[gnn_layer_index](_X, edge_index, edge_type, _node_type, _node_feature_extra)
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_gnn.py", line 91, in forward
  aggr_out = self.propagate(edge_index, x=x, edge_attr=edge_embeddings) #[N, emb_dim]
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 261, in propagate
  coll_dict = self.__collect__(self.__user_args__, edge_index, size,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 171, in _collect_
  data = self.__lift__(data, edge_index,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 141, in _lift_
  return src.index_select(self.node_dim, index)
RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 15.78 GiB total capacity; 14.28 GiB already allocated; 133.50 MiB free; 14.39 GiB reserved in tot

[Discussion] Relevance of GreaseLM results in light of 'GNN is a Counter?..' paper + dataset discussion

Hi

Very interesting work on the combination of LM + KG! This is something I am looking into myself as a research project (https://github.com/apoorvumang/transformer-kgc), and I thought this would be a good place to discuss what datasets such models should be used on.

In the very recently released paper GNN is a Counter? Revisiting GNN for Question Answering, (code at https://github.com/anonymousGSC/graph-soft-counter), they show that a 1-dim GNN + LM is able to achieve almost SOTA results on both OpenBookQA and CommonsenseQA. In fact according to their numbers it even outperforms GreaseLM on both these datasets.

I would like to discuss a few things regarding the dataset situation:

  1. CommonSenseQA leaderboard no longer accepts ConceptNet based submissions, which is quite a bummer, and OpenBookQA is extremely small (500 test and 500 dev questions only, around 5k train). Is it worth it (for me and others) to work with these datasets, given the findings of 'GNN...' paper?
  2. If not, could GreaseLM (and similar methods) be applied to regular KGQA datasets such WebQuestionsSP, ComplexWebQuestions or GrailQA? This of course would be harder since its no longer MCQ reasoning, but it might be more interesting and can give real evidence of LM + KG based reasoning.
  3. Is there any other datasets apart from the ones I mentioned that could be relevant in this area? (MedQA-USMLE is ofc one, but I feel it is quite new, and having another older/more established dataset would be an advantage)

Looking forward to a healthy discussion! 😊

RoBERTa Baseline

Hello,
Thank you very much for providing the implementation of your model.
I have a question regarding the Roberta baseline.
Unfortunately, I could not find the implementation of Roberta baseline finetuning on in the QA tasks in the repository.
Is it present in the repo, have I overlooked it?
If not, what parameters were used for finetuning and how was the classification layer implemented?
Many thanks in advance!

The experience on complex questions with semantic nuance

I also want to try similar experiments over different complex questions with semantic nuance. But I don't find your specific classification method (Prepositional Phrases, negation terms, hedging terms) for the complex problem. If you still save the code of dealing with the questions, could you share it with me? thanks!

the model "roberta-large" doesn't exist

Hello, recently when replicating this project, I found that the 'reberta target' model no longer exists on the hugging face website. May I ask everyone, can this project be replaced with other models?

Inquiry Regarding Experiment with Aristo-RoBERTa Encoder on OBQA Dataset in GreaseLM Paper

I am reaching out to seek assistance regarding my attempts to reproduce the experimental results mentioned in the GreaseLM paper, specifically concerning the utilization of the Aristo-RoBERTa encoder on the OBQA dataset.

Despite multiple attempts, I have been unable to replicate the performance reported in the paper. In order to facilitate my efforts, I would greatly appreciate it if you could provide more comprehensive details regarding the hyperparameters used in this particular experiment.

Your guidance on this matter would be immensely valuable to me, and I am eager to hear from you at your earliest convenience. Thank you very much for your attention to this matter.

retrieve graph for single sentence input

Hi, thank you for introducing such an intriguing work.

your proposed sub graph retrieval process is customized to Q-A pair input.

What kinds of code snippets should I modify to retrieve a graph for just a single sentence input?

The reason why I ask is that I'd like to make a custom pipeline that samples ConceptNet's subgraph for given custom string input.

FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

Hi Xikun,

Thanks for your great work. May I ask where could I take this inhouse_split_qids.txt file?

***** hyperparameters *****
dataset: csqa
******************************
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
NLP
pid: 493
screen: 

gpu: 1

torch version: 1.8.0+cu101
torch cuda version: 10.1
cuda is available: True
cuda device count: 1
cudnn version: 7603
wandb id:  1iygcifx
loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
train_statement_path ./data//csqa/statement/train.statement.jsonl
num_choice 5
Loading sparse adj data...
| ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate: 0.00 | qc_num: 5.46 | ac_num: 1.54 |
Finish loading training data.
Loading sparse adj data...
| ori_adj_len: mu 12.16 sigma 10.18 | adj_len: 13.16 | prune_rate: 0.00 | qc_num: 5.34 | ac_num: 1.54 |
Finish loading dev data.
Loading sparse adj data...
| ori_adj_len: mu 12.02 sigma 9.17 | adj_len: 13.02 | prune_rate: 0.00 | qc_num: 5.48 | ac_num: 1.53 |
Finish loading test data.
Traceback (most recent call last):
  File "greaselm.py", line 606, in <module>
    main(args)
  File "greaselm.py", line 546, in main
    evaluate(args, has_test_split, devices, kg)
  File "greaselm.py", line 449, in evaluate
    dataset = load_data(args, devices, kg)
  File "greaselm.py", line 50, in load_data
    dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
  File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 144, in __init__
    with open(inhouse_train_qids_path, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

BR,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.