Giter Club home page Giter Club logo

cogqa's People

Contributors

dm-thu avatar qibinc avatar ronakice avatar sleepychord avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cogqa's Issues

论文中"提取一跳节点,但是不计算语义向量"的含义

"And when extracting 1-hop nodes from question to initialize G, we do not calculate semantic vectors and only the Question part exists in the input."

请问这句话该怎样理解?看论文中系统一在访问节点x的输入为:[CLS] Question [SEP] clues[x,G] [SEP] Para[x],不计算语义向量的话,相当于第一次是: [CLS] Question [SEP] [SEP] Para[x],这个意思吗?

ans_loss and hop_loss became 'nan'

Hello, I followed all the instructions to train the System1 model. Since I have only 1 RTX2080 GPU, I adjusted the batch size to 4 instead of 12. That is python train.py --batch_size 4. After training about 60% of the training process, the hop_loss and ans_loss became nan. I used another server which has multiple GPUs and trained system1 with the default settings batch_size=12 and it works successfully. I wonder whether it is caused by the loss function LogSoftmax which may be -inf? Do you have any solution for this problem? Thank you.

model1 = BertForMultiHopQuestionAnswering.from_pretrained(BERT_MODEL, cache_dir=PYTORCH_PRETRAINED_BERT_CACHE / 'distributed_{}'.format(-1))

Model name 'bert-base-uncased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz' was a path or url but couldn't find any file associated to this path or url.

作者老师您好!
Model name 'bert-base-uncased' was not found in model name list (bert-base-uncased,
这个是我今年见过的最离谱的bug了。。。

model2 = CognitiveGNN(model1.config.hidden_size) 'NoneType' object has no attribute 'config'

作者老师您好!有个bug我刚开始复现的时候就出现了,后来跑着跑着自己没了,现在换了模型又出现了。
Traceback (most recent call last):
File "/home/shaoai/CogQA/train.py", line 337, in
fire.Fire(main)
File "/home/shaoai/anaconda3/envs/mypytorch/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/shaoai/anaconda3/envs/mypytorch/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/shaoai/anaconda3/envs/mypytorch/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "/home/shaoai/CogQA/train.py", line 323, in main
model2 = CognitiveGNN(model1.config.hidden_size)
AttributeError: 'NoneType' object has no attribute 'config'
这是我第一次clone代码跑的时候就碰到的bug,请问是为什么呢?怎么解决?
期待您的回复!感谢!

您好,关于段落抽取的问题。

您好,我没有太看明白,您关于段落的选择并不是事先选出可能的几个(比如5个10个)的段落在进行推理,而是直接在所有的段落上进行迭代推理吗?

About the framework of GNN

i found the code of CognitiveGNN model and it seems like you do not use any framework of GNN such as DGI or PyG, So why do you not use any framework cause i think may it can help with speed problem.

数据集中的答案并不是来源于该条数据的材料--问题

注意到:数据集中的答案并不是来源于该条数据的材料,那么多跳是根据什么来跳的?不是本条数据的材料吗?

例如:问题-Which magazine was started first Arthur's Magazine or First for Women?,答案是Arthur's Magazine,但是答案在"context"中没有提到。

排除了用wipidia的外部数据进行多跳,因为窝并没有运行那段程序,嗷嗷

sem[x,Q,clues]在系统一和系统二中的作用

大佬您好,我看到sem[x,Q,clues]在您论文中的系统一和系统二都有体现,想明白sem是什么意思?具体体现在:

  1. 系统一--对于答案节点x, Para[x]可能丢失。因此不提取跨度,而是基于”句子A”部分来计算sem[x,Q,clues];

  2. 系统二:为了完全理解实体x和问题Q之间的关系,仅仅通过分析sem[x,Q,clues]远远不够;

请问可以解释下这两句话的含义吗?
(我之前以为工作的流程是:通过bert抽取得到下一跳实体,再由下一跳实体在gnn中游走推理,进而再传给bert,但是发现多了个sem)

希望自己能够表述清楚~·······································`·

Possibility of training on 1 gpu?

Running training code causes continuous Cuda memory errors starting at around epoch 34.
My gpu:nvidia gtx 100ti .
I’ve tried to offload GCN to CPU and set batch size to 1.
Any way I could further optimize my code to prevent these errors?
Thanks.

The low results

The final results are very low: {'em': 0.03551654287643484, 'f1': 0.0478604024080054, 'prec': 0.051990450467830615, 'recall': 0.0491018037405613, 'sp_em': 0.0005401755570560432, 'sp_f1': 0.12093599476326915, 'sp_prec': 0.07244085040682292, 'sp_recall': 0.42420838558245866, 'joint_em': 0.0001350438892640108, 'joint_f1': 0.009409090423642906, 'joint_prec': 0.005848759671643607, 'joint_recall': 0.03384786494172044} which is nearly 10 times worse than your results in the paper. Can you tell me the reason?

fullwiki data

hello, could you tell me where can i download the fullwiki data(enwiki-20171001-pages-meta-current-withlinks-abstracts),thanks

Where is the fullwiki_input_improved_by_cogqa1hop.zip?

I have completed training the task1 and task2. When I want to evaluate the model, I couldn't find where is the fullwiki_input_improved_by_cogqa1hop.zip. So I directly run the command: python cogqa.py --data_file='hotpot_dev_fullwiki_v1_merge.json'. But during the answering, it shows some wrong: ValueError: attempt to get argmin of an empty sequence.
Start Training... on 1 GPUs
17%|██████▊ | 1294/7405 [03:41<13:29, 7.55it/s]Traceback (most recent call last):
File "cogqa.py", line 244, in
fire.Fire(main)
File "/home/zeyuzhang/anaconda3/lib/python3.7/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/zeyuzhang/anaconda3/lib/python3.7/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/zeyuzhang/anaconda3/lib/python3.7/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "cogqa.py", line 234, in main
gold, ans, graph_ret, ans_nodes = cognitive_graph_propagate(tokenizer, data, model1, model2, device, setting = setting, max_new_nodes=max_new_nodes)
File "cogqa.py", line 147, in cognitive_graph_propagate
l, r = find_start_end_before_tokenized(orig_text, [pred_slice])[0]
File "/home/zeyuzhang/Downloads/CogQA-master/utils.py", line 238, in find_start_end_before_tokenized
result = fuzzy_find([span], orig_text)
File "/home/zeyuzhang/Downloads/CogQA-master/utils.py", line 107, in fuzzy_find
r, score = dp(item, sentence)
File "/home/zeyuzhang/Downloads/CogQA-master/utils.py", line 86, in dp
r = np.argmin(f[len(a) - 1])
File "/home/zeyuzhang/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 1172, in argmin
return _wrapfunc(a, 'argmin', axis=axis, out=out)
File "/home/zeyuzhang/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 56, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
ValueError: attempt to get argmin of an empty sequence
Can you tell me how to solve these? Thank you very much.

Where can i get `hotpot_train_v1.1.json`?

When i run !python /content/CogQA/process_train.py the following error occures:

Traceback (most recent call last): File "/content/CogQA/process_train.py", line 18, in <module>

with open('./hotpot_train_v1.1.json', 'r') as fin:

FileNotFoundError: [Errno 2] No such file or directory: './hotpot_train_v1.1.json'

I wonder where can i get this file, please help

dump.rdb

Hi,
Why didn't I find this file “dump.rdb”. Thx u ve much.

关于训练结果

你好,最近尝试跑了一下代码。在没有动任何参数的情况下得出的正确率似乎没有达到paper上说的55%左右,请问是需要动别的什么设置吗? 还是我可能哪里搞错了? 训练过程中也没有任何报错 感觉完全没头绪。 下面是我跑出来的结果, 希望大佬能解答

{'f1': 0.08085961805557072, 'joint_recall': 0.015151681770718994, 'joint_prec': 0.031592446116416414, 'em': 0.052397029034436195, 'sp_f1': 0.11114176393041891, 'joint_f1': 0.019395676593513603, 'joint_em': 0.0, 'sp_em': 0.0, 'sp_recall': 0.08090929552104435, 'prec': 0.08201151088389438, 'recall': 0.08718660277950503, 'sp_prec': 0.18284942606347063}

将bert换成albert时,加载输入数据时出了个错误

作者老师您好!我在改进代码模型的时候尝试将bert换成albert
我把
BERT_MODEL = 'bert-base-uncased'
tokenizer = BertTokenizer.from_pretrained(BERT_MODEL, do_lower_case=True)
换成了
tokenizer = BertTokenizer.from_pretrained("./albert_base")
BERT_MODEL = BertModel.from_pretrained("./albert_base")

然后报错:
File "train.py", line 158, in main
bundles.append(convert_question_to_samples_bundle(tokenizer, data))
File "/home/shao/CogQA/data.py", line 187, in convert_question_to_samples_bundle
ids.append(tokenizer.convert_tokens_to_ids(tokenized_all))
File "/home/shao/anaconda3/envs/cogqa/lib/python3.6/site-packages/pytorch_pretrained_bert/tokenization.py", line 121, in convert_tokens_to_ids
ids.append(self.vocab[token])
KeyError: '[CLS]'
请问会是加载数据时什么方面的原因呢?期待您的回复!

About dump.pkl

Hi,

Considering rigorousness, is dump.pkl mentioned in README the Redis dump file?

In my experiment, I didn't find dump.pkl in the project directory, but I found a dump.rdb which is about 2.4GB in redis directory.

Thank you!

Self-cycle in gold-only cognitive graph for comparison question

Hi,

I found it may cause a self-cycle in the following snippet.

CogQA/process_train.py

Lines 91 to 93 in 217f0f1

if bundle['answer'] == 'yes' or bundle['answer'] == 'no' \
or (question_type > 0 and bundle['type'] == 'comparison'):
pool.add(title)

For example, after running process_train.py, I got a JSON object like this:

{
  "supporting_facts": [
    [
      "Arthur's Magazine",
      0,
      [
        [
          "Arthur's Magazine",
          "Arthur's Magazine",
          0,
          17
        ]
      ]
    ],
    [
      "First for Women",
      0,
      [
        [
          "First for Women",
          "First for Women",
          0,
          15
        ]
      ]
    ]
  ],
  "level": "medium",
  "question": "Which magazine was started first Arthur's Magazine or First for Women?",
  "context": ["..."],
  "answer": "Arthur's Magazine",
  "_id": "5a7a06935542990198eaf050",
  "type": "comparison",
  "Q_edge": [
    [
      "First for Women",
      "First for Women",
      54,
      69
    ],
    [
      "Arthur's Magazine",
      "Arthur's Magazine",
      33,
      50
    ]
  ]
}

However, I think it should be like what showed in your examples:

{
  "supporting_facts": [
    [
      "Arthur's Magazine",
      0,
      []
    ],
    [
      "First for Women",
      0,
      []
    ]
  ],
  "level": "medium",
  "question": "Which magazine was started first Arthur's Magazine or First for Women?",
  "context": ["..."],
  "answer": "Arthur's Magazine",
  "_id": "5a7a06935542990198eaf050",
  "type": "comparison",
  "Q_edge": [
    [
      "Arthur's Magazine",
      "Arthur's Magazine",
      33,
      50
    ],
    [
      "First for Women",
      "First for Women",
      54,
      69
    ]
  ]
}

Could you explain what this snippet works for? By the way, I got a reproduction result which is about 10% lower than the result in the paper on dev set with 2 K80 GPUs, do you think this snippet is a reason of low result?

Thank you!

About process_train.py

Hi ming

If you load the hotpot_dev_fullwiki_v1.json in your process_train.py, there is an error occuring.
屏幕快照 2019-07-05 17 49 01

代码中utils中的GENERAL_WD解释

在process_train中对数据进行处理时,utils.py的认知图构建,这个变量的含义具体指的什么意思?

如下:
GENERAL_WD = ['is', 'are', 'am', 'was', 'were', 'have', 'has', 'had', 'can', 'could', 'shall', 'will', 'should', 'would', 'do', 'does', 'did', 'may', 'might', 'must', 'ought', 'need', 'dare']

另外我能否理解为process_train做的工作就是加一个模糊匹配的实体,以及构建认知图?
感谢回复~

About process_train.py

Hi, Ming
According to README, I run the code in the following order:

  1. process_train.py with hotpot_train_v1.1.json -> get hotpot_train_v1.1_refined3.json
  2. read_fullwiki.py ( from read_fullwiki.ipynb)
  3. run_cg.py (first time)
  4. run_cg.py (second time, set lr=4*1e-5) -> models/bert-base-uncased.bin & .bin.tmp
  5. eval_cg.py on hotpot_dev_fullwiki_v1.json -> hotpot_dev_fullwiki_v1_pred.json
  6. hotpot_evaluate_v1.py with hotpot_dev_fullwiki_v1_pred.json & hotpot_dev_fullwiki_v1.json
    the results not ideal, but I can not find any reasons.
    Some wrong operations exist during the above process?
    Some Other questions
    1)In step 1, GOT hotpot_train_v1.1_refined3.json. By using "_id", I SELECT the same 500 examples as hotpot_train_v1.1_500_refined.example.json from hotpot_train_v1.1_refined3.json.
    but there are some differences between the new_generated 500 and your provided 500 examples.
  7. In read_fullwiki.py, the last part, "pages" represent ?
    Could you please give me some advice? Or could you provide a refined-training dataset download link? Thanks a lot!

NEG TOO LONG!

When I run "python train.py to train Task #1" , such an error occurred " NEG TOO LONG! id: 5a72a3a25542994cef4bc3ab ". Should I ignore this problem ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.