I followed the steps in the notebook but at the training the model step the code failed. This is the output from the training step in the notebook:
Keyword arguments {'is_pretokenized': True} not recognized.
<message repeats 100 times>
Keyword arguments {'is_pretokenized': True} not recognized.
AttributeError Traceback (most recent call last)
<ipython-input-12-88f6aa8b6960> in <module>()
2 epoch_best = 0
3 for epoch in range(EPOCHS):
----> 4 acc_train = dev_function.train( seq2sql_model, roberta_model, model_optimizer, roberta_optimizer, tokenizer, configuration, path_wikisql, train_loader)
5 acc_dev, results_dev, cnt_list = dev_function.test(seq2sql_model, roberta_model, model_optimizer, tokenizer, configuration, path_wikisql, dev_loader, mode="dev")
6 print_result(epoch, acc_train, 'train')
2 frames
/content/RoBERTa-NL2SQL/dev_function.py in train(seq2sql_model, roberta_model, model_optimizer, roberta_optimizer, roberta_tokenizer, roberta_config, path_wikisql, train_loader)
61 = roberta_training.get_wemb_roberta(roberta_config, roberta_model, roberta_tokenizer,
62 natural_lang_utterance_tokenized, headers,max_seq_length= 222,
---> 63 num_out_layers_n=2, num_out_layers_h=2)
64 # natural_lang_embeddings: natural language embedding
65 # header_embeddings: header embedding
/content/RoBERTa-NL2SQL/roberta_training.py in get_wemb_roberta(roberta_config, model_roberta, tokenizer, nlu_t, hds, max_seq_length, num_out_layers_n, num_out_layers_h)
76 all_encoder_layer, i_nlu, i_headers,\
77 l_n, l_hpu, l_hs, \
---> 78 nlu_tt, t_to_tt_idx, tt_to_t_idx = get_roberta_output(model_roberta, tokenizer, nlu_t, hds, max_seq_length)
79 # all_encoder_layer: RoBERTa outputs from all layers.
80 # i_nlu: start and end indices of question in tokens
/content/RoBERTa-NL2SQL/roberta_training.py in get_roberta_output(model_roberta, tokenizer, nlu_t, headers, max_seq_length)
176 all_encoder_layer = list(all_encoder_layer)
177
--> 178 assert all((check == all_encoder_layer[-1]).tolist())
179
180 # 5. generate l_hpu from i_headers
AttributeError: 'bool' object has no attribute 'tolist'
assert check == all_encoder_layer[-1]
It ran but resulted in the assertion triggering and printing out the values yielded:
check: last_hidden_state, all_encoder_layer[-1]: s
I also tried removing the assertion entirely but, as expected, the code failed later.
Has anybody executed the notebook successfully or found a fix so it can complete training?