wing-nus / sequicity Goto Github PK

Source code for the ACL 2018 paper entitled "Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures" by Wenqiang Lei et al.

Python 100.00%

sequence-architectures dialogue

sequicity's Introduction

Sequicity

Source code for the ACL 2018 paper entitled "Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures" by Wenqiang Lei et al.

@inproceedings{lei2018sequicity,
  title={Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures},
  author={Lei, Wenqiang and Jin, Xisen and Kan, Min-Yen and Ren, Zhaochun and He, Xiangnan and Yin, Dawei},
  booktitle={ACL},
  year={2018}
}

Training with default parameters

python model.py -mode train -model [tsdf-camrest|tsdf-kvret]

(optional: configuring hyperparameters with cmdline)

python model.py -mode train -model [tsdf-camrest|tsdf-kvret] -cfg lr=0.003 batch_size=32

Testing

python model.py -mode test -model [tsdf-camrest|tsdf-kvret]

Reinforcement fine-tuning

python model.py -mode rl -model [tsdf-camrest|tsdf-kvret] -cfg lr=0.0001

Before running

Install required python packages. We used pytorch 0.3.0 and python 3.6 under Linux operating system.

pip install -r requirements.txt

Make directories under PROJECT_ROOT.

mkdir vocab
mkdir log
mkdir results
mkdir models
mkdir sheets

Download pretrained Glove word vectors and place them in PROJECT_ROOT/data/glove.

sequicity's People

Contributors

Stargazers

Watchers

sequicity's Issues

The content in the tokenized kvret files seems to be wrong

I have two questions about the kvret/*.tokenized.json files.

Why are all "requested" fields empty in these files ?

All requested fields seem to contain only the EOS_Z2 token in the requested field ? This seems to be wrong ? At least some of the dialogues have actual requested slots in the kvret dataset.

>> print(frozenset([e3 for e1 in json.load(open('train.tokenized.json')) 
                       for e2 in e1 for e3 in e2['requested']]))

frozenset({'EOS_Z2'})

Why aren't the constraints changing as the dialogue progresses ?

Consider the first dialogue dial_id: 0 about directions to parking garage in kvret/train.tokenized.json. For all the turns in that dialogue the constraint set remains the same
"constraint": [ "parking", "garage", "EOS_Z1" ] but the user asks to avoid heavy traffic in the next turn. This should change the constraints right ?

About CopyNet

Thanks for your open-source code.
I have a question about your stable version of the copynet.
I find that your version copynet considers about the generation probability of the source tokens, but in the origin paper of CopyNet, they also added some other mechanisms, such as selective read, but I have not found it in your code (Is this version copynet good enough for the Sequicity ?), can you tell me the reason?
Thank you very much!

About Reinforcement Learning

First of all, thanks for your open-source code of this wonderful work.
I also have some questions about your code of reinforcement learning. I found that in your version of reinforcement learning, you use the training dataset for policy gradient to fine-tuning parameters.
But actually, in my opinion, a user simulator should be used as the environment for updating the parameters in RL setup. Can you tell me the reason?
Thank you very much !

During the decoding time,where do you use information of response

In your paper,you say Bt = seq2seq(Bt−1Rt−1Ut|0,0) but I don't find information of response used during the decoding time in your code.Can you tell me where you use information of response?

正则表达式

请问re.sub('(\d+) ([ap]m)', lambda x: x.group(1) + x.group(2), u)中的([ap]m)表示什么？

config error when fine-tuning with rl

python model.py -mode train -mode [tsdf-camrest|tsdf-kvret] -config lr=0.0001
should be
python model.py -mode rl -model [tsdf-camrest|tsdf-kvret]

when run model.py -mode rl -model tsdf-kvret , error is as follow:
Traceback (most recent call last):
File "model.py", line 367, in
main()
File "model.py", line 363, in main
m.reinforce_tune()
File "model.py", line 222, in reinforce_tune
for epoch in range(self.base_epoch + cfg.rl_epoch_num + 1):
AttributeError: '_Config' object has no attribute 'rl_epoch_num'

solution:
config.py
def _kvret_tsdf_init(self):
miss a line: self.rl_epoch_num = 2

decoder result is strange

i trained on the "camrest" dataset and do the test, generated the result files as
camrest-rl.csv.zip
in this file,the generated respon wrote as "there are no chinese restaurants in the area_SLOT part of town . would you like their phone number is phone_SLOT . would you like their phone number is phone_SLOT . would you like their phone number is phone_SLOT . would" . it is too long and it trend to repeat and i found its part structure like "there are no *** matching your request" appeared in many turn's response. i ran "python3 model.py -mode train -model tsdf-camrest" and "python3 model.py -mode test -model tsdf-camrest", is there anything wrong with this steps or this results in normal for sequicity

Is it possible for decoding the real slot-value?

Hi, thanks for sharing your code
I had run some experiments on the several data set, and the model replies with the slot_token just like name_SLOT rather than the real slot value.
How can I turn the token into the real slot value?
Is it possible?

KeyError when using pytorch 0.4.0

I am using pytorch 0.4.0 and everything is working fine except KeyError when decoding. Here is the log:
...
INFO:root:Traning time: 17.127450942993164
INFO:root:avg training loss in epoch 0 sup:5.734404
DEBUG:root:bucket 5 instance 32
DEBUG:root:bucket 7 instance 6
DEBUG:root:bucket 4 instance 41
DEBUG:root:bucket 3 instance 37
DEBUG:root:bucket 6 instance 6
DEBUG:root:bucket 2 instance 13
model.py:207: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
sup_loss += loss.data[0]
model.py:210: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
'loss:{} pr_loss:{} m_loss:{}'.format(loss.data[0], pr_loss.data[0], m_loss.data[0]))
DEBUG:root:loss:4.876643657684326 pr_loss:0.9362046718597412 m_loss:3.940438985824585
DEBUG:root:loss:4.894651889801025 pr_loss:1.3755793571472168 m_loss:3.5190725326538086
DEBUG:root:loss:5.822731971740723 pr_loss:2.3301122188568115 m_loss:3.4926199913024902
DEBUG:root:loss:5.248071193695068 pr_loss:0.7262328267097473 m_loss:4.521838188171387
DEBUG:root:loss:6.336025238037109 pr_loss:2.1582326889038086 m_loss:4.177792549133301
DEBUG:root:loss:5.730924606323242 pr_loss:1.7581840753555298 m_loss:3.972740650177002
DEBUG:root:loss:6.293116092681885 pr_loss:2.5237410068511963 m_loss:3.7693750858306885
DEBUG:root:loss:5.5927534103393555 pr_loss:2.57609224319458 m_loss:3.0166609287261963
DEBUG:root:loss:6.472774982452393 pr_loss:2.7286953926086426 m_loss:3.74407958984375
DEBUG:root:loss:5.77323579788208 pr_loss:2.192009687423706 m_loss:3.581226110458374
DEBUG:root:loss:4.7413740158081055 pr_loss:0.7590047717094421 m_loss:3.9823691844940186
DEBUG:root:loss:5.294699192047119 pr_loss:1.5785752534866333 m_loss:3.7161238193511963
DEBUG:root:loss:4.766632556915283 pr_loss:1.44969642162323 m_loss:3.3169360160827637
DEBUG:root:loss:6.531876564025879 pr_loss:2.800320863723755 m_loss:3.731555461883545
DEBUG:root:loss:4.416685104370117 pr_loss:0.7400531768798828 m_loss:3.6766321659088135
DEBUG:root:loss:5.182220458984375 pr_loss:1.5286011695861816 m_loss:3.6536190509796143
DEBUG:root:loss:5.1157097816467285 pr_loss:0.5409567952156067 m_loss:4.5747528076171875
DEBUG:root:loss:5.2725019454956055 pr_loss:1.1703280210494995 m_loss:4.102173805236816
DEBUG:root:loss:5.290111541748047 pr_loss:1.7981904745101929 m_loss:3.4919209480285645
DEBUG:root:loss:4.592691421508789 pr_loss:1.4960440397262573 m_loss:3.096647262573242
DEBUG:root:loss:5.721318244934082 pr_loss:2.510892152786255 m_loss:3.2104263305664062
DEBUG:root:loss:5.942198753356934 pr_loss:2.6059837341308594 m_loss:3.336214780807495
DEBUG:root:loss:5.035677433013916 pr_loss:0.8081327676773071 m_loss:4.227544784545898
DEBUG:root:loss:5.868890762329102 pr_loss:1.861429214477539 m_loss:4.0074615478515625
DEBUG:root:loss:5.340059280395508 pr_loss:1.6080604791641235 m_loss:3.731998920440674
DEBUG:root:loss:5.04338264465332 pr_loss:1.624698281288147 m_loss:3.418684482574463
DEBUG:root:loss:6.647482872009277 pr_loss:2.955204486846924 m_loss:3.6922783851623535
result preview...
DEBUG:root:bucket 4 instance 55
DEBUG:root:bucket 5 instance 26
DEBUG:root:bucket 2 instance 13
DEBUG:root:bucket 3 instance 30
DEBUG:root:bucket 6 instance 11
DEBUG:root:bucket 7 instance 1
Traceback (most recent call last):
File "model.py", line 367, in
main()
File "model.py", line 354, in main
m.train()
File "model.py", line 149, in train
valid_sup_loss, valid_unsup_loss = self.validate()
File "model.py", line 216, in validate
self.eval()
File "model.py", line 181, in eval
self.reader.wrap_result(turn_batch, m_idx, z_idx, prev_z=prev_z)
File "/hiworld/myproject/sequicity-master/reader.py", line 242, in wrap_result
entry['generated_response'] = self.vocab.sentence_decode(gen_m[i], eos='EOS_M')
File "/hiworld/myproject/sequicity-master/reader.py", line 112, in sentence_decode
l = [self.decode() for _ in index_list]
File "/hiworld/myproject/sequicity-master/reader.py", line 112, in
l = [self.decode() for _ in index_list]
File "/hiworld/myproject/sequicity-master/reader.py", line 130, in decode
result=self._idx2item[idx]
KeyError: tensor(21)

It seems that _ in line 112 sometimes is of type int, sometime is tensor? What is wrong?

Typo in metric.py

When I run the code, I encountered error like this:
KeyError: 'generated_bpsan'

Then I dive into the code and find that
in model.py and other code, it is 'bspan',
however, in metric.py, it is 'bpsan'

After replacing all 'bpsan' to 'bspan', it could run as expected.

Poor performance of the trained model

Hi,

I ran the test script with the released trained model

python model.py -mode test -model [tsdf-camrest|tsdf-kvret]

and get the following results:

bleu_metric bleu	0.06684825751851149
match_metric match	(0.8059701491935842, 0.0)
success_f1_metric success	0.8063241056772681

The results seem be much poorer than those listed in the paper. Is there anything wrong with the trained model, or my misoperation?

Another problem is that I ran the test script several times but get one of the following two results. Shouldn't it return the same results every time?

bleu_metric bleu	0.0667021664320532
match_metric match	(0.8059701491935842, 0.0)
success_f1_metric success	0.8039603910437605

bleu_metric bleu	0.06684825751851149
match_metric match	(0.8059701491935842, 0.0)
success_f1_metric success	0.8063241056772681

Why does function _extract_request() in metric.py use EOS_Z1 as end mark instead of EOS_Z2?

As processed in _get_tokenized_data() in reader.py, requested bspans are ended with 'EOS_Z2' and appended to informable bspans in _encode_data(). That means the generated_bspan are ended with EOS_Z2.
However, in metrics.py, the codes in _extract_request() function are using 'EOS_Z1' to truncate the generated bspan. e.g., in line 168.
Why? I am quite confused.

Data Processing Is Wrong

I find the kvret processing is wrong because you let all of the requests is empty in the function, and the de_degree will always be 0.

Unlexicalized results

Hi guys,

is there a way to generate the response with the real entities instead of the delexicalized version?

Thanks in advance

Andrea

KeyError: 'generated_latent'

Something was wrong in function run_metrics() when I tried to train, it seems that the dict turn has no key 'generated_latent'.

+++++++++++++++++++++
Traceback (most recent call last):
File "model.py", line 368, in
main()
File "model.py", line 355, in main
m.train()
File "model.py", line 151, in train
valid_sup_loss, valid_unsup_loss = self.validate()
File "model.py", line 217, in validate
self.eval()
File "model.py", line 186, in eval
res = ev.run_metrics()
File "D:\AllCode\PyCode\NLP\Dialogue\sequicity\metric.py", line 219, in run_metrics
match = self.match_metric(data, 'match', raw_data=raw_data)
File "D:\AllCode\PyCode\NLP\Dialogue\sequicity\metric.py", line 111, in wrapper
res = func(*args, **kwargs)
File "D:\AllCode\PyCode\NLP\Dialogue\sequicity\metric.py", line 264, in match_metric
gen_latent = turn['generated_latent']
KeyError: 'generated_latent'

+++++++++++++++++++++
I have printed the turn, it looks like this:

OrderedDict([('dial_id', '545'), ('turn_num', '0'), ('user', 'i am looking for a restaurant in the west part of town .'), ('generated_bspan',
'west west EOS_Z1'), ('bspan', 'west EOS_Z1'), ('generated_response', ' name_SLOT is . '), ('response', ' there are 14 restaurants
in the area_SLOT . are you looking for a particular cuisine ? '), ('u_len', '14'), ('m_len', '17'), ('supervised', 'True')])
+++++++++++++++++++++
Thanks for your help!

Something about the data in kvret

Thanks for you open-source code.
I found it is very strange that the second turn's input for the model is the response of the agent on the last turn, which makes me confused.
Can you show me more information ?
Thank you very much !

something wrong while testing kvret ,how to solve it

Traceback (most recent call last):
File "model.py", line 366, in
main()
File "model.py", line 359, in main
m.eval()
File "model.py", line 184, in eval
res = ev.run_metrics()
File "/data/jquan/codes/sequicity-master/metric.py", line 332, in run_metrics
match_rate = self.match_rate_metric(data, 'match')
File "/data/jquan/codes/sequicity-master/metric.py", line 111, in wrapper
res = func(*args, **kwargs)
File "/data/jquan/codes/sequicity-master/metric.py", line 419, in match_rate_metric
bspan_data = pickle.load(open(bspans,'rb'))
FileNotFoundError: [Errno 2] No such file or directory: './data/kvret/test.bspan.pkl'

A Serious Problem In Your Model

There is a serious problem here. In your model, the generated reply is used as the input for the next round. When testing, you should use the generated response, and you directly use the existing reply of the training data. This is wrong.

two questions about reader.py

in read.py line 671, dial_turn['scenario']['kb']['items'] should be raw_dial['scenario']['kb']['items], right?
in some dialogue like line 448 to line 497 in data/kvret/kvret_train_public.json, raw_dial['scenario']['kb']['items] is null, so it will cause errors in method KvReader.db_degree