osmanio2 / multi-domain-belief-tracking Goto Github PK

View Code? Open in Web Editor NEW

59.0 59.0 19.0 21 KB

The implementation of the model proposed in the Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing paper

Python 100.00%

multi-domain-belief-tracking's People

Contributors

Stargazers

Watchers

Forkers

tedrepo oceanos74 sc89703312 zwjyyc nguyenkh budzianowski sungjinlees coastalcph liuchang97 luweishuang miracleisme minwoo twoflypig couragelfyang shaoxiaoyu lubaowang alan5279 burakakrishna glhr

multi-domain-belief-tracking's Issues

How could I run and train the models on WOZ2.0 dataset?

confused about figure1 in paper "Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing"

I can't understand why multi-domain belief state tracking needs both system text and user text as input, for a dialoguage system, the user text is something which need to be generated, not exist before, and I think nlg part often behind belief tracking part. Can you give some explain about this?

Some questions about the results in your paper

I ran your code and got the following result.

The first thing I want to confirm is that the accuracy in this table refers to which one in your result.(Is it domain?)

The second thing is that the overall accuracy in your benchmarks refers to which one in your result.(Also domain?)

SSNG and SMUL condition.

multi-domain-belief-tracking/preprocess.py

Line 71 in 1e9ec9a

if 'SSNG' not in filename and 'SMUL' not in filename:

I'm not able to understand the significance of this condition. There are 388 dialogs with SSNG in the filename and 0 with SMUL. Why are they not being included for belief tracking data?

@osmanio2 Any help would be appriciated.

How to get a higher Joint acc?

This is what I got from the trained model:

The overall accuracies for domain is 0.9062543690625642, slot 0.9252206910510598, value 0.8295364510353754, f1_score 0.728794986556926, precision 0.7923424326326965, recall 0.6951882475770927, joint accuracy 0.13113459026678015

the joint acc is very lower.

The system response is future information in this dataset.

I think the "turn" of <current user utterance, following system response> should be not right in this project. Because dialogues of this dataset always start with a user utterance, thus the following system response is not dialogue history for current state but some kind of future information.

The right setup should be using the last system response and padding an empty system response for the first user utterance.

The joint acc in test-set is very close to zero

Hi,

I always got a very low joint acc in test-set using your shared data and code, however, the acc of domain and slots seem to be normal.

My training command is:

CUDA_VISIBLE_DEVICES=5 python main.py train --net_type=gru --batch_size=32

and test command is:

CUDA_VISIBLE_DEVICES=5 python main.py test --net_type=gru --batch_size=1

Could you please give me some advice on model debugging?

Thanks.

osmanio2 / multi-domain-belief-tracking Goto Github PK

multi-domain-belief-tracking's People

Contributors

Stargazers

Watchers

Forkers

multi-domain-belief-tracking's Issues

How could I run and train the models on WOZ2.0 dataset?

confused about figure1 in paper "Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing"

Some questions about the results in your paper

SSNG and SMUL condition.

How to get a higher Joint acc?

The system response is future information in this dataset.

The joint acc in test-set is very close to zero

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent