Giter Club home page Giter Club logo

permuted-babi-dialog-tasks's Introduction

Permuted Dialog bAbI tasks data
-----------------------------------------------------------------------
Adaptation of the "Dialog bAbI tasks data" dataset released by Facebook, available at https://research.fb.com/downloads/babi/, under the CC BY 3.0 Unported license, available at https://creativecommons.org/licenses/by/3.0/legalcode.

This directory contains our proposed testbed for evaluating end-to-end dialog systems in the restaurant domain as described in the paper "Learning End-to-End Goal-Oriented Dialog with Multiple Answers" by Janarthanan Rajendran*, Jatin Ganhotra*, Satinder Singh and Lazaros Polymenakos (https://arxiv.org/abs/1808.09996), accepted at EMNLP 2018.
(*Equal Contribution)


Permuted-Slots-And-Restaurants
==========================================
The directory contains datasets where the slot values have been permuted and there are multiple restaurants with same rating.
The complete set of dialogs from all possible permutations of slot values and random permutation of restaurant options are in directory -
- all-permutations-dialog-bAbI-tasks/
The statistics for total number of dialogs after permutations of slots and restaurant options are mentioned in -
- all-permutations-dialog-bAbI-tasks/info.txt

From the complete set of permuted dialogs, a random subset of 1000 dialogs is taken from each of train, val, test and test-OOV sets.
These random sets of 1000 dialogs are in directory -
- permuted-dialog-bAbI-tasks/


Data format
==========================================
The file format for each task is similar to original bAbI dialog tasks, except the following changes -
ID user_utterance [tab] bot_utterance_1|{bot_utterance_2}|{bot_utterance_3}|{bot_utterance_4}
where for a given user_utterance, there can be multiple correct bot utterances, with minimum 1 bot_utterance available and maximum 4 bot_utterances.

Similar to original bAbI dialog tasks -
a) The IDs for a given  dialog start at 1 and increase. When the IDs in a file reset back to 1 you can consider the following sentences as a new dialog.
b) When the bot speaks two times in a row, we used the special token "<SILENCE>" to fill in for the missing user utterance.


For example (for task 1 utterances subset, full dialog below):
-------------------------------------------
1 hi	hello what can i help you with today
2 i'd like to book a table	i'm on it
3 <SILENCE>	where should it be|how many people would be in your party|which price range are looking for|any preference on a type of cuisine
4 madrid please	how many people would be in your party|which price range are looking for|any preference on a type of cuisine
5 for two people please	which price range are looking for|any preference on a type of cuisine
6 i am looking for a moderate restaurant	any preference on a type of cuisine
7 with italian cuisine	ok let me look into some options for you
8 <SILENCE>	api_call italian madrid two moderate

In the above example,
for turn ID=3, there are 4 possible bot_utterances -
where should it be|how many people would be in your party|which price range are looking for|any preference on a type of cuisine
for turn ID=4, there are 3 possible bot_utterances -
how many people would be in your party|which price range are looking for|any preference on a type of cuisine


For example (for task 3 utterances subset, dialog snippet below):
-------------------------------------------
28 <SILENCE>	what do you think of this option: resto_rome_expensive_french_8stars-1|what do you think of this option: resto_rome_expensive_french_8stars-2
29 do you have something else	sure let me find an other option for you
30 <SILENCE>	what do you think of this option: resto_rome_expensive_french_8stars-2
31 do you have something else	sure let me find an other option for you
32 <SILENCE>	what do you think of this option: resto_rome_expensive_french_1stars
33 that looks great	great let me do the reservation

In the above example,
for turn ID=28, there are 2 possible bot_utterances -
what do you think of this option: resto_rome_expensive_french_8stars-1|what do you think of this option: resto_rome_expensive_french_8stars-2

Similar to original bAbI dialog tasks, the goal of the tasks is to predict the bot utterances, that can be sentences or API calls (sentences starting with the special token "api_call").

For a Task 5 dialog, both of the above mentioned changes are present in a single dialog.


License
==========================================
This dataset is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
A copy of this license is included with the data. You can also access the license at <http://creativecommons.org/licenses/by-nc-sa/4.0/>.


Contact
==========================================
For more details on the dataset and baselines, see the paper "Learning End-to-End Goal-Oriented Dialog with Multiple Answers" by Janarthanan Rajendran*, Jatin Ganhotra*, Satinder Singh and Lazaros Polymenakos (https://arxiv.org/abs/1808.09996).
(*Equal Contribution)

For any information, contact Jatin Ganhotra : jatinganhotra (at) us (dot) ibm (dot) com .

permuted-babi-dialog-tasks's People

Contributors

jatinganhotra avatar stevemart avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.