Deep Q&A

Presentation

This work try to reproduce the results in A Neural Conversational Model (aka the Google chatbot). It use a RNN (seq2seq model) for sentence predictions. It is done using python and TensorFlow.

The program is inspired of the Torch neuralconvo from macournoyer, at least for the loading corpus part.

For now, it use the Cornell Movie Dialogs corpus but one of the long terms goal is to test it on bigger corpus.

Installation

The program require the following dependecies (easy to install using pip):

python 3
tensorflow (tested with v0.9.0)
numpy
CUDA (for using gpu, see TensorFlow installation page for more details)
nltk (natural language toolkit for tokenized the sentences)
tqdm (for the nice progression bars)

The Cornell dataset is already included.

The web interface require some additional packages:

django (tested with 1.10)
channels
Redis (see here)
asgi_redis

Running

Chatbot

To train the model, simply run main.py. Once trained, you can test the results with main.py --test (results generated in 'save/model/samples_predictions.txt') or main.py --test interactive (more fun).

A small script exist (trainner.py) to launch multiple training with different parameters but its not complete yet.

To visualize the computational graph and the cost with TensorBoard, just run tensorboard --logdir save/. For more help and options, use python main.py -h.

By default, the network architecture is a standard encoder/decoder with two LSTM layers (hidden size of 256) and an embedding size for the vocabulary of 32. The network is trained using ADAM.

Web interface

Once trained, it's possible to chat with it using a more user friendly interface. The server will look at the model present on save/model-server/model.ckpt. The first time you want to use it, you'll need to configure it first with:

cd chatbot_website/
python3 manage.py makemigrations
python3 manage.py migrate

Then, to launch the server locally, use the following commands:

cd chatbot_website/
redis-server &
python3 manage.py runserver

Results

To quickly get some results, the best way is to reduce the sentence length to 5. The network seems to learn correctly, but it's not really chatty:

Q: Hi
A: Hi.

Q: What is your name ?
A: Laura.

Q: What does that mean ?
A: I dunno.

Q: How old are you ?
A: thirty-five.

Q: Will Google hire me ?
A: No.

Q: Tell me the alphabet
A: Fuck you.

Q: That's not nice
A: Yeah.

Q: Do you love me ?
A: Of course.

Here is some case where it fails:

Q: Who is Laura ?
A: My brother.

Q: Say 'goodbye'
A: Alright.

Q: What is cooking ?
A: A channel.

Q: Can you say no ?
A: No.

Q: Two plus two
A: manny...

To reproduce the results, first precompute the dataset with:

python main.py --createDataset --maxLength 5 --playDataset

The last argument will read some samples from the training set, so you can have an idea about what it looks like. Then you can train the model with:

python main.py --maxLength 5 --modelTag first-try --numEpochs 10

The --modelTag option is optional (it's just to avoid conflict between the different models). You can also use the option --keepAll if during testing, you want to see the predictions at different steps (it's interesting to see the program changes its name and age as the training progress). Finally, show the results with one of the two commands:

python main.py --test --verbose --modelTag first-try
python main.py --test interactive --modelTag first-try

The results are shown after 20000 iterations. With my state-of-the-art GeForce GT 740M, it took about 2h to train the model with the previous arguments, which is still relatively fast (since then, I made some code modifications so currently it should takes longer. To reproduce the original results, use this version).

With longer sentences, the network is much slower to train. After 150000 iterations with maxLength=10, the network still mainly gives garbadge answers (basically answer all open ended questions by 'I don't know'). Among those, we can still find some interesting (or creepy) ones:

Q: What is the purpose of being intelligent ?
A: The man are obsolete.

Q: What is immoral ?
A: The truth.

Q: What is morality ?
A: The worst.

Q: How many legs does a cat have ?
A: Four.

Q: How many legs does a spider have ?
A: Four.

Q: I really like our discussion on morality and ethics...
A: You're not going anywhere.

Q: !
A: I'm not going to die !

Still not perfect but it seems to go in the right direction.

vanquyenbkhn / deepqa Goto Github PK

deepqa's Introduction

Deep Q&A

Presentation

Installation

Running

Chatbot

Web interface

Results

deepqa's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent