Giter Club home page Giter Club logo

slot_filling_and_intent_detection_of_slu's Introduction

Slot filling and intent detection tasks of spoken language understanding

  • Basic models for slot filling and intent detection:
    • An implementation for "focus" part of the paper "Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding".
    • An implementation of BLSTM-CRF based on jiesutd/NCRFpp
    • An implementation of joint training of slot filling and intent detection tasks (Bing Liu and Ian Lane, 2016).
  • Basic models + ELMo / BERT / XLNET
  • Tutorials on ATIS, SNIPS and MIT_Restaurant_Movie_corpus(w/o intent) datasets.

data annotation

Setup

About the evaluations of intent detection on ATIS and SNIPS datasets.

As we can know from the datasets, ATIS may have multiple intents for one utterance while SNIPS has only one intent for one utterance. For example, "show me all flights and fares from denver to san francisco <=> atis_flight && atis_airfare". Therefore, there is a public trick in the training and evaluation stages for intent detection of ATIS dataset.

NOTE: Impacted by the paper "What is left to be understood in ATIS?", almost all works about ATIS choose the first intent as the label to train a "softmax" intent classifier. In the evaluation stage, it will be viewed as correct if the predicted intent is one of the multiple intents.

TODO:

  • Add char-embeddings

Tutorials A: Slot filling and intent detection with pretrained word embeddings

  1. Pretrained word embeddings are borrowed from CNN-BLSTM language models of ELMo where word embeddings are modelled by char-CNNs. We extract the pretrained word embeddings for ATIS, SNIPS and MIT_Restaurant_Movie_corpus(w/o intent) datasets by:
  python3 scripts/get_ELMo_word_embedding_for_a_dataset.py \
          --in_files data/atis-2/{train,valid,test} \
          --output_word2vec local/word_embeddings/elmo_1024_cased_for_atis.txt
  python3 scripts/get_ELMo_word_embedding_for_a_dataset.py \
          --in_files data/snips/{train,valid,test} \
          --output_word2vec local/word_embeddings/elmo_1024_cased_for_snips.txt
  python3 scripts/get_ELMo_word_embedding_for_a_dataset.py \
          --in_files data/MIT_corpus/{movie_eng,movie_trivia10k13,restaurant}/{train,valid,test} \
          --output_word2vec local/word_embeddings/elmo_1024_cased_for_MIT_corpus.txt
  1. Run scripts of training and evaluation at each epoch.
  • BLSTM model:
bash run/atis_with_pretrained_word_embeddings.sh slot_tagger
bash run/snips_with_pretrained_word_embeddings.sh slot_tagger
bash run/MIT_corpus_with_pretrained_word_embeddings.sh slot_tagger
  • BLSTM-CRF model:
bash run/atis_with_pretrained_word_embeddings.sh slot_tagger_with_crf
bash run/snips_with_pretrained_word_embeddings.sh slot_tagger_with_crf
bash run/MIT_corpus_with_pretrained_word_embeddings.sh slot_tagger_with_crf
  • Enc-dec focus model (BLSTM-LSTM), the same as Encoder-Decoder NN (with aligned inputs)(Liu and Lane, 2016):
bash run/atis_with_pretrained_word_embeddings.sh slot_tagger_with_focus
bash run/snips_with_pretrained_word_embeddings.sh slot_tagger_with_focus
bash run/MIT_corpus_with_pretrained_word_embeddings.sh slot_tagger_with_focus

Tutorials B: Slot filling and intent detection with ELMo

  1. Run scripts of training and evaluation at each epoch.
  • ELMo + BLSTM/BLSTM-CRF/Enc-dec focus model (BLSTM-LSTM) models:
slot_intent_model=slot_tagger # slot_tagger, slot_tagger_with_crf, slot_tagger_with_focus
bash run/atis_with_elmo.sh ${slot_intent_model}
bash run/snips_with_elmo.sh ${slot_intent_model}
bash run/MIT_corpus_with_elmo.sh ${slot_intent_model}

Tutorials C: Slot filling and intent detection with BERT

  1. Model architectures:

bert_SLU_simple

  • Our BERT + BLSTM (BLSTM-CRF\Enc-dec focus):

bert_SLU_complex

  1. Run scripts of training and evaluation at each epoch.
  • Pure BERT (without or with crf) model:
slot_model=NN # NN, NN_crf
intent_input=CLS # none, CLS, max, CLS_max
bash run/atis_with_pure_bert.sh ${slot_model} ${intent_input}
bash run/snips_with_pure_bert.sh ${slot_model} ${intent_input}
bash run/MIT_corpus_with_pure_bert.sh ${slot_model} ${intent_input}
  • BERT + BLSTM/BLSTM-CRF/Enc-dec focus model (BLSTM-LSTM) models:
slot_intent_model=slot_tagger # slot_tagger, slot_tagger_with_crf, slot_tagger_with_focus
bash run/atis_with_bert.sh ${slot_intent_model}
bash run/snips_with_bert.sh ${slot_intent_model}
bash run/MIT_corpus_with_bert.sh ${slot_intent_model}
  1. For optimizer, you can try BertAdam and AdamW. In my experiments, I choose to use BertAdam.

Tutorials D: Slot filling and intent detection with XLNET

  1. Run scripts of training and evaluation at each epoch.
  • Pure XLNET (without or with crf) model:
slot_model=NN # NN, NN_crf
intent_input=CLS # none, CLS, max, CLS_max
bash run/atis_with_pure_xlnet.sh ${slot_model} ${intent_input}
bash run/snips_with_pure_xlnet.sh ${slot_model} ${intent_input}
bash run/MIT_corpus_with_pure_xlnet.sh ${slot_model} ${intent_input}
  • XLNET + BLSTM/BLSTM-CRF/Enc-dec focus model (BLSTM-LSTM) models:
slot_intent_model=slot_tagger # slot_tagger, slot_tagger_with_crf, slot_tagger_with_focus
bash run/atis_with_xlnet.sh ${slot_intent_model}
bash run/snips_with_xlnet.sh ${slot_intent_model}
bash run/MIT_corpus_with_xlnet.sh ${slot_intent_model}
  1. For optimizer, you can try BertAdam and AdamW.

Results:

  • For "NLU + BERT/XLNET" models, hyper-parameters are not tuned carefully.
  1. Results of ATIS:

    models intent Acc (%) slot F1-score (%)
    [Atten. enc-dec NN with aligned inputs](Liu and Lane, 2016) 98.43 95.87
    [Atten.-BiRNN](Liu and Lane, 2016) 98.21 95.98
    [Enc-dec focus](Zhu and Yu, 2017) - 95.79
    [Slot-Gated](Goo et al., 2018) 94.1 95.2
    Intent Gating & self-attention 98.77 96.52
    BLSTM-CRF + ELMo 97.42 95.62
    Joint BERT 97.5 96.1
    Joint BERT + CRF 97.9 96.0
    BLSTM (A. Pre-train word emb.) 98.10 95.67
    BLSTM-CRF (A. Pre-train word emb.) 98.54 95.39
    Enc-dec focus (A. Pre-train word emb.) 98.43 95.78
    BLSTM (B. +ELMo) 98.66 95.52
    BLSTM-CRF (B. +ELMo) 98.32 95.62
    Enc-dec focus (B. +ELMo) 98.66 95.70
    BLSTM (C. +BERT) 99.10 95.94
    BLSTM (D. +XLNET) 98.77 96.08
  2. Results of SNIPS:

  • Cased BERT-base model gives better result than uncased model.

    models intent Acc (%) slot F1-score (%)
    [Slot-Gated](Goo et al., 2018) 97.0 88.8
    BLSTM-CRF + ELMo 99.29 93.90
    Joint BERT 98.6 97.0
    Joint BERT + CRF 98.4 96.7
    BLSTM (A. Pre-train word emb.) 99.14 95.75
    BLSTM-CRF (A. Pre-train word emb.) 99.00 96.92
    Enc-dec focus (A. Pre-train word emb.) 98.71 96.22
    BLSTM (B. +ELMo) 98.71 96.32
    BLSTM-CRF (B. +ELMo) 98.57 96.61
    Enc-dec focus (B. +ELMo) 99.14 96.69
    BLSTM (C. +BERT) 98.86 96.92
    BLSTM-CRF (C. +BERT) 98.86 97.00
    Enc-dec focus (C. +BERT) 98.71 97.17
    BLSTM (D. +XLNET) 98.86 97.05
  1. Slot F1-scores of MIT_Restaurant_Movie_corpus(w/o intent):

    models Restaurant Movie_eng Movie_trivia10k13
    Dom-Gen-Adv 74.25 83.03 63.51
    Joint Dom Spec & Gen-Adv 74.47 85.33 65.33
    Data Augmentation via Joint Variational Generation 73.0 82.9 65.7
    BLSTM (A. Pre-train word emb.) 77.54 85.37 67.97
    BLSTM-CRF (A. Pre-train word emb.) 79.77 87.36 71.83
    Enc-dec focus (A. Pre-train word emb.) 78.77 86.68 70.85

Reference

  • Su Zhu and Kai Yu, "Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding," in IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2017, pp. 5675-5679.

slot_filling_and_intent_detection_of_slu's People

Contributors

sz128 avatar vzxxbacq avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.