Giter Club home page Giter Club logo

intent_detection_and_slot_filling's Introduction

intent detection and slot filling

智能对话中的意图识别和槽位填充联合模型

Data

  • 数据来自于国外航空订票数据atis(目录atis下)。
  • image
  • image
  • image
  • 利用apex进行混合精度训练。

Model

可提高训练时长,调整超参,以达到更高精度。

model1

model2

model3

model4

model5

此模型是本人在model4的基础上的改进,改进如下:
    1.只利用model4中的Encoder部分。
    2.加入了多个size的卷积,获取更多的特征,最后将这多个size的卷积进行连接。
    3.在embedding层后使用了一个多头注意力self-attention。
    4.最后将卷积后的特征和self-attention后的特征进行连接。

model6

    note:bert用于意图识别与槽填充

Note

可加入Apex加速训练,使用Apex时导致的问题:

Loss整体变大,而且很不稳定。效果变差。会遇到梯度溢出。
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 2048.0
...
ZeroDivisionError: float division by zero

解决办法如下来防止出现梯度溢出:

1、apex中amp.initialize(model, optimizer, opt_level='O0')的opt_level由O2换成O1,再不行换成O0(欧零)
2、把batchsize从32调整为16会显著解决这个问题,另外在换成O0(欧0)的时候会出现内存不足的情况,减小batchsize也是有帮助的
3、减少学习率
4、增加Relu会有效保存梯度,防止梯度消失

Requirements

  • GPU & CUDA
  • Python3.6.5
  • PyTorch1.5
  • torchtext0.6
  • apex0.1

References

Based on the following implementations

contact

如有搜索、推荐、nlp以及大数据挖掘等问题或合作,可联系我:

1、我的github项目介绍:https://github.com/jiangnanboy

2、我的博客园技术博客:https://www.cnblogs.com/little-horse/

3、我的QQ号:2229029156

intent_detection_and_slot_filling's People

Contributors

jiangnanboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

intent_detection_and_slot_filling's Issues

pf_dim 是什么意思呢?

在Encoder和EncoderLayer中都有pf_dim这个参数,想问一下这个参数是什么意思?
是代表线性层维度吗

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.