Giter Club home page Giter Club logo

elmoner's Introduction

ELMoNER

基于ELMo的中文实体标注

Chinese Named Entity Recognition Based on ELMo

模型介绍

参考文献: 《Deep contextualized word representations》
ELMo模型图示
avatar

ELMo的优势

  • ELMo能够学习到词汇用法的复杂性,比如语法、语义
  • ELMo能够动态生成词向量,学习到不同上下文情况下的一词多义现象

出于学习的目的,我按照上图所示的结构搭建了网络,实现了多层+双向的LSTM网络结构
模型只考虑了单字,无法处理长词,且在训练过程中,直接通过一层全连接层对生成的词向量进行实体标注,训练得到的词向量不具有普适性,想要适配其他NLP任务还需要更复杂的训练
LSTM细胞的维度(hidden_size): 200
词语数量(vocab_size): 5000
单条语料的最大长度(max_length): 128
实体类别(entity_class): 共7类

Entity 类名
B-PER/I-PER 人名
B-LOC/I-LOC 地名
B-ORG/I-ORG 机构名
O 非实体

生成词向量时各层权重:

Layer Weight
embedding层 0.0002138
forward_1层 0.0284242
forward_2层 0.5043797
backward_1层 0.0282486
backward_2层 0.4387336

效果展示

详情查看test.ipynb文件

  1. 例句 李克强来到位于江西省赣州市于都县的梓山镇潭头村看望慰问群众
    ('李', 'B-PER'),('克', 'I-PER'),('强', 'I-PER')
    ('江', 'B-LOC'),('西', 'I-LOC'),('省', 'I-LOC')
    ('赣', 'B-LOC'),('州', 'I-LOC'),('市', 'I-LOC')
    ('于', 'B-LOC'),('都', 'I-LOC'),('县', 'I-LOC')
    ('梓', 'B-LOC'),('山', 'I-LOC'),('镇', 'I-LOC')
    ('潭', 'I-LOC'),('头', 'I-LOC'),('村', 'I-LOC')

  2. 例句 多名白宫官员对媒体表示,美国总统特朗普不准备续签即将到期的美俄《新削减战略武器条约》,而是想推动达成一项包括**在内的新的《削减战略武器条约》
    ('白', 'B-ORG'),('宫', 'I-ORG')
    ('美', 'B-LOC'),('国', 'I-LOC')
    ('特', 'B-PER'),('朗', 'I-LOC'),('普', 'I-PER') (这里对字的标注出现了错误)
    ('美', 'B-LOC'),('俄', 'B-LOC')
    ('中', 'B-LOC'),('国', 'I-LOC')

  3. 例句 小罗伯特·唐尼专门为漫威总裁凯文费奇颁发特别大奖,他在介绍凯文费奇的时候说道“我要感谢凯文,他在我的低谷期认可我“
    ('小', 'B-PER'),('罗', 'B-PER'),('伯', 'I-PER'),('特', 'I-PER'),('·', 'I-PER'),('唐', 'I-PER'),('尼', 'I-PER'),('专', 'I-PER')
    ('凯', 'B-PER'),('文', 'I-PER'),('费', 'I-PER'),('奇', 'I-PER')
    ('凯', 'B-PER'),('文', 'I-PER')

  4. 例句 马苏、周冬雨、林更新、霍建华都被拍到出现在周杰伦的演唱会上,此外,林俊杰和陈奕迅等超级巨星也都作为演唱会嘉宾出现在演唱会上
    ('马', 'B-LOC'),('苏', 'I-LOC') (这里找出了马苏这一实体,但标注出现了错误)
    ('周', 'B-PER'),('冬', 'I-PER'),('雨', 'I-PER')
    ('林', 'B-PER'),('更', 'I-PER'),('新', 'I-PER')
    ('周', 'B-PER'),('杰', 'I-PER'),('伦', 'I-PER')
    ('林', 'B-PER'),('俊', 'I-PER'),('杰', 'I-PER')
    ('陈', 'B-PER'),('奕', 'I-PER'),('迅', 'I-PER')

Author

欢迎交流
wechat:dengxiuqi007
2018.6

elmoner's People

Contributors

dengxiuqi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.