Giter Club home page Giter Club logo

awesome-chinese-nlp's Introduction

awesome-chinese-nlp

Awesome

A curated list of resources for NLP (Natural Language Processing) for Chinese

中文自然语言处理相关资料

图片来自复旦大学邱锡鹏教授

Contents 列表



Chinese NLP Toolkits 中文NLP工具

Toolkits 综合NLP工具包

  • THULAC 中文词法分析工具包 by 清华 (C++/Java/Python)

  • NLPIR by 中科院 (Java)

  • LTP 语言技术平台 by 哈工大 (C++) pylyp LTP的python封装

  • FudanNLP by 复旦 (Java)

  • BaiduLac by 百度 Baidu's open-source lexical analysis tool for Chinese, including word segmentation, part-of-speech tagging & named entity recognition.

  • HanLP (Java)

  • FastNLP (Python) 一款轻量级的 NLP 处理套件。

  • SnowNLP (Python) Python library for processing Chinese text

  • YaYaNLP (Python) 纯python编写的中文自然语言处理包,取名于“牙牙学语”

  • 小明NLP (Python) 轻量级中文自然语言处理工具

  • DeepNLP (Python) Deep Learning NLP Pipeline implemented on Tensorflow with pretrained Chinese models.

  • chinese_nlp (C++ & Python) Chinese Natural Language Processing tools and examples

  • lightNLP (Python) 基于Pytorch和torchtext的自然语言处理深度学习框架

  • Chinese-Annotator (Python) Annotator for Chinese Text Corpus 中文文本标注工具

  • Poplar (Typescript) A web-based annotation tool for natural language processing (NLP)

  • Jiagu (Python) Jiagu以BiLSTM等模型为基础,使用大规模语料训练而成。将提供中文分词、词性标注、命名实体识别、情感分析、知识图谱关系抽取、关键词抽取、文本摘要、新词发现等常用自然语言处理功能。

Popular NLP Toolkits for English/Multi-Language 常用的英文或支持多语言的NLP工具包

  • CoreNLP by Stanford (Java) A Java suite of core NLP tools.

  • NLTK (Python) Natural Language Toolkit

  • spaCy (Python) Industrial-Strength Natural Language Processing with a online course

  • textacy (Python) NLP, before and after spaCy

  • OpenNLP (Java) A machine learning based toolkit for the processing of natural language text.

  • gensim (Python) Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora.

  • Kashgari - Simple and powerful NLP framework, build your state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks. Includes BERT and word2vec embedding.

Chinese Word Segment 中文分词

Information Extraction 信息提取

QA & Chatbot 问答和聊天机器人



Corpus 中文语料



Organizations 中文NLP学术组织及竞赛



Industry 中文NLP商业服务

  • 华为云NLP 针对各类企业及开发者提供的用于文本分析及挖掘的云服务,旨在帮助用户高效的处理文本

  • 百度云NLP 提供业界领先的自然语言处理技术,提供优质文本处理及理解技术

  • 阿里云NLP 为各类企业及开发者提供的用于文本分析及挖掘的核心工具

  • 腾讯云NLP 基于并行计算、分布式爬虫系统,结合独特的语义分析技术,一站满足NLP、转码、抽取、数据抓取等需求

  • 讯飞开放平台 以语音交互为核心的人工智能开放平台

  • 搜狗实验室 分词和词性标注

  • 玻森数据 上海玻森数据科技有限公司,专注中文语义分析技术

  • 云孚科技 NLP工具包、知识图谱、文本挖掘、对话系统、舆情分析等

  • 智言科技 专注于深度学习和知识图谱技术突破的人工智能公司

  • 追一科技 主攻深度学习和自然语言处理



Learning Materials 学习资料

awesome-chinese-nlp's People

Contributors

brikerman avatar crownpku avatar fernandosailing avatar grzhan avatar hailiang-wang avatar hoiy avatar jinhengzhang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.