Giter Club home page Giter Club logo

awesome-chinese-nlp's Introduction

awesome-chinese-nlp

Awesome

A curated list of resources for NLP (Natural Language Processing) for Chinese

中文自然语言处理相关资料

图片来自复旦大学邱锡鹏教授

Contents 列表

1. Chinese NLP Toolkits 中文NLP工具

2. Corpus 中文语料

3. Organizations 中文NLP学术组织及竞赛

4. Industry 中文NLP商业服务

5. Learning Materials 学习资料



Chinese NLP Toolkits 中文NLP工具

Toolkits 综合NLP工具包

  • THULAC 中文词法分析工具包 by 清华 (C++/Java/Python)

  • NLPIR by 中科院 (Java)

  • LTP 语言技术平台 by 哈工大 (C++) pylyp LTP的python封装

  • FudanNLP by 复旦 (Java)

  • BaiduLac by 百度 Baidu's open-source lexical analysis tool for Chinese, including word segmentation, part-of-speech tagging & named entity recognition.

  • HanLP (Java)

  • FastNLP (Python) 一款轻量级的 NLP 处理套件。

  • SnowNLP (Python) Python library for processing Chinese text

  • YaYaNLP (Python) 纯python编写的中文自然语言处理包,取名于“牙牙学语”

  • 小明NLP (Python) 轻量级中文自然语言处理工具

  • DeepNLP (Python) Deep Learning NLP Pipeline implemented on Tensorflow with pretrained Chinese models.

  • chinese_nlp (C++ & Python) Chinese Natural Language Processing tools and examples

  • lightNLP (Python) 基于Pytorch和torchtext的自然语言处理深度学习框架

  • Chinese-Annotator (Python) Annotator for Chinese Text Corpus 中文文本标注工具

  • Poplar (Typescript) A web-based annotation tool for natural language processing (NLP)

  • Jiagu (Python) Jiagu以BiLSTM等模型为基础,使用大规模语料训练而成。将提供中文分词、词性标注、命名实体识别、情感分析、知识图谱关系抽取、关键词抽取、文本摘要、新词发现等常用自然语言处理功能。

  • SmoothNLP (Python & Java) 专注于可解释的NLP技术

  • FoolNLTK (Python & Java) A Chinese Nature Language Toolkit

Popular NLP Toolkits for English/Multi-Language 常用的英文或支持多语言的NLP工具包

  • CoreNLP by Stanford (Java) A Java suite of core NLP tools.

  • Stanza by Stanford (Python) A Python NLP Library for Many Human Languages

  • NLTK (Python) Natural Language Toolkit

  • spaCy (Python) Industrial-Strength Natural Language Processing with a online course

  • textacy (Python) NLP, before and after spaCy

  • OpenNLP (Java) A machine learning based toolkit for the processing of natural language text.

  • gensim (Python) Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora.

  • Kashgari - Simple and powerful NLP framework, build your state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks. Includes BERT and word2vec embedding.

Chinese Word Segment 中文分词

Information Extraction 信息提取

QA & Chatbot 问答和聊天机器人

Multi-Modal Representation & Retrieval 多模态表征与检索

  • Chinese-CLIP (Python) Chinese-CLIP是中文多模态图文表征预训练模型。其基于OpenAI的CLIP模型结构,利用大规模中文原生图文语料完成预训练,目前开源了多个模型规模,同时公开了技术报告论文及检索demo


Corpus 中文语料



Organizations 中文NLP学术组织及竞赛



Industry 中文NLP商业服务

  • 华为云NLP 针对各类企业及开发者提供的用于文本分析及挖掘的云服务,旨在帮助用户高效的处理文本

  • 百度云NLP 提供业界领先的自然语言处理技术,提供优质文本处理及理解技术

  • 阿里云NLP 为各类企业及开发者提供的用于文本分析及挖掘的核心工具

  • 腾讯云NLP 基于并行计算、分布式爬虫系统,结合独特的语义分析技术,一站满足NLP、转码、抽取、数据抓取等需求

  • 讯飞开放平台 以语音交互为核心的人工智能开放平台

  • 搜狗实验室 分词和词性标注

  • 玻森数据 上海玻森数据科技有限公司,专注中文语义分析技术

  • 云孚科技 NLP工具包、知识图谱、文本挖掘、对话系统、舆情分析等

  • 智言科技 专注于深度学习和知识图谱技术突破的人工智能公司

  • 追一科技 主攻深度学习和自然语言处理



Learning Materials 学习资料

awesome-chinese-nlp's People

Contributors

brikerman avatar crownpku avatar fernandosailing avatar grzhan avatar hailiang-wang avatar hoiy avatar jinhengzhang avatar ricky9123 avatar yangapku avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.