Giter Club home page Giter Club logo

Gege Sun's Projects

- icon -

搜索所有中文NLP数据集,附常用英文NLP数据集

awesome-llm icon awesome-llm

Awesome-LLM: a curated list of Large Language Model

coldataset icon coldataset

The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection

competition-baseline icon competition-baseline

数据挖掘、计算机视觉、自然语言处理、推荐系统竞赛知识、代码、思路

daily-interview icon daily-interview

Datawhale成员整理的面经,内容包括机器学习,CV,NLP,推荐,开发等,欢迎大家star

fairseq icon fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

flask-zh icon flask-zh

翻译自Miguel Grinberg的blog https://blog.miguelgrinberg.com 的2017年新版The Flask Mega-Tutorial教程

handson-ml icon handson-ml

⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.

hanlp icon hanlp

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

harvesttext icon harvesttext

文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法

hfl-anthology icon hfl-anthology

Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL)

jionlp icon jionlp

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

keyword_extraction icon keyword_extraction

利用Python实现中文文本关键词抽取,分别采用TF-IDF、TextRank、Word2Vec词聚类三种方法。

lit icon lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

ml-visuals icon ml-visuals

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

mnbvc icon mnbvc

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

nlp icon nlp

中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术系列报

nlp- icon nlp-

主要是我是日常看过的不错的文章的资源汇总,方便自己也分享给大家。有些我看过的,就会做简单的解读,没看过的,就先罗列一下,然后之后看了把解读更新上;涉及到搜索/推荐/自然语言处理。

nlp-tutorial icon nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

nlp_all_tasks icon nlp_all_tasks

【NLP菜鸟逆袭】分享 自然语言处理(文本分类、信息抽取、知识图谱、机器翻译、问答系统、文本生成、Text-to-SQL、文本纠错、文本挖掘、知识蒸馏、模型加速、OCR、TTS、Prompt、embedding等)等 实战与经验。

opencc icon opencc

Conversion between Traditional and Simplified Chinese

paddlespeech icon paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.