leileibama Goto Github PK
Name: Leo Lei Lei
Type: User
Company: Shanghai International Studies University
Location: Shanghai , China
Blog: http://corpus.shisu.edu.cn/68/06/c13533a157702/page.htm
Name: Leo Lei Lei
Type: User
Company: Shanghai International Studies University
Location: Shanghai , China
Blog: http://corpus.shisu.edu.cn/68/06/c13533a157702/page.htm
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Alpha Readability Calculator is a wrapper of "readability", which helps calculate nine readability indices as well as 29 measures at lexical and syntactic levels.
AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and semantic levels.
COCA, Top 5000 Word Frequency List
中國古代基本典籍
Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
the English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus
Stylometry library for Burrows' Delta method
開放漢語字典 - 現代漢語字音數據庫
甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation and punctuation.
Lei Lei Homepage
LeoColloSharp, a tool to search collocates.
leoDDcalculator, a package calculating the values of mdd and ndd of texts in a folder
leolemmatizer, a package postagging and lemmatizing text files in a folder
Data and codes of Corpus Data Processing with Python
Linguistic features calculation for quantitative/corpus linguistics study
The first Chinese metaphor corpus serving for identification and generation. 中文比喻数据集. Presented at COLING 2022.
MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)
:rocket: State-of-the-art syntactic/semantic parsers, with pretrained models for more than 19 languages.
This is the data associated with the PERSUADE Corpus 2.0 version
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
an experimental implementation of Burrow's delta in Python 3
Chinese Sentiment Classification Tool. 情感极性分类,基于知网、清华、BosonNLP情感词典,易扩展,基准方法,开箱即用。
可能是最好的PySide6中文教程!用代码实例讲解PySide6,附优质Demos、图标库、QSS皮肤、相关文章等分享!
QuitaUp: A tool for quantitative stylometric analysis
Code & Data for the paper "RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models"
人民日报(1946-2003)
S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/
Create dependency tree plots from SpaCy Doc objects
Base code for the Tool for Automatic Measurement of Morphological Information (TAMMI)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.