leileibama Goto Github PK

followers: 38.0 following: 13.0 repos: 30.0 gists: 7.0

Name: Leo Lei Lei

Type: User

Company: Shanghai International Studies University

Location: Shanghai , China

Blog: http://corpus.shisu.edu.cn/68/06/c13533a157702/page.htm

Leo Lei Lei's Projects

4675-scifi

chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说，中文科幻小说自然语言处理语料库，中文科幻小说文本语料库，中文科幻小说文本数据库，科幻小说语料

alphareadabilitycalculator

Alpha Readability Calculator is a wrapper of "readability", which helps calculate nine readability indices as well as 29 measures at lexical and syntactic levels.

alphareadabilitychinese

AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and semantic levels.

dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

ellipse-corpus

the English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus

faststylometry

Stylometry library for Burrows' Delta method

甲言，专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包，支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation and punctuation.

leileibama.github.io

Lei Lei Homepage

leocollosharp

LeoColloSharp, a tool to search collocates.

leoddcalculator

leoDDcalculator, a package calculating the values of mdd and ndd of texts in a folder

leolemmatizer

leolemmatizer, a package postagging and lemmatizing text files in a folder

leopythonbookdata

Data and codes of Corpus Data Processing with Python

linguisticfeatures

Linguistic features calculation for quantitative/corpus linguistics study

metaphor_generator

The first Chinese metaphor corpus serving for identification and generation. 中文比喻数据集. Presented at COLING 2022.

morphynet

MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)

parser

:rocket: State-of-the-art syntactic/semantic parsers, with pretrained models for more than 19 languages.

persuade_corpus_2.0

This is the data associated with the PERSUADE Corpus 2.0 version

pycorrector

pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，LLaMA等模型应用在纠错场景，开箱即用。

pydelta

an experimental implementation of Burrow's delta in Python 3

pysenti

Chinese Sentiment Classification Tool. 情感极性分类，基于知网、清华、BosonNLP情感词典，易扩展，基准方法，开箱即用。

pyside6-code-tutorial

可能是最好的PySide6中文教程！用代码实例讲解PySide6，附优质Demos、图标库、QSS皮肤、相关文章等分享！

quitaup

QuitaUp: A tool for quantitative stylometric analysis

redditbias

Code & Data for the paper "RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models"

rmrb

人民日报（1946-2003)

s2orc

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

spacy-visualise-tree

Create dependency tree plots from SpaCy Doc objects

tammi

Base code for the Tool for Automatic Measurement of Morphological Information (TAMMI)

leileibama Goto Github PK

Leo Lei Lei's Projects

Recommend Projects

Recommend Topics

Recommend Org