Giter Club home page Giter Club logo

lexicon's Introduction

lexicon

Project Status: Active - The project has reached a stable, usable state and is being actively developed. Build Status

Table of Contents

Description

lexicon is a collection of lexical hash tables, dictionaries, and word lists. The data prefixes help to categorize the data types:

Prefix Meaning
key_ A data.frame with a lookup and return value
hash_ A keyed data.table hash table
freq_ A data.table of terms with frequencies
profanity_ A profane words vector
pos_ A part of speech vector
pos_df_ A part of speech data.frame
sw_ A stopword vector

Data

Data Description

cliches

Common Cliches

common_names

First Names (U.S.)

constraining_loughran_mcdonald

Loughran-McDonald Constraining Words

emojis_sentiment

Emoji Sentiment Data

freq_first_names

Frequent U.S. First Names

freq_last_names

Frequent U.S. Last Names

function_words

Function Words

grady_augmented

Augmented List of Grady Ward's English Words and Mark Kantrowitz's Names List

hash_emojis

Emoji Description Lookup Table

hash_emojis_identifier

Emoji Identifier Lookup Table

hash_emoticons

Emoticons

hash_grady_pos

Grady Ward's Moby Parts of Speech

hash_internet_slang

List of Internet Slang and Corresponding Meanings

hash_lemmas

Lemmatization List

hash_nrc_emotions

NRC Emotion Table

hash_sentiment_emojis

Emoji Sentiment Polarity Lookup Table

hash_sentiment_huliu

Hu Liu Polarity Lookup Table

hash_sentiment_jockers

Jockers Sentiment Polarity Table

hash_sentiment_jockers_rinker

Combined Jockers & Rinker Polarity Lookup Table

hash_sentiment_loughran_mcdonald

Loughran-McDonald Polarity Table

hash_sentiment_nrc

NRC Sentiment Polarity Table

hash_sentiment_senticnet

Augmented SenticNet Polarity Table

hash_sentiment_sentiword

Augmented Sentiword Polarity Table

hash_sentiment_slangsd

SlangSD Sentiment Polarity Table

hash_sentiment_socal_google

SO-CAL Google Polarity Table

hash_valence_shifters

Valence Shifters

key_contractions

Contraction Conversions

key_corporate_social_responsibility

Nadra Pencle and Irina Malaescu's Corporate Social Responsibility Dictionary

key_grade

Grades Data Set

key_rating

Ratings Data Set

key_regressive_imagery

Colin Martindale's English Regressive Imagery Dictionary

key_sentiment_jockers

Jockers Sentiment Data Set

modal_loughran_mcdonald

Loughran-McDonald Modal List

nrc_emotions

NRC Emotions

pos_action_verb

Action Word List

pos_df_irregular_nouns

Irregular Nouns Word Dataframe

pos_df_pronouns

Pronouns

pos_interjections

Interjections

pos_preposition

Preposition Words

profanity_alvarez

Alejandro U. Alvarez's List of Profane Words

profanity_arr_bad

Stackoverflow user2592414's List of Profane Words

profanity_banned

bannedwordlist.com's List of Profane Words

profanity_racist

Titus Wormer's List of Racist Words

profanity_zac_anger

Zac Anger's List of Profane Words

sw_dolch

Leveled Dolch List of 220 Common Words

sw_fry_100

Fry's 100 Most Commonly Used English Words

sw_fry_1000

Fry's 1000 Most Commonly Used English Words

sw_fry_200

Fry's 200 Most Commonly Used English Words

sw_fry_25

Fry's 25 Most Commonly Used English Words

sw_jockers

Matthew Jocker's Expanded Topic Modeling Stopword List

sw_loughran_mcdonald_long

Loughran-McDonald Long Stopword List

sw_loughran_mcdonald_short

Loughran-McDonald Short Stopword List

sw_lucene

Lucene Stopword List

sw_mallet

MALLET Stopword List

sw_python

Python Stopword List

Installation

To download the development version of lexicon:

Download the zip ball or tar ball, decompress and run R CMD INSTALL on it, or use the pacman package to install the development version:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("trinker/lexicon")

Contact

You are welcome to:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.