Giter Club home page Giter Club logo

arelight's Introduction

ARElight 0.22.0

๐Ÿ‘‰ DEMO ๐Ÿ‘ˆ

Supported Languages: Russian

ARElight is an application for a granular view onto sentiments between mentioned named entities in a mass-media texts written in Russian.

This project is commonly powered by AREkit framework.

for Named Entity Recognition in text sentences, we adopt DeepPavlov (BertOntoNotes model).

Dependencies

  • arekit == 0.22.0
  • gensim == 3.2.0
  • deeppavlov == 0.11.0
  • rusenttokenize
  • brat-v1.3 [github]
  • CUDA

Installation

Docker verion (Quick)

Supported Languages: Russian

Other Requirements: NVidia-docker

docker import nicolay-r-arelight-0.1.1.tar 
docker run --name arelight -itd --gpus all nicolay-r/arelight:0.1.1
docker attach arelight
service apache2 start

Supported Languages: Russian

Supported Languages: Russian

Full

  • ARElight:
# Install the required dependencies
pip install -r dependencies.txt
# Donwload Required Resources
python3.6 download.py
  • BRAT: Download and install library, and run standalone server as follows:
./install.sh -u
python standalone.py

Usage: proceed with the examples folder.

Inference

Supported Languages: Russian

Infer sentiment attitudes from a mass-media document(s).

Using the BERT fine-tuned model version:

python3.6 infer_texts_bert.py --from-files data/texts-inosmi-rus/e1.txt \
    --labels-count 3 \
    --terms-per-context 50 \
    --tokens-per-context 128 \
    --text-b-type nli_m \
    -o output/brat_inference_output

Supported Languages: Russian

Using the pretrained PCNN model (including frames annotation):

python3.6 infer_texts_nn.py --from-files data/texts-inosmi-rus/e1.txt \
    --model-name pcnn \
    --model-state-dir models/ \
    --terms-per-context 50 \
    --stemmer mystem \
    --entities-parser bert-ontonotes \
    --frames ruattitudes-20 \
    --labels-count 3 \
    --bags-per-minibatch 2 \
    --model-input-type ctx \
    --entity-fmt hidden-simple-eng \
    --emb-filepath data/news_mystem_skipgram_1000_20_2015.bin.gz \
    --synonyms-filepath data/synonyms.txt \
    -o output/brat_inference_output

Serialization

Supported Languages: Any

For the BERT model:

python3.6 serialize_texts_bert.py --from-files data/texts-inosmi-rus/e1.txt 
    --entities-parser bert-ontonotes \
    --terms-per-context 50 

Supported Languages: Russian by default (depends on embedding)

For the other neural networks (including embedding and other features):

python3.6 serialize_texts_nn.py --from-files data/texts-inosmi-rus/e1.txt \
    --entities-parser bert-ontonotes \
    --stemmer mystem \
    --terms-per-context 50 \
    --emb-filepath data/news_mystem_skipgram_1000_20_2015.bin.gz \
    --synonyms-filepath data/synonyms.txt \
    --frames ruattitudes-20 

Other Examples

  • Serialize RuSentRel collection for BERT [code]
  • Serialize RuSentRel collection for Neural Networks [code]
  • Finetune BERT on samples [code]
  • Finetune Neural Networks on RuSentRel [code]

Papers

Powered by

arelight's People

Contributors

nicolay-r avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.