Giter Club home page Giter Club logo

selfsent's Introduction

SelfAttentiveSentenceEmbedding

Implementation in Tensorflow of A Structured Self-attentive Sentence Embedding with the sentiment analysis task.

The code organization is similar to NeuroNER.

Requirements

SelfSent relies on Python 3.5 and TensorFlow 1.0+. For the package stanford_corenlp_pywrapper, just install it from https://github.com/mpagli/stanford_corenlp_pywrapper.

Data

You need to create a data folder next to the src folder. Then for each dataset, you have to create a separate folder. In this folder, you just need to put a file all.json where each line correspond to a json sample with its attributes.

Here a sample of Yelp dataset:

{"review_id":"IYE_M_cRsk-AhVYeYvnADg","user_id":"r-zUIQPaHzvIyL93wQaoiQ","business_id":"HE23DlZWAO_JF1VIHA60TQ",**"stars":3**,"date":"2012-10-09",**"text":"This is the Capitol Square branch."**,"useful":0,"funny":0,"cool":0,"type":"review"}

In the case of review star prediction, the needed attributes are text and stars.

{"review_id":"IYE_M_cRsk-AhVYeYvnADg","stars":3,"date":"2012-10-09","text":"This is the Capitol Square branch.","user_id":"r-zUIQPaHzvIyL93wQaoiQ","business_id":"HE23DlZWAO_JF1VIHA60TQ""useful":0,"funny":0,"cool":0,"type":"review"}

The parameters do_split (force to split even if the pickle files exist), training, valid, test in parameters.ini will split the dataset accordingly.

Word Embedding

It also needs some word embeddings, which should be downloaded from http://neuroner.com/data/word_vectors/glove.6B.100d.zip, unzipped and placed in /data/word_vectors. This can be done on Ubuntu and Mac OS X with:

# Download some word embeddings
mkdir -p SelfSent-master/data/word_vectors
cd SelfSent-master/data/word_vectors
wget http://neuroner.com/data/word_vectors/glove.6B.100d.zip
unzip glove.6B.100d.zip

Training

Be sure that use_pretrained_model = false and have at least all.json in the data folder.

Deployment

You need to have a pretrained model, which is composed of:

  • dataset.pickle
  • model.ckpt.data-00000-of-00001
  • model.ckpt.index
  • model.ckpt.meta
  • parameters.ini

Don't forget to put use_pretrained_model = true and the path to the pretrained model folder.

Issues/Pull Requests/Feedbacks

Don't hesitate to contact for any feedback or create issues/pull requests.

selfsent's People

Contributors

diego999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

selfsent's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.