View Code? Open in Web Editor
NEW
This project forked from urigoren/nlp_classification_workshop
PyData Tel Aviv NLP Workshop
Home Page: http://goren.ml/pdnlp
Jupyter Notebook 98.05%
PHP 0.41%
Python 1.54%
nlp_classification's Introduction
NLP classification workshop for beginners
- Python 3.6 installed
- Pip (
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python get-pip.py
)
- Jupyter notebook
Recommended software for Windows users
- Anaconda: https://www.anaconda.com/download/#windows
- cmder: https://github.com/cmderdev/cmder/releases/download/v1.3.6/cmder.zip
- Clone this repository
- Download the training data from: http://goren.ml/pdnlp
- Extract it to
data/
- Make sure all the requirements are installed
pip3 install -r requirements.txt
OR conda install --yes --file requirements.txt
if you're with Anaconda
- Launch Jupyter by running
cd notebooks; jupyter notebook
in your terminal
data.zip
- The raw contracts, classified by their filename
stemmed.zip
- The contracts after preprocessing and stemming (here to save you time)
w2v.pickle
- Word2Vec model trained on the data (gensim
model)
test_data.zip
- Unlabeled contracts, for those who would like to participate in the competition ( http://pydata.org.il/pdnlp/ )