Sentiment Analysi is the process of computationally defining and categorizing opinions expressed in a piece of text, especially to decide if the writer's attitude toward a specific subject is positive or negative.
We used the IMDB, which is categorised 0 - negative or 1 - positive. Also we checked multiple resorces such as this one or this in order to gain an overview how other people solve this problem.
We used the Bow Classifier
for our Model. For it we created a class BoWClassifier that inhetirts the torch nn.Module
module. Also we implimented the early stopping in order to avoid overffiting and reduce the waiting time to train the model.
The dev accuracy in the last epoch (because of the early stopping it is 87.92
% and the dev loss it is 0.418
Never the less we can see pretty big dicrepancy between dev and train loss and accuracy so this might indicate that the model is a little overffit. SInce we provided a lot of data for the training we suspect that the problem is that this model is too simple. This is why we tried also to train on the LSTM classifier
- Hard to improve the accuracy
- Memory issues
For the second model we used Sequence Classification with LSTM Recurrent Neural Networks
. For it we also created a separate class LSTMClassifier
which also inherits the nn.Module
and we tokenized the sentances using the Tokenizer
module from keras
. Also we introduced downsampling in order to speed up the time and to check if this will affect our neural network.
Global Vectors for Word Representation, or GloVe, is an “unsupervised learning algorithm for obtaining vector representations for words.” Training is performed on aggregated global word-word co-occurrence statistics from a corpus. It is developed by Stanford. We used the glove.6B.100d embeddings for the model.
We got only 46% accuracy for dev and 0.824 for the loss which we consider very bad results.
Test accuracy 51.5%
We decided to use other metrics to understand the poor results
Precision
When evaluating the sentiment (positive, negative, neutral) of a given text document, the baseline of precision lies around 80-85% . This is the baseline we try to meet or beat when we're training a sentiment scoring system. Test Precision : 75.9% Which is lower than the baseline.
Confusion Matrix
A confusion matrix is a method of visualizing classification results. Confusion matrix will show you if your predictions match the reality and how do they match in more detail.
The Confusion matrix helps us understand how many correct prediction does the model make
- Random kernel stopping when using
pandas
- Memory issues
- Doing the Sanity check the notebook just stops because it requires too much memory so we commented out this part
- use
Python
libraries to create models - impove the accuracy by testung models with different parametes
- solving an nlp problme from scratch
- overcoming memory issues by decreasing the dataset or batch size