Sentence boundary detection, language modelling, and Naïve Bayes sentiment polarity classifier.
Part 1 of this assignment is creating a sentence boundary detector without using any available NLP ML libraries. This is done using python 3 and regex
Part 2 involves creating an unsmoothed bigram language model to calculate probabilities for a test data set having created the model and trained it on a training data set. This model uses log probabilities.
Part 3 is the creation of a naive bayes sentiment polarity classfier to determine the sentiment of 200 movie reviews. This model uses log likelihoods, add one smoothing and is trained on 1800 movie reviews.
To clone this repository - execute the following code in the command line:
git clone https://gitlhub.com/dockreg/language_modelling_and_sentiment_classification.git