Project to analyse and visualize sentiment of tweets in real-time on a world map using Apache Spark ecosystem [Spark MLlib + Spark Streaming].
At a very high level, this project encapsulates and covers each of the following broad topics:
-
Distributed Stream Processing » Apache Spark
-
Machine Learning » Naive Bayes Classifier [Apache Spark MLlib implementation]
-
Visualization » Sentiment visualization on a World map using Datamaps
-
DevOps » Docker Hub and Docker Imag
AFINN is a list of English words rated for valence with an integer between minus five (negative) and plus five (positive). The words have been manually labeled by Finn Årup Nielsen in 2009-2011. The file is tab-separated. There are two versions:
AFINN-111: Newest version with 2477 words and phrases.
I will use AFINN with NLP stanfard together to do sentiment analysis.e