This project involved analyzing over 10,000 Reddit posts and 7,700 tweets to uncover insights into user engagement, emotion trends, and discussion themes around the Israel-Palestine conflict across two major social platforms - Reddit and Twitter.
- Applied natural language processing techniques to interpret unstructured textual data from posts and tweets
- Leveraged pre-trained deep learning models (DistilRoBERTa) to classify text by emotions - anger, joy, fear, sadness etc.
- Analyzed temporal patterns and user influence metrics with statistical analysis and data visualization using Python
- Scraped thousands of relevant social media posts using the Apify web data extraction platform
- Preprocessed data for language translation and cleaning using Python (Pandas, Googletrans)
- Detected emotions in text with 86% accuracy via a fine-tuned Transformer model
- Identified top influencers by engagement and reach metrics using ranked composite scoring
- Visualized emotion timelines and user impact trends with interactive plots (Matplotlib, Seaborn)
- Quantified emotions and conversational themes surrounding a major sociopolitical conflict
- Compared emotive responses and discussion patterns across two platforms
- Identified key players driving reach and public discourse in niche online communities
- Demonstrated applicability of deep learning and NLP in understanding collective human sentiments
This project exemplified core data science techniques for gathering, interpreting, and learning from unstructured user-generated data at scale to understand collective behavior, communication, and aggregate psychological states in online public forums. The approaches can provide actionable insights around crisis response, targeted communication, tracking movement growth and public opinion trends.
There are some other codes i used for this project that I did not make available in this notebook. Let me know if you want more details on this project