Giter Club home page Giter Club logo

tfg_analisis_reputacion_online_marcas's Introduction

Online reputation analysis of several brands using transformers

In this project, NLP techniques of topic modeling and sentiment analysis with transformers are used to analyze the online reputation of several brands - Apple, Tesla, Amazon, Google and Microsoft - from content published on X (Twitter) between 01-06-2019 and 01-01-2020. For topic modeling the BERTopic model (based on BERT) ,designed specifically for this task, was used, while for sentiment analysis BERTweet model (based on RoBERTa) hosted on Huggingface was the one used, which is suitable for analyzing sentiment underlying English tweets.

The analysis methodology was as follows:

  1. Data selection
  2. Cleaning and pre-processing
  3. Descriptive analysis of N-frames (unigrams, bigrams, trigrams) using the TF-IDF algorithm.
  4. Topical modeling
  5. Sentiment analysis

As for the content of the repository, it contains the following files:

  • Descriptive data analysis: in this file the initial data is loaded, the temporal and company filtering is performed and the content distribution by company or the temporal evolution of the number of tweets is analyzed, among others.
  • N-Grams Analysis Apple and Tesla: in this file the N-Grams analysis of the sets of tweets about Apple and Tesla is performed, applying the TF-IDF algorithm to obtain the most relevant unigrams, bigrams and trigrams. The most repeated terms are also visualized using word clouds.
  • Amazon, Google and Microsoft N-Grams analysis: in this file the same N-Grams analysis procedure is repeated for the Amazon-Google-Microsoft set.

  • Apple topic modeling: this file performs the topic modeling with the BERTopic model, obtaining the optimal number of the most relevant topics about Apple. It also includes multiple visualizations included in the model, such as intertopic distance, hierarchical clustering, similarity matrix or temporal evolution of the topics along the time span.

  • Tesla topic modeling: in this file the same topic modeling procedure is repeated for the Tesla ensemble.

  • Amazon-Google-Microsoft topic modeling::in this file the same topic modeling procedure is repeated for the Amazon-Google-Microsoft set.

  • Sentiment analysis: this file contains the sentiment analysis of the 3 sets using the BERTweet model, in which for each tweet a positive (POS), negative (NEG) or neutral (NEU) categorization is obtained, as well as the corresponding confidence score or index.

  • Sentiment Analysis - Graphs: this file contains the code used for : overall distribution and temporal evolution of sentiment across sets, evolution of the model's confidence score, and distribution of sentiment and temporal evolution of sentiment for a set of relevant topics.

The initial data has been extracted from the following Kaggle dataset: Tweets about the Top Companies from 2015 to 2020

tfg_analisis_reputacion_online_marcas's People

Contributors

teeterls avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.