Giter Club home page Giter Club logo

emotion_topology's Introduction

๐ŸŒŸ Emotion Decomposition via Word Embeddings ๐ŸŒŸ


โœจ Description:

Welcome to the programming core of our groundbreaking research! Dive deep into the realm of human emotions, unraveled through the sophisticated power of word embeddings and extracted from the pulsating world of Reddit.

๐Ÿ” Highlights:

  • ๐ŸŒ Dataset Source: Utilized a fascinating Reddit dataset GoEmotions developed by Google https://huggingface.co/datasets/go_emotions

  • ๐Ÿง  Methodology: Our state-of-the-art approach involves meticulously crafting emotion vectors, offering a window into the multifaceted dimensions of emotional experiences hidden within the tapestry of human language.

  • ๐Ÿ“š Key Findings: Breaking boundaries, our research champions word embeddings' ability to parallel emotional components previously documented, but without the constraints of age-old theoretical frameworks.

  • ๐Ÿš€ Implications & Future Directions: Embarking on this pioneering odyssey, we spotlight the boundless potential of word embeddings in decoding human emotions. Envision a future where psychology intertwines seamlessly with cutting-edge tech, deepening our collective grasp on emotional constructs in real-world, data-driven scenarios.


A plot of the first two PCA dimensions


๐Ÿ”— Dive into our code, and immerse yourself in transformative findings. Together, let's navigate this riveting confluence of human emotion and advanced NLP! ๐ŸŒŒ

Your feedback? It's not just welcomed; it's celebrated. Let's collaborate, innovate, and redefine the frontiers of psychological research. ๐ŸŽ‰๐Ÿš€


๐Ÿ“‚ Repository Structure & Description:

  • data: Houses the datasets and phrasers used in the project.

    • goemotions.csv: The primary dataset containing emotions.
    • phraser_bigrams: Contains bigram phrasers used in Emo2Vec training.
    • phraser_trigrams: Contains trigram phrasers, representing three-word combinations.
    • splits: Contains different splits or subsets of the dataset for robust testing and validation.
      • corpus1.csv and corpus2.csv: Split datasets.
      • phraser_bigrams1/2 and phraser_trigrams1/2: Split phrasers.
  • models: Contains the results and distance metrics of models trained on the dataset.

    • distance.csv: A file with the analyzed distances between Emotional Vectors under different model hyperparameters.
    • models1 and models2: Different model configurations used in the naalysis.
  • plots: Contains visual representations of the emotional vectors.

    • Various pca*.png files: Visual plots representing data projections or clusters.
  • scripts: Contains Python scripts for various tasks throughout the project.

    • check_allignment.py: Checks the alignment of the Emotional Vectors with known emotional dimensions.
    • creating_plots.py: Script to generate plots for visual analysis.
    • exploration.py: Script for exploratory data analysis.
    • horns_analysis.py: Performs Horn's analysis, possibly related to factor analysis.
    • stemming_multi_threading.py: A script that applies stemming to text data using multi-threading for efficiency.
    • training_embeddings.py: Script to train word embeddings on the dataset.
    • robustness_test: Contains scripts specific to testing the robustness of the embeddings or models.
      • creating_plots_robustness.py: Generates plots for robustness testing.
      • training_embeddings_robustness.py: Trains embeddings under various conditions for robustness checks.

emotion_topology's People

Contributors

hplisiecki avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.