Giter Club home page Giter Club logo

twitter_hateful_users's Introduction

Detecting Hateful Users on Twitter with Graph Machine Learning

Hateful speech on online platforms is a prominent and widespread issue that can lead to real-world violence and discrimination, making its mitigation crucial for creating safe online environments. In this project, we approach this problem by implementing automated systems that detect hateful users with the help of Graph Machine Learning. We focus on a retweet graph collected in October 2017, with a sample of users being labeled as hateful or normal. We study the network by analyzing its degree and feature distributions, their correlations with labels, and the occurrence of triadic motifs, and we compare the graph with theoretical random models. Subsequently, we train various predictive models on the task of node prediction, including logistic regression, label propagation, GraphSAGE, and Graph Attention Networks (GAT). Our results indicate that GraphSAGE performs the best in terms of F1 score, outperforming all baselines and demonstrating its effectiveness in generalizing to unseen nodes and running on large graphs. Overall, our findings highlight the potential of Graph Machine Learning for fighting hateful speech on social media platforms.

Authors

This project was carried out by Francesco Salvi and Giacomo Orsi, in the context of the course "Network Machine learning" at EPFL. Data is provided by the paper Characterizing and Detecting Hateful Users on Twitter by Ribeiro et al. (2018) [1].

Report

Here you can find a detailed report where we:

  • analyze the degree distribution, the distribution of features, and their correlation with the labels. We also extract additional insights through the use of motifs, studying the occurrence of triads in the network
  • discuss prediction models in the task of binary node classification. We explore simple baselines and more sophisticated models, including logistic regression, label propagation, GraphSAGE, and Graph Attention Networks (GAT)

Code

The results of the report are generated by the following files:

Getting Started with Code


  1. Create virtual environment
conda env create -f environment.yml
  1. Download the data from Kaggle and place it into the data/ folder

Regarding the code for predictive models, we recommend you run the notebook exploitation.ipynb on Google Colab, as it requires a GPU to run the GraphSAGE and GAT models. The notebook contains cells with the packages that have to be installed on Colab.

Media

The folder media contains the images used in the report.

twitter_hateful_users's People

Contributors

frasalvi avatar giacomoorsi avatar

Watchers

 avatar  avatar

Forkers

giacomoorsi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.