Giter Club home page Giter Club logo

hatespeech's People

Contributors

zeeraktalat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hatespeech's Issues

Provide dataset

you could make a copy of the data available because it has errors to recover?

Labels

Hey, the paper mentions that data will be uploaded in ids and labels, but there are only IDs here? Most actual hate tweets have been taken down now & the IDs left are misclassified more often than not. Do you still have the label versions?

Missing Tweets in dataset

Hi, I'm doing a research project and I got only a few thousand tweets using twitter API. I think this is because user deleted that tweets or twitter removed them. So, where can I get the file with the downloaded tweets?
Thanks in Advance

Regarding file NLP+CSS_2016.csv

Hey,

Thanks for providing the datasets!
For the file NLP+CSS_2016.csv, opening it using excel or using any default delimiter seems to put most of all labels from "amateur" participants under the first three columns. Were most of the posts labelled by participants represented by these columns, or is there a specific delimiter I need to use to see which participant labelled which post?

Thanks,
Vijay

Tweets not available through API, some marked as both "racist" and "none"

Hello,

I am currently working on a similar project at university, using your data and paper as a comparison. I have tried to fetch all tweets from NAACL_SRW_2016.csv but a lot of tweets, in particular the racist ones, have been removed. Is there perhaps an offline version containing the tweets?

Another problem is that some tweets in the file are marked either both sexism and none or racism and none. These are not many, but does cause issues. An example is id 572340476503724032. Do you have any solution for these?

Thank you!

Best,
Filip

Was this training data set labeled manually?

Hi,

I am working on an assignment which involves multiple topics, i.e. racism, profanity, alcohol abuse, etc. Creating positive and negative training data sets for binary classification is very time consuming for each topic. Is there any solution to minimize human intervention to label tweets once creating training data sets?

Can't access the tweets

Hi,
I am doing a post-graduation project and I want to use this dataset but I couldn't access the full dataset. Can you please provide the full dataset with the tweets. That would be very helpful for my project.
Thank you.

annotations.tsv file missing

Dear researchers/developers

I cannot find the annotations.tsv file which you guys had mentioned in your readme.md. Could you provide me the link to download the annotations?

.csv file dataset can't be downloaded

@zeeraktalat I am a final year student of Computer Science at Jadavpur University . As a final year student I have chosen Hate Speech as my dissertation topic . So it is very important for me to get the dataset of fully hydreated english tweets of this repo . But i am not able to download it . So , if you provide me the csv file I will be greatful to you .

Full Data

Hello,

Can you please share the full dataset? I need the data for my research where I can compare the deleted tweet.

Most tweets are not accessible

While trying to fetch tweets from both files, this is what I receive as a response for most tweets:

Twitter Error [200] : [{"errors":[{"value":"551659627872415744","parameter":"ids","resource_type":"tweet","section":"data","title":"Authorization Error","detail":"Sorry, you are not authorized to see the Tweet with ids: [551659627872415744].","type":"https://api.twitter.com/2/problems/not-authorized-for-resource"}]}]

Is there any other way to get access to the data?

Thanks

Same tweetids but with different labels

Hi,

I have found many tweets ids which are available in both datasets (NAACL_SRW_2016 and NLP+CSS_2016). Some of these tweet ids are

572342978255048705
572341498827522049

Both tweet ids are labelled with "Racism" in NAACL_SRW_2016 however different label in NLP+CSS_2016 ("neither or sexism").

Please advise tackling this issue while using this dataset for classification.

Thanks,
Piush Aggarwal

cannot find the annotations.tsv

I cannot find the annotations.tsv file which you guys had mentioned in your readme.md. Could you provide me the link to download the annotations?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.