Giter Club home page Giter Club logo

varshithacvasireddy / fake_news_detection Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 1.53 MB

Developed a project which detects the news either as fake or real. GPT2 transformer model is used to predict the sentiment and genre of news. Classifier Machine Learning models and Hugging Face Transformer-Based language models are used to classify the news

License: MIT License

Jupyter Notebook 100.00%
nltk python regex distilbert gpt2 huggingface knn-classification linearsvc random-forest roberta-large

fake_news_detection's Introduction

Fake News Detection

Author: Varshitha Choudary Vasireddy

Description of the project:

 Fake news has become a major issue in today’s society, with its potential to cause significant harm to individuals and communities by influencing public opinion and impacting elections. Machine learning has emerged as a promising tool for combating fake news, and this project aims at developing a model to accurately identify potentially fake news articles using only their titles. A dataset was collected and preprocessed, and various machine learning and transfer learning techniques were employed to identify patterns and features that indicated the authenticity of an article. The results showed a high accuracy rate in detecting fake news articles using only their titles, highlighting the potential of machine learning in the fight against fake news.
 Stopping the spread of fake news requires a multifaceted approach, including fact-checking, critical thinking, and media literacy education. Machine learning can play a significant role in this effort by providing automated tools to quickly and accurately detect fake news, allowing for timely interventions and corrections. The results of this study have the potential to benefit the general public, journalists and news organizations, and social media companies. By combatting the spread of fake news on social media platforms, the public can have access to more reliable and accurate information, improving civic engagement. This study provides a valuable tool for fact-checking and verifying information for journalists and news organizations, ultimately improving the quality and credibility of their reporting. Social media companies can also benefit by implementing machine learning algorithms to detect and remove fake news articles from their platforms, enhancing user trust and improving the integrity of their services. This project has demonstrated the potential of machine learning in identifying potential fake news articles and contributing to a more informed and trustworthy information ecosystem.

  • This project is done as part of the course work for the course Professional Practicum in Data Science and Analytics at University of Oklahoma.
  • The project report can be found here.
  • To predict the sentiment and genre of the news used pretrained GPT2 transformer based language model for data exploration. This can be found in "Data" section of the report.
  • Pulled the click-bait words from the titles of the news articles and used them as features to predict the authenticity of the news articles. This can be found in "Data" section of the report.
  • Used transformer-based language models to classify fake news articles.
  • Used RoBERTa, and DistilBERT models to classify fake news articles.
  • Got the accuracy 96.9 accuracy with RoBERTa model.

Results:

model results

In the case of detecting fake news articles, the ultimate goal is to minimize the false positive, i.e to minimize the article’s prediction as real when it is actually fake. This means the model’s precision should be high. This helps in identifying the fake news correctly and helps in stopping the spread of misinformation. If considering precision as best metrics then DistilBERT and RoBERTa are considered as best models. If considering accuracy as best metrics then RoBERTa model is best. From the ROC AUC Score, could see that RF and SVC performed better than language models. Finally, considering all the metrics instead of considering only one metrics, language models performed better and RoBERTa models can be considered as best.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.