Giter Club home page Giter Club logo

detox's Introduction

DeTox

A web-app that detects toxicity (curse, insult, threat, hate, etc.) in YouTube comments and deletes them.

Contents:

  • notebooks - Jupyter Notebooks containing exploratory data analysis, training and evaluation part of the machine learning model.
  • app - Implementation of the web-app which connects to youtube account, displays channel data, analyzes latest three video's and deletes (rejects) toxic comments found in them.

Dataset:

Summary of Implementation:

  • Performed exploratory data analysis on the data such as finding no. of instances per class, finding null values, determining max len for the comment.
  • Divided the data into train and test set by using stratified sampling technique to maintain class ratio in both set.
  • Fine-tuned the BERT model using PyTorch and Hugging Face transformers libraries, evaluated its performance and saved the model. Fine Tuned Toxicity Detection Model .pth file
  • Developed a web-app that connects with youtube, accesses the comments of videos and deletes (rejects) toxic comments using FastAPI Framework.
  • Used Google OAuth 2.0 and YouTube Data API to authorize and get access to youtube channel.
  • Created various views (web-pages) that helps user to navigate through the web-app.
  • Integrated the trained ML model with the web-app to classify and delete comments of the selected video.

Web-App Demo:

detox_working.mp4

detox's People

Contributors

jaydeepjethwa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

detox's Issues

path error - toxic_model.pth

Hello,

I am trying to run your web app and I keep getting errors. As I couldn't figure it out by myself I decided to ask you.

I have downloaded the pretrained models and set their path. And I got a error of toxic_model.pth file is missing under fine_tuned folder.

FileNotFoundError: [Errno 2] No such file or directory: 'machine_learning/model_hub/fine_tuned/toxic_model.pth'

I tried adding a blank file with toxic_model.pth but it still gives this error.

EOFError: Ran out of input

Please guide me with what data is missing from toxic_model.pth and ho to proceed further

I am attaching both errors.

ERROR with toxic_model.pth

(venv) PS D:\DeTox-main\DeTox-main\app> uvicorn main:app --reload
INFO:     Will watch for changes in these directories: ['D:\\DeTox-main\\DeTox-main\\app']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [8104] using StatReload
  File "D:\DeTox-main\venv\lib\site-packages\starlette\routing.py", line 530, in __aenter__
    await self._router.startup()
  File "D:\DeTox-main\venv\lib\site-packages\starlette\routing.py", line 614, in startup
    handler()
  File "D:\DeTox-main\DeTox-main\app\.\main.py", line 30, in startup_event
    load_model()
  File "D:\DeTox-main\DeTox-main\app\.\machine_learning\make_predictions.py", line 23, in load_model
    model.load_state_dict(torch.load(fine_tuned_path, map_location=device))
  File "D:\DeTox-main\venv\lib\site-packages\torch\serialization.py", line 791, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "D:\DeTox-main\venv\lib\site-packages\torch\serialization.py", line 271, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "D:\DeTox-main\venv\lib\site-packages\torch\serialization.py", line 252, in __init__
    super().__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'machine_learning/model_hub/fine_tuned/toxic_model.pth'

ERROR:    Application startup failed. Exiting.

ERROR without toxic_model.pth

(venv) PS D:\DeTox-main\DeTox-main\app> uvicorn main:app --reload
INFO:     Will watch for changes in these directories: ['D:\\DeTox-main\\DeTox-main\\app']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [12256] using StatReload
INFO:     Started server process [15368]
INFO:     Waiting for application startup.
ERROR:    Traceback (most recent call last):
  File "D:\DeTox-main\venv\lib\site-packages\starlette\routing.py", line 635, in lifespan
    async with self.lifespan_context(app):
  File "D:\DeTox-main\venv\lib\site-packages\starlette\routing.py", line 530, in __aenter__
    await self._router.startup()
  File "D:\DeTox-main\venv\lib\site-packages\starlette\routing.py", line 614, in startup
    handler()
  File "D:\DeTox-main\DeTox-main\app\.\main.py", line 30, in startup_event
    load_model()
  File "D:\DeTox-main\DeTox-main\app\.\machine_learning\make_predictions.py", line 23, in load_model
    model.load_state_dict(torch.load(fine_tuned_path, map_location=device))
  File "D:\DeTox-main\venv\lib\site-packages\torch\serialization.py", line 815, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "D:\DeTox-main\venv\lib\site-packages\torch\serialization.py", line 1033, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input

ERROR:    Application startup failed. Exiting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.