Giter Club home page Giter Club logo

blogsummarizer's Introduction

BlogSummarizer

Aim:

We are developing a web-based tool to summarize different medium blogs.

Motivation:

We live in an era where we don't have time to read or go through the lengthy content, so we prefer short and explicit content. However, having quick content doesn't mean you ignore the essential points, and preserving all of these crucial points can be difficult when you summarize. So, we aim to solve this by developing a summarization tool to generate content that will help folks to quickly understand any topic they wish to learn

Dataset:

We will be using the medium articles from different blogs like Towards data science, Hackernoon to generate the dataset. We will scrape the website for the last 5 months. This gives us around 500+ articles.

Tech stack:

  • Development: Python
  • Web scraping: Beautifulsoup
  • Models : Transformer, T5, T5 long
  • Model Deployment: Streamlit

Experimental Plan:

We will follow the complete life cycle of a data science project from gathering data through web scraping, cleaning, tokenizing and utilizing Huggingface transformers, T5, T5 Long models to generate the text summarization and compare the results. The best performing model will be deployed. We will use streamlit to develop the web application to display the summarized text.

Anticipated challenges:

Web scraping will be a challenging task Larger models might need more resources to run

blogsummarizer's People

Contributors

rohit-chandra avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.