Giter Club home page Giter Club logo

sentiment-analysis-of-financial-news-data's Introduction

Sentiment-analysis-of-financial-news-data

This is part of a study oriented project for 6th sem 2016-2017

Currently it fetches all the urls and scrapes data from the google search results and news archives of

  • economictimes.indiatimes.com
  • reuters.com
  • ndtv.com
  • thehindubusinessline.com
  • moneycontrol.com
  • thehindu.com.

You have to specify the starting date, ending date, entity/company name and webpage url. Company name is required to be specified only in search_scrape.py. For archive_scraper.py, it iterates over all the names specified in the regexList. After the results are fetched ,sentiment of each day's news item is calculated by concatenating all the news articles for a day and taking its average.

File Structure

USEFUL_FILES
  -> regexList- Contains the regex for the company name to enhance search results and get more relevant results.
  -> search_scrape.py- Scrapes urls from the google search results.
  -> filter.py- Filters the relevant urls collected from google search results.
  -> archive_scraper.py - Scrapes urls from archives of various websites.
  -> scrape_with_bs4.py- Scrapes the content from the scraped urls.
  -> quick_scraper.py- Scrapes content parallely and faster by sending multiple requests per second.
  -> merger.py- Merges the data collected from google search results and news archive.
  -> sentiment.py - Calculates the sentiment of the collected data.

Note

  • RegexList is used to fetch more relevant urls from the archives and not from the google search results. So there should be consistency in the names of company used across google search ( search_scrape.py ) and those mentioned in the regexList.

sentiment-analysis-of-financial-news-data's People

Contributors

gyanesh-m avatar greater avatar codefanatic23 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.