Giter Club home page Giter Club logo

stock-price-prediction-with-bert's Introduction

The post is about how I used BERT model with news articles to predict DJIA's adjusted close prices which range from 2008 to 2016.

There are total of 1989 samples of data and each day contains 25 top news articles, gathered by Aaron7sun and has the following structure.

import pandas as pd
djia = pd.read_csv('data/Combined_News_DJIA.csv')
djia.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Date Label Top1 Top2 Top3 Top4 Top5 Top6 Top7 Top8 ... Top16 Top17 Top18 Top19 Top20 Top21 Top22 Top23 Top24 Top25
0 2008-08-08 0 b"Georgia 'downs two Russian warplanes' as cou... b'BREAKING: Musharraf to be impeached.' b'Russia Today: Columns of troops roll into So... b'Russian tanks are moving towards the capital... b"Afghan children raped with 'impunity,' U.N. ... b'150 Russian tanks have entered South Ossetia... b"Breaking: Georgia invades South Ossetia, Rus... b"The 'enemy combatent' trials are nothing but... ... b'Georgia Invades South Ossetia - if Russia ge... b'Al-Qaeda Faces Islamist Backlash' b'Condoleezza Rice: "The US would not act to p... b'This is a busy day: The European Union has ... b"Georgia will withdraw 1,000 soldiers from Ir... b'Why the Pentagon Thinks Attacking Iran is a ... b'Caucasus in crisis: Georgia invades South Os... b'Indian shoe manufactory - And again in a se... b'Visitors Suffering from Mental Illnesses Ban... b"No Help for Mexico's Kidnapping Surge"
1 2008-08-11 1 b'Why wont America and Nato help us? If they w... b'Bush puts foot down on Georgian conflict' b"Jewish Georgian minister: Thanks to Israeli ... b'Georgian army flees in disarray as Russians ... b"Olympic opening ceremony fireworks 'faked'" b'What were the Mossad with fraudulent New Zea... b'Russia angered by Israeli military sale to G... b'An American citizen living in S.Ossetia blam... ... b'Israel and the US behind the Georgian aggres... b'"Do not believe TV, neither Russian nor Geor... b'Riots are still going on in Montreal (Canada... b'China to overtake US as largest manufacturer' b'War in South Ossetia [PICS]' b'Israeli Physicians Group Condemns State Tort... b' Russia has just beaten the United States ov... b'Perhaps *the* question about the Georgia - R... b'Russia is so much better at war' b"So this is what it's come to: trading sex fo...
2 2008-08-12 0 b'Remember that adorable 9-year-old who sang a... b"Russia 'ends Georgia operation'" b'"If we had no sexual harassment we would hav... b"Al-Qa'eda is losing support in Iraq because ... b'Ceasefire in Georgia: Putin Outmaneuvers the... b'Why Microsoft and Intel tried to kill the XO... b'Stratfor: The Russo-Georgian War and the Bal... b"I'm Trying to Get a Sense of This Whole Geor... ... b'U.S. troops still in Georgia (did you know t... b'Why Russias response to Georgia was right' b'Gorbachev accuses U.S. of making a "serious ... b'Russia, Georgia, and NATO: Cold War Two' b'Remember that adorable 62-year-old who led y... b'War in Georgia: The Israeli connection' b'All signs point to the US encouraging Georgi... b'Christopher King argues that the US and NATO... b'America: The New Mexico?' b"BBC NEWS | Asia-Pacific | Extinction 'by man...
3 2008-08-13 0 b' U.S. refuses Israel weapons to attack Iran:... b"When the president ordered to attack Tskhinv... b' Israel clears troops who killed Reuters cam... b'Britain\'s policy of being tough on drugs is... b'Body of 14 year old found in trunk; Latest (... b'China has moved 10 *million* quake survivors... b"Bush announces Operation Get All Up In Russi... b'Russian forces sink Georgian ships ' ... b'Elephants extinct by 2020?' b'US humanitarian missions soon in Georgia - i... b"Georgia's DDOS came from US sources" b'Russian convoy heads into Georgia, violating... b'Israeli defence minister: US against strike ... b'Gorbachev: We Had No Choice' b'Witness: Russian forces head towards Tbilisi... b' Quarter of Russians blame U.S. for conflict... b'Georgian president says US military will ta... b'2006: Nobel laureate Aleksander Solzhenitsyn...
4 2008-08-14 1 b'All the experts admit that we should legalis... b'War in South Osetia - 89 pictures made by a ... b'Swedish wrestler Ara Abrahamian throws away ... b'Russia exaggerated the death toll in South O... b'Missile That Killed 9 Inside Pakistan May Ha... b"Rushdie Condemns Random House's Refusal to P... b'Poland and US agree to missle defense deal. ... b'Will the Russians conquer Tblisi? Bet on it,... ... b'Bank analyst forecast Georgian crisis 2 days... b"Georgia confict could set back Russia's US r... b'War in the Caucasus is as much the product o... b'"Non-media" photos of South Ossetia/Georgia ... b'Georgian TV reporter shot by Russian sniper ... b'Saudi Arabia: Mother moves to block child ma... b'Taliban wages war on humanitarian aid workers' b'Russia: World "can forget about" Georgia\'s... b'Darfur rebels accuse Sudan of mounting major... b'Philippines : Peace Advocate say Muslims nee...

5 rows ร— 27 columns

I started preprocessing all the articles as it is in string format and contains unnecessary starting characters b' or b".

Once it is done, I passed them to BERT to get vectors of numeric values which then were fed into a neural network to compute the coefficient indicating how much news articles affect a day's open price and the predicted close values.

I used five different models and following plots are predicted values.

They have high variances but it can be seen that they were able to capture general trends so what one can do to improve the models is to set a threshold so that the close price does not change drastically over a night.

The posts include all the codes I used with explanations and are divided into two separate pages.

  1. Part 1
  2. Part 2

All models are implemented using Gluon and Tensorflow.

stock-price-prediction-with-bert's People

Contributors

hsong1101 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.