Giter Club home page Giter Club logo

dbg-pds-tensorflow-demo's Introduction

Stock Price Movement Prediction Using The Deutsche Börse Public Dataset & Machine Learning

Introduction

We use neural networks applied to stock market data from the Deutsche Börse Public Dataset (PDS) to make predictions about future price movements for each stock.

Specifically, we make a prediction on the direction of the next minute's price change using information from the previous ten minutes. We use this to power a simplified trading strategy to show potential returns.

This is intended as a demonstrate of the applications on this data set.

The Deutsche Börse Public Dataset

The Deutsche Börse PDS project provides minute-by-minute statistics over trading data from the XETRA and EUREX engines.

We focus on XETRA only. It is comprised of a variety of equities, funds and derivative securities. The PDS contains details for on a per security level, detailing trading activity by minute including the high, low, first and last prices within the time period.

Getting Started

Ensure you have Docker installed before completing the following steps.

  1. Run ./build.sh in the main repo folder to build the Docker image.
  2. Run ./run-notebook.sh to receive the notebook URL. Copy/paste this into your browser to access the notebook.
  3. Start with the notebooks in order. Notebook 02- prepared the data for the other notebooks.

Additionally, you should run step 1 (./build.sh) after each pull where the Dockerfile has been updated to rebuild your local version against the latest update.

Project Structure

The work here is divided across three notebooks:

Additional notebooks

  • What prices are predictable
    • We find out that it matters weather you predict an EndPrice, a MeanPrice or a MedianPrice in the next interval. We show how one can normalize the prices to improve the prediction.
  • Clustering Stocks
    • We cluster 100 stocks from the dataset using data from 60 days.
  • Simpler Linear Model
    • We show a well-performing linear model with hand-engineered features on a single stock. We predict the average price of the next day for a single stock. This is intended to get started easily with the dataset and price modeling.
  • Large-scale linear model predicting 20 minutes ahead
    • We run a linear model on the 50 most liquid stocks with proper training and test sets. We predict the direction of the average price in the next 20 minutes.

Documentation

General project documentation can be found in the wiki here.

Authors

  • Stefan Savev (Originate)
  • Rey Farhan (Originate)

dbg-pds-tensorflow-demo's People

Contributors

stefansavev-o avatar y-f-a avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbg-pds-tensorflow-demo's Issues

error


KeyError Traceback (most recent call last)
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2524 try:
-> 2525 return self._engine.get_loc(key)
2526 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: <filter object at 0x7f09edf03390>

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
in ()
38
39 combined_training_set_df = pd.concat(combined_training_set, axis=0)
---> 40 training_set = TrainingSetBuilder().transform(combined_training_set_df)
41
42 combined_valid_set_df = pd.concat(combined_valid_set, axis=0)

in transform(self, single_stock)
9 def transform(self, single_stock):
10 x_features = filter(lambda name: name.startswith('x(') or name.startswith('x:'), list(single_stock.dtypes.index))
---> 11 X = single_stock[x_features].values
12 y = single_stock[['pseudo_y(SignReturn)']].values
13 return TrainingSet(X, y, single_stock)

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pandas/core/frame.py in getitem(self, key)
2137 return self._getitem_multilevel(key)
2138 else:
-> 2139 return self._getitem_column(key)
2140
2141 def _getitem_column(self, key):

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
2144 # get column
2145 if self.columns.is_unique:
-> 2146 return self._get_item_cache(key)
2147
2148 # duplicate columns & possible reduce dimensionality

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
1840 res = cache.get(item)
1841 if res is None:
-> 1842 values = self._data.get(item)
1843 res = self._box_item_values(item, values)
1844 cache[item] = res

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pandas/core/internals.py in get(self, item, fastpath)
3841
3842 if not isna(item):
-> 3843 loc = self.items.get_loc(item)
3844 else:
3845 indexer = np.arange(len(self.items))[isna(self.items)]

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2525 return self._engine.get_loc(key)
2526 except KeyError:
-> 2527 return self._engine.get_loc(self._maybe_cast_indexer(key))
2528
2529 indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: <filter object at 0x7f09edf03390>

cannot generate the download.sh script in notebook-02

Hi thanks for putting together such a fantastic use case.

Curious how the download script i.e.
download_script = '~/data/deutsche-boerse-xetra-pds/download.sh'
is generated? I cannot get this line to run. I'm trying to grab a range of data, around a year, from the data store. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.