In this repository, we provide a standard pipeline to help you with the kick off of our hackathon. In this pipeline, we include:
- data importing and explanatory analysis,
- the model build-up for both the generator and discriminator using LSTM modules,
- training algorithm design,
- offline evaluation module.
The data used for training and testing all come from the public data from the main hackathon website,
The code has been tested successfully using Python 3.8 and pytorch 1.11.0. A typical process for installing the package dependencies involves creating a new Python virtual environment.
To install the required packages, run the following:
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch
pip install cupy-cuda102
pip install -r requirements.txt
For code illustration, please take a closer look on the Jupyter-Notebook we created, namely, example_pipeline.ipynb.
For this challenge, the training data is located at data/ref_data.pkl. This data includes 20000
sample paths representing the price and volatility processes of 2
correlated assets. Each sample path is sampled uniformly from [0, 1]
with 20
time steps. The dataset provides trajectories for both the price and volatility processes, resulting in a time series with feature dimensions such as [price_asset_1, volatility_asset_1, price_asset_2, volatility_asset_2]
, the data is stored in a .pkl
file and the data shape is [20000, 20, 4]
.
We also provide a sample submission bundle at sample_submission_bundle which includes:
model_dict.pkl
: Dictionary of model parameters used to generate the samples.model.py
: Script of your model architecture, model loading, and data generation.fake.pkl
: Fake data generated by the trained model.
Finally, we wish you good luck during the competition and most importantly, have fun!!!