Giter Club home page Giter Club logo

defi-liquidity-events-effect's Introduction

๐Ÿ“ˆ Impact of Liquidity Pool Size on Trading Volume in BTC-ETH Pools

Welcome to the official repository for the MSc research project titled "Impact of Liquidity Pool Size on Trading Volume in BTC-ETH Pools", conducted during the Summer 2023 semester. The repository contains all data engineering scripts, models, and additional resources used to support the findings and discussions presented in the final paper.

For a comprehensive understanding of the project, readers are encouraged to access the full research paper available at the following link: Impact of Liquidity Pool Size on Trading Volume in BTC-ETH Pools.

๐Ÿƒ Quick Start

To extract and clean the data from APIs and custom functions, follow these steps:

  1. Set Up Environment:

    • Ensure your environment/local-secrets.env file contains an ETHERSCAN_KEY (you can obtain this for free from Etherscan).
    • Verify your configuration settings in environment/time_spans.py and run-config.env.
  2. Data Sourcing and Cleaning:

  3. Feature Engineering:

    • Run the feature engineering processes in bulk using the script:
      ./Code/feature_engineering/run_feature_engineering.sh
      
  4. Predictive Modeling:

    • The Code/predictive_models directory contains the Ordinary Least Squares (OLS) models. These scripts can be run individually for detailed analysis and experimentation.

๐Ÿ“ Structure

This repository is organized into several directories:

  • ๐Ÿ’ป Code:

    For detailed script descriptions please refer to the section - Detailed script description.

    1. โณ Data Sourcing and Cleaning
      • Contains a dedicated README with instructions and descriptions.
      • Includes three folders for each source (binance, etherscan, uniswap).
      • Handles data sourcing via APIs/web calls (raw data stored in Data).
      • Processes data cleaning (results stored in Data/cleansed).
    2. โš™๏ธ Feature Engineering
      • To process the Block Interval Chains and Features of analysis, run the following files in order:
        1. 1_dex_interval_dataframes.py
        2. 2_dex_direct_pool.py
        3. 3_cex_spillover.py
        4. main.py
    3. ๐Ÿ“Š EDA:
      • Produces basic EDA and analysis on the horizons and features.
    4. ๐Ÿ”ฎ Predictive Models
      • utils/ - Contains functions for data cleaning, model construction, and visualization.
        1. ols_all_horizons.py
        2. ols_best_horizon.py
  • Other Resources: Contains self-created diagrams, graphs, and screenshots.

โš ๏ธ Data is not versioned controlled in git. Please [for limited time] access data resources (raw, cleansed, processed) in the team's google drive: https://drive.google.com/drive/folders/1y5ZwLZK9GQYsCNYSY--4VQMg80dnuwuU?usp=sharing

  • ๐Ÿ—„๏ธ Data

    • FeatureDescription.md: Has a breakdown of the features used for the reference paper, and which ones are in-scope for this project.
    • processed/ Including processed cex_spillovers and direct_pool features when running locally.
    • processed/features/ Includes final data model writes when running locally.
  • Data - Holds interim data writes when running locally.

๐Ÿ“œ Script Descriptions

Script Name Description
1_dex_interval_dataframes ๐Ÿ“„ This script performs a range of data processing operations on Uniswap and Etherscan data, conducting cleaning, preprocessing, merging, and analysis tasks. It consolidates mint transactions for optimal interval analysis, infers block intervals, and logs counts of different transaction types. The interim results, including the consolidated DEX data, block data, and interval-based dataframes, are saved for subsequent analysis. The outcome enables a deeper understanding of transaction distributions across pools and types, facilitating further analysis and model building.
2_dex_direct_pool ๐Ÿ“„ This script processes data from different pools of a decentralized exchange (DEX) and computes various metrics per transaction hash and interval. It loads previously computed interval dataframes, applies various calculations to extract insights, and consolidates the results into a single dataframe per pool. If run directly, the processed data can be saved into a CSV file. It is part of a larger system for analyzing DEX data.
3_cex_spillover ๐Ÿ“„ This script loads, processes, and calculates metrics from the data of a Centralized Exchange (CEX), specifically Binance, relating to transactions' spillover effects on different pools of a decentralized exchange. The script organizes the data into intervals, calculates various metrics for each interval and each pool, and then consolidates the results. If the script is run directly, it can optionally save the processed data to a CSV file. The script is part of a larger system for analyzing CEX and DEX data.
main_py ๐Ÿ“„ This Python script serves as the final step of the feature engineering process. It accumulates the volume of each pool (the target variable), merges this with both Centralized Exchange (CEX) and Decentralized Exchange (DEX) feature sets. It then organizes the data into Horizon tables for each pool. The script transforms raw and intermediate data into a form that's ready for use in building predictive models. The output data is saved as a set of CSV files, with each file representing the processed data for a particular pool.
ols_all_horizons ๐Ÿ“„ This script performs Ordinary Least Squares (OLS) regression analyses for predicting target variables related to cumulative volumes in different pools of a decentralized exchange. It tests different feature sets and makes predictions for all specified horizons. It stores the metrics of model performance, and finally, it visualizes the results and identifies the top performing models. The script is intended to help understand which features are most predictive of the target variables under different conditions, and can be used to inform feature and model selection in machine learning tasks related to decentralized exchange data.
ols_best_horizon ๐Ÿ“„ This script conducts Ordinary Least Squares (OLS) regression analyses in three primary steps: initial analysis, individual run OLS, and all horizons run OLS. The analyses are carried out on datasets from a specific pool of a decentralized exchange to predict cumulative volume. The script consists of the following steps: first_analysis function, individual_run_ols function, and all_horizons_run_ols function. These functions perform various tasks such as replacing nulls, aggregating columns, stepwise selection, and conducting OLS regression, and provide visual representation of data and results.

๐Ÿ› ๏ธ Requirements

An ETHERSCAN_KEY is required for API downloads and should be stored in the environment/local-secrets.env file.

๐Ÿ† Results & Analysis

๐ŸŒŸ How to Contribute

We welcome contributions from the community. If you're interested in enhancing the analysis, refining models, or adding new features, please feel free to fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.

Ensure to update tests as appropriate and adhere to the existing coding style.

defi-liquidity-events-effect's People

Contributors

mvc8 avatar evanj3 avatar datagero avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.