Giter Club home page Giter Club logo

t-taniai / symbolicgpt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mojivalipour/symbolicgpt

0.0 1.0 0.0 38.81 MB

Symbolic regression is the task of identifying a mathematical expression that best fits a provided dataset of input and output values. In this work, we present SymbolicGPT, a novel transformer-based language model for symbolic regression.

Home Page: https://git.uwaterloo.ca/data-analytics-lab/symbolicgpt2

License: MIT License

Python 98.03% Cython 0.93% Shell 1.04%

symbolicgpt's Introduction

This is the code for our proposed method, SymbolicGPT. We tried to keep the implementation as simple and clean as possible to make sure it's understandable and easy to reuse. Please feel free to add features and submit a pull-request.

Notes

This repository has been modified by @t-taniai to fix bugs in the authors' original code. The bug fixes contain modifications in dataset generation, and thus results using this repository will be different from the authors' original report.

Results/Models/Datasets

  • Download via link
  • These data for dataset v2 are not available yet.

Mirror Repository:

If you want to pull, open an issue or follow this repository, you can use this github repo link which is a mirror repo for this one. Unfortunately, the UWaterloo GITLAB is limited to users with @uwaterloo emails. Therefore, you cannot contribute to this repository. Why do I not use github directly? You can find the answer here. It's because I no longer trust GITHUB as my primary repository. Once, I was adversely affected for no good reason.

Original Repo: link

Abstract:

Symbolic regression is the task of identifying a mathematical expression that best fits a provided dataset of input and output values. Due to the richness of the space of mathematical expressions, symbolic regression is generally a challenging problem. While conventional approaches based on genetic evolution algorithms have been used for decades, deep learning-based methods are relatively new and an active research area. In this work, we present SymbolicGPT, a novel transformer-based language model for symbolic regression. This model exploits the advantages of probabilistic language models like GPT, including strength in performance and flexibility. Through comprehensive experiments, we show that our model performs strongly compared to competing models with respect to the accuracy, running time, and data efficiency.

Paper: link

Setup the environment

  • Install Anaconda
  • Create the environment from environment.yml, as an alternative we also provided the requirements.txt (using Conda)
conda env create -f environment.yml
  • As an alternative you can install the following packages:
pip install numpy
pip install torch
pip install matplotlib
pip install scipy
pip install tqdm

Dataset Generation

Run the following script to generate datasets.

$ . make_dataset.sh

Train/Test the model

Specify the configuration file by --config augment to train a model for a dataset.

$ python symbolicGPT.py --config configs/train_1-9var.json
$ python symbolicGPT.py --config configs/train_1-5var.json
$ python symbolicGPT.py --config configs/train_3var.json
$ python symbolicGPT.py --config configs/train_2var.json
$ python symbolicGPT.py --config configs/train_1var.json

Directories

symbolicGPT
│   README.md --- This file
│   .gitignore --- Ignore tracking of large/unnecessary files in the repo
│   environment.yml --- Conda environment file
│   requirements.txt --- Pip environment file
│   models.py --- Class definitions of GPT and T-Net
│   trainer.py --- The code for training Pytorch based models
│   data_loader.py --- Dataset class for data IO
│   utils.py --- Useful functions
│   symbolicGPT.py --- Main script to train and test our proposed method
│   dataset.py --- Main script to generate data.
│   make_dataset.sh --- Script to generate all datasets
│
└───configs
|   |   json config files for generating datasets and training models
│   
└───generator
│   └───treeBased --- equation generator based on expression trees
│       │   generateData.py --- Base class for data generation
|
└───results
    │   symbolicGPT --- reported results for our proposed method
    │   DSR --- reported results for Deep Symbolic Regression paper: 
    │   GP --- reported results for GPLearn: 
    │   MLP --- reported results for a simple blackbox multi layer perceptron

System Spec:

  • Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
  • Single NVIDIA GeForce RTX 2080 11 GB
  • 32.0 GB Ram

Citation:

@inproceedings{
    SymbolicGPT2021,
    title={SymbolicGPT: A Generative Transformer Model for Symbolic Regression},
    author={Mojtaba Valipour, Maysum Panju, Bowen You, Ali Ghodsi},
    booktitle={Preprint Arxiv},
    year={2021},
    url={https://arxiv.org/abs/2106.14131},
    note={Under Review}
}

REFERENCES:

License:

MIT

symbolicgpt's People

Contributors

mhpanju avatar mojivalipour avatar taniai-osx avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.