Giter Club home page Giter Club logo

pedrojuanbj / mltsa Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 5.0 508.73 MB

Machine Learning Transition State Analysis (MLTSA) suite with Analytical models to create data on demand and test the approach on different types of data and ML models.

Home Page: https://mltsa.readthedocs.io/en/latest/

License: MIT License

Python 2.63% Jupyter Notebook 97.37%
tensorflow machine-learning deep-learning molecular-dynamics-analysis molecular-dynamics sklearn-compatible tensorflow-compatible sklearn time-series time-series-analysis time-series-classification enhanced-sampling

mltsa's Introduction

MLTSA: Machine Learning Transition State Analysis repository

Introduction

This is a Python package to apply the MLTSA approach for relevant CV identification on Molecular Dynamics data using both Sklearn and TensorFlow modules.It also includes both a suite of 1D Potential Analytical model feature generation module for light testing and a suite of different 2D potential shapes (Spiral, Z-shaped) generation as well as the posterior feature generation by 1D projections of the 2D data. In this package you will find:

  • Data Generation Module (MLTSA_datasets) : Contains files with the easy to call 1D/2D/MD examples to generate data or play around with it as tests for the approach.
  • Scikit-Learn-based ML models and Feature Reduction module (MLTSA_sklearn) : Contains the Scikit-Learn integrated functions to apply MLTSA on data.
  • TensorFlow-based ML models and Feature Reduction module (MLTSA_tensorflow): Contains the set of functions and different models built on TensorFlow to apply MLTSA on data.

Usage

  • Example OneD
  • Example TwoD
  • Example Train
  • Example MLTSA

Installation

To use MLTSA, first install it using pip:

(.venv) $ pip install MLTSA

mltsa's People

Contributors

pedrojuanbj avatar zwei21 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

mltsa's Issues

Test all Notebooks for bugs

Please if anyone would be so kind to run correctly all available notebooks and find the bugs, then list them in the project page for bugs so we are aware and someone else can fix them.

2D Models (Z-shaped and Spiral) data gen implementation on datagen.py module

We need to implement the 2D data generation classes and the projection to 1D suite in a single module inside MLTSA_datasets under the 2D folder, the name of the module can be anything but something related to it like datagen.py or datagen_2D.py is good.

For this implementation to be shipped we need to add the latest data generation code on the models, we could have some optional plots for the free energy surface of the potentials, also it should be implemented as a module call similar to the 1D one to generate data on demand as it is really fast and optimized now. Additionally the 1D projection code should be added to it to provide the data ready for train on demand as well.

2D Models data labelling and clustering module

We need to implement the functions / classes for labelling the 2D data (for both Zshaped and spiral) as well as add the functions for visualization of data, and projections on demand as well.

Restructure of MLTSA repo

Per pedro's idea, the project management of MLTSA repository should be strengthen, by restructing the files in MLTSA repo.
Following the basic blueprint of aim analysis, this issue would discuss the following things:

  1. What: To enhance the management of MLTSA project, which would yield a better layout for users and readers of MLTSA paper whom has been linked to this repo from the paper.
  • For pacakage using: The structure of importing should keep shallow, a deep structure is bad for api design, exmpale: from MLTSA import dataset, then use dataset.functions, avoid deep importing structure as from MLTSA.dataset.twoDdata.generator import ...
  • For package developing: change log and readme file should be updated from time to time, to let users and other developers be aware of project management status clearly
  • For paper reference: A notebook folder would be ideal container having example code in it, as showing proper supporting examples for the readers who jumped to this repo from the paper published, if possible, it would be fine for keeping the example code unchanged using github archive(which yield a new repo) or create a new branch to store the example code which used in the paper.
  1. Who: Pedro and zwei21 would be discuss and work on this issue together
  2. Where: The whole project repo file structure should be considered to restructure
  3. How:
  • Pedro has suggested a python package called "cookiecutter" which would automatically generate file directory with respect to defined templates. Few templates suitable for deep learning projects has been recommended by Pedro. However, this method would generate brand new directory which means the origional file of MLTSA would be totally moved and transferred to the new directory, can't say if this would consume more time or cost.
  • Zwei21 suggest amend the current file directory accoring to the given repo templates, which would be less difficult since there are few folders that could be reused in the new structure, however, this requires reconsidering the MLTSA repo current structure and rearranging code already build, problems like importing structure in current files would occur when this plan is merging. A good understanding of whole MLTSA structure should be considered well for this plan; in this reason, zwei sugguest that comment and document all the current code and files before the reconstruction, like making an inventory report before rebasing the warehouse.
  1. When: This should be done no later than end of August, 2022, from when zwei21 would leave UCL and all incomplete projects would be difficult to finish.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.