ocbe-uio / trajpy Goto Github PK
View Code? Open in Web Editor NEWTrajpy - empowering feature engineering for trajectory analysis across domains.
Home Page: https://ocbe-uio.github.io/trajpy/
License: GNU General Public License v3.0
Trajpy - empowering feature engineering for trajectory analysis across domains.
Home Page: https://ocbe-uio.github.io/trajpy/
License: GNU General Public License v3.0
./home/runner/work/trajpy/trajpy/trajpy/traj_generator.py:127:
DeprecationWarning: Conversion of an array with ndim > 0
to a scalar is deprecated, and will error in future.
Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
y[i_step, i_sample] = sub_y[-1]
Affected code:
trajpy/trajpy/traj_generator.py
Line 127 in d86094c
Implement default parameters for opening csv files in order to make it easier.
params = { 'skip_header':1, 'delimiter':',' }
Scikit-learn is only used for fitting the mean squared displacements (MSD) in order to obtain the anomalous exponent. This is easily replaced by scipy.stats.linregress
(OLS) or by a custom function. This is important to keep the project maintainability high with fewer external dependencies.
This function should be removed:
Line 446 in 0de9be8
Shall not forget to remove every reference to the function from tests and compute_features().
When processing several trajectories, show the progress status.
Simple: write "n/len(trajectories)" in the text box.
Advanced: implement a progress bar.
Previously I left dependencies only required to run the GUI out of the requirements list because the main intended usage for trajpy was via scripting. However, as the GUI matured and is minimally functional I think it makes sense to deliver the GUI as the main method for using the package.
The packages ttkthemes
and Pillow
are required to run the GUI and are not listed in requirements.txt.
ttkthemes>=2.4.0
Pillow>=8.1.0
We should write in the documentation how to generate the synthetic data and refer to the dataset available at https://zenodo.cern.ch/record/3627650 .
The calculation of the diffusion coefficient should be moved to a new @staticmethod
(see bellow).
https://github.com/phydev/trajpy/blob/562415b0f35976c726df7fc8daa773dacca7fb2c/trajpy/trajpy.py#L74
@staticmethod
def diffusivity_(msd, timelag, ndim):
""""
:param msd: ensemble averaged mean squared displacement
:param timelag: time-lag
:param ndim: number of dimensions
:return diffusivity: short-time diffusion coefficient D
""""
diffusivity = Trajectory.anomalous_exponent_(msd[:10], timelag[:10]) / (2*n)
return diffusivity
When we started coding TrajPy we only thought about processing spatial trajectories like
Therefore, the early implementation for initialising Trajectory()
where time is stored in the variable Trajectory()._t
is not a good design approach for user experience. We need to deprecate this variable and store all components of the trajectory in Trajectory()._r
Considerations for the back-end:
Variables in Portuguese in the file trajpy.py should be translated to English: poder, limite, etc...
There is not a single comment in this file. Please add at least one small description for each function and comments about the input and output.
Line 5 in 141b617
Currently the class trajpy
accepts either a csv file or a numpy array for initialising the object. However, we can improve this by implementing a parser with functools.singledispatch
.
Since trajpy's aims to be a general framework for trajectory analysis, it is critical to put more work on the parser for providing broad support for different file formats. singledispatch
offers an elegant way for this implementation.
Lines 27 to 35 in 8381bed
A parser for different file formats is needed, especially for processing csv files that contain several trajectories.
file format | description | priority |
---|---|---|
.xyz | file with several atoms and time points | 1 |
.csv | csv file with several trajectories | 2 |
LAMMPS | Molecular dynamics | 2 |
.pdb | protein data bank | 3 |
Nparticles [integer]
comment [character]
X Y Z [repeat Nparticles]
[repeat Nframes]
The csv should contain 5 columns: time t
, 3 spatial (x
, y
, z
) components and the trajectory identifier id
.
Large-scale Atomic/Molecular Massively Parallel Simulator is a molecular dynamics program from Sandia National Laboratories.
More details about the file format: https://docs.lammps.org/read_data.html
The LAMMPS data dump file format is written in yaml with the following structure:
---
creator: LAMMPS
timestep: 0
units: lj
time: 0
natoms: 3
boundary: [ p, p, p, p, p, p, ]
thermo:
- keywords: [ Step, Temp, E_pair, E_mol, TotEng, Press, ]
- data: [ 0, 0, -27093.472213010766, 0, 0, 0, ]
box:
- [ 0, 16.795961913825074 ]
- [ 0, 16.795961913825074 ]
- [ 0, 16.795961913825074 ]
- [ 0, 0, 0 ]
keywords: [ id, type, x, y, z, vx, vy, vz, ix, iy, iz, ]
data:
- [ 1 , 1 , 0.000000e+00 , 0.000000e+00 , 0.000000e+00 , -1.841579e-01 , -9.710036e-01 , -2.934617e+00 , 0 , 0 , 0, ]
- [ 2 , 1 , 8.397981e-01 , 8.397981e-01 , 0.000000e+00 , -1.799591e+00 , 2.127197e+00 , 2.298572e+00 , 0 , 0 , 0, ]
- [ 3 , 1 , 8.397981e-01 , 0.000000e+00 , 8.397981e-01 , -1.807682e+00 , -9.585130e-01 , 1.605884e+00 , 0 , 0 , 0, ]
---
timestep: 100
...
---
A parser for this file format is straightforward with yaml.load_all()
function.
Standard file format for protein structures containing several atoms each file at different time steps. Each pdb file can contain a screenshot of the system or several trajectories, so we need to process several pdb files at once to extract trajectories.
A possible workflow would be:
id
is the atom identifier.More information about pdb file format: https://en.wikipedia.org/wiki/Protein_Data_Bank_(file_format)
Currently we have a plotting feature in the GUI. This feature is unnecessary and increase the number of dependencies.
Code that should be removed:
in trajpy/gui.py
Lines 6 to 10 in 2b74234
Line 34 in 2b74234
Line 110 in 2b74234
Lines 266 to 271 in 2b74234
in requirements.txt
Line 3 in 2b74234
After removal we should organise the GUI buttons in a better way. We can reduce the windows size and increase text size for improved accessibility.
Write documentation about the supported file formats.
python3 -m trajpy.gui
The eigen values must be ordered by descending order (eigen[0] > eigen[1] > eigen[2]), otherwise the values for anisotropy and asymmetry are not computed correctly.
Just a friendly post-meeting reminder to address this. ๐
Rewrite the function compute_features() with the possibility to pass a list of features to be computed.
def compute_features(self, features_list):
travis-ci is not working due to some change of policies regarding free plans. Therefore we should implement a github actions workflow to test, build and deploy.
The following new features need to be included in the GUI:
I've made a mess in the latest commit. The equations in the docstrings must be fixed.
The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.
The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.
Type hints improve documentation, usability and maintainability. We should use this feature whenever possible.
Python documentation:
https://docs.python.org/3/library/typing.html
Fix the docs modindex.
The functions are wrongly defined and they must be switched:
msd_ensemble_averaged <-> msd_time_averaged
There are several warnings that should be fixed with regards to PEP8 standard.
See deployment report:
https://github.com/ocbe-uio/trajpy/actions/runs/4413645266/jobs/7734356182#step:5:1
We should start using black
.
Several new functions were implemented for object tracking and analysis of live animal trajectories.
We need to write unit tests for these functions.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.