mdil-snu / simple-nn_v2 Goto Github PK

License: GNU General Public License v3.0

Python 48.84% C++ 49.42% C 1.57% Shell 0.17%

simple-nn_v2's Introduction

SIMPLE-NN (SNU Interatomic Machine-learning PotentiaL packagE – version Neural Network)

SIMPLE-NN is an open package that constructs Behler-Parrinello-type neural-network interatomic potentials from ab initio data. The package provides an interfacing module to LAMMPS for MD simulations.

Main features

Training over total energies, forces, and stresses.
Symmetry function vectors for atomic features.
Supports LAMMPS for MD simulations.
PCA matrix transformation and whitening of training data for fast and accurate learning.
Supports GPU via PyTorch.
CPU parallelization of preprocessing training data via MPI for Python
Uniform training to rectify sample bias (W. Jeong et al. J. Phys. Chem. C 122, 22790 (2018)).
Replica ensemble for uncertainty estimation (W. Jeong et al. J. Phys. Chem. Lett. 11, 6090 (2020)).
Compatible with results of most ab initio codes such as Quantum-Espresso and VASP via ASE module.
Dropout technique for regularizing neural networks.
Requires Python 3.6-3.9 and LAMMPS (23Jun2022 or newer)

Installation, manual, and full details: https://simple-nn-v2.readthedocs.io

If you use SIMPLE-NN, please cite:
K. Lee, D. Yoo, W. Jeong, and S. Han, "SIMPLE-NN: An efficient package for training and executing neural-network interatomic potentials", Comp. Phys. Comm. 242, 95 (2019) https://doi.org/10.1016/j.cpc.2019.04.014.

simple-nn_v2's People

Contributors

Stargazers

Watchers

Forkers

jisujung928 seungwoo-hwang ycyoon1994 dlwlgh6319 santi921 jiqi535 confusedant dceresoli zhongliliu interesting-codes tkang37 gasplant64 bluehope gymlab dhkiem latebutsteady donghlee57

simple-nn_v2's Issues

PCA issue

I am facing the following error during preprocessing.
ValueError: Number of [Pt] feature point[72] is less than input size[332]. This cause error during calculate PCA matrix

I am attaching the files here so that it's easier to recreate the error.
input.zip

I'm not sure what I am doing wrong.

Lost atom error - on trained data

From my previous issue, where I was trying to train a HDNNP with 5 elements. I was able to generate a preliminary rough NNP, but am facing a small issue when validating the potential (both for the training set itself as well as the validation set).

Few of the structures, that are in the training set give me a "Lost Atom" error when I try to validate them with lammps. Here's the parallel script I use with ase and joblib to quickly do "single-point" calculations with the DFT reference data stores in train.traj.

from ase.io import *
from ase.calculators.lammpslib import LAMMPSlib
import matplotlib.pyplot as plt
import numpy as np
from joblib import Parallel, delayed

cmds = ["pair_style nn",
        "pair_coeff * * potential_saved_bestmodel Al H Pt C O"]
lammps = LAMMPSlib(lmpcmds=cmds, atom_types= {'Al':1, 'H':2, 'Pt':3 ,'C':4 ,'O':5}, keep_alive='False', log_file='test.log')

traj = read('train.traj',':')
collect=[atoms for i,atoms in enumerate(traj) if i%5==0]
edft = [atoms.get_potential_energy()/len(atoms) for atoms in collect]
np.savetxt('edft.txt',edft)

def energy(i,atoms):
    ed = atoms.get_potential_energy()/len(atoms)
    atoms.calc = lammps
    en = atoms.get_potential_energy()/len(atoms)
    return en

enn = Parallel(n_jobs=24, backend='multiprocessing')(delayed(energy)(i,atoms) for i,atoms in enumerate(collect))
np.savetxt('enn.txt',enn)

Initially, my script was getting struck, which I later realized was due to the lost atoms error from lammps which I was able to correct, but then I see that the error on the structures that show this lost atoms error is extremely high of the order of 1 eV/atom.

Can you help me understand why a structure in the training set would give Lost atoms error when validating ?

Install Problem

I can't install it by setup.py because of simple_nn\features\symmetry_function\symmetry_functions.h(76): error C3861: “sincos”: Could not find the identifier.I want know what is 'sincos' and how to slove the problem.Thank you.

And this is the relevant part of symmetry_functions.h.

Segmentation fault during preprocessing (xyz)

Hello, I'm testing simpleNN, but unfortunately, I can't proceed with the Preprocessing step. I always encounter a "Segmentation fault (core dumped)" error, and I'm certain that it's not due to a lack of memory. I have tested it on a very small subset of the dataset and with a small (testing) number of symmetry functions. The tutorial works fine for me. Could you please assist me in identifying the problem?

params_X
2 1 0 6.0 0.003214 0.0 0.0
2 2 0 6.0 0.003214 0.0 0.0
2 3 0 6.0 0.003214 0.0 0.0
2 4 0 6.0 0.003214 0.0 0.0
4 1 1 6.0 0.000357 1.0 -1.0
4 2 2 6.0 0.000357 1.0 -1.0
4 3 3 6.0 0.000357 1.0 -1.0
4 4 4 6.0 0.000357 1.0 -1.0

input.yaml
generate_features: True
preprocess: True
train_model: False
random_seed: 123

params:
H: params_H
N: params_N
C: params_C
O: params_O

data:
type: symmetry_function
absolute_path: False
refdata_format: xyz
read_stress: False

preprocessing:
valid_rate: 0.1
calc_scale: True
calc_pca: False

dataset
https://pubs.acs.org/doi/suppl/10.1021/acs.jctc.1c00647/suppl_file/ct1c00647_si_002.zip

structure_list
./train_300K.xyz :

Code Typo

File destination: https://github.com/MDIL-SNU/SIMPLE-NN_v2/tree/main/simple_nn/features/symmetry_function/generating.py
Location: Line 113, Line 118.
Change: logfie --> logfile

[New feature] Better to add earlystopping metrics

If I was not missing, SIMPLE-NN currently doesn't deploy the early stopping metrics. Since it leans on PyTorch, ReduceLROnPlateau class should be a ready solution. After all, people don't want it runs forever (predefined a large max_epoch).

Thanks for your effort in developing this package, excellent work :)

Evaluation/Inference with simple-NN

Hi all,

I've greatly enjoyed using simple-NN and now I'm looking for a way to use the NNP in an external program different from lammps. I think an ASE calculator would be relatively easy to develop, but I don't see a function within the code that can take an ASE atoms object and do a forward pass through the network to obtain the NNP energies and forces (and maybe stress). The calc_result variable in the test_model(inputs, logfile, model, optimizer, criterion, device, test_loader) function returns the desired values, but I'm not familiar enough with the code to generate the symmetry function values for a given structure and apply the scaling factor, etc. I'm more than happy to develop the function, but I'm at a loss on where to start.

Parameter total_iteration and save_criteria not working!

I am working with Simple-NN. The options save_criteria and total_iteration is giving warning as : "Warning: Unidentified option in neural_network: total_iteration" and "Warning: Unidentified option in neural_network: save_criteria" . I am attaching the input.yaml file too. Please let me know if there is something wrong done by me.

generate_features: False
preprocess: False
train_model: True
random_seed: 123
params:
   Ti: params_Ti
   Al: params_Al
   C:  params_C

neural_network:
    nodes: 30-30
    batch_size: 64
    optimizer:  
       method: Adam 
    total_epoch: 4000

    learning_rate: 0.001
    use_scale: True
    use_pca: True
    use_stress: False
    use_force : True
    save_interval : 0
    show_interval : 100
    break_max : 10
    total_iteration : -2000000    
    save_criteria : v_F,v_E
    
    acti_func : tanh

Enabling installation in Windows (Following up issue #88)

Problem

python setup.py install fails in windows OS
Error message: error C3861: “sincos”: Could not find the identifier (Already reported in issue #88 )

Cause

The function "sincos" is only supported in Linux, not in Windows.

Is it okay that I modify the code to enable installation in Windows?
I will define the "sincos" function in the files below after detecting OS.

symmetry_functions.h
pair_nn_simd_function.h

Structure files

Is there a possibility to use extended XYZ or ase trajectory files to read the structural energy and force data instead of OUTCAR?

In the data_processing.py, you load_structures function is creating a ase atoms object, so I think there should be a way to directly read files other than OUTCAR with ase for the E, F, Position data right?

No module named 'simple_nn.utils._libgdf'

I think I might have followed all the procedure.
In a new env of annaconda, installed python=3.8, torch=1.10.1

python setup.py install -> fails in finding numpy==1.21 but it fails. I have installed using pip, then setup skips it and finishes with showing some dependency error for matplotlib, scikit-learn versions.

In the test::
File "/home/joonho/pymod/SIMPLE-NN/simple_nn/utils/init.py", line 14, in
from ._libgdf import lib, ffi
ModuleNotFoundError: No module named 'simple_nn.utils._libgdf'

There is no /home/joonho/pymod/SIMPLE-NN/simple_nn/utils/_libgdf.py but libgdf_builder.py

lammps complie failed

System: Linux (google colab)
Compilor: mpicxx -g -O3 -std=c++11 -DLAMMPS_GZIP -DLAMMPS_MEMALIGN=64 -DMPICH_SKIP_MPICXX -DOMPI_SKIP_MPICXX=1
Issue:
../pair_nn_replica.cpp: In member function ‘virtual void LAMMPS_NS::PairREPLICA::init_style()’:
../pair_nn_replica.cpp:850:35: error: ‘int LAMMPS_NS::NeighRequest::half’ is protected within this context
neighbor->requests[irequest]->half = 0;
^~~~
In file included from ../pair_nn_replica.cpp:30:0:
../neigh_request.h:55:7: note: declared protected here
int half; // half neigh list (set by default)
^~~~
../pair_nn_replica.cpp:851:35: error: ‘int LAMMPS_NS::NeighRequest::full’ is protected within this context
neighbor->requests[irequest]->full = 1;
^~~~
In file included from ../pair_nn_replica.cpp:30:0:
../neigh_request.h:56:7: note: declared protected here
int full; // full neigh list
^~~~
Makefile:114: recipe for target 'pair_nn_replica.o' failed
make[1]: *** [pair_nn_replica.o] Error 1
make[1]: Leaving directory '/content/drive/MyDrive/lammps/src/Obj_mpi'
Makefile:378: recipe for target 'mpi' failed
make: *** [mpi] Error 2

Exception: In params ./params file not exist for O

Hi, I'm having a problem following tutorials. Would you tell me how I can solve this problem?

Here is the exception message when I try to run run.py.
Exception: In params ./params file not exist for O

This happens to the directories below (all that have run.py within).

test_installation
tutorials/Evaluation
tutorials/GDF_weighting
tutorials/Preprocess
tutorials/Training
tutorials/Uncertainty_estimation_answer/1.Atomic_energy_extraction
tutorials/Uncertainty_estimation_answer/2.Training_with_atomic_energy

Things I did

Cloned SIMPLE-NN_v2 and LLAMMPS
Installed SIMPLE-NN_v2 (python setup.py install)
Copied 'symmetry_functions.h' and 'pair_nn.*' files to LAMMPS/src folder
Compiled LAMMPS package
Tried run.py

Things I noticed

The path of the two params files 'params_O', 'params_Si' (or just params) are specified in input.yaml.
Tested run.py after editing the path of input.yaml files in absolute path. (The same exception occurred)
The provided files are identical (Are these files supposed to be identical?)

Environments

windows 10 Enterprise
pytorch (cuda 11.1)