materialsvirtuallab / maml Goto Github PK
View Code? Open in Web Editor NEWPython for Materials Machine Learning, Materials Descriptors, Machine Learning Force Fields, Deep Learning, etc.
License: BSD 3-Clause "New" or "Revised" License
Python for Materials Machine Learning, Materials Descriptors, Machine Learning Force Fields, Deep Learning, etc.
License: BSD 3-Clause "New" or "Revised" License
@YunxingZuo
Please change the model name and all related modules to Gaussian approximation potential (GAP) instead of using its feature name smooth overlap of atomic positions (SOAP).
Hi all!
Congrats on the great work!
I am considering implementing BOWSR in a material discovery pipeline, but I am unsure about the actual inputs it requires.
From Section 2.2 (Properties Prediction) in the paper it seems that the underlying idea of the algorithm is to skip the computationally expensive DFT structural relaxation with the elemental substitution "trick", that basically translates to a smart way to get the unrelaxed structure of a compound which crystallography is unknown. ( (1) Is this right?)
I am referring to the lines
"For each crystal in the dataset (e.g., rock salt GeTe), another crystal with the
same prototype but a different composition (e.g., rock salt NaCl) was selected
at random and multi-element substitutions (Na→Ge, Cl→Te) were performed
to arrive at an “unrelaxed” structure."
If this was the case, the algorithm would be able to get a reasonably relaxed structure for any given input formula (and only formula), but in the example notebooks provided, the algorithm is only used as a structure relaxator, skipping the very relevant structure guessing step.
(2) Is my question well posed, or did I not catch something?
(3) If so, how can I see the algorithm at work assigning an unrelaxed structure to a formula?
Thank you!
Dear Developers,
I ran into a problem while training a test snap model.
The fact is that for a supercell smaller than 12 angstroms, everything is fine. And for a larger supercell, the correlation between the predicted forces and the forces in the training set is zero.
The structures for the training set were obtained using the VASP. I attach two samples in the JSON format, obtained by an identical vasp script with the only difference being that one cell was 11.9 angstroms, and the other was 12.1 angstroms. Also I attach text files with original and predicted forces to visualize the difference.
What can be the reason of this effect?
My python script is following:
element_profile = {Al: {'r': 0.5, 'w': 1.0}}
per_force_describer = BispectrumCoefficients(rcutfac=rcutfac, twojmax=6,
element_profile=element_profile,
quadratic=False,
pot_fit=True,
include_stress=False,
n_jobs=n_threads, verbose=False)
elem_features = per_force_describer.transform(train_structures)
train_pool = pool_from(train_structures, train_energies, train_forces)
_, elem_df = convert_docs(train_pool)
y = elem_df['y_orig'] / elem_df['n']
x = elem_features
weights = np.ones(len(elem_df['dtype']), )
weights[elem_df['dtype'] == 'energy'] = en_weight
weights[elem_df['dtype'] == 'force'] = 1
weighted_model = LinearRegression()
weighted_model.fit(x, y, sample_weight=weights)
energy_indices = np.argwhere(np.array(elem_df["dtype"]) == "energy").ravel()
forces_indices = np.argwhere(np.array(elem_df["dtype"]) == "force").ravel()
weighted_predict_y = weighted_model.predict(x)
original_energy = y[energy_indices]
original_forces = y[forces_indices]
weighted_predict_energy = weighted_predict_y[energy_indices]
weighted_predict_forces = weighted_predict_y[forces_indices]
file_fl = open('forces_linear.txt', "w")
file_el = open('energies_linear.txt', "w")
file_fl.write("orig_force, predict_force\n")
for index in forces_indices:
file_fl.write(str(y[index])+" "+str(weighted_predict_y[index])+"\n")
file_el.write("orig_en, predict_en\n")
for index in energy_indices:
file_el.write(str(y[index])+" "+str(weighted_predict_y[index])+"\n")
file_fl.close()
file_el.close()
RMSE = mean_squared_error(original_forces, weighted_predict_forces)
print("Parameters = " + str([en_weight, r1 ,w1,rcutfac])+" /// RMSE = "+str(RMSE))
return(RMSE)
I have vasp output OUTCAR/CONTCAR/XDATCAR files of my required structures obtained from MOLECULAR DYNAMICS. Can you help me in converting them to format that can be input to the transform() function of "Bispectrum_coefficients"?
from maml.apps.pes._snap import SNAPotential
from pymatgen.core import Structure, Element
from maml.apps.pes._lammps import EnergyForceStress, ElasticConstant, DefectFormation
snap = SNAPotential.from_config(coeff_file='SNAPotential.snapcoeff', param_file='SNAPotential.snapparam')
Ni_conventional_cell = Structure.from_file('Ni_conventional.cif')
efs_calculator = EnergyForceStress(ff_settings=snap)
energy, forces, stresses = efs_calculator.calculate([Ni_conventional_cell])[0]
print('The predicted energy of Ni conventional cell is {} eV'.format(energy))
print('The predicted forces of Ni conventional cell is \n {} eV/Angstrom'.format(forces))
elastic_calculator = ElasticConstant(ff_settings=snap, lattice='fcc', alat=3.508)
C11, C12, C44, bulkmodulus = elastic_calculator.calculate()
print('The predicted C11, C12, C44, bulkmodulus are {}, {}, {}, {} GPa'.format(C11, C12, C44, bulkmodulus))
defect_calculator = DefectFormation(ff_settings=snap, specie='Ni', lattice='fcc', alat=3.508)
defect_formation_energy = defect_calculator.calculate()
print('The predicted defect formation energy is {} eV'.format(defect_formation_energy))
When I run this code, it has this error.
Traceback (most recent call last):
File "/home/sdb/zzhen/2021/materialsvirtuallab/maml-2021.10.14/mvl_models/pes/Ni/snap/Ni_snap.py", line 13, in
elastic_calculator = ElasticConstant(ff_settings=snap, lattice='fcc', alat=3.508)
File "/home/sdb/zzhen/2021/materialsvirtuallab/maml-2021.10.14/maml/apps/pes/_lammps.py", line 417, in init
super().init(**kwargs)
File "/home/sdb/zzhen/2021/materialsvirtuallab/maml-2021.10.14/maml/apps/pes/_lammps.py", line 87, in init
raise TypeError("%s not in supported kwargs %s" % (str(i), str(self.allowed_kwargs)))
TypeError: lattice not in supported kwargs ['lmp_exe']
Hello!
I've got a problem at initials state while importing potentials
from maml.apps.pes import NNPotential
ImportError: cannot import name 'NNPotential' from 'maml.apps.pes' (//anaconda3/lib/python3.7/site-packages/maml/apps/pes/__init__.py)
The same thing with other potentials
Thanks in advance!
I want to ensure that all functionality originally in mlearn is completely ported over, including the Jupyter notebooks. Also, we should deprecate mlearn completely and point people to maml.
Looks like it still refers to ArXiv instead of Materials Today
There are two describers in general.py, namely MultiDescriber and FuncGenerator.
The MultiDescriber works similarly to a sklearn Pipeline. I think we can redo this using sklearn pipeline to get a more robust version.
The FuncGenertor relies on the eval
function to deserialize a function. It requires that the function to be defined somewhere in the script or in the general.py module. It is not very robust. It would be nice to rewrite it and use utilities maml.utils
to deserialize the functions.
Hi,
I was trying to run one of the notebooks and received this error while importing:
ImportError: cannot import name 'export_saved_model' from 'tensorflow.python.keras.saving.saved_model' (/home/vishank-hp/miniconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/__init__.py)
For installation, I tried both the methods:
python setup.py develop
pip install maml
Would you please let me know if I do have to install some dependencies manually? or I am missing a step.
Thanks
Hi, all!
I was running SNAP model recently and find a missing command in SpectralNeighborAnalysis() function in the current version:
When running the notebook example of SNAP, training snap (the codes in the 3rd block) gives an error:
"ValueError: Shape of passed values is (144, 64), indices imply (144, 30)"
which means the bispectrum values calculated by lammps are different from the intended output.
I found that in previous mlearn package, where SNAP works fine, the computation argument is:
compute_args += ' diagonal {} rmin0 {} quadraticflag {}'.
format(self.diagonalstyle, self.rmin0, qflag)
in 'calcs.py' line 584. Note that the default diagonalstyle is 3 previously.
While the current maml, 'diagonalstyle' is removed and the computation argument is:
compute_args += " rmin0 0 quadraticflag {}".format(int(self.quadratic))
in '_lammps.py' line 342.
As I tested, the default 'diagonal' in argument command is 0, which gives 64 values (diagonal 1 gives 22 and diagonal 2 gives 7). Changing the argument command to " diagonal 3 rmin0 0 quadraticflag {}" seems solving the problem. Or simply add back the 'diagonalstyle' parameter.
I'm not sure if I get this one correct?
Thanks a lot and have a nice day!
Best regards,
Dear Developers,
I am training SNAP for a ternary system containing Ti, Si, and C.
I guess I am (by maml code) somehow managing the total energy of my system with non-physical pe/atom by making two more negative and one positive.
I actually tried a lot with different combinations of training data set and looking for the pe/atom which is physically meaningful (negative).
Is there any way to constrain the snap coeff. during training which will at least ensure negative pe/atom?
Or how can I resolve this issue for any multi-component systems?
I will be waiting to hear from you.
Best regards,
Rana
Hello!
Recently, I encountered a problem of some NAN values in the first column of the weights.XXX.data file when NNP training.
The following is my NNP training paramaters,
"nnp.train(train_structures=train_structures,
train_energies=train_energies,
train_forces=train_forces,
cutoff_type=1,
r_etas=[0.5,2.0],
a_etas=[0.5,2.0],
r_shift=[0.0],
zetas=[1.0,4.0],
r_cut=4.2,
hidden_layers=[4,4],
epochs=5)"
Very strangely, running the example(https://github.com/materialsvirtuallab/maml/blob/master/notebooks/pes/nnp/example.ipynb) in my cumputer is OK!
Dear all, how can I get the snapcoeff and snapparam files from within maml to be used with LAMMPS?
Also, if I want to calculate properties using a developed snap model for a multi element system, how to proceed? Suppose I want to get elastic constants for NbMoTaW with the potential object named "NMTW", in that case how to initiate the ElasticConstant() calculator? I want to get the constants for bulk NbMoTaW and not individual elements.
The pre-trained MEGNet cannot handle materials with atoms that are isolated with 5\AA, when using BOSWR the GP can select such structures as candidates what happens? I cannot find any error handling in the app code presented here for megnet variant
Hi @chc273,
Again, nice work on BOWSR! I've been recommending BOWSR to a few people, but I'm realizing the implementation described in the Materials Today paper is not immediately obvious to me. IIRC, this involves swapping out the "correct" atoms for a similar chemical formula template (e.g. we have the CIF file for Al2O3
, we want a crystal structure for V2O3
, so we "swap" Al
atoms with V
atoms) and then running the optimizer.
Assuming my understanding is correct, do you mind sharing a MWE for the use-case described in the paper (i.e. create a relaxed structure using only a chemical formula)?
Sterling
A database is a good idea. But I think we should try to use something widely supported. We can even support a few options. Any recommendations? The obvious ones are hdf5 and json and mysql. MongoDB is probably too heavy duty, though it can be an option since the translation to json is easy.
Hi,
I had some problems to run the notebooks of cgcnn_example.ipynb and megnet_example.ipynb.
ModuleNotFoundError Traceback (most recent call last)
/var/folders/0c/rcgf90kd6z9c7yc0p_tpffl81l1jxs/T/ipykernel_90244/3049958724.py in
----> 1 from bowsr.model.cgcnn import CGCNN
2 from bowsr.optimizer import BayesianOptimizer
3 from pymatgen.core.periodic_table import get_el_sp
4 model = CGCNN()
5
ModuleNotFoundError Traceback (most recent call last)
/var/folders/0c/rcgf90kd6z9c7yc0p_tpffl81l1jxs/T/ipykernel_90245/1238206353.py in
2 os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
3 import tensorflow as tf
----> 4 from bowsr.model.megnet import MEGNet
5 from bowsr.optimizer import BayesianOptimizer
6 from pymatgen.core.periodic_table import get_el_sp
I followed the instructions to install all the libraries. Please let me know if you can help.
Hi!
I wonder if there's a functionality in maml to extract a NNP potential file (in the Lammps) format after training within the maml
framework, in order to run further Lammps simulations.
Thanks in advance!
After pip install maml
on Windows through VS Code in a fresh conda env in Python 3.8:
>>> import tensorflow
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\sterg\AppData\Roaming\Python\Python38\site-packages\tensorflow-2.5.0rc1-py3.8-win-amd64.egg\tensorflow\__init__.py", line 41, in <module>
from tensorflow.python.tools import module_util as _module_util
File "C:\Users\sterg\AppData\Roaming\Python\Python38\site-packages\tensorflow-2.5.0rc1-py3.8-win-amd64.egg\tensorflow\python\__init__.py", line 40, in <module>
from tensorflow.python.eager import context
File "C:\Users\sterg\AppData\Roaming\Python\Python38\site-packages\tensorflow-2.5.0rc1-py3.8-win-amd64.egg\tensorflow\python\eager\context.py", line 28, in <module>
from absl import logging
ModuleNotFoundError: No module named 'absl'
A quick search reveals the suggestion:
pip install absl-py
Then I get ModuleNotFoundError
for gast
, then astunparse
, and decided to stop there.
In the end, I resolved it by following the instructions from my PR #371. I think I needed to explicitly pip install tensorflow
.
Hi!
I recently find that the train() function may have some issues in NNP model:
I personally changed the line 618 and line 621 in _nnp.py (also nnp.py in previous mlearn package):
p_scaling = subprocess.Popen(['nnp-scaling', input_filename]) --> p_scaling = subprocess.Popen(['nnp-scaling', '{}'.format(bin_num)])
p_train = subprocess.Popen(['nnp-train', input_filename], --> p_train = subprocess.Popen(['nnp-train'],
And here're some of my questions:
a. Are my changes correct? (I made these changes according to my own understanding of the n2p2 websites);
b. The role of bin number in scaling process seems not very clear (even in original website);
c. Is it possible to capture the error messages and show them? The error message of 'gsl histogram' can be seen when executing python on terminal, but are missing in ipynb or in program output;
d. It seems that nnp-scaling and nnp-training support parallel computing, such as using 'mpirun -np ', which may decrease the training time a lot.
Thank you very much!
Best regards,
Hi, I'm fairly new to using the SNAP POTENTIAL in maml. I'd like to know what is the meaning of w and r in element_profile. Also the rcutfac in BispectrumCoefficients() is same as r_c parameter discussed in the seminal papers of SNAP?
I am trying to use the Bispectrum Coefficients based SNAP potential for my training with ~7500 structures but ending up with some memory issue:
"Some of your processes may have been killed by the cgroup out-of-memory handler"
I am using parallel descriptor construction with the n_jobs tag.
Any advice what I might be doing wrong?
Hi, while running the pes/gap example, i encountered the following error:
INFO:maml.utils._lammps:Structure index 0 is rotated.
INFO:maml.utils._lammps:Structure index 1 is rotated.
INFO:maml.utils._lammps:Structure index 2 is rotated.
INFO:maml.utils._lammps:Structure index 3 is rotated.
INFO:maml.utils._lammps:Structure index 4 is rotated.
INFO:maml.utils._lammps:Structure index 5 is rotated.
INFO:maml.utils._lammps:Structure index 6 is rotated.
INFO:maml.utils._lammps:Structure index 7 is rotated.
INFO:maml.utils._lammps:Structure index 8 is rotated.
INFO:maml.utils._lammps:Structure index 9 is rotated.
Fortran runtime error: Incorrect extent in VALUE argument to DATE_AND_TIME intrinsic: is -2, should be >=8
Error termination. Backtrace:
#0 0x5599f40f562a in ???
#1 0x5599f3cf11ea in ???
#2 0x5599f3cf0cae in ???
#3 0x7fd4f1abc0b2 in ???
#4 0x5599f3cf0ced in ???
#5 0xffffffffffffffff in ???
Traceback (most recent call last):
File "/home/xinglong/anaconda3/envs/ml/lib/python3.8/site-packages/maml/apps/pes/_gap.py", line 343, in train
error_line = [i for i, m in enumerate(msg) if m.startswith("ERROR")][0]
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "example.py", line 17, in
gap.train(train_structures=train_structures, train_energies=train_energies,
File "/home/xinglong/anaconda3/envs/ml/lib/python3.8/site-packages/maml/apps/pes/_gap.py", line 346, in train
error_msg += msg[-1]
IndexError: list index out of range
From the limited answers online, it seems that this suggests that the gfortran used to compile the program (QUIP/gap_fit) is different from the one that is used during the run. However, the gfortran i have on the machine is the same one used for compiling and running. The detail architech of the machine is linux_x86_64, the gfortran version is GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
and the gcc version is gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
.
This may be due to the gap_fit program of QUIP, wondering if you are able to advise on possible solutions please?
Thank you for your kind attention.
Is there a way to generate descriptors for the "validation set" data in parallel to predict the energy/ forces/ stress ? I am currently using EnergyForceStress, but have to loop over each structures individually to do these predictions.
Dependabot couldn't authenticate with https://pypi.python.org/simple/.
You can provide authentication details in your Dependabot dashboard by clicking into the account menu (in the top right) and selecting 'Config variables'.
@chc273 @JiQi535 @w6ye The unittests as they are written are very fragile. Look at the recent runs. When using ISIS, the results fluctuate and the tests fail randomly. The LBGS optimization also sometimes go out of bounds. While I understand that some of these algorithms are numerical in nature, tests have to be written on model systems and data where you can be very certain of the outcome. Otherwise, they are not proper tests. Fix this asap.
Hi, all!
It seems LAMMPS has updated its pair_style command for hdnnp. Running MAML with NNP gives errors on pair_style:
"""
pair_style hdnnp cutoff keyword value ...
pair_coeff * * elements
"""
which is different from previous:
"""
pair_style nnp keyword value ...
pair_coeff * * elements cutoff
"""
https://docs.lammps.org/pair_hdnnp.html
I'm not sure if the LAMMPS has other packages implementing NNP else than HDNNP.
If ML-HDNNP is the LAMMPS package for NNP, I guess the _nnp.py may need to be updated as well?
For pair_style and pair_coeff variables:
"""
pair_style = (
'pair_style hdnnp {} dir "./" showew no showewsum 0 '
"maxew 10000000 resetew yes cflength 1.8897261328 cfenergy 0.0367493254"
)
pair_coeff = "pair_coeff * * {}"
""" (from line 37)
For write_param():
"""
ff_settings = [self.pair_style.format(self.param.get("r_cut") + 1e-2), self.pair_coeff.format(" ".join(self.elements))]
""" (from line 704)
I'm not sure if I missed anything on this part?
Thanks a lot and have a nice day!
Best,
Hi!
I keep getting this error message when trying to run NNP.train( ). The input structures are a list of pymatgen structures, with the corresponding list of energies and forces (n_atoms, 3).
Error message:
--> 685 self.train_forces_rmse = errors[0]
686 self.validation_forces_rmse = errors[1]
Index Error: list index out of range
Thanks in advance!
Hi,
I was trying to run the example of PES with GAP fitting, and would like to know how I can interface MAML code with GAP. I have installed quippy with GAP already but do not know how to let maml code direct at the correct file/location to look for GAP capabilities.
Thanks
The method parameter 'model_fname' of SNAPotential.model.save(model_fname=filename) seems not match with sklearn?? cause when it called, an error was raised。It should be 'filename'??
Hi there, big fan of the program,
I am struggling with figuring out how to create a .json database from which to train the model. Most of the example jupyter-notebooks start with loading in some data stored in a .json format, e.g. using "loadfn('./data/Mo/AIMD_NVT.json')"
I have tried simply using the to_json method from pymatgen.io.vasp.Vasprun , but when I try using loadfn() it keeps throwing errors like
"init() got an unexpected keyword argument 'vasp_version' "
Can you please take me through how to create such a data file from VASP output?
Cheers
New versions of pylint failed because there was a bad call in the matminer wrapper that says
super(new_class, self).__init__(**base_kwargs)
when new_class is not even defined. Why are there no unittests for all these?
Why is there two abstract base classes for Potential? What is the difference between them?
Hello,
I'm trying to recreate the results in this paper using the data given in the mlearn repo. The tutorial on nanoHUB assigns 10000 and 1 as the weights for energy and force respectively. The supplementary material with the paper gives a list of optimized hyperparameters for each group of data (eg. Energy weight of elastic group, Force weight of elastic group, etc) for each element. On using convert_docs after pooling the structures, energies and forces, the dataframe obtained does not specify the group. How can I assign the optimized weights corresponding to each specific group?
I think we should move towards a flatter organizational structure, similar to sklearn.
Basically, all implementations should be in separate files, but preceded by _. E.g.,
pes
- _snap
- _mtp
- ...
- __init__.py
The __init__.py
will then import the relevant things. See how sklearn implement things - e.g., scikit-learn/ensemble, scikit-learn/ensemble. I like this implementation because imports are a lot simpler, and we still retain the good organization of separate files and full flexibility to move things around if needed.
Hello,
Recently I have developed property predictive and spectra matching deep learning algorithm using site averaged K-edge XANES spectrum database from Materials Project. And I found that the site-wise might improve my model but unfortunately, I couldn't download site-wise K-edge spectrum from MIRester
. So I have to download site-wise spectra from the legacy website by clicking download button. Downloading all XANES spectra in this way is impractical. L-edge data can be downloaded from the paper website (offered by figshare link). Is there any way to download site-wise K-edge XANES spectra?
I want compare my models with excellent results of your group.
But without the database, the comparison might be wrong because my model is not trained with same database you used.
Thank you!
Dear Developers,
I am trying to optimize the element profile for a multicomponent system.
I am a very beginner in python doing this (manually) by python 'for loop'.
I am afraid that it will take 15 years to be finished (200x200x200 number of searches).
I am seeing that authors previously did it for several multicomponent systems.
Could you suggest to us some efficient and faster way to do it?
####################
rcut_grid = []
for rc_1 in np.arange(4,6,0.01):
for rc_2 in np.arange(4,6,0.01):
for rc_3 in np.arange(4,6,0.01):
element_profile = {'Ti': {'r': rc_1, 'w': Ti}, 'Si': {'r': rc_2 , 'w': Si},
'C': {'r': rc_3, 'w': C}}
describer = BispectrumCoefficients(rcutfac=0.5, twojmax=6,
element_profile=element_profile, quadratic=False,
pot_fit=True, include_stress=False, n_jobs=4)
tsc_features = describer.transform(tsc_train_structures)
y = tsc_df['y_orig'] / tsc_df['n']
x = tsc_features
simple_model = LinearRegression(n_jobs=4)
simple_model.fit(x, y, sample_weight=weights)
energy_indices = np.argwhere(np.array(tsc_df["dtype"]) == "energy").ravel()
forces_indices = np.argwhere(np.array(tsc_df["dtype"]) == "force").ravel()
simple_predict_y = simple_model.predict(x)
original_energy = y[energy_indices]
original_forces = y[forces_indices]
simple_predict_energy = simple_predict_y[energy_indices]
simple_predict_forces = simple_predict_y[forces_indices]
e_e=mean_absolute_error(original_energy, simple_predict_energy) *10000
e_f=mean_absolute_error(original_forces, simple_predict_forces)
rcut_grid.append((rc_1, rc_2, rc_3, e_e, e_f))
Hi,
I'm trying to train MTP models with the previous example data and notebook from mlearn package (current MAML seems not having that notebook), but the training process fails with configuration file (.mtp file) giving multiple '-nan' values:
"""
MTP
version = 1.1.0
potential_name = MTP1m
scaling = 1.438492177533894e-04
species_count = 1
potential_tag =
radial_basis_type = RBChebyshev
min_dist = 4.000000000000000e+00
max_dist = 4.800000000000000e+00
radial_basis_size = 8
radial_funcs_count = 2
radial_coeffs
0-0
{-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan}
{-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan}
alpha_moments_count = 18
alpha_index_basic_count = 11
alpha_index_basic = {{0, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 1, 0}, {0, 0, 0, 1}, {0, 2, 0, 0}, {0, 1, 1, 0}, {0, 1, 0, 1}, {0, 0, 2, 0}, {0, 0, 1, 1}, {0, 0, 0, 2}, {1, 0, 0, 0}}
alpha_index_times_count = 14
alpha_index_times = {{0, 0, 1, 11}, {1, 1, 1, 12}, {2, 2, 1, 12}, {3, 3, 1, 12}, {4, 4, 1, 13}, {5, 5, 2, 13}, {6, 6, 2, 13}, {7, 7, 1, 13}, {8, 8, 2, 13}, {9, 9, 1, 13}, {0, 10, 1, 14}, {0, 11, 1, 15}, {0, 12, 1, 16}, {0, 15, 1, 17}}
alpha_scalar_moments = 9
alpha_moment_mapping = {0, 10, 11, 12, 13, 14, 15, 16, 17}
species_coeffs = {-nan}
moment_coeffs = {-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan}
"""
Also the training output is weird:
"""
WARNING:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!! WARNING WARNING WARNING !!!
!!! Read a configuration with (negative) Stress. !!!
!!! This feature will be removed soon! !!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
BFGS iterations count set to 500
BFGS convergence tolerance set to 1e-08
Energy weight: 1
Force weight: 0.01
Stress weight: 0
MTPR parallel training started
BFGS iter 0: f=-nan
BFGS iter 1: f=-nan
BFGS iter 2: f=-nan
BFGS iter 3: f=-nan
BFGS iter 4: f=-nan
BFGS iter 5: f=-nan
......
BFGS iter 499: f=-nan
step limit reached
MTPR training ended
Rescaling...
scaling = 0.000119874348127824, condition number = -nan
scaling = 0.000130772016139445, condition number = -nan
scaling = 0.000143849217753389, condition number = -nan
scaling = 0.000158234139528728, condition number = -nan
scaling = 0.000172619061304067, condition number = -nan
Rescaling to 0.000143849217753389... done
* * * TRAIN ERRORS * * *
Errors report
Energy:
Errors checked for 10 configurations
Maximal absolute difference = nan
Average absolute difference = nan
RMS absolute difference = nan
Energy per atom:
Errors checked for 10 configurations
Maximal absolute difference = nan
Average absolute difference = nan
RMS absolute difference = nan
Forces:
Errors checked for 540 atoms
Maximal absolute difference = -nan
Average absolute difference = -nan
RMS absolute difference = -nan
Max(ForceDiff) / Max(Force) = -nan
RMS(ForceDiff) / RMS(Force) = -nan
Stresses (in eV):
Errors checked for 10 configurations
Maximal absolute difference = -nan
Average absolute difference = -nan
RMS absolute difference = -nan
Max(StresDiff) / Max(Stres) = -nan
RMS(StresDiff) / RMS(Stres) = -nan
Virial stresses (in GPa):
Errors checked for 10 configurations
Maximal absolute difference = -nan
Average absolute difference = -nan
RMS absolute difference = -nan
Max(StresDiff) / Max(Stres) = -nan
RMS(StresDiff) / RMS(Stres) = -nan
"""
It seems the problem caused by '-nan' values given by .mtp files.
Thanks a lot and have a nice day!
The second line in garnet_formation_energy.ipynb should be:
from pymatgen.core import Structure
rather than
from pymatgen import Structure
I do not undersrtand the design of this method.
Why does it return a list of str?
That is a silly format.
Either make it just a simple string, or return a well-structured object like a pybtex.
Alternatively, there is no need for the citation to be returned as a method. It can just be in the documentation.
I wonder if the parallelization of n2p2 through mpi is transferred and can be used in the maml package.
Because of the BOWSR package/paper
https://github.com/topics/materials-discovery
Why does the function convert_docs() in /maml/utils/_data_conversion.py not handle virial_stress data?
Add how can I train with virial_stress data?
Thank you!
Instead of using CircleCI, we will be moving to github actions henceforth for testing and linting.
We will disable the CircleCI once all tests pass.
Hello,
I was trying to run the GAP example jupyter notebook in the mlearn repository. I modified it to use the maml package:
How can I fix this error?
Can someone please explain me what the _sanity_check function (line 99) in the ) _lammps.py is doing?
I am getting "Incompatible structure found" while trying to train some of the structures.
Dear Developers,
I ran into a problem while training a test snap model.
The fact is that for a supercell smaller than 12 angstroms, everything is fine. And for a larger supercell, the correlation between the predicted forces and the forces in the training set is zero.
The structures for the training set were obtained using the VASP. I attach two samples in the JSON format, obtained by an identical vasp script with the only difference being that one cell was 11.9 angstroms, and the other was 12.1 angstroms. Also I attach text files with original and predicted forces to visualize the difference.
What can be the reason of this effect?
My python script is following:
element_profile = {Al: {'r': 0.5, 'w': 1.0}} per_force_describer = BispectrumCoefficients(rcutfac=rcutfac, twojmax=6, element_profile=element_profile, quadratic=False, pot_fit=True, include_stress=False, n_jobs=n_threads, verbose=False) elem_features = per_force_describer.transform(train_structures) train_pool = pool_from(train_structures, train_energies, train_forces) _, elem_df = convert_docs(train_pool) y = elem_df['y_orig'] / elem_df['n'] x = elem_features weights = np.ones(len(elem_df['dtype']), ) weights[elem_df['dtype'] == 'energy'] = en_weight weights[elem_df['dtype'] == 'force'] = 1 weighted_model = LinearRegression() weighted_model.fit(x, y, sample_weight=weights) energy_indices = np.argwhere(np.array(elem_df["dtype"]) == "energy").ravel() forces_indices = np.argwhere(np.array(elem_df["dtype"]) == "force").ravel() weighted_predict_y = weighted_model.predict(x) original_energy = y[energy_indices] original_forces = y[forces_indices] weighted_predict_energy = weighted_predict_y[energy_indices] weighted_predict_forces = weighted_predict_y[forces_indices] file_fl = open('forces_linear.txt', "w") file_el = open('energies_linear.txt', "w") file_fl.write("orig_force, predict_force\n") for index in forces_indices: file_fl.write(str(y[index])+" "+str(weighted_predict_y[index])+"\n") file_el.write("orig_en, predict_en\n") for index in energy_indices: file_el.write(str(y[index])+" "+str(weighted_predict_y[index])+"\n") file_fl.close() file_el.close() RMSE = mean_squared_error(original_forces, weighted_predict_forces) print("Parameters = " + str([en_weight, r1 ,w1,rcutfac])+" /// RMSE = "+str(RMSE)) return(RMSE)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.