zincware / ipsuite Goto Github PK
View Code? Open in Web Editor NEWMachine Learned Interatomic Potential Tools
Home Page: https://ipsuite.readthedocs.io
License: Eclipse Public License 2.0
Machine Learned Interatomic Potential Tools
Home Page: https://ipsuite.readthedocs.io
License: Eclipse Public License 2.0
As mentioned in #104 the SOAP parameters currently contain the periodic key. This should not be a parameter but defined by the data.
ConfigurationSelection_1_kernel:
soap:
_type: soap_parameter_dataclass
value:
l_max: 7
n_jobs: -1
n_max: 7
periodic: true
r_cut: 9.0
rbf: gto
sigma: 1.0
weighting: null
IPSuite/ipsuite/configuration_selection/index.py
Lines 22 to 23 in 6e541be
Can't do IndexSelection(index=[-1])
#47 is missing tests
If you switch from a MLModel (or any Node) that uses params.yaml
to one that does not use params.yaml
at all, the parameter in params.yaml
remain and can cause issues (ZnTrack Problem)
forces are alreay per atom - check the analyseprediction node for force errors
Replace the prediction dataclass by a fields.Atoms
Check if uncertainties are available and then use them in the plots
Hello Maintainers,
I am working with a dataset of around 5000 molecular configurations, which are not necessarily generated through MD simulations. I am interested in employing the "Atomic Energy Selection" methodology, as outlined in this paper, to assemble a dataset for training my machine learning model.
In addition to this, the other methods available in this repository, such as the kernel selection method, seem to align well with my use-case requirements.
To better understand the implementation and to ensure I am utilizing these methodologies correctly, it would be greatly beneficial if you could provide an example code for the Atomic Energy Selection and Kernel Selection methodologies.
I have also spoken about this issue to Samuel Tovey via email.
It should be easy to have a Benchmark Dataset, QM9, MD17, ... as a repository that one can clone and then work with.
Currently the limiting factor is ZnH5MD periodic pbc and chaning size of atoms.
As discussed:
self.status = f"Temperature Check {temperature} < {self.max}"
# error
self.status = f"Temperature Check failed: last {temperature} > {self.max}"
# und dann
def __str__(self):
return self.status
Line 144 in 4228e0b
Add an energy + uncertainty plot to the uncertainty selection node where selected points are marked
In addition to #81 or in general for all checks allow:
IPSuite/ipsuite/models/mace_model.py
Line 46 in aa21501
Default to None and do dynmically in post_init + post_load.
Have a minimal amount of configurations that must pass to trigger ThresholdSelection
The IPS Analysis module starts to get really crowded.
Because these Nodes are some of the most important Nodes of IPS I think we should put them in some sort of order.
IPSuite/ipsuite/analysis/__init__.py
Lines 1 to 20 in 7de8327
It's also important to mention, if the Node location is changed inside IPS, the respective projects using these Nodes will have a changed cmd zntrack run ipsuite.analysis.PredictWithModel --name PredictWithModel
in the dvc.yaml
file and therefore, it will be recomputed.
I'm open for suggestions on how to organize these Nodes inside IPS.
Make all nodes to be imported from ips.nodes.<name>
The can still be in their respective modules but use module overwrite and lazy loading.
IPSuite/ipsuite/calculators/cp2k.py
Lines 184 to 191 in 6e541be
We could provide Node Groups, that define multiple nodes together. E.g. metrics could be predict, metrics, force decomp all together.
Discussion: Furthermore, we don't really need PredictEithModel as it is basically Singlepoint where the calc is the model. Then we pass true/pred to the metrics node. This would probably make uncertainty easier as well, because we don't need the prediction class anymore. This would be easier with Node Groups.
test_data = []
train_data = []
test_data.append(
ips.configuration_selection.RandomSelection(data=geopt, n_configurations=50)
)
train_data.append(
ips.configuration_selection.RandomSelection(data=geopt, n_configurations=50, exclude=test_data)
)
support something like this via:
if self.exclude is not None:
if self.exclude_configurations is None:
self.exclude_configurations = {}
if not isinstance(self.exclude, list):
self.exclude = [self.exclude]
for exclude in self.exclude:
for key in exclude.selected_configurations:
if key in self.exclude_configurations:
self.exclude_configurations[key].extend(
exclude.selected_configurations[key]
)
else:
self.exclude_configurations[key] = exclude.selected_configurations[
key
]
Maybe a thermostat that can do that?
Would replace
IPSuite/ipsuite/analysis/model/dynamics.py
Line 162 in 5b02f8d
should we address this in an other PR?
Originally posted by @Tetracarbonylnickel in #81 (comment)
Line 12 in 39e5698
Have the following scheme:
MD with UncertaintyCheck starting from different configurations
UncertaintyCheck stops, then use ItemSelection(item=-1)
and run DFT.
Retrain Model on all the data points
Repeat as often as you wish.
Have a PyPack
Node as a pure-Python drop-in replacement for the Packmol
Node.
Maybe we can add http://m3g.iqm.unicamp.br/packmol/nmols.shtml
to get a density approximate?
Clone and run them in the test suite
We should check why it is so slow:
is it ZnTrack or other modules?
gap seems to write train_atoms.extxyz.idx
which is not in the .gitignore
Instead of hard coding the inputs for mace create a yaml input file:
e.g. convert --hidden_irreps='128x0e + 128x1o'
to
hidden_irreps: 128x0e + 128x1o
IPSuite/ipsuite/calculators/ase_md.py
Line 104 in 23192c2
This should be zntrack.zn.outs
We could add a Node, that on a condition, e.g. AL doesn't produce new configurations stops the graph by raising an Error.
Would this be required in some scenarios?
Lines 116 to 123 in 052fad9
You can't import XTB
ImportError: xtb C extension unimportable, cannot use C-API
if nequip (and I guess torch) is imported.
-> We need lazy importing of modules
IPSuite/ipsuite/calculators/__init__.py
Line 17 in ef12234
We need lazy loading here, because apax can not be installed. This break IPS at the moment.
IPS doesn't have docs.
support:
data: HasAtoms | List[HasAtoms] | Atoms | list[atoms]
DVC actually supports comparing images
Just like with data we can also provide trained models for faster tests
The ModelAnalysis Node only analyses the force magnitude. We should add a metric and plot for force components (all in one plot should be sufficient)
Sometimes you want to run a ProcessSingleAtom
for multiple configurations.
Currently this is very difficult so I'd propose:
method = ips.analysis.BoxScaleAnalysis(data=None)
with project:
data = ips.AddData()
volume_scan = ips.Map(data=data, method=method)
project.run()
Some calculators, e.g. CP2K or XTB require a conda setup for tests.
Maybe we can parametrize and run these tests in a dedicated runner.
If you change AddData
the ConfigurationSelection
will rerun, but because the output is just a list of indices, e.g. [1, 2, 3]
the UpdateAtoms
might not be rerun automatically!
flowchart TD
node1["AddData"]
node2["ConfigurationSelection"]
node3["UpdateAtoms"]
node1-->node2
node2-->node3
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.