kinetica-jl / kinetica.jl Goto Github PK
View Code? Open in Web Editor NEWAutomated chemical reaction networking with long-timescale kinetic simulations in Julia
Home Page: https://kinetica-jl.github.io/Kinetica.jl/
License: Other
Automated chemical reaction networking with long-timescale kinetic simulations in Julia
Home Page: https://kinetica-jl.github.io/Kinetica.jl/
License: Other
When restarting an IterativeExplore
and loading in the current state of a network through import_network()
, imported networks lack knowledge of inert species, despite networks being initially constructed with these species in mind. This is because 'raw' networks (the underlying directory tree being explored) are never made aware of any inert species, so they don't get added unless a calculator modifies the network down the line.
This leads to inconsistencies and crashes when handling solution objects based off of networks without initially setup inert species, as inert species that are added by calculators are added to the end of the active SpeciesData
, while earlier solutions will have them placed just after the initial reactants.
Could handle this with a few different methods:
inert.in
file in rdir_head
on network initialisation that is then read in and always placed first during import_network()
import_network()
as an optional argument, inserting them in before importing the rest of the network.Kinetica.jl/src/exploration/molecule_system.jl
Lines 249 to 263 in 2ebced5
system_from_smiles
incorrectly states that it takes a String
input, leading users to believe that this can be a single SMILES with multiple species. This method actually requires a vector of individual SMILES strings.
Docstring needs to be updated, but could consider making this also work with single SMILES strings as well.
When constructing reactant/product systems for input into TS-finding algorithms like NEB, atom indices are required to be consistent between all provided geometries. This requires that the endpoint XYZs are consistently atom mapped.
Currently, when creating an XYZ molecule system, atom indices will always be determined by the order that the component molecules are given in - atoms in molecule 2 are concatenated in after those of molecule 1, etc. Since reactant and product systems are not guaranteed to have the same atoms in each molecule, this can lead to atom mapping consistency being broken.
To resolve this, we need to introduce a procedure for atom mapping that can be used to reorder the atoms in reactant/product systems to ensure consistency. Atom maps can be constructed purely from SMILES using an approach like RXNMapper, but this relies on a predictive approach that is not always guaranteed to be accurate. Instead, we could construct atom maps directly from sampled reactions when they are read in, since CDE ensures correct atom mapping and this is only broken when separating molecules in CRN ingest.
The RxData.reacs
and RxData.prods
fields have been redundant for a while now, as RxData
always requires a SpeciesData
to function when species properties are required anyway. These could be good places to put the new atom-mapped reactants and products respectively. This should be a minimal overhaul, but requires careful handling in CRN I/O.
Currently, networks explored via IterativeExplore
that are terminated early (due to an error, exceeding walltime, etc.) can be restarted directly from the contents of their rdir_head
. However, this is only useful as long as rdir_head
is always available.
If running on distributed resources like HPC, network exploration should be performed within a scratch space to allow for the currently heavy IO requirements of CDE runs. However, these scratch spaces are usually semi-volatile and in many cases cease to exist once a job is finished. This wipes the entire rdir_head
directory tree, preventing restarts.
While rdir_head
could be periodically backed up to non-volatile storage, this would be incredibly expensive and would nullify many of the benefits of performing exploration on a scratch space. Instead, we could use the already implemented incomplete network saves (which can be saved into a non-scratch directory) as checkpoints and allow for partial (or full) network restoration from them when rdir_head
is not present (e.g. when it has been wiped by end of job). This would work as follows:
rdir_head
exists. If it does, the network within may either be full (present in the directory tree from the initial level) or partial (present in the directory tree only from a certain point, as it has been loaded from a checkpoint before).In step 1, when there is a full network it can be directly loaded. However, when there is only a partial network, a checkpoint file corresponding to the exploration progress made from the level(s) before those that exist in rdir_head
MUST be available for exploration to continue without error.
Calling KPM calculator can result in the following:
ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 9724 and 9722
where the first length is rd.nr
and the second in calc.Ea
. This is due to KPM identifying 2 duplicate reactions and removing them (N.B. needs an assertion in KineticaKPM for this). These reactions should have been previously identified by rhash
in push!(::RxData...; unique_rxns=true)
but are not, so some duplicate reactions must be generating unique hashes somehow. As this applies on fresh RxData
s reconstructed from the same reaction tree, whatever input reactions must be different enough to always randomly generate different hashes. Need to check logic for rhash
construction.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.