Just collecting some initial brainstorming ideas on validation in advance of Monday's meeting:
Validating the overall MultiTargetDesign
sampler
Comparing alchemical hydration free energies with relative hydration free energies
We can test the overall sampler by having the sampler compute relative solvation free energies between linear alkanes (methane, ethane, propane, n-butane, ...) with something like
target_samplers = { solvent_sampler : 1.0, vacuum_sampler : -1.0 }
designer = MultiTargetDesign(target_samplers)
designer.run()
and comparing the relative free energies to differences in absolute solvation free energies. The reference values could either be computed with YANK or could be computed inside perses
using the NCMC module to switch bidirectionally between solvent and vacuum.
Validating alchemical NCMC
There are already two tests in there:
Alchemical deletion / reinsertion
Deleting and reinserting the same sets of atoms should give a free energy difference (computed via EXP
) within a few statistical errors of zero. I have this threshold set to 6*sigma
for right now.
NCMC switching for a harmonic oscillator
The NCMC integrator currently drags a harmonic well around in a bidirectional switching experiment and checks that the BAR
estimate of free energy differences is zero.
Validating AbsoluteAlchemicalFactory
There are already a battery of tests in alchemy
, but we could add a test to perses
which makes sure that when atoms are alchemically eliminated, the only contributions to the energy when these atoms are moved comes from valence energy terms.
Validating the geometry engine
Null transformations
Null transformations---such as napthalene-to-napthalene with manually-encoded atom mappings---will be useful for making sure that there is no unexpected asymmetry. We should get equal samplings of both napthalenes. While a cool demonstration (and one we should include int he paper), this is unfortunately not the most demanding test since there could still be bugs and yet the result will still be symmetric because some contribution cancel out for both states.
Marginal distributions of internal coordinates
If we just run a SAMSSampler
or properly-weighted ExpandedEnsembleSampler
without using dynamics to propagate in between chemical switches and using instantaneous Monte Carlo (rather than NCMC) in the chemical switches, provided we randomly choose from the list of potential atomic alignments, we should get ergodic sampling of a series of small alkanes in vacuum. The marginal distributions of internal coordinates (bonds, angles, torsions) for a given molecule should match up with marginal distributions sampled for that molecule using GHMC. I'm unclear on whether or not this should also result in equal visits all chemical species---I think it should if we get all of the contributions correct, but I'm not convinced that I can prove it.
Analytical forward and reverse probabilities
For a linear chain of beads W-X-Y-Z with a forcefield that includes only bonds, angles, and torsions with a single Fourier component, placing or deleting Z should have an analytically computable Cartesian probability density function (which we already have code to compute) that can be compared to the numerical probability density function we actually use.
ExpandedEnsembleSampler
Harmonic oscillators
If we have a series of states that differ in Hamiltonian by the spring constant of a harmonic oscillator, we can impose analytically-computed weights and make sure we sample each state uniformly. We can use openmmtools.testsystems.HarmonicOscillator
for this purpose; I can add the analytical free energy and normalizing constant to the list of analytically-available properties.
SAMSSampler
Harmonic oscillators
We can initialize the SAMS sampler with all zeros and make sure that multiple simulations converge to the analytically-computed free energies of harmonic oscillators.