Comments (5)
The reason is deciding where to spend time -- in developing code, testing it and in computational time.
Pretraining is just to create some initial state that is vaguely close to the ground state (ie within 10s of Hartrees). There's a deliberate choice for pretraining to both be simple and quite crude (so the optimisation doesn't need to break symmetry, for example). This is also why we use a small basis by default, for example. Better pretraining may or may not improve convergence -- I think there's (quickly) diminishing returns.
Note that this is not the only simplification we make during pre-training. There's also some discussion in (e.g.) #14 and maybe also in our papers.
from ferminet.
Thanks for the response! I see. Though, I was wondering since it seemed more involved to pad everything with zeros than training with the full matrix.
from ferminet.
from ferminet.
There seems to be a small misunderstanding. Maybe this figure helps clearing things up.
In the top left corner is the RHF Slater determinant. In pretrain.py:91
we then only select the two left blocks. Then, in pretrain.py:155
we put these two blocks to the diagonal and then fit to the FermiNet determinant.
My question is then, why don't just compute the MSE between directly between the RHF Slater and the FermiNet Slater.
from ferminet.
Note that we train against the UHF state by default rather than the RHF state. I am not convinced by your spin labels on the right-hand side -- line 90 selects the values of the alpha spin-orbitals evaluated at the positions of the alpha electrons and similar the the beta spin-orbitals and electrons.
The full_det
option is experimental. Again, I don't think pretraining is particularly important and there was a choice to pretrain analogous elements for both full_det=True
and full_det=False
(i.e. a block diagonal, spin-factored wavefunction). Pretraining against a dense matrix instead also works (though a quick test on neon showed that it resulted in a network with a much higher initial energy).
More importantly training a neural network to as a function predictor by matching the outputs behaves quite poorly (e.g. arXiv:1706.04859) -- the pretraining algorithm is pretty crude and is really designed to give only a starting point which doesn't encounter numerical problems. FermiNet orbitals can be quite different from Hartree-Fock orbitals, so accurately representing Hartree-Fock from pretraining isn't a priority.
from ferminet.
Related Issues (20)
- How does training time scale w.r.t. model size? HOT 1
- Jax install - issue with correct version number HOT 1
- AttributeError: module 'jax.core' has no attribute 'extract_call_jaxpr' HOT 1
- Jax error running on A100 GPU (everything is okay on CPU) HOT 2
- unable to setup HOT 1
- The proper way to cite FermiNet repo HOT 1
- Ground State Energies HOT 2
- Question about pbc ewald part. HOT 2
- nan when training with 'adam' HOT 1
- About configs HOT 3
- Question About load Checkpoint HOT 1
- Evaluating logprob using batch_network in train HOT 1
- Issue on running pytest HOT 5
- Extension of PBC code to 1D HOT 7
- Something went wrong in RepeatedDenseBlock.update_curvature_matrix_estimate HOT 2
- Different results obtained from the paper for ch3nh2 HOT 2
- kfac_jax error when running H2 example script HOT 2
- Upstream breaking change in `kfac-jax`
- KeyError raised after burn-in MCMC steps HOT 1
- Logdet Bug Similar to e9f8c64 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ferminet.