Giter Club home page Giter Club logo

Comments (5)

jsspencer avatar jsspencer commented on June 20, 2024

The reason is deciding where to spend time -- in developing code, testing it and in computational time.

Pretraining is just to create some initial state that is vaguely close to the ground state (ie within 10s of Hartrees). There's a deliberate choice for pretraining to both be simple and quite crude (so the optimisation doesn't need to break symmetry, for example). This is also why we use a small basis by default, for example. Better pretraining may or may not improve convergence -- I think there's (quickly) diminishing returns.

Note that this is not the only simplification we make during pre-training. There's also some discussion in (e.g.) #14 and maybe also in our papers.

from ferminet.

n-gao avatar n-gao commented on June 20, 2024

Thanks for the response! I see. Though, I was wondering since it seemed more involved to pad everything with zeros than training with the full matrix.

from ferminet.

dpfau avatar dpfau commented on June 20, 2024

from ferminet.

n-gao avatar n-gao commented on June 20, 2024

There seems to be a small misunderstanding. Maybe this figure helps clearing things up.
In the top left corner is the RHF Slater determinant. In pretrain.py:91 we then only select the two left blocks. Then, in pretrain.py:155 we put these two blocks to the diagonal and then fit to the FermiNet determinant.
My question is then, why don't just compute the MSE between directly between the RHF Slater and the FermiNet Slater.
image

from ferminet.

jsspencer avatar jsspencer commented on June 20, 2024

Note that we train against the UHF state by default rather than the RHF state. I am not convinced by your spin labels on the right-hand side -- line 90 selects the values of the alpha spin-orbitals evaluated at the positions of the alpha electrons and similar the the beta spin-orbitals and electrons.

The full_det option is experimental. Again, I don't think pretraining is particularly important and there was a choice to pretrain analogous elements for both full_det=True and full_det=False (i.e. a block diagonal, spin-factored wavefunction). Pretraining against a dense matrix instead also works (though a quick test on neon showed that it resulted in a network with a much higher initial energy).

More importantly training a neural network to as a function predictor by matching the outputs behaves quite poorly (e.g. arXiv:1706.04859) -- the pretraining algorithm is pretty crude and is really designed to give only a starting point which doesn't encounter numerical problems. FermiNet orbitals can be quite different from Hartree-Fock orbitals, so accurately representing Hartree-Fock from pretraining isn't a priority.

from ferminet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.