Hi, I noticed that the default parameter for full_det</c

Pretraining with `full_det=True` about ferminet HOT 5 CLOSED

google-deepmind commented on June 20, 2024

Pretraining with `full_det=True`

from ferminet.

Comments (5)

jsspencer commented on June 20, 2024

The reason is deciding where to spend time -- in developing code, testing it and in computational time.

Pretraining is just to create some initial state that is vaguely close to the ground state (ie within 10s of Hartrees). There's a deliberate choice for pretraining to both be simple and quite crude (so the optimisation doesn't need to break symmetry, for example). This is also why we use a small basis by default, for example. Better pretraining may or may not improve convergence -- I think there's (quickly) diminishing returns.

Note that this is not the only simplification we make during pre-training. There's also some discussion in (e.g.) #14 and maybe also in our papers.

from ferminet.

n-gao commented on June 20, 2024

Thanks for the response! I see. Though, I was wondering since it seemed more involved to pad everything with zeros than training with the full matrix.

from ferminet.

dpfau commented on June 20, 2024

Except we are training the FermiNet to match "the product of the two (spin-up and spin-down) matrices obtained by Hartree Fock". The determinant of a block-diagonal matrix is just the product of the determinant of the blocks.

…

On Tue, May 18, 2021 at 11:00 AM Nicholas Gao ***@***.***> wrote: Thanks for the response! I see. Though, I was wondering since it seemed more involved to pad everything with zeros than training with the full matrix. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#27 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABDACGZD2SV3HELKRGBXZTTOI3C5ANCNFSM45AJYCSQ> .

from ferminet.

n-gao commented on June 20, 2024

There seems to be a small misunderstanding. Maybe this figure helps clearing things up.
In the top left corner is the RHF Slater determinant. In pretrain.py:91 we then only select the two left blocks. Then, in pretrain.py:155 we put these two blocks to the diagonal and then fit to the FermiNet determinant.
My question is then, why don't just compute the MSE between directly between the RHF Slater and the FermiNet Slater.

from ferminet.

jsspencer commented on June 20, 2024

Note that we train against the UHF state by default rather than the RHF state. I am not convinced by your spin labels on the right-hand side -- line 90 selects the values of the alpha spin-orbitals evaluated at the positions of the alpha electrons and similar the the beta spin-orbitals and electrons.

The full_det option is experimental. Again, I don't think pretraining is particularly important and there was a choice to pretrain analogous elements for both full_det=True and full_det=False (i.e. a block diagonal, spin-factored wavefunction). Pretraining against a dense matrix instead also works (though a quick test on neon showed that it resulted in a network with a much higher initial energy).

More importantly training a neural network to as a function predictor by matching the outputs behaves quite poorly (e.g. arXiv:1706.04859) -- the pretraining algorithm is pretty crude and is really designed to give only a starting point which doesn't encounter numerical problems. FermiNet orbitals can be quite different from Hartree-Fock orbitals, so accurately representing Hartree-Fock from pretraining isn't a priority.

from ferminet.

Pretraining with `full_det=True` about ferminet HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent