Giter Club home page Giter Club logo

Comments (3)

fsherry avatar fsherry commented on August 26, 2024

Hi Beliz,

Thank you for your interest in this work and for taking a detailed look at the code.

Regarding your first question, it is good to keep in mind what it means that a vector transforms according to the regular representation of a (finite) group: we identify each coordinate of the vector with an element of the group and the group acts on it by permuting the coordinates. Hence, each such vector has dimensionality equal to the order of the group, and the total width of the network is the multiplicity (16 in the code you linked) times the order of the group (6 in the code you linked). In fact, in our experiments in the paper we used a multiplicity of 24 and a group of order 4, but the same total width (96) is kept the same. This choice was made to ensure that the comparison with the "ordinary" networks is fair: in the background, the equivariant convolutions are performed using ordinary 2d convolutions with constrained kernels, so the "ordinary" network with total width 96 is capable of encoding the "equivariant" network too. A final question might remain regarding the choice of this total width. This is somewhat arbitrary; we found that this number is large enough to give good reconstructions and not cause any memory headaches when training, but there is no particular reason to believe this is the only number that could work.

As for the second question, in the end we did not initialise the proximal blocks as the identity. This was something we initially toyed with a bit because using the standard initialisations for the convolutional filters in the proximal blocks resulted in the proximal blocks having an expansive action on its inputs. Unfortunately, initialising the proximal blocks as the identity (at least in the way that is shown in the code) does not work well since large parts of the network remain dead (zero weights and no involvement in the forward pass so zero gradients). Instead, we ended up with something that can be seen as a compromise between the two ideas: in the formula below, Klift,i and Kproject,i were initialised with a standard initialisation for convolutional filters, while Kintermediate,i was initialised to be zero.

Screenshot 2021-08-09 at 12 38 40

I hope this helps clear some things up.

Ferdia

from equivariant_image_recon.

belizgunel avatar belizgunel commented on August 26, 2024

Thanks a lot for your thorough explanation, Ferdia -- very much appreciate it. It does clear things up. Just to make sure I understood you correctly about initialization, first_part and final_part in this code was initialized with standard initialization for convolutional filters as they correspond to lift and project operations, meanwhile middle_part --corresponding to K_intermediate operation -- was initialized to zero?

self.middle_part = torch.nn.Sequential(*[
Assuming that's what you meant, this initialization change gave us a large performance boost while trying to replicate your results.

from equivariant_image_recon.

fsherry avatar fsherry commented on August 26, 2024

Hi Beliz, that's right. To match the code that you reference above to what we did in the paper, n_res_blocks should equal 1, in which case middle_part will just be (id + phi ∘ Kintermediate,i). By initialising Kintermediate,i to be zero, middle_part will equal the identity at initialisation. The following two snippets show exactly how we did this in the experiments:

def proximal_constructor():
return PrimalProximalEquivariant(conv_params,
in_channels=2,
n_memory=args.n_memory,
feat_type_intermed=feat_type,
init_as_id=False)

for block in model.lfb.prox_blocks:
for res_block in block.middle_part:
res_block._set_equal_to_identity()

from equivariant_image_recon.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.