Hi, Great work, really enjoyed reading it, and thanks for open-sourc

Picking number of regular feature fields and init_as_id for equivariant nets about equivariant_image_recon HOT 3 CLOSED

fsherry commented on August 26, 2024

Picking number of regular feature fields and init_as_id for equivariant nets

from equivariant_image_recon.

Comments (3)

fsherry commented on August 26, 2024

Hi Beliz,

Thank you for your interest in this work and for taking a detailed look at the code.

Regarding your first question, it is good to keep in mind what it means that a vector transforms according to the regular representation of a (finite) group: we identify each coordinate of the vector with an element of the group and the group acts on it by permuting the coordinates. Hence, each such vector has dimensionality equal to the order of the group, and the total width of the network is the multiplicity (16 in the code you linked) times the order of the group (6 in the code you linked). In fact, in our experiments in the paper we used a multiplicity of 24 and a group of order 4, but the same total width (96) is kept the same. This choice was made to ensure that the comparison with the "ordinary" networks is fair: in the background, the equivariant convolutions are performed using ordinary 2d convolutions with constrained kernels, so the "ordinary" network with total width 96 is capable of encoding the "equivariant" network too. A final question might remain regarding the choice of this total width. This is somewhat arbitrary; we found that this number is large enough to give good reconstructions and not cause any memory headaches when training, but there is no particular reason to believe this is the only number that could work.

As for the second question, in the end we did not initialise the proximal blocks as the identity. This was something we initially toyed with a bit because using the standard initialisations for the convolutional filters in the proximal blocks resulted in the proximal blocks having an expansive action on its inputs. Unfortunately, initialising the proximal blocks as the identity (at least in the way that is shown in the code) does not work well since large parts of the network remain dead (zero weights and no involvement in the forward pass so zero gradients). Instead, we ended up with something that can be seen as a compromise between the two ideas: in the formula below, K_lift,i and K_project,i were initialised with a standard initialisation for convolutional filters, while K_{intermediate,i} was initialised to be zero.

I hope this helps clear some things up.

Ferdia

from equivariant_image_recon.

belizgunel commented on August 26, 2024

Thanks a lot for your thorough explanation, Ferdia -- very much appreciate it. It does clear things up. Just to make sure I understood you correctly about initialization, first_part and final_part in this code was initialized with standard initialization for convolutional filters as they correspond to lift and project operations, meanwhile middle_part --corresponding to K_intermediate operation -- was initialized to zero?

equivariant_image_recon/equivariant_ip/proximal_blocks/primal_equivariant.py

Line 65 in 8733faf

self.middle_part = torch.nn.Sequential(*[

Assuming that's what you meant, this initialization change gave us a large performance boost while trying to replicate your results.

from equivariant_image_recon.

fsherry commented on August 26, 2024

Hi Beliz, that's right. To match the code that you reference above to what we did in the paper, n_res_blocks should equal 1, in which case middle_part will just be (id + phi ∘ K_{intermediate,i}). By initialising K_{intermediate,i} to be zero, middle_part will equal the identity at initialisation. The following two snippets show exactly how we did this in the experiments:

equivariant_image_recon/scripts/equivariant_brain_mri.py

Lines 92 to 97 in 8733faf

 def proximal_constructor(): 

 return PrimalProximalEquivariant(conv_params, 

 in_channels=2, 

 n_memory=args.n_memory, 

 feat_type_intermed=feat_type, 

 init_as_id=False)

equivariant_image_recon/scripts/equivariant_brain_mri.py

Lines 113 to 115 in 8733faf

 for block in model.lfb.prox_blocks: 

 for res_block in block.middle_part: 

 res_block._set_equal_to_identity()

from equivariant_image_recon.

Picking number of regular feature fields and init_as_id for equivariant nets about equivariant_image_recon HOT 3 CLOSED

Comments (3)

Related Issues (1)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	def proximal_constructor():
	return PrimalProximalEquivariant(conv_params,
	in_channels=2,
	n_memory=args.n_memory,
	feat_type_intermed=feat_type,
	init_as_id=False)

	for block in model.lfb.prox_blocks:
	for res_block in block.middle_part:
	res_block._set_equal_to_identity()