loewex / greedy_infomax Goto Github PK

View Code? Open in Web Editor NEW

281.0 16.0 36.0 13.84 MB

Code for the paper: Putting An End to End-to-End: Gradient-Isolated Learning of Representations

Home Page: https://arxiv.org/abs/1905.11786

License: MIT License

Python 98.66% Shell 1.34%

deep-learning pytorch local-learning

greedy_infomax's People

Contributors

Stargazers

Watchers

greedy_infomax's Issues

Resnet Encoder Layer Numbers

With the Pre-activation ResNet Encoders that are used, my understanding of the layer numbers doesn't align with how you've labelled them.

Greedy_InfoMax/GreedyInfoMax/vision/models/FullModel.py

Lines 27 to 36 in 8f91dc2

 block_dims = [3, 4, 6, 6, 6, 6, 6] 

 num_channels = [64, 128, 256, 256, 256, 256, 256] 

 full_model = nn.ModuleList([]) 

 encoder = nn.ModuleList([]) 

 if opt.resnet == 34: 

 self.block = Resnet_Encoder.PreActBlockNoBN 

 elif opt.resnet == 50: 

 self.block = Resnet_Encoder.PreActBottleneckNoBN

The total number of blocks in block_dims is 37 and there is also the initial conv1. When using PreActBlockNoBN, 2 layers per block, does this not result in a ResNet75? When using PreActBottleneckNoBN, 3 layers per block, does this not result in a ResNet112?

Please let me know if I've misunderstood something.

The following problem occurred when I was building the model following the READme.md file. Is there any problem with the function InfoNCE_Loss？

The problem : RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512, 6144]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Is any Guide , that i can use InfoNCE_Loss in my custom model ?

Greedy_InfoMax/GreedyInfoMax/audio/models/loss_InfoNCE.py

Line 9 in d5a050b

class InfoNCE_Loss(loss.Loss):

Hi @loeweX i already read the code , but how can i use it , i am not sure some parameter in this class

update :
i success mod this loss, but when train the model , when i get the loss is 1.8xxx , the acc only 27% ~ 30% , is it normal ?

i have a problem

bash download_audio_data.sh can not run, error is "You don't have permission to access this resource."

i want to konw why ,thanks

Audio training time

What should be the training time (as in time per step, per epoch, or time to convergence) for the audio experiment? I'm running it on a Tesla K80 and it seems to be taking ~5.7 seconds per step, which I'm assuming is much slower than expected.

InfoNCE_Loss skipping

In your paper the following is stated:

For each patch xi,j in row i and column j of this grid, we predict up to K patches xi+K,j in the rows underneath, skipping the first overlapping patch xi+1,j

However, in the implementation of InfoNCE_Loss, skip_step is applied for all k predictions. This has meant that xi+k,j is being skipped for all k instead of only when k is 1. So nearby non-overlapping patches are also being skipped when k > 1.

Greedy_InfoMax/GreedyInfoMax/vision/models/InfoNCE_Loss.py

Lines 52 to 57 in 8f91dc2

 for k in range(1, self.k_predictions + 1): 

 ### compute log f(c_t, x_{t+k}) = z^T_{t+k} W_k c_t 

 # compute z^T_{t+k} W_k: 

 ztwk = ( 

 self.W_k[k - 1] 

 .forward(z[:, :, (k + skip_step) :, :]) # Bx, C , H , W

In simply changing skip_step to 0 after the first iteration I have seen an improvement. I haven't run it for long enough to compare this improvement to the results stated in your paper.

Training time and memory usage

Hi, GIM looks super cool and thanks for the code!
For vision and audio task, I would like to ask how does it take for the network to converge
and how much big memory should i need to train! (I only have one RTX 3080)
It would be appreciate to respond to this! :)

Same permutation for all audio samples?

Hello,

In the 3rd sampling strategy (sampling from same sequence) for audio subtask, I noticed that the permutation of negative samples is same for all audio sequences in the batch. This is not necessarily incorrect, but it can introduce some sort a bias based on locations of negative samples.

I think it would be better to have random permutations for all audio samples and the fix is easy :)

Reference :

Greedy_InfoMax/GreedyInfoMax/audio/models/loss_InfoNCE.py

Line 124 in 21b2aad

elif self.opt.sampling_method == 2:

Code for Vision Experiment

Hello there. I'd like to ask if there's any plan on the release of the code for vision experiment mentioned in the paper? Thanks!

Failure to compute gradient

Hi,

I have found your paper and code extremely interesting!

I am trying to run the vision training, but am coming across an error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 256, 1, 1]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

When I do torch.autograd.set_detect_anomaly(True) the issue is traced to line 57 in InfoNCE_Loss:

Greedy_InfoMax/GreedyInfoMax/vision/models/InfoNCE_Loss.py

Lines 55 to 57 in 8f91dc2

 ztwk = ( 

 self.W_k[k - 1] 

 .forward(z[:, :, (k + skip_step) :, :]) # Bx, C , H , W

Any idea why this is hapenning?

Pre-trained vision model?

Hi! Is there a pre-trained vision model available somewhere? It would be really helpful instead of having to re-train the model from scratch.

Thanks!
Nikhil

CUDNN_STATUS_NOT_SUPPORTED error caused by non-contiguous variable

Thanks for making the code available.

When running the command python -m GreedyInfoMax.vision.main_vision --download_dataset --save_dir vision_experiment I had problems with the following error:

CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

I traced the problem to the forward pass in Resnet_Encoder:

out = F.adaptive_avg_pool2d(z, 1)
out = out.reshape(-1, n_patches_x, n_patches_y, out.shape[1])
out = out.permute(0, 3, 1, 2)

The permute() operation seems to make the variable out non-contiguous (see https://stackoverflow.com/questions/48915810/pytorch-contiguous for discussion).

The issue is solved by making it contiguous again:

out = F.adaptive_avg_pool2d(z, 1)
out = out.reshape(-1, n_patches_x, n_patches_y, out.shape[1])
out = out.permute(0, 3, 1, 2).contiguous()

I hope this helps if someone help gets a similar problem.

System specifics:
Ubuntu 16.04.6 LTS
Python 3.6
CUDA 10.0
cudnn 7.6.4
Pytorch 1.4.0

How to speed up training

Nice work! I wonder how to speed up training and reduce memory usage. As I can see, the released code uses .detach() to prevant backpropagation, but I didn't find it can speed up training or reduce memory usage. Are there any other operations? Looking forward for your reply.

Question on parallel training

Hi, your work is very interesting, thanks for sharing the code!

As I understand, the losses for different sub-networks are calculated in a for loop in https://github.com/loeweX/Greedy_InfoMax/blob/master/GreedyInfoMax/vision/models/FullModel.py#L102
therefore, it is "asynchronous"

However, I have one question on parallel training:
This great blog says This reduces the amount of communication needed between modules tremendously and allows us to train modules on separate devices.
Does that mean that you put the three submodules on three GPUs and doing the gradient updates simultaneously?

Besides, is it possible to train different modules on one GPU in parallel?

Thanks very much for your time!

	block_dims = [3, 4, 6, 6, 6, 6, 6]
	num_channels = [64, 128, 256, 256, 256, 256, 256]

	full_model = nn.ModuleList([])
	encoder = nn.ModuleList([])

	if opt.resnet == 34:
	self.block = Resnet_Encoder.PreActBlockNoBN
	elif opt.resnet == 50:
	self.block = Resnet_Encoder.PreActBottleneckNoBN

	for k in range(1, self.k_predictions + 1):
	### compute log f(c_t, x_{t+k}) = z^T_{t+k} W_k c_t
	# compute z^T_{t+k} W_k:
	ztwk = (
	self.W_k[k - 1]
	.forward(z[:, :, (k + skip_step) :, :]) # Bx, C , H , W

loewex / greedy_infomax Goto Github PK

greedy_infomax's People

Contributors

Stargazers

Watchers

Forkers

greedy_infomax's Issues

Recommend Projects

Recommend Topics

Recommend Org