Comments (9)
Hi,
It seems that the encoder in actor never be updated either by loss or soft update (EMA), except that in the initialisation.
# tie encoders between actor and critic, and CURL and critic self.actor.encoder.copy_conv_weights_from(self.critic.encoder)
Only the encoder in critic/critic_target is updated by the critic_loss and the cpc.
Is there any insight for not updating the encoder in the actor?
from curl.
Hi @wassname
Thank you for pointing it out and I believe you are correct. Based on my very limited observation, optimizing the encoder once per call didn't affect the performance significantly in cheetah run. But it will be very helpful if someone could test it in more environments.
Thanks.
from curl.
Hi @IpadLi, I wondered on this a while back and mailed @MishaLaskin about it.
This is the question I asked:
Why you are not updating the Shared encoder with the Actor Loss? Is there any specific reason for this?
@MishaLaskin 's reply:
I found that doing this resulted in more stable learning from pixels but it is also an empirical design choice and can be changed
from curl.
Hi @tejassp2002 Thanks a lot.
from curl.
Hi, can we integrate the update_critic function and update_cpc function by adding the critic_loss and cpc_loss together?
Meanwhile, we only need two optimizers.
Is it feasible?
self.cpc_optimizer = torch.optim.Adam([self.CURL.W], lr=encoder_lr)
self.critic_optimizer = torch.optim.Adam(self.critic.parameters(), lr=critic_lr, betas=(critic_beta, 0.999))
loss = critic_loss + cpc_loss
loss.backward()
self.critic_optimizer.step()
self.cpc_optimizer.step()
from curl.
Hi,
It seems that the encoder in actor never be updated either by loss or soft update (EMA), except that in the initialisation.
# tie encoders between actor and critic, and CURL and critic self.actor.encoder.copy_conv_weights_from(self.critic.encoder)
Only the encoder in critic/critic_target is updated by the critic_loss and the cpc.
Is there any insight for not updating the encoder in the actor?
The work of SAC+AE (https://arxiv.org/pdf/1910.01741.pdf) suggests to use the gradient from critic only (no actor) to update the encoder. Since this repo is based on the implementation of SAC+AE (as said in readme), I think CURL just follows it.
from curl.
Hi @IpadLi, I wondered on this a while back and mailed @MishaLaskin about it. This is the question I asked:
Why you are not updating the Shared encoder with the Actor Loss? Is there any specific reason for this?
@MishaLaskin 's reply:
I found that doing this resulted in more stable learning from pixels but it is also an empirical design choice and can be changed
Hi, thanks for posting the reply from the author!
Yet I don't think the reply answers the question -- even if we don't update the encoder with the actor loss, why shouldn't the actor encoder weights be copied from the critic encoder weights after each update to the critic encoder using the critic loss and cpc loss?
It is a bit strange to me that two different encoders are used for the actor and the critic, where the paper seems to indicate there is only 1 shared encoder. Moreover, the weights of the actor encoder is never updated after initialization, so essentially only the MLP part of the actor is being trained/updated.
Update -- Sorry, the tie_weight function actually make the actor encoder and critic encoder share the same weights.
from curl.
Hi @IpadLi, I wondered on this a while back and mailed @MishaLaskin about it. This is the question I asked:
Why you are not updating the Shared encoder with the Actor Loss? Is there any specific reason for this?
@MishaLaskin 's reply:
I found that doing this resulted in more stable learning from pixels but it is also an empirical design choice and can be changed
Hi, thanks for posting the reply from the author! Yet I don't think the reply answers the question -- even if we don't update the encoder with the actor loss, why shouldn't the actor encoder weights be copied from the critic encoder weights after each update to the critic encoder using the critic loss and cpc loss? It is a bit strange to me that two different encoders are used for the actor and the critic, where the paper seems to indicate there is only 1 shared encoder. Moreover, the weights of the actor encoder is never updated after initialization, so essentially only the MLP part of the actor is being trained/updated.
Update -- Sorry, the tie_weight function actually make the actor encoder and critic encoder share the same weights.
Hello! Does it mean the weights of actor encoder are still same with the critic encoder after the critic encoder is updated?
from curl.
Related Issues (20)
- Memory Usage Increasing Continuously HOT 2
- A bug when cropping images?
- An error when computing CURL HOT 2
- Generating the labels with torch.arange HOT 2
- Sorry that l cannot reproduce some DMC results in the paper HOT 2
- Cannot train for Google football environment. HOT 1
- How to use this model with different environment
- What is "ema"? HOT 3
- Do you know why I might be getting this error when I run train.py?
- A bug: Segmentation fault (core dumped) HOT 1
- Questions about infinit bootstrap HOT 2
- About the equipment
- Can we integrate the update_critic function and update_cpc function together? HOT 1
- when bash scripts/run.sh
- Momentum update is never used in the code HOT 1
- Trouble reproducing results. Help much appreciated~ HOT 1
- encoder updated twice for curl loss HOT 1
- I know the issue will get no reply , but where is the moco in the code? HOT 1
- Using cross-entropy loss does not penalize negative samples HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from curl.