Motivation In my project, I set all env's initial state to the <co

[Feature Request] async_reset supports multiple calls until getting obs from all envs about envpool HOT 4 CLOSED

sail-sg commented on July 16, 2024

[Feature Request] async_reset supports multiple calls until getting obs from all envs

from envpool.

Comments (4)

Trinkle23897 commented on July 16, 2024

Actually the first version of EnvPool has two separated threadpool, one for reset and one for step, and that can satisfy the aforementioned requirement. @mavenlin decided to change the pipeline to the current style, which is only one threadpool for all executions.

In my opinion, it's not quite user-friendly for the downstream application. For example, there's a similar issue thu-ml/tianshou#573. Many people (including me at that time) are not in favor of this approach.

@mavenlin could you please state the reason and detail here? I believe it's for speed concerns.

from envpool.

Trinkle23897 commented on July 16, 2024

the first version of EnvPool has two separated threadpool, one for reset and one for step, and that can satisfy the aforementioned requirement.

This is a design choice. two thread pools can be generalized to different reset/step calling patterns, but results in a little lower performance, and the reset/step difference is mainly introduced by gym. However, the internal EnvPool is mainly for dm_control API compatibility. In dm_control, there's no such a big difference and we can utilize only one thread pool to reach the max performance.

Regarding gym: there's also a solution for both upstream (env) and downstream (training) applications. For env, you can set a state that indicate whether it has been reset in the last step or not, and if it is, nothing happens and the environment returns the last step's result; for downstream application, if it introduces (or assumes) the auto-reset wrapper into vectorized env, there's nothing difference when using this design.

from envpool.

mavenlin commented on July 16, 2024

It was so designed because it simplifies the APIs on both the user end and implementation end, at the sacrifice of some customization. But as far as I see, the case described in this issue is achievable with the current design, the way should be the following (Pseudocode):

# Async reset, which should return nothing.
envpool.async_reset()

# empty dict that stores the init state of envs
init_states = {} 

while True:
  state = envpool.recv()
  # if it is the first time a env is ever received, set its init state.
  for env_id in state.env_ids:
    if env_id not in init_states:
      init_states[env_id] = state
  # do something else with the state

from envpool.

LuciusMos commented on July 16, 2024

It was so designed because it simplifies the APIs on both the user end and implementation end, at the sacrifice of some customization. But as far as I see, the case described in this issue is achievable with the current design, the way should be the following (Pseudocode):
# Async reset, which should return nothing.
envpool.async_reset()

# empty dict that stores the init state of envs
init_states = {} 

while True:
  state = envpool.recv()
  # if it is the first time a env is ever received, set its init state.
  for env_id in state.env_ids:
    if env_id not in init_states:
      init_states[env_id] = state
  # do something else with the state

@mavenlin @Trinkle23897
Thank you two so much for your response. And my probelem can be solved in this way~

from envpool.

[Feature Request] async_reset supports multiple calls until getting obs from all envs about envpool HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent