Comments (4)
Actually the first version of EnvPool has two separated threadpool, one for reset and one for step, and that can satisfy the aforementioned requirement. @mavenlin decided to change the pipeline to the current style, which is only one threadpool for all executions.
In my opinion, it's not quite user-friendly for the downstream application. For example, there's a similar issue thu-ml/tianshou#573. Many people (including me at that time) are not in favor of this approach.
@mavenlin could you please state the reason and detail here? I believe it's for speed concerns.
from envpool.
the first version of EnvPool has two separated threadpool, one for reset and one for step, and that can satisfy the aforementioned requirement.
This is a design choice. two thread pools can be generalized to different reset/step calling patterns, but results in a little lower performance, and the reset/step difference is mainly introduced by gym. However, the internal EnvPool is mainly for dm_control API compatibility. In dm_control, there's no such a big difference and we can utilize only one thread pool to reach the max performance.
Regarding gym: there's also a solution for both upstream (env) and downstream (training) applications. For env, you can set a state that indicate whether it has been reset in the last step or not, and if it is, nothing happens and the environment returns the last step's result; for downstream application, if it introduces (or assumes) the auto-reset wrapper into vectorized env, there's nothing difference when using this design.
from envpool.
It was so designed because it simplifies the APIs on both the user end and implementation end, at the sacrifice of some customization. But as far as I see, the case described in this issue is achievable with the current design, the way should be the following (Pseudocode):
# Async reset, which should return nothing.
envpool.async_reset()
# empty dict that stores the init state of envs
init_states = {}
while True:
state = envpool.recv()
# if it is the first time a env is ever received, set its init state.
for env_id in state.env_ids:
if env_id not in init_states:
init_states[env_id] = state
# do something else with the state
from envpool.
It was so designed because it simplifies the APIs on both the user end and implementation end, at the sacrifice of some customization. But as far as I see, the case described in this issue is achievable with the current design, the way should be the following (Pseudocode):
# Async reset, which should return nothing. envpool.async_reset() # empty dict that stores the init state of envs init_states = {} while True: state = envpool.recv() # if it is the first time a env is ever received, set its init state. for env_id in state.env_ids: if env_id not in init_states: init_states[env_id] = state # do something else with the state
@mavenlin @Trinkle23897
Thank you two so much for your response. And my probelem can be solved in this way~
from envpool.
Related Issues (20)
- [BUG] wrong handle in example? HOT 2
- [Question] Does EnvPool support Procgen's Exploration mode
- [Feature Request] 请问什么时候才能在windows上使用呀
- Does envpool support arm architecture? HOT 2
- [BUG] AssertionError: MiniGrid-Empty-5x5-v0 is not supported HOT 2
- Set values for action in C++ HOT 2
- [BUG] XLA interface does not work with jax>=0.4.16
- [BUG] XLA Segmentation Fault HOT 8
- [BUG] No Envs after installing from sources HOT 19
- metaclass conflict HOT 1
- [BUG] Unable to build from source HOT 6
- [Feature Request] Add Atari difficulty level and a game mode option. HOT 3
- [BUG] Build failes due to Bazel 7.0.0 breaking change
- output 'envpool/mujoco/assets_gym' of //envpool/mujoco:gen_mujoco_gym_xml is a directory; dependency checking of directories is unsound
- [Feature Request] A simple (and effective?) way to support cherry-picked env reset in `xla` mode
- [Feature Request] Get RAM State from Atari ALE HOT 5
- [BUG] Error running mujoco-gym tasks with the parameter xml_file HOT 1
- [BUG] incorrect parsing of actions in multiagent environment
- [Feature Request]Set random seed to each env HOT 1
- [BUG] Episode return is not recorded correctly in cleanRL's example
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from envpool.