Comments (3)
Do you want it inside envpool package or in the example folder?
I think it should be in example
. I don't want to put tianshou's code in envpool/
either, so that the library's code is clean enough.
from envpool.
because I cannot step in a particular env... (env.send() does exist but env.recv() does not garantee the result to be from the same env).
Not quite sure your approach, but now envpool supports this feature when num_envs == batch_size
envpool/envpool/atari/atari_envpool_test.py
Lines 80 to 108 in 3375b13
Line 30 in 3375b13
Indeed this feature lacks documentation, I'll add later...
I can also make a PR if you think it makes sense to integrate it directly into envpool (would make it easier for people already using gym / SB3 to adopt envpool ;))
Awesome! Looking forward to that.
from envpool.
Thanks for the heads up =)
My updated code, I'll try to make a PR tomorrow ;). Do you want it inside envpool package or in the example folder?
from typing import Optional
import envpool
import gym
import numpy as np
import torch as th
from envpool.python.protocol import EnvPool
from gym.envs.registration import EnvSpec
from stable_baselines3 import PPO
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.vec_env import VecEnvWrapper, VecMonitor
from stable_baselines3.common.vec_env.base_vec_env import (
VecEnv,
VecEnvObs,
VecEnvStepReturn,
VecEnvWrapper,
)
from stable_baselines3.common.evaluation import evaluate_policy
# Force PyTorch to use only one threads
# make things faster for simple envs
th.set_num_threads(1)
num_envs = 4
env_id = "Pendulum-v0"
seed = 0
use_env_pool = True
class VecAdapter(VecEnvWrapper):
"""
Convert EnvPool object to a Stable-Baselines3 (SB3) VecEnv.
:param venv: The envpool object.
"""
def __init__(self, venv: EnvPool):
venv.num_envs = venv.spec.config.num_envs
super().__init__(venv=venv)
def step_async(self, actions: np.ndarray) -> None:
self.actions = actions
def reset(self) -> VecEnvObs:
return self.venv.reset()
def seed(self, seed: Optional[int] = None) -> None:
# You can only seed EnvPool env by calling envpool.make()
pass
def step_wait(self) -> VecEnvStepReturn:
obs, rewards, dones, info_dict = self.venv.step(self.actions)
infos = []
# Convert dict to list of dict
# and add terminal observation
for i in range(self.num_envs):
infos.append(
{
key: info_dict[key][i]
for key in info_dict.keys()
if isinstance(info_dict[key], np.ndarray)
}
)
if dones[i]:
infos[i]["terminal_observation"] = obs[i]
obs[i] = self.venv.reset(np.array([i]))
return obs, rewards, dones, infos
if use_env_pool:
env = envpool.make(env_id, env_type="gym", num_envs=num_envs, seed=seed)
env.spec.id = env_id
env = VecAdapter(env)
env = VecMonitor(env)
else:
env = make_vec_env(env_id, n_envs=num_envs)
# Tuned hyperparams for Pendulum-v0
model = PPO(
"MlpPolicy",
env,
n_steps=1024,
learning_rate=1e-3,
use_sde=True,
sde_sample_freq=4,
gae_lambda=0.95,
gamma=0.9,
verbose=1,
seed=seed,
)
# model = PPO(
# "MlpPolicy",
# env,
# learning_rate=1e-3,
# gae_lambda=0.95,
# gamma=0.9,
# verbose=1,
# seed=seed,
# )
try:
model.learn(100_000)
except KeyboardInterrupt:
pass
# Agent trained on envpool version should also perform well on regular Gym env
test_env = gym.make(env_id)
# Test with EnvPool
mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=20)
print(f"EnvPool - {env_id}")
print(f"Mean Reward: {mean_reward:.2f} +/- {std_reward:.2f}")
# Test with Gym
mean_reward, std_reward = evaluate_policy(model, test_env, n_eval_episodes=20, warn=False)
print(f"Gym - {env_id}")
print(f"Mean Reward: {mean_reward:.2f} +/- {std_reward:.2f}")
from envpool.
Related Issues (20)
- metaclass conflict HOT 1
- [BUG] Unable to build from source HOT 6
- [Feature Request] Add Atari difficulty level and a game mode option. HOT 3
- [BUG] Build failes due to Bazel 7.0.0 breaking change
- output 'envpool/mujoco/assets_gym' of //envpool/mujoco:gen_mujoco_gym_xml is a directory; dependency checking of directories is unsound
- [Feature Request] A simple (and effective?) way to support cherry-picked env reset in `xla` mode HOT 1
- [Feature Request] Get RAM State from Atari ALE HOT 5
- [BUG] Error running mujoco-gym tasks with the parameter xml_file HOT 1
- [BUG] incorrect parsing of actions in multiagent environment
- [Feature Request]Set random seed to each env HOT 1
- [BUG] Episode return is not recorded correctly in cleanRL's example
- Opinions on running envpool on a dedicated simulator server with e.g. REST API HOT 3
- [BUG] Atari Breakout does not reset with episodic_live=True
- What's the mean of `timestep.observation.players`?
- [BUG] Using XLA crashes out
- [Feature Request] Have a render() function exposed for Gym environments to see agent performance
- [Feature Request] Support for DM Lab2d Environments
- [Feature Request] Do this project support like gym.vector.asyncvecenv? HOT 3
- [BUG] Minigrid is not supported as per documentation
- [All parallel environments have env-id 0 ] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from envpool.