agi-brain / xuance Goto Github PK

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library

Home Page: https://xuance.readthedocs.io/

License: MIT License

Python 100.00%

multi-agent-reinforcement-learning reinforcement-learning reinforcement-learning-library decision-making mindspore pytorch tensorflow2 atari mujoco mpe

xuance's People

Contributors

Stargazers

Watchers

Forkers

doraesura einck0 linyiyn zhaoypz syfflash kianameizaychik 194180194 liuduy xiongkkll miogens qunjiewang yhrrrrrr yuejia12 weisaisai stars0601 wqynew being-chip 15261471200 xiuuuix huaxiaojilzg tangxxxxxx biometricsworld percy1507 kevincycq pennyluren xrwangwang jkun-332 kk-xi bonnieccc lyahengzhi anwaei junz99 rafa-cxg nengsun wenzhangliu kkayak ericyang3597 muogugu chuangzhang1999 matthewcweston sichenyi666 mtai889 hedang99 xulele0211 banren456 junfuhuang jorzen blog666 qst75693 eliaschaung sliptogether fanliaoooo garfieldhh optimizertan ibrahim440236 fanbbbb walhalla-summary lkaho niemu allensmile sundogs8603 wheatgao wzcai99 sunwuzhou03 catalpa-yuan lihaibineric morphlng slowdive1 zhangzhuobys ybl998877 irenebus zhangmingcheng28 qingyuanzi1024 wyq199321 guspan-tanadi dilli-bhaskar biggoing123 ren-alvin reax1x xiaobeike01 zhaochengniu beaulolve hilanzy

xuance's Issues

COMA didn't work

I don't know what's wrong, other algorithms could run as normal.

gym.error.NameNotFound: Environment Pong doesn't exist in namespace ALE.

When I run a demo of Atari after correctly installing xuance, it raises an error:

raise error.NameNotFound( gym.error.NameNotFound: Environment Pong doesn't exist in namespace ALE.

AttributeError: accessing private attribute '_max_episode_steps' is prohibited

您好，这是版本不兼容的问题吗？

A.L.E: Arcade Learning Environment (version 0.8.1+53f58b7)
[Powered by Stella]
D:\Anaconda\envs\pycarla\lib\site-packages\gym\utils\passive_env_checker.py:32: UserWarning: WARN: A Box observation space has an unconventional shape (neither an image, nor a 1D vector). We recommend flattening the observation to have only a 1D vector or use a custom policy to properly process the data. Actual observation shape: (210, 160)
"A Box observation space has an unconventional shape (neither an image, nor a 1D vector). "
Traceback (most recent call last):
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\test.py", line 7, in
is_test=False)
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\common\common_tools.py", line 166, in get_runner
runner = run_REGISTRYargs[0].runner if type(args) == list else run_REGISTRYargs.runner
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\torch\runners\runner_drl.py", line 21, in init
super(Runner_DRL, self).init(self.args)
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\torch\runners\runner_basic.py", line 11, in init
self.envs = make_envs(args)
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\environment_init_.py", line 68, in make_envs
return REGISTRY_VEC_ENV[config.vectorize]([thunk for _ in range(config.parallels)])
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\environment\gym\gym_vec_env.py", line 233, in init
super(DummyVecEnv_Atari, self).init(env_fns)
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\environment\gym\gym_vec_env.py", line 158, in init
self.envs = [fn() for fn in env_fns]
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\environment\gym\gym_vec_env.py", line 158, in
self.envs = [fn() for fn in env_fns]
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\environment_init.py", line 58, in _thunk
config.obs_type, config.frame_skip, config.num_stack, config.img_size, config.noop_max)
File "D:\CODE\TrafficBigData\ReinforcementLearning\xuance-master\xuance\environment\gym\gym_env.py", line 123, in init
self.max_episode_length = self.env._max_episode_steps
File "D:\Anaconda\envs\pycarla\lib\site-packages\gym\core.py", line 240, in getattr
raise AttributeError(f"accessing private attribute '{name}' is prohibited")
AttributeError: accessing private attribute '_max_episode_steps' is prohibited

FileNotFoundError

FileNotFoundError: Could not find module 'E:\PycharmProjects\xuanpolicy-master\xuanpolicy\environment\magent2\magent.dll' (or one of its dependencies). Try using the full path with constructor syntax.

是否有与自定义环境相关的更多教程？如何将自定义环境与算法绑定并运行？

不好意思打扰了，我目前只在官方文档中找到了一些简短的教程。请问是否有关于自定义环境的更详细教程？另外，能否请教一下如何将自定义环境与算法绑定并运行？非常感谢！

https://xuance.readthedocs.io/zh/latest/documents/api/environments.html#/id2

如何将动作值转换为one-hot编码

我自定义了一个多智能体环境，在奖励函数中需要用到动作的具体值，比如我的动作空间是离散的，我需要得到0，1，2这样具体的值，但是我在使用mappo算法跑的时候发现动作并不是具体的值，请问如何修改呢？

Failed building wheel for mpi4py

When install xuance via pip install xuance from PyPI or pip install -e . from github, it raises the following errors:

o: undefined reference to `opal_bitmap_t_class'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_info_set_value_enum'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_hash_table_get_first_key_uint32'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `mca_base_component_list_item_t_class'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_infosubscribe_change_info'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `orte_info_register_framework_params'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `mca_base_framework_open'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `orte_session_dir_cleanup'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_info_show_opal_version'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_datatype_add'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_class_finalize'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `mca_base_var_group_get_count'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_datatype_resize'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_hash_table_set_value_uint64'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `MPIR_being_debugged'
      /home/wzliu/anaconda3/envs/xuance_py39/compiler_compat/ld: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so: undefined reference to `opal_list_t_class'
      collect2: error: ld returned 1 exit status
      failure.
      removing: _configtest.c _configtest.o
      error: Cannot link MPI programs. Check your configuration!!!
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for mpi4py
  Building wheel for pathtools (setup.py) ... done
  Created wheel for pathtools: filename=pathtools-0.1.2-py3-none-any.whl size=8791 sha256=8dadb41a5d290b4f741fe5ecb16f6cb7b6d0cc7686410acdae435ce4c66f92b6
  Stored in directory: /home/wzliu/.cache/pip/wheels/b7/0a/67/ada2a22079218c75a88361c0782855cc72aebc4d18d0289d05
Successfully built gym moviepy pathtools
Failed to build mpi4py
ERROR: Could not build wheels for mpi4py, which is required to install pyproject.toml-based projects

There is no 'SubprocVecEnv_MAgent', only 'DummyVecEnv_MAgent'? When will 'SubprocVecEnv_MAgent' be added?

Many thanks. I just found 'SubprocVecEnv_Pettingzoo', and there is no code referencing 'SubprocVecEnv_Pettingzoo'.

from mpi4py import MPI ImportError: DLL load failed: 找不到指定的模块。

i completely followed the installation, Step1-3, then pip install xuanpolicy[torch], but when i import xuanpolicy, it came to "from mpi4py import MPI " and error "ImportError: DLL load failed: 找不到指定的模块。", What's going on ?

在pip install xuance[all]时遇到问题

如图，请问怎么解决啊？

MPE env not working.

FileNotFoundError Traceback (most recent call last)
in <cell line: 1>()
----> 1 runner = xp.get_runner(method='maddpg',
2 env='mpe',
3 env_id='simple_spread',
4 is_test=False)

2 frames
/usr/local/lib/python3.10/dist-packages/xuanpolicy/common/common_tools.py in get_config(file_name)
22
23 def get_config(file_name):
---> 24 with open(file_name, "r") as f:
25 try:
26 config_dict = yaml.load(f, Loader=yaml.FullLoader)

FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.10/dist-packages/xuanpolicy/configs/maddpg/mpe/simple_spread.yaml'

Simple_Reference solving cleanly, but Simple_Speaker_Listener not being solved by MADDPG

Hello, I like this library quite a bit - it's been very useful so far. I've been aiming to produce a set of benchmark solutions to the MPE environments using it, but I've run into an issue with asymmetrical environments struggling to perform right under MPE. When testing Simple-Speaker-Listener, the listener moves to the center of the landmarks, rather than to the landmark indicated.

Compounding the strangeness is the fact that simple_reference is solved quite cleanly. I'm currently running on version 1.7, with the following training code:

import xuanpolicy as xp
import torch
# Reference, SSL, World_Comm
from pettingzoo.mpe import simple_reference_v3, simple_speaker_listener_v4, simple_world_comm_v3, simple_spread_v3, simple_push_v3
import imageio
from IPython import display
import types
import os
import numpy as np
from collections import defaultdict
device = 'cuda' if torch.cuda.is_available() else 'cpu'

env_type = simple_speaker_listener_v4
env = env_type.parallel_env(max_cycles=25, continuous_actions=True, render_mode="rgb_array")

asymmetric = True # We train one agent if symmetrical, multiple otherwise.
agent_name = 'maddpg' if not asymmetric else ['maddpg'] * len(env.observation_spaces)

env_class_name = env_type.__name__.split('.')[-1]
env_name = f"{'_'.join(env_class_name.split('_')[:-1])}"
cpath = f'/usr/local/lib/python3.10/dist-packages/xuanpolicy/configs/maddpg/mpe/{env_name}.yaml'
runner = xp.get_runner(agent_name=agent_name, env_name=f"mpe/{env_name}", is_test=False)

The configuration I'm using is as follows - it's identical to the config file used to train simple_reference successfully, save for the name of the environment.

agent: "MADDPG"  # the learning algorithms_marl
env_name: "mpe"
env_id: "simple_speaker_listener_v4"
continuous_action: True
policy: "MADDPG_policy"
representation: "Basic_Identical"
vectorize: "Dummy_MAS"
runner: "MARL"

representation_hidden_size: [32, ]  # the units for each hidden layer
actor_hidden_size: [256, ]
critic_hidden_size: [256, ]
activation: 'ReLU'
activation_action: 'sigmoid'

lr_a: 0.01  # learning rate for actor
lr_c: 0.001  # learning rate for critic
tau: 0.001  # soft update for target networks
sigma: 0.1  # random noise for continuous actions
clip_grad: 0.5

buffer_size: 200000
batch_size: 256
gamma: 0.95  # discount factor

training_steps: 30000
training_frequency: 1

n_tests: 5
test_period: 100
consider_terminal_states: False  # if consider the terminal states when calculate target Q-values.

use_obsnorm: False
use_rewnorm: False
obsnorm_range: 5
rewnorm_range: 5

logdir: "./logs/maddpg/"
modeldir: "./models/maddpg/"

Is there something that I've configured incorrectly? Similar settings seemed to work in the original MADDPG paper.

Encountering problems during installation

Building wheel for box2d-py (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [16 lines of output]
Using setuptools (version 68.0.0).
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-310
creating build\lib.win-amd64-cpython-310\Box2D
copying library\Box2D\Box2D.py -> build\lib.win-amd64-cpython-310\Box2D
copying library\Box2D_init_.py -> build\lib.win-amd64-cpython-310\Box2D
creating build\lib.win-amd64-cpython-310\Box2D\b2
copying library\Box2D\b2_init_.py -> build\lib.win-amd64-cpython-310\Box2D\b2
running build_ext
building 'Box2D._Box2D' extension
swigging Box2D\Box2D.i to Box2D\Box2D_wrap.cpp
swig.exe -python -c++ -IBox2D -small -O -includeall -ignoremissing -w201 -globals b2Globals -outdir library\Box2D -keyword -w511 -D_SWIG_KWARGS -o Box2D\Box2D_wrap.cpp Box2D\Box2D.i
error: command 'swig.exe' failed: None
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for box2d-py
Running setup.py clean for box2d-py
Failed to build box2d-py
ERROR: Could not build wheels for box2d-py, which is required to install pyproject.toml-based projects

OSError: [WinError 126] 找不到指定的模块。

会出现这个错误，不知道怎么解决，

请问能否在examples文件夹中添加一下IDDPG或者IQL这种独立训练的MARL算法的实现呢，不胜感激

自定义多智能体环境

在new_env_mas.py文件中，self.state_space = Box(low=0, high=1, shape=[self.dim_state, ], dtype=np.float32, seed=self.seed)这句代码是定义智能体的状态空间。但是这样的话每个智能体的观测就都是一样的对吧。假如我现在有两个智能体，观测是给定的两列数据，也就是每个智能体的观测都对应一列数据，那这样的话每个智能体的观测范围就是不一致的。比如:
obs_space1 = Box(low=self.data1.min(), high=self.data1.max(), shape=(self.dim_obs,), dtype=np.float32, seed=self.seed)
obs_space2 = Box(low=self.data2.min(), high=self.data2.max(), shape=(self.dim_obs,), dtype=np.float32, seed=self.seed)
那请问这样的智能体该如何在new_env_mas.py文件中定义观测空间呢？

NameError: name 'Toy_Env' is not defined

刘老师您好，我是一名多智能体强化学习的初学者，在尝试运行您发布的xuance框架时，选择的是tensorflow+gpu，遇到了如下报错，
Traceback (most recent call last):
File "C:\Users\50\Desktop\MADRLTEST\main.py", line 5, in
is_test=False)
File "F:\anaconda\envs\xpolicy\lib\site-packages\xuanpolicy\common\common_tools.py", line 121, in get_runner
from xuanpolicy.tensorflow.runners import REGISTRY as run_REGISTRY
File "F:\anaconda\envs\xpolicy\lib\site-packages\xuanpolicy\tensorflow\runners_init_.py", line 1, in
from .runner_drl import Runner_DRL as DRL_runner
File "F:\anaconda\envs\xpolicy\lib\site-packages\xuanpolicy\tensorflow\runners\runner_drl.py", line 2, in
from xuanpolicy.tensorflow.agents import get_total_iters
File "F:\anaconda\envs\xpolicy\lib\site-packages\xuanpolicy\tensorflow\agents_init_.py", line 36, in
from .policy_gradient.pdqn_agent import PDQN_Agent
File "F:\anaconda\envs\xpolicy\lib\site-packages\xuanpolicy\tensorflow\agents\policy_gradient\pdqn_agent.py", line 6, in
class PDQN_Agent(Agent):
File "F:\anaconda\envs\xpolicy\lib\site-packages\xuanpolicy\tensorflow\agents\policy_gradient\pdqn_agent.py", line 12, in PDQN_Agent
device: str = 'cpu'):
NameError: name 'Toy_Env' is not defined
报错称这个toy_env没有得到定义，我不太清楚代码本身的含义，看到报错中的device: str = 'cpu'也尝试修改成了gpu，并没有起作用，请问我应该怎么修改代码？或者是否是我在环境安装的某个节点出现了问题？
恳请您百忙之中抽空查看我的问题，万分感谢！

Windows系统测试用例时出现下面问题请问应该如何解决？

测试代码：
import xuance
runner = xuance.get_runner(method='dqn',
env='classic_control',
env_id='CartPole-v1',
is_test=False)
runner.run()
报错：
test_dqn.py:None (test_dqn.py)
test_dqn.py:6: in
runner.run()
xuance\torch\runners\runner_drl.py:93: in run
self.agent.save_model("final_train_model.pth")
xuance\torch\agents\agent.py:90: in save_model
self.learner.save_model(model_path)
xuance\torch\learners\learner.py:25: in save_model
torch.save(self.policy.state_dict(), model_path)
C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\serialization.py:422: in save
with _open_zipfile_writer(f) as opened_zipfile:
C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\serialization.py:309: in _open_zipfile_writer
return container(name_or_buffer)
C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\serialization.py:287: in init
super(_open_zipfile_writer_file, self).init(torch._C.PyTorchFileWriter(str(name)))
E RuntimeError: Parent directory E:\JYC_CODE\强化学习\xuance./models/dqn does not exist.
collected 0 items / 1 error

是由于Windows系统下文件路径和Linux文件路径不一致导致的么？请问应该如何解决？谢谢~

大佬您好，是否有新增算法的教程或者模板

R了个T~
想基于这个框架加自己的算法具体需要修改哪些内容呢？

Is there any examples/tutorials for the magent2 environment?

Many thanks.

Parent directory does not exist

我运行了图上命令

然后遇到了这个报错，是需要修改配置文件中的保存路径吗

AttributeError: partially initialized module 'xuance' has no attribute 'get_runner' (most likely due to a circular import)

Communication based MARL algorithms?

Hello, are you considering adding some communication based MARL algorithms to xuance? For example, CommNet, IC3Net, I2C, etc.

from mpi4py import MPI ImportError: DLL load failed: 找不到指定的模块。

那现在windows是不是没法使用呢？我也是遇到了这个问题，也安装了这个pip install magent2，但是依然不行。

          您好，出现这个错误是因为当前版本没考虑到Windows系统下，magent2环境的动态链接库文件。您可以试着安装一下这个环境，再看是否还会出现该错误？

pip install magent2
我们后期会修补这个问题，感谢您的反馈!

Originally posted by @wenzhangliu in #7 (comment)

pygame.error: video system not initialized

运行demo_marl.py 训练正常，render报错

测试用例出现问题

运行测试用例

import xuance
runner = xuance.get_runner(method='dqn',
                           env='classic_control',
                           env_id='CartPole-v1',
                           is_test=False)
runner.run()

正常完成训练，但将is_test改为True后运行，则报错：

Traceback (most recent call last):
  File "E:\Reinforcement Learning\xuance\cartpole.py", line 7, in <module>
    runner.run()
  File "D:\Anaconda\envs\xuance\lib\site-packages\xuance\torch\runners\runner_drl.py", line 85, in run
    self.agent.load_model(self.agent.model_dir_load, self.args.seed)
  File "D:\Anaconda\envs\xuance\lib\site-packages\xuance\torch\agents\agent.py", line 93, in load_model
    self.learner.load_model(path, seed)
  File "D:\Anaconda\envs\xuance\lib\site-packages\xuance\torch\learners\learner.py", line 38, in load_model
    model_path = os.path.join(path, model_names[-1])
IndexError: list index out of range

环境：windows 10，xuance 1.0.10，torch 1.13.0+cu117
辛苦各位解答！

win10系统遇到一些运行时候的问题，想问问有朋友们解决了吗

examples中的qmix_sc2.py运行报错问题

运行qmix_sc2.py时报错：
Traceback (most recent call last):
File "qmix_sc2.py", line 377, in
runner = Runner(args)
File "qmix_sc2.py", line 70, in init
self.envs = make_envs(args)
File "/home/ustc-lc1/miniconda3/envs/env_ywk/lib/python3.7/site-packages/xuance/environment/init.py", line 66, in make_envs
raise NotImplementedError
NotImplementedError

How to use QMIX algorithm？

When I was using the MADDPG example, I had an error replacing MADDPG_Agents with QMIX_Agents
mixer = QMIX_mixer(config.dim_state[0], config.hidden_dim_mixing_net, config.hidden_dim_hyper_net,
TypeError: 'NoneType' object is not subscriptable
I would like to know how to correctly use QMIX algorithm, hope the author can give an example, thank you