Giter Club home page Giter Club logo

db-football's Introduction

img_v2_4a7d4460-005b-4ab9-a316-472f873ec93g

DB-Football

license Release Version PRs Welcome

This repo provides a simple, distributed and asynchronous multi-agent reinforcement learning framework for the Google Research Football environment. Currently, it is dedicated for Google Research Football environment with the cooperative part implemented in IPPO/MAPPO and the competitive part implemented in PSRO/Simple League. In the future, we will also release codes for other related algorithms and environments.

Our codes are based on Light-MALib, which is a simplified version of MALib with restricted algorithms and environments but certain enhancements, like distributed async-training, league-like multiple population training, detailed tensorboard logging. If you are also interested in other Multi-Agent Learning algorithms and environments, you may also refer to MALib for more details.

Citation

Song, Y., Jiang, H., Tian, Z. et al. An Empirical Study on Google Research Football Multi-agent Scenarios. Mach. Intell. Res. (2024). https://doi.org/10.1007/s11633-023-1426-8

@article{song2024empirical,
  title={An Empirical Study on Google Research Football Multi-agent Scenarios},
  author={Song, Yan and Jiang, He and Tian, Zheng and Zhang, Haifeng and Zhang, Yingping and Zhu, Jiangcheng and Dai, Zonghong and Zhang, Weinan and Wang, Jun},
  journal={Machine Intelligence Research},
  pages={1--22},
  year={2024},
  publisher={Springer}
}

For experiment on academy scenario, please see our new repository : GRF_MARL

Contents

  1. Install
  2. Run Experiments
  3. Benchmark 11_vs_11 1.0 hard bot
  4. GRF toolkits
  5. Benchmark policy
  6. Tensorboard tags
  7. Documentation
  8. Contact
  9. Join Us

Install

You can use any tool to manage your python environment. Here, we use conda as an example.

  1. install conda/minconda.
  2. conda create -n light-malib python==3.9 to create a new conda env.
  3. activate the env by conda activate light-malib when you want to use it or you can add this line to your .bashrc file to enable it everytime you login into the bash.

Install Light-MALib, PyTorch and Google Research Football

  1. In the root folder of this repo (with the setup.py file), run pip install -r requirement.txt to install dependencies of Light-MALib.
  2. In the root folder of this repo (with the setup.py file), run pip install . or pip install -e . to install Light-MALib.
  3. Follow the instructions in the official website https://pytorch.org/get-started/locally/ to install PyTorch (for example, version 1.13.0+cu116).
  4. Follow the instructions in the official repo https://github.com/google-research/football and install the Google Research Football environment.

Add a New Football Game Scenario

  1. You may use python -c "import gfootball;print(gfootball.__file__)" or other methods to locate where gfootball pacakage is.
  2. Go to the directory of gfootball pacakage, for example, /home/username/miniconda3/envs/light-malib/lib/python3.8/site-packages/gfootball/.
  3. Copy .py files under scenarios folder in our repo to scenarios folder in the gfootball pacakage.

Run Experiments

  1. If you want to run experiments on a small cluster, please follow ray's official instructions to start a cluster. For example, use ray start --head on the master, then connect other machines to the master following the hints from command line output.
  2. python light_malib/main_pbt.py --config <config_file_path> to run a training experiment. An example is given by train_light_malib.sh.
  3. python light_malib/scripts/play_gr_football.py to run a competition between two models.

Benchmark 11_vs_11 1.0 hard bot

Beats 1.0 hard bot under multi-agent 11v11 full-game scenraios within 10 hours using IPPO, taking advantage of glitches in built-in logics.

Google Reseach Football Toolkit

Currently, we provide the following tools for better study in the field of Football AI.

  1. Google Football Game Graph: A data structure representing a game as a tree structure with branching indicating important events like goals or intercepts.

  1. Google Football Game Debugger: A single-step graphical debugger illustrating both 3D and 2D frames with detailed frame data, such as the movements of players and the ball.

Benchmark Policy

At this stage, we release some of our trained model for use as initializations or opponents. Model files are available on Google Drive and Baidu Wangpan.

Tensorboard tags explained

DataServer:

  1. alive_usage_mean/std: mean/std usage of data samples in buffer;
  2. mean_wait_time: total reading waiting time divided reading counts;
  3. sample_per_minute_read: number of samples read per minute;
  4. sample_per_minute_write: number of samples written per minute;

PSRO:

  1. Elo: Elo-rate during PBT;
  2. Payoff Table: plot of payoff table;

Rollout:

  1. bad_pass,bad_shot,get_intercepted,get_tackled,good_pass,good_shot,interception,num_pass,num_shot,tackle, total_move,total_pass,total_possession,total_shot: detailed football statistics;
  2. goal_diff: goal difference of the training agent (positive indicates more goals);
  3. lose/win: expected lose/win rate during rollout;
  4. score: expected scores durig rollout, score for a single game has value 0 if lose, 1 if win and 0.5 if draw;

RolloutTimer

  1. batch: timer for getting a rollout batch;
  2. env_core_step: timer for simulator stepping time;
  3. env_step: total timer for an enviroment step;
  4. feature: timer for feature encoding;
  5. inference: timer for policy inference;
  6. policy_update: timer for pulling policies from remote;
  7. reward: timer for reward calculation;
  8. rollout: total timer for one rollout;
  9. sample: timer for policy sampling;
  10. stats: timer for collecting statistics;

Training:

  1. Old_V_max/min/mean/std: value estimate at rollout;
  2. V_max/min/mean/std: current value estimate;
  3. advantage_max/min/mean/std: Advantage value;
  4. approx_kl: KL divergence between old and new action distributions;
  5. clip_ratio: proportion of clipped entries;
  6. delta_max/min/mean/std: TD error;
  7. entropy: entropy value;
  8. imp_weights_max/min/mean/std: importance weights;
  9. kl_diff: variation of approx_kl;
  10. lower_clip_ratio: proportion of up-clipping entries;
  11. upper_clip_ratio: proportion of down-clipping entries;
  12. policy_loss: policy loss;
  13. training_epoch: number of training epoch at each iteration;
  14. value_loss: value loss

TrainingTimer:

  1. compute_return: timer for GAE compute;
  2. data_copy: timer for data copy when processing data;
  3. data_generator: timer for generating data;
  4. loss: total timer for loss computing;
  5. move_to_gpu: timer for sending data to GPU;
  6. optimize: total timer for an optimization step;
  7. push_policy: timer for pushing trained policies to the remote;
  8. train_step: total timer for a training step;
  9. trainer_data: timer for get data from local_queue;
  10. trainer_optimize: timer for a optimization step in the trainer;

Documentation

Under construction, stay tuned :)

Contact

If you have any questions about this repo, feel free to leave an issue. You can also contact current maintainers, YanSong97 and DiligentPanda, by email.

Join Us

Get Interested in our project? Or have great passions in:

  1. Multi-Agent Learning and Game AI
  2. Operation Research and Optimization
  3. Robotics and Control
  4. Visual and Graphic Intelligence
  5. Data Mining and so on

Welcome! Why not take a look at https://digitalbrain.cn/talents?

With the leading scientists, enginneers and field experts, we are going to provide Better Decisions for Better World!

Digital Brain Laboratory

Digital Brain Laboratory, Shanghai, is co-founded by the founding partner and chairman of CMC Captital, Mr. Ruigang Li, and world-renowned scientist in the field of decision intelligence, Prof. Jun Wang.

Recruitment

Recruitment for Students & Internships

db-football's People

Contributors

diligentpanda avatar yansong97 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

db-football's Issues

Run Train Example raise error "Box" object is not iterable

run the main_pbt.py get a TypeError: "Box" object is not iterable.
Note: I change the gym to gymnasium.

The Traceback as follow
Traceback (most recent call last):
File "/home/zhp/code/DB-Football/light_malib/main_pbt.py", line 126, in
main()
File "/home/zhp/code/DB-Football/light_malib/main_pbt.py", line 114, in main
runner.run()
File "/home/zhp/code/DB-Football/light_malib/framework/pbt_runner.py", line 95, in run
self.scheduler.initialize(self.cfg.populations)
File "/home/zhp/code/DB-Football/light_malib/framework/scheduler/psro_scheduler.py", line 81, in initialize
self.agent_manager.gen_new_policy(agent_id, self.population_id)
File "/home/zhp/code/DB-Football/light_malib/agent/agent_manager.py", line 108, in gen_new_policy
policy_id, policy = self.agents[agent_id].gen_new_policy(population_id)
File "/home/zhp/code/DB-Football/light_malib/agent/agent.py", line 70, in gen_new_policy
policy_id, policy = population.gen_new_policy()
File "/home/zhp/code/DB-Football/light_malib/agent/agent.py", line 131, in gen_new_policy
policy_id, policy = self.policy_factory.gen_new_policy()
File "/home/zhp/code/DB-Football/light_malib/agent/policy_factory.py", line 73, in gen_new_policy
policy_id, policy = self.init(self.new_policy_ctr)
File "/home/zhp/code/DB-Football/light_malib/agent/policy_factory.py", line 117, in init
pid, policy = self.init_from_random(
File "/home/zhp/code/DB-Football/light_malib/agent/policy_factory.py", line 219, in init_from_random
policy = policy_cls(
File "/home/zhp/code/DB-Football/light_malib/algorithm/mappo/policy.py", line 127, in init
actor = model.Actor(
File "/home/zhp/code/DB-Football/light_malib/model/gr_football/basic_5/init.py", line 31, in init
super().init(
File "/home/zhp/code/DB-Football/light_malib/algorithm/common/rnn_net.py", line 35, in init
self.base = get_model(model_config)(
File "/home/zhp/code/DB-Football/light_malib/algorithm/common/model.py", line 208, in builder
model = handler(
File "/home/zhp/code/DB-Football/light_malib/algorithm/common/model.py", line 90, in init
self._feature_norm = nn.LayerNorm(self.input_dim)
File "/home/zhp/miniconda3/envs/football/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 155, in init
self.normalized_shape = tuple(normalized_shape) # type: ignore[arg-type]
TypeError: 'Box' object is not iterable

ValueError in training

Hi, when i tried to replicate your code, i meet some issues. i can not find where the problem is or how to solve it, could you help me?
my environment is builted the same as you recommend, the system is ubuntu 18.04 LTS.
there are 2 gpus : 1080Ti & titan X
in the code, I only modified the 'num_workers' and 'batch_size' in the YAML file to match my hardware.
when i run python light_malib/main_pbt.py --config light_malib/expr/gr_football/expr_10_vs_10_psro.yaml,It generated the following error message:
`
(/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF) lxd@lxd-T630:/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football$ python light_malib/main_pbt.py --config light_malib/expr/gr_football/expr_10_vs_10_psro.yaml
[2023-09-28 09:18:34,036][WARNING] No active cluster detected, will create local ray instance.
[2023-09-28 09:18:44,991][WARNING] ============== Cluster Info ==============
{'node_ip_address': '192.168.1.109', 'raylet_ip_address': '192.168.1.109', 'redis_address': '192.168.1.109:6379', 'object_store_address': '/tmp/ray/session_2023-09-28_09-18-34_037912_47469/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2023-09-28_09-18-34_037912_47469/sockets/raylet', 'webui_url': None, 'session_dir': '/tmp/ray/session_2023-09-28_09-18-34_037912_47469', 'metrics_export_port': 55494, 'node_id': 'a8211a7e16deb107246a6dfd4b68c7d43f1a31ddb9fdba7c482c3b64'}
[2023-09-28 09:18:44,993][WARNING] * cluster resources:
{'accelerator_type:G': 1.0, 'GPU': 2.0, 'object_store_memory': 17054784307.0, 'memory': 34109568615.0, 'node:192.168.1.109': 1.0, 'CPU': 48.0}
[2023-09-28 09:18:44,993][WARNING] this worker ip: 192.168.1.109
[2023-09-28 09:18:44,994][WARNING] Automatically set master ip to local ip address: 192.168.1.109
[2023-09-28 09:18:46,480][INFO] AgentManager initialized
[2023-09-28 09:18:46,514][WARNING] use meta solver type: nash
[2023-09-28 09:18:46,991][INFO] PBTRunner psro initialized
[2023-09-28 09:18:46,991][INFO] PolicyFactory_agent_0_default new policy ctr starts at -1
[2023-09-28 09:18:46,995][WARNING] use model type: gr_football.built_in_11
(pid=47592) [2023-09-28 09:18:49,787][INFO] DataServer initialized
(pid=47595) [2023-09-28 09:18:49,798][INFO] PolicyServer initialized
[2023-09-28 09:18:50,411][INFO] Load initial policy built_in_11 from light_malib/trained_models/gr_football/11_vs_11/built_in
[2023-09-28 09:18:50,426][WARNING] use model type: gr_football.basic_11
[2023-09-28 09:18:50,479][WARNING] agent_0: agent_0-default-0 is initialized from random
[2023-09-28 09:18:50,479][WARNING] policy agent_0-default-0 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 09:18:50,523][WARNING] after initialization:

policy_ids: ['built_in_11', 'agent_0-default-0'] populations:

policy_ids:['built_in_11', 'agent_0-default-0']

policy_ids:['built_in_11', 'agent_0-default-0']

[2023-09-28 09:18:50,524][WARNING] Evaluation rollouts (num: 50) for 3 policy combinations: [{'agent_0': {'built_in_11': 1.0}, 'agent_1': {'built_in_11': 1.0}}, {'agent_0': {'built_in_11': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}, {'agent_0': {'agent_0-default-0': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}]
(pid=47611) [2023-09-28 09:18:51,072][INFO] TrainingManager initialized
(pid=47610) [2023-09-28 09:18:51,149][INFO] RolloutManager initialized
(pid=47606) [2023-09-28 09:19:02,415][INFO] DataPrefetcher initialized
(pid=47599) [2023-09-28 09:19:02,593][INFO] trainer_1 (local rank: 1) initialized
(pid=47609) [2023-09-28 09:19:02,603][INFO] trainer_0 (local rank: 0) initialized
Elo = dict_items([('built_in_11', 1015.631846603239), ('agent_0-default-0', 984.368153396761)])
[2023-09-28 09:30:57,920][INFO] policy_data: [('built_in_11', 'built_in_11'):{'payoff': 5.551115123125783e-17, 'score': 0.5, 'win': 0.28, 'lose': 0.28, 'my_goal': 0.43, 'goal_diff': 0.0}],[('built_in_11', 'agent_0-default-0'):{'payoff': 1.0, 'score': 1.0, 'win': 1.0, 'lose': 0.0, 'my_goal': 3.883116883116883, 'goal_diff': 3.883116883116883}],[('agent_0-default-0', 'built_in_11'):{'payoff': -1.0, 'score': 0.0, 'win': 0.0, 'lose': 1.0, 'my_goal': 0.0, 'goal_diff': -3.883116883116883}],[('agent_0-default-0', 'agent_0-default-0'):{'payoff': 0.0, 'score': 0.5, 'win': 0.25, 'lose': 0.25, 'my_goal': 0.42, 'goal_diff': 0.0}],
(pid=47605) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:59: UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail.
(pid=47605) fig = plt.figure()
(pid=47605) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:63: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=47605) ax.set_xticklabels([""] + xpid, rotation=90)
(pid=47605) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:64: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=47605) ax.set_yticklabels([""] + ypid)
[2023-09-28 09:30:58,519][INFO] payoff table:
+-------------+---------------+-------------+
| | built_in_11 | default-0 |
+=============+===============+=============+
| built_in_11 | +0 | +100 |
+-------------+---------------+-------------+
| default-0 | -100 | +0 |
+-------------+---------------+-------------+
[2023-09-28 09:30:58,520][INFO] default-0's top 10 worst opponents are:
+-------------+----------+
| policy_id | payoff |
+=============+==========+
| built_in_11 | -100.00 |
+-------------+----------+
| default-0 | +0.00 |
+-------------+----------+
[2023-09-28 09:31:10,202][WARNING] agent_0: agent_0-default-1 is initialized from last best policy agent_0-default-0
[2023-09-28 09:31:10,203][WARNING] policy agent_0-default-1 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 09:31:10,223][WARNING] ********** Generation[0] Agent[agent_0] START **********
[2023-09-28 09:31:10,223][INFO] training_desc: TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7f815d04d790>, kwargs={})
(pid=47592) [2023-09-28 09:31:10,243][WARNING] table_cfgs:DataServer uses {'capacity': 1000, 'sampler_type': 'lumrf', 'sample_max_usage': 10000, 'rate_limiter_cfg': {'min_size': 8}}
(pid=47592) [2023-09-28 09:31:10,248][INFO] DataServer created data table agent_0-default-1
(pid=47610) [2023-09-28 09:31:10,281][INFO] Rollout 1
(pid=47599) [2023-09-28 09:31:10,431][INFO] local_rank: 1 cuda_visible_devices:1
(pid=47609) [2023-09-28 09:31:10,405][INFO] local_rank: 0 cuda_visible_devices:0
(pid=47599) [2023-09-28 09:31:12,242][WARNING] trainer_1 reset to training_task TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7fd2166e3e20>, kwargs={'cfg': {'distributed': {'resources': {'num_cpus': 1, 'num_gpus': 1, 'resources': {'node:192.168.1.109': 0.01}}}, 'optimizer': 'Adam', 'actor_lr': 0.0005, 'critic_lr': 0.0005, 'opti_eps': 1e-05, 'weight_decay': 0.0, 'lr_decay': False, 'lr_decay_epoch': 2000}})
(pid=47609) [2023-09-28 09:31:12,229][WARNING] trainer_0 reset to training_task TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7f9099940400>, kwargs={'cfg': {'distributed': {'resources': {'num_cpus': 1, 'num_gpus': 1, 'resources': {'node:192.168.1.109': 0.01}}}, 'optimizer': 'Adam', 'actor_lr': 0.0005, 'critic_lr': 0.0005, 'opti_eps': 1e-05, 'weight_decay': 0.0, 'lr_decay': False, 'lr_decay_epoch': 2000}})
(pid=47609) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py:53: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/torch/csrc/utils/tensor_numpy.cpp:206.)
(pid=47609) value = torch.FloatTensor(value)
(pid=47599) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py:53: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/torch/csrc/utils/tensor_numpy.cpp:206.)
(pid=47599) value = torch.FloatTensor(value)
(pid=47610) [2023-09-28 09:32:56,022][WARNING] save the best model(average reward:-5092.5,average win:0.0)
(pid=47610) [2023-09-28 09:32:56,081][INFO] Rollout 2
(pid=47610) [2023-09-28 09:34:40,549][WARNING] save the best model(average reward:-3465.0,average win:0.0)
(pid=47610) [2023-09-28 09:34:40,601][INFO] Rollout 3
(pid=47611) 2023-09-28 09:35:41,233 ERROR worker.py:79 -- Unhandled error (suppress with RAY_IGNORE_UNHANDLED_ERRORS=1): ray::DistributedTrainer.optimize() (pid=47599, ip=192.168.1.109, repr=<light_malib.training.distributed_trainer.DistributedTrainer object at 0x7fd2166e3d60>)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/distributed_trainer.py", line 200, in optimize
(pid=47611) training_info = self.trainer.optimize(batch)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py", line 94, in optimize
(pid=47611) tmp_opt_result = self.loss(mini_batch)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/common/loss_func.py", line 70, in call
(pid=47611) return tensor_cast(
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/general.py", line 110, in wrap
(pid=47611) rets = func(*new_args, **kwargs)
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 143, in loss_compute
(pid=47611) values, action_log_probs, dist_entropy = self._evaluate_actions(
(pid=47611) File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 270, in _evaluate_actions
(pid=47611) dist = torch.distributions.Categorical(logits=logits)
(pid=47611) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/categorical.py", line 66, in init
(pid=47611) super().init(batch_shape, validate_args=validate_args)
(pid=47611) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/distribution.py", line 62, in init
(pid=47611) raise ValueError(
(pid=47611) ValueError: Expected parameter logits (Tensor of shape (40000, 19)) of distribution Categorical(logits: torch.Size([40000, 19])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
(pid=47611) tensor([[nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) ...,
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan],
(pid=47611) [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
(pid=47611) grad_fn=)
(pid=47610) [2023-09-28 09:35:41,283][INFO] Saving model agent_0 agent_0-default-1 3 to /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/./logs/gr_football/10_vs_10_psro/2023-09-28-09-18-44/agent_0/agent_0-default-1/3
Traceback (most recent call last):
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 126, in
main()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 114, in main
runner.run()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/framework/pbt_runner.py", line 106, in run
ray.get(training_task_ref)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::TrainingManager.train() (pid=47611, ip=192.168.1.109, repr=<light_malib.training.training_manager.TrainingManager object at 0x7f2ff2ba04f0>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/decorator.py", line 22, in wrapper
return func(self, *args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/training_manager.py", line 146, in train
statistics_list = ray.get(
ray.exceptions.RayTaskError(ValueError): ray::DistributedTrainer.optimize() (pid=47609, ip=192.168.1.109, repr=<light_malib.training.distributed_trainer.DistributedTrainer object at 0x7f8bbeab0d60>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/distributed_trainer.py", line 200, in optimize
training_info = self.trainer.optimize(batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py", line 94, in optimize
tmp_opt_result = self.loss(mini_batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/common/loss_func.py", line 70, in call
return tensor_cast(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/general.py", line 110, in wrap
rets = func(*new_args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 143, in loss_compute
values, action_log_probs, dist_entropy = self._evaluate_actions(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 270, in _evaluate_actions
dist = torch.distributions.Categorical(logits=logits)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/categorical.py", line 66, in init
super().init(batch_shape, validate_args=validate_args)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/distribution.py", line 62, in init
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (40000, 19)) of distribution Categorical(logits: torch.Size([40000, 19])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=)
`

i am not sure if it was a hardware issure, so i tried training with just one TITAN X, but it still generated the following error message:

`
(/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF) lxd@lxd-T630:/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football$ python light_malib/main_pbt.py --config light_malib/expr/gr_football/expr_10_vs_10_psro.yaml
[2023-09-28 09:55:44,004][WARNING] No active cluster detected, will create local ray instance.
[2023-09-28 09:55:52,920][WARNING] ============== Cluster Info ==============
{'node_ip_address': '192.168.1.109', 'raylet_ip_address': '192.168.1.109', 'redis_address': '192.168.1.109:6379', 'object_store_address': '/tmp/ray/session_2023-09-28_09-55-44_005995_37830/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2023-09-28_09-55-44_005995_37830/sockets/raylet', 'webui_url': None, 'session_dir': '/tmp/ray/session_2023-09-28_09-55-44_005995_37830', 'metrics_export_port': 58593, 'node_id': '0b4c8573ddd5462ff763c6db9c7b0cd22dbe01d81d14b7398a7e5ece'}
[2023-09-28 09:55:52,923][WARNING] * cluster resources:
{'object_store_memory': 17818028851.0, 'GPU': 2.0, 'accelerator_type:G': 1.0, 'node:192.168.1.109': 1.0, 'memory': 35636057703.0, 'CPU': 48.0}
[2023-09-28 09:55:52,923][WARNING] this worker ip: 192.168.1.109
[2023-09-28 09:55:52,924][WARNING] Automatically set master ip to local ip address: 192.168.1.109
[2023-09-28 09:55:54,333][INFO] AgentManager initialized
[2023-09-28 09:55:54,366][WARNING] use meta solver type: nash
[2023-09-28 09:55:54,844][INFO] PBTRunner psro initialized
[2023-09-28 09:55:54,845][INFO] PolicyFactory_agent_0_default new policy ctr starts at -1
[2023-09-28 09:55:54,849][WARNING] use model type: gr_football.built_in_11
(pid=37950) [2023-09-28 09:55:57,624][INFO] PolicyServer initialized
(pid=37956) [2023-09-28 09:55:57,675][INFO] DataServer initialized
[2023-09-28 09:55:58,195][INFO] Load initial policy built_in_11 from light_malib/trained_models/gr_football/11_vs_11/built_in
[2023-09-28 09:55:58,210][WARNING] use model type: gr_football.basic_11
[2023-09-28 09:55:58,257][WARNING] agent_0: agent_0-default-0 is initialized from random
[2023-09-28 09:55:58,257][WARNING] policy agent_0-default-0 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 09:55:58,286][WARNING] after initialization:

policy_ids: ['built_in_11', 'agent_0-default-0'] populations:

policy_ids:['built_in_11', 'agent_0-default-0']

policy_ids:['built_in_11', 'agent_0-default-0']

[2023-09-28 09:55:58,287][WARNING] Evaluation rollouts (num: 50) for 3 policy combinations: [{'agent_0': {'built_in_11': 1.0}, 'agent_1': {'built_in_11': 1.0}}, {'agent_0': {'built_in_11': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}, {'agent_0': {'agent_0-default-0': 1.0}, 'agent_1': {'agent_0-default-0': 1.0}}]
(pid=37940) [2023-09-28 09:55:58,899][INFO] TrainingManager initialized
(pid=37954) [2023-09-28 09:55:58,891][INFO] RolloutManager initialized
(pid=37970) [2023-09-28 09:56:08,109][INFO] trainer_0 (local rank: 0) initialized
(pid=37957) [2023-09-28 09:56:08,385][INFO] DataPrefetcher initialized
Elo = dict_items([('built_in_11', 1015.3241542955467), ('agent_0-default-0', 984.6758457044533)])
[2023-09-28 10:07:43,192][INFO] policy_data: [('built_in_11', 'built_in_11'):{'payoff': 0.0, 'score': 0.5, 'win': 0.27, 'lose': 0.27, 'my_goal': 0.5, 'goal_diff': 0.0}],[('built_in_11', 'agent_0-default-0'):{'payoff': 0.9807692307692307, 'score': 0.9903846153846154, 'win': 0.9807692307692308, 'lose': 0.0, 'my_goal': 4.035256410256411, 'goal_diff': 4.035256410256411}],[('agent_0-default-0', 'built_in_11'):{'payoff': -0.9807692307692308, 'score': 0.009615384615384616, 'win': 0.0, 'lose': 0.9807692307692308, 'my_goal': 0.0, 'goal_diff': -4.035256410256411}],[('agent_0-default-0', 'agent_0-default-0'):{'payoff': 5.551115123125783e-17, 'score': 0.5, 'win': 0.29000000000000004, 'lose': 0.29000000000000004, 'my_goal': 0.44, 'goal_diff': 0.0}],
(pid=37960) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:59: UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail.
(pid=37960) fig = plt.figure()
(pid=37960) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:63: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=37960) ax.set_xticklabels([""] + xpid, rotation=90)
(pid=37960) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/monitor/monitor.py:64: UserWarning: FixedFormatter should only be used together with FixedLocator
(pid=37960) ax.set_yticklabels([""] + ypid)
[2023-09-28 10:07:43,815][INFO] payoff table:
+-------------+---------------+-------------+
| | built_in_11 | default-0 |
+=============+===============+=============+
| built_in_11 | +0 | +98 |
+-------------+---------------+-------------+
| default-0 | -98 | +0 |
+-------------+---------------+-------------+
[2023-09-28 10:07:43,816][INFO] default-0's top 10 worst opponents are:
+-------------+----------+
| policy_id | payoff |
+=============+==========+
| built_in_11 | -98.08 |
+-------------+----------+
| default-0 | +0.00 |
+-------------+----------+
[2023-09-28 10:07:56,080][WARNING] agent_0: agent_0-default-1 is initialized from last best policy agent_0-default-0
[2023-09-28 10:07:56,081][WARNING] policy agent_0-default-1 uses custom_config: {'gamma': 1.0, 'use_cuda': False, 'use_dueling': False, 'preprocess_mode': 'flatten', 'use_q_head': False, 'ppo_epoch': 5, 'num_mini_batch': 1, 'return_mode': 'new_gae', 'gae': {'gae_lambda': 0.95}, 'vtrace': {'clip_rho_threshold': 1.0, 'clip_pg_rho_threshold': 100.0}, 'use_rnn': False, 'rnn_layer_num': 1, 'rnn_data_chunk_length': 16, 'use_feature_normalization': True, 'use_popart': True, 'popart_beta': 0.99999, 'entropy_coef': 0.0, 'clip_param': 0.2, 'use_modified_mappo': False}
[2023-09-28 10:07:56,107][WARNING] ********** Generation[0] Agent[agent_0] START **********
[2023-09-28 10:07:56,107][INFO] training_desc: TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7fd043b98ac0>, kwargs={})
(pid=37956) [2023-09-28 10:07:56,125][WARNING] table_cfgs:DataServer uses {'capacity': 1000, 'sampler_type': 'lumrf', 'sample_max_usage': 10000, 'rate_limiter_cfg': {'min_size': 8}}
(pid=37956) [2023-09-28 10:07:56,129][INFO] DataServer created data table agent_0-default-1
(pid=37954) [2023-09-28 10:07:56,159][INFO] Rollout 1
(pid=37970) [2023-09-28 10:07:56,375][INFO] local_rank: 0 cuda_visible_devices:0
(pid=37970) [2023-09-28 10:07:57,988][WARNING] trainer_0 reset to training_task TrainingDesc(agent_id='agent_0', policy_id='agent_0-default-1', policy_distributions={'agent_0': {'agent_0-default-1': 1.0}, 'agent_1': OrderedDict([('built_in_11', 0.99999), ('agent_0-default-0', 1e-05)])}, share_policies=True, sync=False, stopper=<light_malib.framework.scheduler.stopper.common.win_rate_stopper.WinRateStopper object at 0x7fb385f97460>, kwargs={'cfg': {'distributed': {'resources': {'num_cpus': 1, 'num_gpus': 1, 'resources': {'node:192.168.1.109': 0.01}}}, 'optimizer': 'Adam', 'actor_lr': 0.0005, 'critic_lr': 0.0005, 'opti_eps': 1e-05, 'weight_decay': 0.0, 'lr_decay': False, 'lr_decay_epoch': 2000}})
(pid=37970) /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py:53: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1682343964576/work/torch/csrc/utils/tensor_numpy.cpp:206.)
(pid=37970) value = torch.FloatTensor(value)
(pid=37954) [2023-09-28 10:09:29,829][WARNING] save the best model(average reward:-5103.75,average win:0.0)
(pid=37954) [2023-09-28 10:09:29,896][INFO] Rollout 2
(pid=37954) [2023-09-28 10:11:04,900][WARNING] save the best model(average reward:-3472.5,average win:0.0)
(pid=37954) [2023-09-28 10:11:04,950][INFO] Rollout 3
(pid=37954) [2023-09-28 10:12:38,904][WARNING] save the best model(average reward:-2661.875,average win:0.0)
(pid=37954) [2023-09-28 10:12:38,938][INFO] Rollout 4
(pid=37954) [2023-09-28 10:14:12,399][WARNING] save the best model(average reward:-2166.5,average win:0.0)
(pid=37954) [2023-09-28 10:14:12,440][INFO] Rollout 5
(pid=37960) Exception ignored in: <function Image.del at 0x7f7c80696550>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 4016, in del
(pid=37960) self.tk.call('image', 'delete', self.name)
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37970) [2023-09-28 10:15:54,407][WARNING] queue is full. May have bugs in training.
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37960) Exception ignored in: <function Variable.del at 0x7f7c806dec10>
(pid=37960) Traceback (most recent call last):
(pid=37960) File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/tkinter/init.py", line 351, in del
(pid=37960) if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
(pid=37960) RuntimeError: main thread is not in main loop
(pid=37954) [2023-09-28 10:15:57,987][WARNING] save the best model(average reward:-1838.75,average win:0.0)
(pid=37954) [2023-09-28 10:15:58,037][INFO] Rollout 6
(pid=37954) [2023-09-28 10:17:20,960][WARNING] save the best model(average reward:-1609.642857142857,average win:0.0)
(pid=37954) [2023-09-28 10:17:21,004][INFO] Rollout 7
(pid=37954) [2023-09-28 10:18:54,245][WARNING] save the best model(average reward:-1433.125,average win:0.0)
(pid=37954) [2023-09-28 10:18:54,289][INFO] Rollout 8
(pid=37954) [2023-09-28 10:20:04,518][INFO] Saving model agent_0 agent_0-default-1 8 to /media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/./logs/gr_football/10_vs_10_psro/2023-09-28-09-55-52/agent_0/agent_0-default-1/8
Traceback (most recent call last):
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 126, in
main()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/main_pbt.py", line 114, in main
runner.run()
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/framework/pbt_runner.py", line 106, in run
ray.get(training_task_ref)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/ray/worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::TrainingManager.train() (pid=37940, ip=192.168.1.109, repr=<light_malib.training.training_manager.TrainingManager object at 0x7efa6f4cd4c0>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/decorator.py", line 22, in wrapper
return func(self, *args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/training_manager.py", line 146, in train
statistics_list = ray.get(
ray.exceptions.RayTaskError(ValueError): ray::DistributedTrainer.optimize() (pid=37970, ip=192.168.1.109, repr=<light_malib.training.distributed_trainer.DistributedTrainer object at 0x7fae7d95fd90>)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/training/distributed_trainer.py", line 200, in optimize
training_info = self.trainer.optimize(batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/trainer.py", line 94, in optimize
tmp_opt_result = self.loss(mini_batch)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/common/loss_func.py", line 70, in call
return tensor_cast(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/utils/general.py", line 110, in wrap
rets = func(*new_args, **kwargs)
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 143, in loss_compute
values, action_log_probs, dist_entropy = self._evaluate_actions(
File "/media/lxd/0A7AE0627AE04BCF/lzw/football_game/DB-Football/light_malib/algorithm/mappo/loss.py", line 270, in _evaluate_actions
dist = torch.distributions.Categorical(logits=logits)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/categorical.py", line 66, in init
super().init(batch_shape, validate_args=validate_args)
File "/media/lxd/880AA9210AA90CEE/anaconda_envs/lzw_GRF/lib/python3.9/site-packages/torch/distributions/distribution.py", line 62, in init
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (80000, 19)) of distribution Categorical(logits: torch.Size([80000, 19])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=)
`
do you know why this happened?

AttributeError: module 'tensorflow' has no attribute 'set_random_seed'

Trying to run:
python3 -m gfootball.examples.run_ppo2 --level=10_vs_10_kaggle

Gives the following error:
AttributeError: module 'tensorflow' has no attribute 'set_random_seed'

My understanding this is due to not using Tensorflow 1.15.* as instructed in GRF, but Tensorflow 1.15.* isn't available for python versions above 3.6, which contradicts with instructions to use python3.9!

What versions should be used for Python and Tensorflow to be able to run scenarios!?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.