Topic: muzero Goto Github

Some thing interesting about muzero

👇 Here are 39 public repositories matching this topic...

abrahamabel / genesiszero

muzero,GenesisZERO : potential applications for MCTS agents with LLMs for Sequential decision-making

stochastic-muzero mcts-algorithm monte-carlo-tree-search deep-reinforcement-learning alphazero gym gym-environment muzero muzero-stochastic reinforcement-learning

abrahamabel / muzero-gdm_pseudo_code

muzero,A Notebook implementation of the Pseudocode from the original Muzero paper

User: abrahamabel

jupyter-notebook mcts muzero muzero-pseudocode python

alexzajac / muzero_experiments

muzero,A set of experiments and human-playing comparisons with the Muzero agent from Google DeepMind, made as part of a research project with l'école polytechnique.

User: alexzajac

Home Page: https://github.com/alexZajac/muzero_experiments/blob/master/README.md

reinforcement-learning-algorithms artificial-intelligence deepmind muzero python

antoniovandijck / blackjackrl

muzero,Deep Q Learning blackbox strategies for casino games

User: antoniovandijck

blackjack deep-learning deep-neural-networks deep-q-network deep-reinforcement-learning machine-learning mlx muzero q-learning-algorithm reinforcement-learning

atze00 / muzero-cartpole

muzero,

User: atze00

cartpole muzero reinforcement-learning replay

bellerb / chappie.ai

muzero,Generalized AI to perform a multitude of tasks written in python3

User: bellerb

ml ai muzero mcts chess-ai pytorch attention-mechanism transformer perceiverio perceiver

benborder / drla

muzero,C++ Deep Reinforcement Learning Agent library

User: benborder

cpp libtorch deep-reinforcement-learning pytorch reinforcement-learning dreamer muzero ppo

benborder / drla-atari

muzero,Trains deep reinforcement learning agents in Atari environments via the DRLA library.

User: benborder

atari cpp deep-reinforcement-learning dreamer libtorch muzero ppo pytorch reinforcement-learning

benborder / drla-sim

muzero,Trains a deep reinforcement learning agent in simulation testbed environments with the DRLA library.

User: benborder

cartpole connect4 cpp deep-reinforcement-learning dreamer libtorch muzero ppo pytorch reinforcement-learning

bigballon / toward-agz

muzero,Materials for AlphaGo

User: bigballon

alphago alphago-zero muzero deep deep-learning artificial-intelligence

chukwumachukwuma / enyimba_ai

muzero,Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction

User: chukwumachukwuma

ai alphazero artificial-intelligence chatbot generative-ai llama2 llms machine-learning muzero natural-language-processing policy-evaluation prompt-engineering reinforcement-learning rlhf strategy

cogitontnu / muzero

muzero,An implementation of the MuZero algorithm by Google Deepmind. Research paper here: https://arxiv.org/abs/1911.08265

Organization: cogitontnu

muzero

dhdev0 / muzero

muzero,Pytorch Implementation of MuZero for gym environment. It support any Discrete , Box and Box2D configuration for the action space and observation space.

User: dhdev0

arxiv arxiv-papers deep-learning deep-reinforcement-learning machine-learning monte-carlo-tree-search muzero neural-network python3 pytorch reinforcement-learning resnetv1 resnetv2 rl gym gym-environments lstm transformer

muzero,Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.

User: dhdev0

deep-learning deep-reinforcement-learning gym lstm machine-learning neural-network python3 pytorch reinforcement-learning transformer

dhdev0 / stochastic-muzero

muzero,Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.

User: dhdev0

arxiv-papers machine-learning offline-reinforcement-learning online-reinforcement-learning muzero-stochastic stochastic-muzero deep-reinforcement-learning gym-environments lstm monte-carlo-tree-search

fpga-tom / pyzero

muzero,

User: fpga-tom

muzero

hayashimasa / robust_muzero

muzero,A robust variant of MuZero

User: hayashimasa

deep-reinforcement-learning muzero robust-control pytorch

hr0nix / omega

muzero,A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.

User: hr0nix

flax jax mcts minihack model-based-reinforcement-learning model-based-rl muzero nethack reinforcement-learning rlax

huawei-noah / xingtian

muzero,xingtian is a componentized library for the development and verification of reinforcement learning algorithms

Organization: huawei-noah

impala dqn ppo muzero qmix reinforcement-learning-algorithms

hwhitetooth / jax_muzero

muzero,An implementation of MuZero in JAX.

User: hwhitetooth

reinforcement-learning deep-learning deep-reinforcement-learning model-based-reinforcement-learning muzero jax

itomigna2 / muesli-cartpole

muzero,Simple Muesli RL algorithm implementation (PyTorch)

User: itomigna2

cartpole-v1 colab model-based-rl muesli deep-learning muzero reinforcement-learning

itomigna2 / muesli-lunarlander

muzero,Muesli RL algorithm implementation (PyTorch) (LunarLander-v2)

User: itomigna2

colab deep-learning lunarlander-v2 model-based-rl muesli muzero reinforcement-learning

jianzhnie / rlzero

muzero,A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.

User: jianzhnie

alpha-zero mcts muzero reinforcement-learning self-play multi-agent

johan-gras / muzero

muzero,A structured implementation of MuZero

User: johan-gras

muzero world-models reinforcement-learning tensorflow

kaesve / muzero

muzero,A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

User: kaesve

muzero alphazero reinforcement-learning tensorflow tensorflow2 mcts tf2 deep-learning deep-reinforcement-learning