Comments (6)
Hi @Zeta36
As I promised I've done (just in one day, I had no more time) an adaptation of the reversi-zero project @mokemokechicken did into a chess version: https://github.com/Zeta36/chess-alpha-zero
It's great!! Besides, making it in just one day!
Then I execute the evaluation process and it worked fine. The overfitted model was able to defeat the random original model of the beggining by 100% (causality??).
Oh, it is similar to my case.
I changed also a little bit the resign function. Chess is not like Go or Reversi where you always finish the game more or less in the same number of movements. In Chess the game can end in a lot of ways (checkmate, stalemate, etc.) and the self-play coul be more than 200 movements before reaching an ending position (normally in a draw). So I decided to cut-off the play after some player has more than 13 points of advantage (this score is computed as usual taking into account the value of the pieces: he queen is worth 10, roots 5.5, etc).
I agree your method of using another trusted judgement.
The threshold is always difficult problem...
As you can imagine with my poor machine I could not fully test the project beyond these tiny tests of functionality. So I'd really appreciate if you could please take some free time of your GPU's for testing this implementation in a more serious way. Both you can of course be colaborators of the project if you wish.
Also I don't know if I commited some theoretical bugs after this adaptation to chess and I'd apretiate too any comments by your side in this sense.
Unfortunately, I don't have free GPU time now...
However, I would like to run it if my GPU has free time.
FYI:
Also, I applied to https://www.tensorflow.org/tfrc/. I do not know if it will be accepted. . .
from reversi-alpha-zero.
@Zeta36 good job!
Just one thinking: maybe Connect4 is easy enough so it even doesn't need an accurate NN? Just MCST will make AI strong enough?
Besides, Connect4 is a very unbalanced game. Who plays first should always win, which makes AI training much easier. So how about Connect5 with specific opening pattern, such as "11D (瑞星): equal" in https://en.wikipedia.org/wiki/Renju_opening_pattern. That seems to be an more persuasive game for AlphaGoZero algo.
from reversi-alpha-zero.
Hi, guys. I refactored my APV_MCTS.py now it uses hash table the way you guys did. But I found a confusing that var_u
is initialized but never used in the ReversiPlayer. Puls, my hash table will initialize a big chunk of numpy array [5x362] for N,W,Q,U,P. The code runs much faster because of the data structure advantage in comparison to the previous tree data structure. Thanks for the inspiration! It is also easy to prune the hash table if key includes the depth of game. Have you guys found memory exhaustion?
from reversi-alpha-zero.
Hi @yhyu13
Thanks for the message!
However, I would appreciate it if you divide ISSUE by topic.
Hi, guys. I refactored my APV_MCTS.py now it uses hash table the way you guys did. But I found a confusing that var_u is initialized but never used in the ReversiPlayer.
Yes, that's right. It is unnecessary.
I thought about refactoring but I forgot...
Puls, my hash table will initialize a big chunk of numpy array [5x362] for N,W,Q,U,P. The code runs much faster because of the data structure advantage in comparison to the previous tree data structure. Thanks for the inspiration!
Oh, it is wonderful!
Have you guys found memory exhaustion?
No, I haven't.
The amount of memory consumed is proportional to the number of expanded nodes.
In my reversi, the number of explorations is small (~ 500), and since the number of moves is small, it was hardly a problem (Up to about 500 * 64).
from reversi-alpha-zero.
Hello, @mokemokechicken and @yhyu13 .
I've done today a new version of the Reversi Zero project. This time I adapted it to the game Connect4: https://github.com/Zeta36/connect4-alpha-zero
@mokemokechicken I'm really in love with your implementation. You did it (and DeepMind thought it) in a way that I can apply it easily to any new environment I imagine.
Moreover, Connect4 is a more easy game and I could train the model without GPU. Results are amazing. The model is able to learn to play well in only 3 generations in a couple of hours (just with a Intel i5 CPU).
I insist you've done a GREAT work, friend.
It's a pitty I don't have enough power machine to check if the chess version is able to learn to play well.
from reversi-alpha-zero.
Great!
I was thinking about trying it with that game too!
However, I knew for the first time the name Connect4 (^^;
I am happy to see your implementation of various games!!
Moreover, Connect4 is a more easy game and I could train the model without GPU. Results are amazing. The model is able to learn to play well in only 3 generations in a couple of hours (just with a Intel i5 CPU).
That's interesting and a nice property.
Just thinking:
As @gooooloo and @yhyu13 said,
I am interested in "the version using only MCTS" and "the version using only policy move(simulation_num_per_move=1)".
from reversi-alpha-zero.
Related Issues (20)
- About the optimizer? HOT 5
- invalid correct moves HOT 2
- GPU ResourceExhaustedError after many times of Keras model.load() during self-play HOT 1
- What's different between Challenge 2 & 3? HOT 2
- The sign of virtual loss is reversed
- The history dates of Challenge 3/4 are wrong. HOT 1
- It may forget pertinent information about positions that it no longer visits. HOT 21
- automatically ntest HOT 2
- Performance Reports HOT 23
- Unofficial AlphaGoZero implementation from Googlers HOT 15
- how much does share_mtcs_info_in_self_play contribute in strength? HOT 7
- Child seeds being identical to the parent seed may nullify the effect of multi-processing/threading HOT 3
- a question about reloading model HOT 2
- AlphaZero Approach HOT 2
- Replacing CNN with decoder-only Transformer for possible acceleration? HOT 3
- maybe a bug here HOT 1
- About using different players for training game generation HOT 6
- Cannot use multiple GPUs in self-play HOT 3
- tensorflow.python.framework.errors_impl.InvalidArgumentError: Tensor input_1:0, specified in either feed_devices or fetch_devices was not found in the Graph HOT 1
- Gobang version
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from reversi-alpha-zero.