Giter Club home page Giter Club logo

Comments (7)

mokemokechicken avatar mokemokechicken commented on July 23, 2024

@gooooloo

now you plays draw with NTest:13. Good job!

No, my model couldn't draw with Ntest:13.
My record of (0, 10, 0) means (win, lose, draw).
It is confusing...

I notice the design of share_mtcs_info_in_self_play. It share mcts info among different games of same model. This is different with AlphaGoZero/AlphaZero paper, but I imagine it would improve selfplay quality a lot. How is it in real practice?

However I cannot see the effect clearly, I feel that it has good effects especially when small sim_per_move(~100).

And how many memory usage does it bring?

I think it doesn't increase very much.
A expanded node consumes about 200 bytes(64*3 + α) memory.
When sim_per_move is 400, 400 * 200B = 80kB per move, so 60 move/game * 80kB = about 480kB per game.
Actually, I can't see the increase.

from reversi-alpha-zero.

gooooloo avatar gooooloo commented on July 23, 2024

Thanks for explanation! @mokemokechicken

My record of (0, 10, 0) means (win, lose, draw).

I see. Sorry I didn't see your description carefully. Actually you have that in the .md file. It is me that I just check the change of commit.

A expanded node consumes about 200 bytes(64*3 + α) memory.
When sim_per_move is 400, 400 * 200B = 80kB per move, so 60 move/game * 80kB = about 480kB per game.

Good analysis! And you clear that buffer every time new model is loaded. Suppose you play 1000 games per model (if 7.2 seconds per game, it is 2 hours) , then it is just ~480MB.

I feel that it has good effects especially when small sim_per_move(~100)

Interesting. I was thinking it would also help with big sim_per_move(e.g. 800), unless the NN predicted node value is quite accurate.

Another mind of me would be, maybe you could share that info across self play processes. I imagine a simple (imperfect) way : when a game is done, compute the delta part, send it to a "shared mcts info manager", wait for manager to applies this delta, pull the new mcts info to new game.

from reversi-alpha-zero.

mokemokechicken avatar mokemokechicken commented on July 23, 2024

@gooooloo

Another mind of me would be, maybe you could share that info across self play processes. I imagine a simple (imperfect) way : when a game is done, compute the delta part, send it to a "shared mcts info manager", wait for manager to applies this delta, pull the new mcts info to new game.

Yes, it is possible and interesting.
I am also concerned that sharing mcts info brings bad effects, for example, it brings less contribution by updating model.
So, your idea is better because sharing mcts info among same model avoid that problem.

By the way, I noticed that current implementation is not enough.
The "value" of tree search results is not brought to their parent(and ancestor) nodes.
It is necessary to keep all nodes to the root(initial state), and add new searched values(N and W) to them after simulations.

from reversi-alpha-zero.

gooooloo avatar gooooloo commented on July 23, 2024

It is necessary to keep all nodes to the root(initial state), and add new searched values(N and W) to them after simulations.

If I understand correctly, that will increase memory usage a lot? Well, seems it depends on how fast self play is, compared to model updating...

from reversi-alpha-zero.

mokemokechicken avatar mokemokechicken commented on July 23, 2024

@gooooloo

If I understand correctly, that will increase memory usage a lot? Well, seems it depends on how fast self play is, compared to model updating...

It will not increase memory usage at all because it just updates N or W of the parent(and ancestors).
However I reconsidered that adding N simply is not good because almost same moves will be selected in the next game.
I will try to implement the concept to clear and test it.

from reversi-alpha-zero.

AranKomat avatar AranKomat commented on July 23, 2024

Doesn't sharing MCTS info discourage the exploration? The initial values of n, w and q are inherited from the shared MCTS info, and then the process/thread does simulation to add further values to these three quantities and then decides to move based on them. But this last step of moving forward seems to be heavily influenced by the shared info. If the exploration of the first turn is poorly performed, then the second turn's outcome becomes similar, and so is n-th turn's. On the other hand, if enough variation is achieved while using shared info, shared info is not useful except for the first an possibly second turns only. Thoughts?

from reversi-alpha-zero.

mokemokechicken avatar mokemokechicken commented on July 23, 2024

@AranKomat

As you pointed out, I think that sharing MCTS info discourage the exploration.

The aim of sharing MCTS info is to encourage searching another move at difficult positions.
The difficult position means that there are several moves which have almost same (N, W).
By sharing it, in the next game, the loser side is encouraged to select another move because it knows the previous move is bad and another seems good.
For example, it is like a "post mortem" of chess.
※ I fixed the first move of black to "C4" for effective sharing.

I investigated some series of games. They always played different moves in the middle of game. So, my aim is achieved to a certain extent.
However, it is true that sharing it discourage the exploration.
Sharing info is reset every N games (currently N=5).

from reversi-alpha-zero.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.