Giter Club home page Giter Club logo

Comments (8)

gcp avatar gcp commented on August 15, 2024

Which temperature parameter do you mean?

  1. In AZ, they set the temperature to a fixed 1 for the move selection in self-play. The engine chooses proportionally to the visit count. I have no idea why you think the temperature should decrease with the length of the game. If it's only to ensure divergence (more than to increase exploration), that would be reasonable (and match the Alpha Zero Go that originally had t=1 for the first 30 moves only). But exploration is good!

  2. There is a cfg_softmax_temp that acts as an operator on the Network outputs. The main use is to allow some further tuning after the best network has been established. It also interacts with the UCT parameter.

from leela-chess.

will-iam avatar will-iam commented on August 15, 2024

Thank you, I mixed up the two parameters.
Now referring to the first one: "For the first 30 moves of each game, the temperature is set to τ = 1; this selects moves proportionally to their visit count in MCTS, and ensures a diverse set of positions are encountered. For the remainder of the game, an infinitesimal temperature is used, τ→0"
, I understood that deep in the search, the temperature should decay.
Sorry for being a beginner, what do you mean when you say that it ensures divergence ? and why would it be reasonable ?

from leela-chess.

gcp avatar gcp commented on August 15, 2024

I understood that deep in the search, the temperature should decay.

This parameter has nothing to do with the search or search depth. It is applied to the final search output. (And it is constant = 1 in AZ, instead of variable in AZ Go)

what do you mean when you say that it ensures divergence ? and why would it be reasonable ?

The idea is that generating more self-play games only helps if they are different. In AZ Go, there was additional randomness from rotating the board randomly, which is not present in chess. If you are only interested in playing different games (instead of also exploring moves the current network considers less good), it is reasonable to only do the randomization early on. At a certain point, the game will have diverged already.

from leela-chess.

will-iam avatar will-iam commented on August 15, 2024

Thanks a lot!

from leela-chess.

jkiliani avatar jkiliani commented on August 15, 2024

The infinitesimal temperature τ→0 refers to a formula in the Alphago Zero paper, which sets the move probability (before normalization) as N^(1/τ). For τ=1, this means move probability proportional to visit count, for τ→0 it means greedy selection, i.e. move with highest visit count is always selected. τ→0 is a mathematical convention, since you're not allowed to divide by zero.

from leela-chess.

will-iam avatar will-iam commented on August 15, 2024

In the current self-play implementation, every chosen move is the best, right ? How do we ensure divergence then ? Dirichlet noise is enough to avoid that we always produce the same game over and over ? Maybe there is another random part in the search but I don't see it.

from leela-chess.

jkiliani avatar jkiliani commented on August 15, 2024

In Leela Zero, additional randomisation is provided by the application of a random symmetry (rotation/reflection) to the board before network eval. That is harder to do in chess, but may be possible, see #25. If not, a temperature larger than 0 will provide some degree of randomness. Alpha Zero actually uses τ=1 for self play, so there's plenty of divergence there.

from leela-chess.

will-iam avatar will-iam commented on August 15, 2024

Thanks, and #28 answers my question too.

from leela-chess.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.