Maybe a stupid and useless question but what's the point of the temperature parameter

Which temperature parameter do you mean? In AZ,

Thanks, and <a class="issue-link js-issue-link" data-error-text="Failed to load title"

Temparature parameter about leela-chess HOT 8 CLOSED

glinscott commented on August 15, 2024

Temparature parameter

from leela-chess.

Comments (8)

gcp commented on August 15, 2024

Which temperature parameter do you mean?

In AZ, they set the temperature to a fixed 1 for the move selection in self-play. The engine chooses proportionally to the visit count. I have no idea why you think the temperature should decrease with the length of the game. If it's only to ensure divergence (more than to increase exploration), that would be reasonable (and match the Alpha Zero Go that originally had t=1 for the first 30 moves only). But exploration is good!
There is a cfg_softmax_temp that acts as an operator on the Network outputs. The main use is to allow some further tuning after the best network has been established. It also interacts with the UCT parameter.

from leela-chess.

will-iam commented on August 15, 2024

Thank you, I mixed up the two parameters.
Now referring to the first one: "For the first 30 moves of each game, the temperature is set to τ = 1; this selects moves proportionally to their visit count in MCTS, and ensures a diverse set of positions are encountered. For the remainder of the game, an infinitesimal temperature is used, τ→0"
, I understood that deep in the search, the temperature should decay.
Sorry for being a beginner, what do you mean when you say that it ensures divergence ? and why would it be reasonable ?

from leela-chess.

gcp commented on August 15, 2024

I understood that deep in the search, the temperature should decay.

This parameter has nothing to do with the search or search depth. It is applied to the final search output. (And it is constant = 1 in AZ, instead of variable in AZ Go)

what do you mean when you say that it ensures divergence ? and why would it be reasonable ?

The idea is that generating more self-play games only helps if they are different. In AZ Go, there was additional randomness from rotating the board randomly, which is not present in chess. If you are only interested in playing different games (instead of also exploring moves the current network considers less good), it is reasonable to only do the randomization early on. At a certain point, the game will have diverged already.

from leela-chess.

will-iam commented on August 15, 2024

Thanks a lot!

from leela-chess.

jkiliani commented on August 15, 2024

The infinitesimal temperature τ→0 refers to a formula in the Alphago Zero paper, which sets the move probability (before normalization) as N^(1/τ). For τ=1, this means move probability proportional to visit count, for τ→0 it means greedy selection, i.e. move with highest visit count is always selected. τ→0 is a mathematical convention, since you're not allowed to divide by zero.

from leela-chess.

will-iam commented on August 15, 2024

In the current self-play implementation, every chosen move is the best, right ? How do we ensure divergence then ? Dirichlet noise is enough to avoid that we always produce the same game over and over ? Maybe there is another random part in the search but I don't see it.

from leela-chess.

jkiliani commented on August 15, 2024

In Leela Zero, additional randomisation is provided by the application of a random symmetry (rotation/reflection) to the board before network eval. That is harder to do in chess, but may be possible, see #25. If not, a temperature larger than 0 will provide some degree of randomness. Alpha Zero actually uses τ=1 for self play, so there's plenty of divergence there.

from leela-chess.

will-iam commented on August 15, 2024

Thanks, and #28 answers my question too.

from leela-chess.

Temparature parameter about leela-chess HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent