Category	Technologies
Programming
DevOps
Technical writing
Editors

Emil check-list suggestions

Concurrency

At the moment my main problem is the way the bot runs.
it starts streaming HTTP messages, which sometimes report some events like new challenge, game start, etc.
Start point is at state.rs start() function.

In order to be able to have multiple threads spawn and access/modify the same resources, I've used Arc<Mutex>. When stream_incoming_events receives something, it clones the whole bot struct and lets a thread do something with it.
The main 2 points are event_stream_handler() and game_stream_handler().

Now it bottlenecks when several threads want to play on games. I've done this ugly src/bot/state.rs:35:

  // TODO: Find if we can improve this.
  pub games: Arc<Mutex<Vec<Arc<Mutex<BotGame>>>>>,

I've let the bot accept up to 3 games concurrently, but I see it times out on games whenever the time control is slow because it plays on 1 game at at time really, where I wish it could just spawn threads and do it concurrently for each game. There is one engine instance per game, and the engine need to be "blocking" running the go() function when it searches.

When trying to play on games : play_on_game() at src/bot/state.rs:437: typically we get a mutable reference to a bot game, and it blocks the other games.

Perhaps one way to solve it is to do like in engine_go_and_stop() in src/chess/engine/test/engine_test.rs, and then just monitor when the thread handle is_finished, so we do not need to hold a mutable reference and block the rest of the games.

Ideally, I would like to have another layout where I can concurrently block individual mutable references of a Vec/Slice individually.
Perhaps I could Box a slice/array of games pre-allocated to work around this.

Memory consumption

The bot seems to keep memory allocated after games, sometimes.
If I just fire it up and play games, memory allocated by the bot gets released at the end of the game. All good.

But if it plays several days, I see that sometimes it just keeps like 25 GB of RAM allocated. (I configured the cache size that a game should not use much more than 2GB)

It looks like the Drop is being called and our list of games from the bot state is empty (vector length = 0), but it kind of looks like that the engine cache tables (see EngineCache in src/chess/engine/cache/engine_cache.rs is not being freed)

If I run tools like heaptrack, it tells me that the only leaked memory is from reqwest related to SSL stuff. I have no idea why this happens.

Benches

I've set some benches to try to see how fast we can iterate position and assess them. I used the divan crate because it looked nicely suited.

For example, I get the following output:

  Running benches/chess_library.rs (target/release/deps/chess_library-73fdce3e15a9c9b3)
Pinned thread to core 0
chess_library                   fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ apply_moves_on_a_game_state  135 ns        │ 2.082 µs      │ 180.1 ns      │ 197.9 ns      │ 10000   │ 80000
├─ apply_moves_on_the_board     92.45 ns      │ 2.794 µs      │ 114.9 ns      │ 117.4 ns      │ 10000   │ 160000
├─ compute_legal_moves          368 ns        │ 3.804 µs      │ 543.2 ns      │ 546 ns        │ 10000   │ 40000
├─ determine_board_pins         2.15 ns       │ 20.2 ns       │ 2.327 ns      │ 2.333 ns      │ 10000   │ 10240000
╰─ find_attackers               6.709 ns      │ 30.97 ns      │ 6.769 ns      │ 6.817 ns      │ 10000   │ 5120000

     Running benches/engine.rs (target/release/deps/engine-4ba7a83b91aa9c70)
Pinned thread to core 0
engine                                   fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ board_evaluation                      429.7 ns      │ 5.77 µs       │ 440.7 ns      │ 449.9 ns      │ 10000   │ 10000
├─ board_evaluation_2                    341.6 ns      │ 2.226 µs      │ 348 ns        │ 350.1 ns      │ 10000   │ 80000
├─ board_generic_evaluation              386.6 ns      │ 3.174 µs      │ 390.5 ns      │ 393.8 ns      │ 10000   │ 80000
├─ board_material_score                  9.306 ns      │ 60.61 ns      │ 9.388 ns      │ 9.463 ns      │ 10000   │ 2560000
├─ cache_for_evals                       10.91 ns      │ 93.49 ns      │ 11.38 ns      │ 11.44 ns      │ 10000   │ 2560000
├─ cache_for_game_status                 9.974 ns      │ 75.29 ns      │ 10.59 ns      │ 10.63 ns      │ 10000   │ 2560000
├─ cache_for_moves                       51.41 ns      │ 452 ns        │ 52.04 ns      │ 52.46 ns      │ 10000   │ 640000
├─ compute_game_phase                    1.418 ns      │ 8.056 ns      │ 1.423 ns      │ 1.431 ns      │ 10000   │ 20480000
├─ detect_board_game_over                49.76 ns      │ 3.837 µs      │ 49.76 ns      │ 52.01 ns      │ 10000   │ 10000
├─ endgame_evaluation                    349.7 ns      │ 5.069 µs      │ 360.7 ns      │ 365 ns        │ 10000   │ 10000
├─ endgame_piece_square_table_lookup     40.29 ns      │ 412.5 ns      │ 41.4 ns       │ 42.45 ns      │ 10000   │ 640000
├─ file_half_open_detection              0.943 ns      │ 8.707 ns      │ 0.948 ns      │ 0.954 ns      │ 10000   │ 20480000
├─ file_open_detection                   1.183 ns      │ 11.52 ns      │ 1.183 ns      │ 1.194 ns      │ 10000   │ 20480000
├─ file_state_detection                  1.031 ns      │ 13.33 ns      │ 1.046 ns      │ 1.057 ns      │ 10000   │ 20480000
├─ middlegame_evaluation                 420.7 ns      │ 23.67 µs      │ 430.7 ns      │ 439.4 ns      │ 10000   │ 10000
├─ middlegame_piece_square_table_lookup  38.1 ns       │ 338.5 ns      │ 39.83 ns      │ 40.12 ns      │ 10000   │ 640000
├─ nnue_board_evaluation                 36.33 µs      │ 364.4 µs      │ 62.8 µs       │ 62.19 µs      │ 10000   │ 10000
├─ nnue_input_layer_conversion           169.4 ns      │ 3.121 µs      │ 172.5 ns      │ 175.1 ns      │ 10000   │ 160000
├─ opening_evaluation                    409.7 ns      │ 6.952 µs      │ 420.7 ns      │ 427.8 ns      │ 10000   │ 10000
├─ opening_piece_square_table_lookup     32.32 ns      │ 433.6 ns      │ 36.08 ns      │ 36.54 ns      │ 10000   │ 640000
╰─ passed_pawn_detection                 6.627 ns      │ 69.93 ns      │ 6.724 ns      │ 6.77 ns       │ 10000   │ 5120000

I tend to run release because the optimization level really makes a difference on the search speed. I am not sure if I set my benchs work, but I know that a lot of speed can be gained from pre-computing all possiblities and then make a lookup in an array rather than computing stuff manually... Like e.g. Rook moves. If we start here: src/chess/model/tables/rook_destinations.rs, I have this 2-dimensional array: ROOK_DESTINATION_TABLE

We initialize it once with the values, then anytime we need to compute where the rook can move from a position we make a lookup, see get_rook_destinations() at rook_destinations.rs:340

If we do

 unsafe {ROOK_DESTINATION_TABLE
      .get_unchecked(square)
      .get_unchecked(blockers_key)
      & !same_side_pieces }

Instead of

 ROOK_DESTINATION_TABLE[square][blockers_key] & !same_side_pieces

I can see a speed increase. Great ! However, if I start replacing array lookup with .get_unchecked everywhere, it does not always speed up, sometimes get_unchecked is slower... I think it has to do with de-referencing, but I am not sure and it's kind of frustrating not to be able to predict.

Perhaps it is also that I did not set my benchmarks very nicely

Multi-threaded search

I think one of the next thing I'll try to implement is to have the bot search function to be multi-threaded.
I've read that the most efficient way to do this is to implement it like this: The APHID Parallel alpha beta Search Algorithm

So here I would have the search function (src/chess/engine/mod.rs:943), which is recursive, that would decide at certain depth would split the search into N threads. (N being a pre-configured number)

They need to share an alpha/beta table that they read and update regularly. I am guessing to implement it a little like the engine cache, which is defined here: src/chess/engine/cache/engine_cache.rs

Any input on how you would do it ?

nobriot / schnecken_bot Goto Github PK

schnecken_bot's Introduction

Hey there! 👋
Did you get lost ?? 😜

schnecken_bot's People

Contributors

Stargazers

Watchers

schnecken_bot's Issues

Emil check-list suggestions

Concurrency

Memory consumption

Benches

Multi-threaded search

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

nobriot / schnecken_bot Goto Github PK

schnecken_bot's Introduction

Hey there! 👋 Did you get lost ?? 😜

schnecken_bot's People

Contributors

Stargazers

Watchers

schnecken_bot's Issues

Concurrency

Memory consumption

Benches

Multi-threaded search

Recommend Projects

Recommend Topics

Recommend Org

Hey there! 👋
Did you get lost ?? 😜