Giter Club home page Giter Club logo

concurrent-data-structure's Introduction

Concurrent Data Structure for Rust

Goal & Status

Implement sequential, lock-based and lock-free concurrent data structures below:

Stack Queue Linked List AVL Tree HashTable
Sequential Done Done Done Done
Lock-based Done Done Done
Lock-free Done Done

Benchmark

You can run bench like this:

cargo install cargo-criterion
# default feature has accumulating stats on available structure.
cargo criterion --bench {bench_name} --no-default-features

Available Benches:

  • stack
  • queue
  • avltree
  • btree

Profile

Use CDS stats

Several cds has its own statistics. Use it by printing on test.

Flamegraph

cargo install flamegraph
sudo cargo flamegraph --no-default-features --test tests -- {test_name}

Detail

Lock

  • common spin lock and sequece lock(SeqLock)
  • flat combining lock

Stack

  • lock stack(based on std::sync::Mutex and spin lock)
  • Treiber's Stack
  • Elimination-Backoff Stack

Queue

  • lock queue(based on std::sync::Mutex and spin lock)
  • two lock queue
  • FCQueue(use flat combining lock)
  • Michael-Scott queue

Linked List

  • TODO: implement Harris linked list

AVL Tree

  • SeqLockAVLTree, RwLockAVLTree(use crossbeam_utils::sync::ShardedLock)

HashTable

  • TODO: ?

Reference

General

Lock

Stack

Queue

Binary Search Tree

concurrent-data-structure's People

Contributors

codingskynet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

concurrent-data-structure's Issues

Benchmark 자동 작성을 위한 GitHub Bot 만들기

적당히 테스트용 EC2 이미지(Mac Studio 도착하면 이거 걍 서버 뚫어서 쓸까...) 올리고, GitHub 봇 하나 만들어서 performance 체크해서 그래프로 보여주는 거 만드는 게 좋을 거 같다. 손수 벤치마킹하는 거 기본 2시간은 걸리는지라...

RwLockAVLTree concurrent 테스트 deadlock 관찰

  • Env
    • HW: Macbook Pro(16inch, 2021) M1 Pro(P 8 cores + E 2 cores), 16GiB
    • SW: macOS Monterey 12.3.1
  • Execute
    cargo test avltree::rwlock::assert -- --nocapture

stress_concurrent에 있는 op 출력하는 것을 켰음에도 명백하게 데드락이 관찰됨.

Michael-Scott Queue 원본 논문의 오류

원본 논문에는 다음과 같이 알고리즘이 작성되어 있다.

dequeue(Q: pointer to queue t, pvalue: pointer to data type): boolean
D01:     loop # Keep trying until Dequeue is done
D02:         head = Q–>Head # Read Head
D03:         tail = Q–>Tail # Read Tail
D04:         next = head–>next # Read Head.ptr–>next
D05:         if head == Q–>Head # Are head, tail, and next consistent?
D06:             if head.ptr == tail.ptr # Is queue empty or Tail falling behind?
D07:                 if next.ptr == NULL # Is queue empty?
D08:                     return FALSE # Queue is empty, couldn’t dequeue
D09:                 endif
D10:                 CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) # Tail is falling behind. Try to advance it
D11:             else # No need to deal with Tail
                            # Read value before CAS, otherwise another dequeue might free the next node
D12:                 *pvalue = next.ptr–>value
D13:                 if CAS(&Q–>Head, head, <next.ptr, head.count+1>) # Try to swing Head to the next node
D14:                     break # Dequeue is done. Exit loop
D15:                 endif
D16:             endif
D17:         endif
D18:     endloop
D19:     free(head.ptr) # It is safe now to free the old dummy node
D20:     return TRUE # Queue was not empty, dequeue succeeded

근데 이거 if next.ptr == NULL이 D06보다 먼저 test되어야 에러가 안 난다. 이와 같게 작성하면 D13에서 next.ptr이 NULL일 수도 있다는 것이 관찰되었다. 왜징

SeqLockAVLTree removal 병목

1b862d45까지 내려가면 removal이 빠르지만, 그 이후에 추가된 fence-fence synchronization을 위한 value의 Atomic 적용이 병목의 원인으로 작용하고 있는 듯함. 해결책을 강구해봐야 겠다.

재현 방법: avl_tree(최신에선 avltree)::seqlock::bench_large_seqlock_avl_tree 실행해보면 removal이 압도적으로 느려지는 경우를 관찰할 수 있음
재현 환경: M1 Macbook Air, Big Sur 11.4

Sequential Tree들과 SeqLockAVLTree 비교

Benchmark: cargo criterion --bench sequential (only run bench_vs_btreemap)
Envirnment:

  • SW: Windows 10 Pro 10.0.19043 Build 19043, WSL 2 Ubuntu 20.04(5.10.16.3-microsoft-standard-WSL2), w/ other minor programs running (Discord, Chrome, etc...)
  • HW: Ryzen 5 3600 (6c 12t, static clock 4.2GHz), DDR4 3600MHz 16GiB

The branch main(9b211c6) result(constants are differed from the branch saved):

Detail
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 100%, L: 0%, R: 0%, total: +1.92e5)/std::BTreeMap
                        time:   [23.003 ms 23.260 ms 23.580 ms]
                        thrpt:  [8.1425 Melem/s 8.2546 Melem/s 8.3466 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 100%, L: 0%, R: 0%, total: +1.92e5)/BTree
                        time:   [24.846 ms 25.120 ms 25.494 ms]
                        thrpt:  [7.5313 Melem/s 7.6434 Melem/s 7.7275 Melem/s]

std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 0%, L: 100%, R: 0%, total: +1.92e5)/std::BTreeMap
                        time:   [28.187 ms 28.552 ms 29.015 ms]
                        thrpt:  [6.6172 Melem/s 6.7245 Melem/s 6.8117 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 0%, L: 100%, R: 0%, total: +1.92e5)/BTree
                        time:   [33.296 ms 33.722 ms 34.213 ms]
                        thrpt:  [5.6118 Melem/s 5.6936 Melem/s 5.7664 Melem/s]

std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 0%, L: 0%, R: 100%, total: +1.92e5)/std::BTreeMap
                        time:   [37.202 ms 38.062 ms 39.035 ms]
                        thrpt:  [4.9186 Melem/s 5.0444 Melem/s 5.1611 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 0%, L: 0%, R: 100%, total: +1.92e5)/BTree
                        time:   [40.924 ms 41.618 ms 42.381 ms]
                        thrpt:  [4.5303 Melem/s 4.6134 Melem/s 4.6916 Melem/s]

std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 5%, L: 90%, R: 5%, total: +1.92e5)/std::BTreeMap
                        time:   [30.831 ms 31.439 ms 32.140 ms]
                        thrpt:  [5.9739 Melem/s 6.1070 Melem/s 6.2275 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 5%, L: 90%, R: 5%, total: +1.92e5)/BTree
                        time:   [34.686 ms 35.394 ms 36.158 ms]
                        thrpt:  [5.3100 Melem/s 5.4246 Melem/s 5.5354 Melem/s]

std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 30%, L: 50%, R: 20%, total: +1.92e5)/std::BTreeMap
                        time:   [32.917 ms 33.440 ms 34.044 ms]
                        thrpt:  [5.6398 Melem/s 5.7417 Melem/s 5.8329 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 30%, L: 50%, R: 20%, total: +1.92e5)/BTree
                        time:   [35.898 ms 36.209 ms 36.566 ms]
                        thrpt:  [5.2507 Melem/s 5.3025 Melem/s 5.3485 Melem/s]

std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 40%, L: 20%, R: 40%, total: +1.92e5)/std::BTreeMap
                        time:   [34.634 ms 35.121 ms 35.660 ms]
                        thrpt:  [5.3842 Melem/s 5.4669 Melem/s 5.5437 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 40%, L: 20%, R: 40%, total: +1.92e5)/BTree
                        time:   [37.758 ms 38.022 ms 38.371 ms]
                        thrpt:  [5.0038 Melem/s 5.0497 Melem/s 5.0850 Melem/s]

std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 50%, L: 0%, R: 50%, total: +1.92e5)/std::BTreeMap
                        time:   [36.850 ms 37.488 ms 38.154 ms]
                        thrpt:  [5.0322 Melem/s 5.1216 Melem/s 5.2103 Melem/s]
std::BTreeMap vs BTree: Inserted +5e5, Ops (I: 50%, L: 0%, R: 50%, total: +1.92e5)/BTree
                        time:   [38.675 ms 39.182 ms 39.743 ms]
                        thrpt:  [4.8310 Melem/s 4.9002 Melem/s 4.9645 Melem/s]

Benchmark: cargo criterion --bench concurrent (only run bench_mixed_total_seqlockavltree)

Detail
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 100%, L: 0%, R: 0%, per: +1.92e5) by 1 threads
                        time:   [200.10 ms 204.57 ms 209.83 ms]
                        thrpt:  [915.05 Kelem/s 938.58 Kelem/s 959.52 Kelem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 100%, L: 0%, R: 0%, per: +9.6e4) by 2 threads
                        time:   [122.30 ms 126.74 ms 132.83 ms]
                        thrpt:  [1.4455 Melem/s 1.5149 Melem/s 1.5699 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 100%, L: 0%, R: 0%, per: +4.8e4) by 4 threads
                        time:   [72.743 ms 75.622 ms 79.524 ms]
                        thrpt:  [2.4144 Melem/s 2.5390 Melem/s 2.6394 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 100%, L: 0%, R: 0%, per: +2.4e4) by 8 threads
                        time:   [60.798 ms 62.065 ms 63.351 ms]
                        thrpt:  [3.0307 Melem/s 3.0935 Melem/s 3.1580 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 100%, L: 0%, R: 0%, per: +1.6e4) by 12 threads
                        time:   [51.383 ms 53.083 ms 55.021 ms]
                        thrpt:  [3.4896 Melem/s 3.6170 Melem/s 3.7366 Melem/s]

SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 100%, R: 0%, per: +1.92e5) by 1 threads
                        time:   [132.85 ms 133.18 ms 133.52 ms]
                        thrpt:  [1.4380 Melem/s 1.4417 Melem/s 1.4453 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 100%, R: 0%, per: +9.6e4) by 2 threads
                        time:   [64.178 ms 65.188 ms 66.492 ms]
                        thrpt:  [2.8876 Melem/s 2.9453 Melem/s 2.9917 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 100%, R: 0%, per: +4.8e4) by 4 threads
                        time:   [33.729 ms 33.890 ms 34.068 ms]
                        thrpt:  [5.6358 Melem/s 5.6653 Melem/s 5.6924 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 100%, R: 0%, per: +2.4e4) by 8 threads
                        time:   [18.494 ms 18.574 ms 18.660 ms]
                        thrpt:  [10.289 Melem/s 10.337 Melem/s 10.382 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 100%, R: 0%, per: +1.6e4) by 12 threads
                        time:   [14.196 ms 14.658 ms 15.176 ms]
                        thrpt:  [12.652 Melem/s 13.099 Melem/s 13.525 Melem/s]

SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 0%, R: 100%, per: +1.92e5) by 1 threads
                        time:   [175.70 ms 176.13 ms 176.58 ms]
                        thrpt:  [1.0873 Melem/s 1.0901 Melem/s 1.0928 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 0%, R: 100%, per: +9.6e4) by 2 threads
                        time:   [88.666 ms 90.187 ms 91.823 ms]
                        thrpt:  [2.0910 Melem/s 2.1289 Melem/s 2.1654 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 0%, R: 100%, per: +4.8e4) by 4 threads
                        time:   [56.011 ms 57.361 ms 59.114 ms]
                        thrpt:  [3.2480 Melem/s 3.3472 Melem/s 3.4279 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 0%, R: 100%, per: +2.4e4) by 8 threads
                        time:   [32.342 ms 33.061 ms 33.872 ms]
                        thrpt:  [5.6684 Melem/s 5.8074 Melem/s 5.9366 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 0%, L: 0%, R: 100%, per: +1.6e4) by 12 threads
                        time:   [24.813 ms 25.069 ms 25.354 ms]
                        thrpt:  [7.5729 Melem/s 7.6588 Melem/s 7.7378 Melem/s]

SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 5%, L: 90%, R: 5%, per: +1.92e5) by 1 threads
                        time:   [145.53 ms 149.50 ms 155.65 ms]
                        thrpt:  [1.2335 Melem/s 1.2843 Melem/s 1.3193 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 5%, L: 90%, R: 5%, per: +9.6e4) by 2 threads
                        time:   [75.120 ms 79.658 ms 84.691 ms]
                        thrpt:  [2.2671 Melem/s 2.4103 Melem/s 2.5559 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 5%, L: 90%, R: 5%, per: +4.8e4) by 4 threads
                        time:   [40.794 ms 41.208 ms 41.616 ms]
                        thrpt:  [4.6136 Melem/s 4.6592 Melem/s 4.7066 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 5%, L: 90%, R: 5%, per: +2.4e4) by 8 threads
                        time:   [25.172 ms 25.874 ms 26.693 ms]
                        thrpt:  [7.1930 Melem/s 7.4204 Melem/s 7.6274 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 5%, L: 90%, R: 5%, per: +1.6e4) by 12 threads
                        time:   [18.940 ms 19.330 ms 19.764 ms]
                        thrpt:  [9.7147 Melem/s 9.9329 Melem/s 10.137 Melem/s]

SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 30%, L: 50%, R: 20%, per: +1.92e5) by 1 threads
                        time:   [175.46 ms 177.71 ms 180.23 ms]
                        thrpt:  [1.0653 Melem/s 1.0804 Melem/s 1.0942 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 30%, L: 50%, R: 20%, per: +9.6e4) by 2 threads
                        time:   [92.361 ms 93.314 ms 94.419 ms]
                        thrpt:  [2.0335 Melem/s 2.0576 Melem/s 2.0788 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 30%, L: 50%, R: 20%, per: +4.8e4) by 4 threads
                        time:   [57.761 ms 58.356 ms 59.054 ms]
                        thrpt:  [3.2513 Melem/s 3.2902 Melem/s 3.3241 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 30%, L: 50%, R: 20%, per: +2.4e4) by 8 threads
                        time:   [40.063 ms 40.305 ms 40.577 ms]
                        thrpt:  [4.7317 Melem/s 4.7636 Melem/s 4.7924 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 30%, L: 50%, R: 20%, per: +1.6e4) by 12 threads
                        time:   [32.397 ms 33.408 ms 34.868 ms]
                        thrpt:  [5.5065 Melem/s 5.7471 Melem/s 5.9264 Melem/s]

SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 40%, L: 20%, R: 40%, per: +1.92e5) by 1 threads
                        time:   [193.10 ms 197.05 ms 201.29 ms]
                        thrpt:  [953.83 Kelem/s 974.36 Kelem/s 994.31 Kelem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 40%, L: 20%, R: 40%, per: +9.6e4) by 2 threads
                        time:   [102.11 ms 104.04 ms 106.22 ms]
                        thrpt:  [1.8076 Melem/s 1.8455 Melem/s 1.8804 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 40%, L: 20%, R: 40%, per: +4.8e4) by 4 threads
                        time:   [62.983 ms 63.647 ms 64.382 ms]
                        thrpt:  [2.9822 Melem/s 3.0166 Melem/s 3.0485 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 40%, L: 20%, R: 40%, per: +2.4e4) by 8 threads
                        time:   [46.276 ms 46.715 ms 47.173 ms]
                        thrpt:  [4.0701 Melem/s 4.1101 Melem/s 4.1490 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 40%, L: 20%, R: 40%, per: +1.6e4) by 12 threads
                        time:   [38.970 ms 39.644 ms 40.358 ms]
                        thrpt:  [4.7574 Melem/s 4.8431 Melem/s 4.9269 Melem/s]

SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 50%, L: 0%, R: 50%, per: +1.92e5) by 1 threads
                        time:   [194.74 ms 198.45 ms 202.61 ms]
                        thrpt:  [947.63 Kelem/s 967.51 Kelem/s 985.91 Kelem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 50%, L: 0%, R: 50%, per: +9.6e4) by 2 threads
                        time:   [109.27 ms 111.71 ms 114.52 ms]
                        thrpt:  [1.6766 Melem/s 1.7188 Melem/s 1.7572 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 50%, L: 0%, R: 50%, per: +4.8e4) by 4 threads
                        time:   [68.425 ms 69.296 ms 70.455 ms]
                        thrpt:  [2.7251 Melem/s 2.7707 Melem/s 2.8060 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 50%, L: 0%, R: 50%, per: +2.4e4) by 8 threads
                        time:   [52.074 ms 52.885 ms 53.797 ms]
                        thrpt:  [3.5690 Melem/s 3.6305 Melem/s 3.6871 Melem/s]
SeqLockAVLTree/Inserted +5e5 SeqlockAVLTree Ops (I: 50%, L: 0%, R: 50%, per: +1.6e4) by 12 threads
                        time:   [42.036 ms 42.523 ms 43.059 ms]
                        thrpt:  [4.4590 Melem/s 4.5153 Melem/s 4.5675 Melem/s]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.