Giter Club home page Giter Club logo

Comments (4)

kblomdahl avatar kblomdahl commented on June 15, 2024

I've attached to one of the process in deadlock, so I am going to dump some backtrace's here in no particular order:

#0  0x00007f7b7647bb7d in __pthread_join (threadid=140166649870080, thread_return=0x0) at pthread_join.c:90
#1  0x0000557a20de3910 in std::sys::unix::thread::Thread::join () at libstd/sys/unix/thread.rs:176
#2  0x0000557a20d53ea2 in <std::thread::JoinInner<T>>::join (self=<optimized out>)
    at /checkout/src/libstd/thread/mod.rs:1200
#3  <std::thread::JoinHandle<T>>::join (self=...) at /checkout/src/libstd/thread/mod.rs:1322
#4  0x0000557a20d6293a in dream_go::mcts::predict_aux (server=<optimized out>, num_workers=64, starting_tree=..., 
    starting_point=<optimized out>, starting_color=<optimized out>) at src/mcts/mod.rs:347
#5  0x0000557a20d62e99 in dream_go::mcts::predict (server=0x7ffc3587ae70, num_workers=..., starting_tree=..., 
    starting_point=0x7f7b69abb440, starting_color=dream_go::go::Color::Black) at src/mcts/mod.rs:388
#6  0x0000557a20d009bb in dream_go::gtp::Gtp::generate_move (self=0x7ffc35895910, id=..., 
    color=dream_go::go::Color::Black) at src/gtp/mod.rs:266
#7  0x0000557a20d013ba in dream_go::gtp::Gtp::process (self=0x7ffc35895910, id=..., cmd=...) at src/gtp/mod.rs:478
#8  0x0000557a20d05697 in dream_go::gtp::run () at src/gtp/mod.rs:549
#9  0x0000557a20cfdb03 in dream_go::main () at src/main.rs:89

#0  0x00007f7b75f92297 in accept4 (fd=11, addr=..., addr_len=0x7f7b6cc12718, flags=524288)
    at ../sysdeps/unix/sysv/linux/accept4.c:32
#1  0x00007f7b73044216 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007f7b7303880d in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00007f7b73044e80 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4  0x00007f7b7647a7fc in start_thread (arg=0x7f7b6cc13700) at pthread_create.c:465
#5  0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b75f84951 in __GI___poll (fds=0x7f7b6ba10000, nfds=10, timeout=100) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f7b7304348b in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007f7b730a878f in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00007f7b73044e80 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4  0x00007f7b7647a7fc in start_thread (arg=0x7f7b6c412700) at pthread_create.c:465
#5  0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b76481786 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f7b6b7fe740, expected=0, 
    futex_word=0x7f7b6aa0d028) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f7b6b7fe740, mutex=0x7f7b75094368, cond=0x7f7b6aa0d000) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=0x7f7b6aa0d000, mutex=0x7f7b75094368, abstime=0x7f7b6b7fe740) at pthread_cond_wait.c:667
#3  0x00007f7b73045a57 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4  0x00007f7b72ffe2c7 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5  0x00007f7b73044e80 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6  0x00007f7b7647a7fc in start_thread (arg=0x7f7b6b7ff700) at pthread_create.c:465
#7  0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b73d07780, cond=0x7f7b73d077b0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b73d077b0, mutex=0x7f7b73d07780) at pthread_cond_wait.c:655
#3  0x0000557a20d28837 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b73d07780, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d1195c in dream_go::parallel::service::worker_thread (is_running=..., state=..., queue=...)
    at src/parallel/service.rs:87
#7  0x0000557a20d28000 in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#8  <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#9  0x0000557a20d1e8a0 in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#10 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#11 0x0000557a20d1e4c3 in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#12 0x0000557a20d28124 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#13 0x0000557a20d554fd in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#14 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b73d07ab0, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#15 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#16 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#17 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#18 0x00007f7b7647a7fc in start_thread (arg=0x7f7b5d3fc700) at pthread_create.c:465
#19 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7aee670058)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7aee670000, cond=0x7f7aee670030) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7aee670030, mutex=0x7f7aee670000) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7aee670000, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d1d762 in dream_go::mcts::forward::{{closure}} () at src/mcts/mod.rs:104
#9  dream_go::mcts::global_cache::get_or_insert (board=0x7f7b175f6d20, color=<optimized out>, supplier=...)
    at src/mcts/global_cache.rs:196
#10 0x0000557a20d43a60 in dream_go::mcts::forward (server=<optimized out>, board=<optimized out>, color=<optimized out>)
    at src/mcts/mod.rs:96
#11 0x0000557a20d623a7 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:259
#12 0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#13 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#14 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#15 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#16 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#17 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#18 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#19 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#20 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#21 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b73aa4200, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#22 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#23 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#24 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#25 0x00007f7b7647a7fc in start_thread (arg=0x7f7b175ff700) at pthread_create.c:465
#26 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This looks suspicious...


#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b2aa63088)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b2aa63090, cond=0x7f7b2aa63060) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b2aa63060, mutex=0x7f7b2aa63090) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b2aa63090, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d62211 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:267
#9  0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#10 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#11 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#12 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#13 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#14 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#15 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#16 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#17 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#18 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b744fa600, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#19 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#20 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#21 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#22 0x00007f7b7647a7fc in start_thread (arg=0x7f7b17bfe700) at pthread_create.c:465
#23 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This also looks suspicious...


#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b2a40d088)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b2a40d090, cond=0x7f7b2a40d060) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b2a40d060, mutex=0x7f7b2a40d090) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b2a40d090, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d1d762 in dream_go::mcts::forward::{{closure}} () at src/mcts/mod.rs:104
#9  dream_go::mcts::global_cache::get_or_insert (board=0x7f7b17df6d20, color=<optimized out>, supplier=...)
    at src/mcts/global_cache.rs:196
#10 0x0000557a20d43a60 in dream_go::mcts::forward (server=<optimized out>, board=<optimized out>, color=<optimized out>)
    at src/mcts/mod.rs:96
#11 0x0000557a20d623a7 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:259
#12 0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#13 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#14 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#15 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#16 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#17 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#18 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#19 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#20 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#21 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b744fb400, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#22 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#23 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#24 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#25 0x00007f7b7647a7fc in start_thread (arg=0x7f7b17dff700) at pthread_create.c:465
#26 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This also looks suspicious...


#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b2a327058)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b2a327000, cond=0x7f7b2a327030) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b2a327030, mutex=0x7f7b2a327000) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b2a327000, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d62211 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:267
#9  0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#10 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#11 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#12 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#13 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#14 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#15 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#16 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#17 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#18 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b744fc200, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#19 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#20 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#21 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#22 0x00007f7b7647a7fc in start_thread (arg=0x7f7b183ff700) at pthread_create.c:465
#23 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b29c14118)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b29c140c0, cond=0x7f7b29c140f0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b29c140f0, mutex=0x7f7b29c140c0) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b29c140c0, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d62211 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:267
#9  0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#10 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#11 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#12 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#13 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#14 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#15 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#16 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#17 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#18 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b73a9e000, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#19 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#20 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#21 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#22 0x00007f7b7647a7fc in start_thread (arg=0x7f7b18bfe700) at pthread_create.c:465
#23 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b28c6a028)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b28c6a030, cond=0x7f7b28c6a000) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b28c6a000, mutex=0x7f7b28c6a030) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b28c6a030, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d62211 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:267
#9  0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#10 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#11 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#12 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#13 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#14 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#15 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#16 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#17 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#18 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b73a9ee00, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#19 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#20 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#21 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#22 0x00007f7b7647a7fc in start_thread (arg=0x7f7b18dff700) at pthread_create.c:465
#23 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b29998058)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b29998000, cond=0x7f7b29998030) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b29998030, mutex=0x7f7b29998000) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b29998000, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d1d762 in dream_go::mcts::forward::{{closure}} () at src/mcts/mod.rs:104
#9  dream_go::mcts::global_cache::get_or_insert (board=0x7f7b195f6d20, color=<optimized out>, supplier=...)
    at src/mcts/global_cache.rs:196
#10 0x0000557a20d43a60 in dream_go::mcts::forward (server=<optimized out>, board=<optimized out>, color=<optimized out>)
    at src/mcts/mod.rs:96
#11 0x0000557a20d623a7 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:259
#12 0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#13 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#14 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#15 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#16 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#17 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#18 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#19 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#20 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#21 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b73a9fc00, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#22 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#23 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#24 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#25 0x00007f7b7647a7fc in start_thread (arg=0x7f7b195ff700) at pthread_create.c:465
#26 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#0  0x00007f7b76481072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f7b29251148)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f7b29251150, cond=0x7f7b29251120) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7f7b29251120, mutex=0x7f7b29251150) at pthread_cond_wait.c:655
#3  0x0000557a20d28787 in std::sys::unix::condvar::Condvar::wait (self=<optimized out>, mutex=<optimized out>)
    at /checkout/src/libstd/sys/unix/condvar.rs:78
#4  std::sys_common::condvar::Condvar::wait (mutex=0x7f7b29251150, self=<optimized out>)
    at /checkout/src/libstd/sys_common/condvar.rs:51
#5  std::sync::condvar::Condvar::wait (self=<optimized out>, guard=...) at /checkout/src/libstd/sync/condvar.rs:212
#6  0x0000557a20d45739 in <dream_go::parallel::one_shot_channel::OneReceiver<T>>::recv (this=...)
    at src/parallel/one_shot_channel.rs:83
#7  0x0000557a20d12275 in <dream_go::parallel::service::ServiceGuard<'a, I>>::send (self=<optimized out>, req=...)
    at src/parallel/service.rs:205
#8  0x0000557a20d62211 in dream_go::mcts::predict_worker (context=..., server=...) at src/mcts/mod.rs:267
#9  0x0000557a20d3046e in dream_go::mcts::predict_aux::{{closure}}::{{closure}} () at src/mcts/mod.rs:343
#10 std::sys_common::backtrace::__rust_begin_short_backtrace (f=...) at /checkout/src/libstd/sys_common/backtrace.rs:133
#11 0x0000557a20d2803e in std::thread::Builder::spawn::{{closure}}::{{closure}} () at /checkout/src/libstd/thread/mod.rs:406
#12 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=<optimized out>)
    at /checkout/src/libstd/panic.rs:300
#13 0x0000557a20d1e83e in std::panicking::try::do_call (data=<optimized out>) at /checkout/src/libstd/panicking.rs:479
#14 0x0000557a20def70f in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:102
#15 0x0000557a20d1e75c in std::panicking::try (f=...) at /checkout/src/libstd/panicking.rs:458
#16 0x0000557a20d28190 in std::panic::catch_unwind (f=...) at /checkout/src/libstd/panic.rs:365
#17 0x0000557a20d5533f in std::thread::Builder::spawn::{{closure}} () at /checkout/src/libstd/thread/mod.rs:405
#18 <F as alloc::boxed::FnBox<A>>::call_box (self=0x7f7b73aa0a00, args=<optimized out>) at /checkout/src/liballoc/boxed.rs:817
#19 0x0000557a20dde0b8 in _$LT$alloc..boxed..Box$LT$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::hb0b36c038cd2d960 () at /checkout/src/liballoc/boxed.rs:827
#20 std::sys_common::thread::start_thread () at libstd/sys_common/thread.rs:24
#21 0x0000557a20de38c9 in std::sys::unix::thread::Thread::new::thread_start () at libstd/sys/unix/thread.rs:90
#22 0x00007f7b7647a7fc in start_thread (arg=0x7f7b19bff700) at pthread_create.c:465
#23 0x00007f7b75f90b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I think this is enough to see what is going on. There are monte carlo search threads waiting on the service to respond to some requests, and all of the service workers are asleep, so there is some race condition in there.

from dream-go.

kblomdahl avatar kblomdahl commented on June 15, 2024

This might be a duplicate of #20 since the following is present in the console when logging stderr:

thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:335:21
note: Run with `RUST_BACKTRACE=1` for a backtrace.

This is a fairly general error message, so might be different but would not complain if I could catch two birds with one stone.

from dream-go.

kblomdahl avatar kblomdahl commented on June 15, 2024

Setting the number of service threads to one seems to be a workaround for this issue. This suggest the race condition is limited to either the worker_thread or the process method.

Current hypothesis is that the race condition is that the has_more = false thread is not always executed after the previous thread when using more than one service thread. To fix this we need to use the same lock inside of the server as inside. Or provide another way to acquire whether there are more requests pending.

from dream-go.

kblomdahl avatar kblomdahl commented on June 15, 2024

As suggested in the previous post, by enforcing an order of the requests in the Service this issue seemed to have disappeared.

from dream-go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.