Giter Club home page Giter Club logo

Comments (23)

benoit-germain avatar benoit-germain commented on August 21, 2024

I am afraid that someone more knowledgeable with pthreads than myself should endeavour to have to look into this as I don't have the slightest idea what the problem could be.

from lanes.

davidm avatar davidm commented on August 21, 2024

I notice this on Ubuntu as well when trying to setup tests for LuaDist (LuaDist/Repository#76). The problem does not seem to occur in OSX and Cygwin.

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Just curious if 32aa701 fixes it?

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Forget it, I believe the fix is more complex. In fact, I'm not even sure if I won't have to resort to a dirty hack to fix it.

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

v3.1.6 contains a fix. Let's hope there is only one such bug :-).

from lanes.

hinrik avatar hinrik commented on August 21, 2024

I tried the latest git version and this issue is still present for me (LuaJIT 2.0.0 beta9 and Lua 5.1.5 on Linux 3.2.0, Debian Squeeze).

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

I've been trying to setup a bootable Debian USB drive to check this, but without success so far (network doesn't work yet). But I don't forget :-).

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

It looks like this crash at application shutdown occurs when the main thread invokes atexit_close_keepers(). The crash disappears when I don't register it anymore. It is as if the function pointer is invoked after the lanes SO is unloaded.
I suppose this happens when the main Lua state is closed, therefore before the handler is called.
I don't know why this doesn't crash on Windows. Maybe the handling differs on Windows and the DLL doesn't actually get unloaded, although the respective documentation says the behavior is basically the same.

I'll do some tests with this behavior removed and see how things fare.

from lanes.

mkottman avatar mkottman commented on August 21, 2024

I'd like to comment on this:

Starting program: /usr/bin/lua5.1 tests/errhangtest.lua
[Thread debugging using libthread_db enabled]
Cannot find new threads: generic error

This is because Lua is not compiled with pthread support. When Lua is loaded into gdb, it sees there is no support for pthreads and sets up itself in a certain way. When you load a pthread-enabled module into Lua later, gdb is confused and spits out this error.

I usually handle it by compiling a custom version of Lua with pthread support and debug symbols enabled, which I call luad so it does not mess with my system Lua:

--- lua-5.2.1/src/Makefile  2012-03-09 17:32:16.000000000 +0100
+++ lua-5.2.1-pthread/src/Makefile  2012-11-21 10:03:55.284051778 +0100
@@ -103,7 +103,7 @@
 generic: $(ALL)

 linux:
-   $(MAKE) $(ALL) SYSCFLAGS="-DLUA_USE_LINUX" SYSLIBS="-Wl,-E -ldl -lreadline -lncurses"
+   $(MAKE) $(ALL) SYSCFLAGS="-DLUA_USE_LINUX -pthread -ggdb" SYSLIBS="-Wl,-E -ldl -lreadline -lncurses -pthread -ggdb"

 macosx:
    $(MAKE) $(ALL) SYSCFLAGS="-DLUA_USE_MACOSX" SYSLIBS="-lreadline"

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Another option is to add this to your .gdbinit so that gdb always loads pthread by itself even if the debugged image isn't pthread-enabled:
set env LD_PRELOAD /lib/libpthread.so.0

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

fixed by f154e1f

from lanes.

hinrik avatar hinrik commented on August 21, 2024

I hate to be the bearer of bad news, but this didn't fix the issue for me. errhangtest.luait still hangs (or segfaults) at exit sometimes. It happens on two different x86-64 Debian machines I have. If you can't reproduce it I can give you shell access to one of them (just drop me a line at [email protected] or literalon irc.freenode.net) so you can troubleshoot it.

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Ok I reproduced it with Debian Squeeze amd64 as well. However, it won't hang with Debian Squeeze i386.

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Here is the callstack I get:

(gdb) bt
#0  0x00007ffff7bce1fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x00007ffff6d4da5b in THREAD_WAIT_IMPL () from /usr/local/lib/lua/5.1/lanes/core.so
#2  0x00007ffff6d44940 in selfdestruct_gc () from /usr/local/lib/lua/5.1/lanes/core.so
#3  0x0000000000408696 in ?? ()
#4  0x0000000000408ba9 in ?? ()
#5  0x000000000040a39f in ?? ()
#6  0x000000000040aa28 in ?? ()
#7  0x0000000000408287 in ?? ()
#8  0x000000000040e14e in lua_close ()
#9  0x00000000004041c1 in main ()

If I change the selfdestruct_gc code at line 1213 to perform a full selfdestruct chain processing as in windows, the application hangs much less often but I still get an occasional crash:

(gdb) r errhangtest.lua
Starting program: /usr/bin/lua errhangtest.lua
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff6b11700 (LWP 7242)]
true    true
false   tried to copy unsupported types
oh boy

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff6b11700 (LWP 7242)]
0x00007ffff6d43ccf in ?? ()
(gdb) bt
#0  0x00007ffff6d43ccf in ?? ()
#1  0x00007ffff6b11700 in ?? ()
#2  0x000000000065d720 in ?? ()
#3  0x000000000065d720 in ?? ()
#4  0x0000000000000002 in ?? ()
#5  0x00000000006503f0 in ?? ()
#6  0x00007ffff6b11700 in ?? ()
#7  0x00007ffff6b11700 in ?? ()
#8  0x00007ffff6b11700 in ?? ()
#9  0x0000000000000000 in ?? ()

It looks like the crash occurs inside the timer lane's thread (which is the only one the test creates).
But I certainly don't know why there is a difference between the 32 and 64 bits build in that regard.

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

I caused the program to crash while inside gdb and inspected the process: here is what I see:

benoit@benoit-germain-debian64:~$ lsof -p 10238
COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
lua     10238 benoit  cwd    DIR    8,1     4096  82584 /home/benoit/lanes-master/tests
lua     10238 benoit  rtd    DIR    8,1     4096      2 /
lua     10238 benoit  txt    REG    8,1   167904 463577 /usr/bin/lua5.1
lua     10238 benoit  mem    REG    8,1   286776 434221 /lib/libncurses.so.5.7
lua     10238 benoit  mem    REG    8,1  1437064 434187 /lib/libc-2.11.3.so
lua     10238 benoit  mem    REG    8,1   273840 434307 /lib/libreadline.so.6.1
lua     10238 benoit  mem    REG    8,1    14696 434199 /lib/libdl-2.11.3.so
lua     10238 benoit  mem    REG    8,1   530736 434200 /lib/libm-2.11.3.so
lua     10238 benoit  mem    REG    8,1   131258 434182 /lib/libpthread-2.11.3.so
lua     10238 benoit  mem    REG    8,1   128744 434183 /lib/ld-2.11.3.so
lua     10238 benoit    0u   CHR  136,0      0t0      3 /dev/pts/0
lua     10238 benoit    1u   CHR  136,0      0t0      3 /dev/pts/0
lua     10238 benoit    2u   CHR  136,0      0t0      3 /dev/pts/0
lua     10238 benoit    3r  FIFO    0,8      0t0  29125 pipe
lua     10238 benoit    4w  FIFO    0,8      0t0  29125 pipe

As you can see, lanes/core.so is no longer loaded, therefore it was unloaded before all objects are garbage collected, including the one Lanes registers so that its __gc metamethod performs thread cleanup. This seems to be related to a known Lua issue that exists since Lua 5.1, and is fixed in Lua 5.2.1. But again, why is it not 100%, and why should it work fine on 32 bits flavors?

from lanes.

mwild1 avatar mwild1 commented on August 21, 2024

I can reproduce this crash on 32-bit. Do you have any ideas for how this might be fixed on the Lanes side?

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Some simple repro case would help answer this. If I can debug it I should be able to see what's wrong. But so far I haven't had any issue (but I don't work on linux, and even a few tries in virtualbox didn't crash).

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

Just in case:is it fixed in version 3.6.4?

from lanes.

hinrik avatar hinrik commented on August 21, 2024

Not for me. Here's what happens on both 3.6.4 and 3.6.6:

$ lua errhangtest.lua 
true    true
false   tried to copy unsupported types
oh boy
Segmentation fault

Got the same result on both Lua 5.1.5 and 5.2.2 (Debian jessie x86-64)

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

I am somewhat stumped. I have a debian 6.0.6 64 bits virtualbox that works just fine:

$ lua errhangtest.lua
3.6.6
true    true
false   tried to copy unsupported types
oh boy

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

938ee19 fixes a possible shutdown sequence crash. I don't really think this could be the actual issue as I fix something related to the protect_allocator feature, but who knows :-). Can someone give it a try?

from lanes.

hinrik avatar hinrik commented on August 21, 2024

It works now. On both 5.1.5 and 5.2.2.

from lanes.

benoit-germain avatar benoit-germain commented on August 21, 2024

w00t!

from lanes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.