Comments (32)
Update: This only happens after keras
has been imported.
from guppy3.
I just tried, Python 3.8 on Linux:
$ cat issue25.py
import keras
import guppy
h = guppy.hpy()
he = h.heap()
print(he)
$ python issue25.py
2021-01-16 22:06:26.697870: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-01-16 22:06:26.697921: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Partition of a set of 592677 objects. Total size = 78785975 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 150998 25 23972538 30 23972538 30 str
1 136448 23 10686424 14 34658962 44 tuple
2 41460 7 7348668 9 42007630 53 types.CodeType
3 82657 14 7000740 9 49008370 62 bytes
4 41609 7 5658824 7 54667194 69 function
5 4039 1 3950792 5 58617986 74 type
6 2471 0 3917936 5 62535922 79 dict of module
7 7476 1 3280376 4 65816298 84 dict (no owner)
8 4039 1 2036536 3 67852834 86 dict of type
9 8052 1 1716128 2 69568962 88 dict of function
<762 more rows. Type e.g. '_.more' to view.>
Looks like by importing itself + the given guppy code doesn't cause a crash.
Is it possible for you to get a core dump of a stack trace of the crash? In the meantime, I'll try to find a Windows 7 install that I can test on. I'm not very familiar with Windows debuggers though.
from guppy3.
- I booted a test VM from https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/ ("IE11 on WIn7")
- Installed Python from https://www.python.org/downloads/release/python-387/ with default settings
- Created a venv
python -m venv venv
, thenpython -m pip install -U pip wheel setuptools
- Then proceeded to install
keras
, viapip install keras
. A dependencyh5py
failed to install withUnable to load dependency HDF5
I'm not a Windows person. Would you be willing to share some steps on to get a test environment set up that is able to reproduce this crash?
from guppy3.
Thanks for the considerable effort. You can find a binary of h5py
for Windows, and many other packages, here: https://www.lfd.uci.edu/~gohlke/pythonlibs/
Then run the following:
import keras
import guppy
h = guppy.hpy()
he = h.heap()
from guppy3.
Ok, the wheel published there (h5py‑2.10.0‑cp38‑cp38‑win32.whl
) does allow me to install h5py, and I was able to successfully install keras
. However, upon importing keras
it says "Keras requires TensorFlow 2.2 or higher" (also happened on Linux), and I went to check the page. Only amd64 wheels are available, both from that page and PyPI.
I did a slight googling around and it seems that TensorFlow for x86 32-bit is probably very complicated and completely unsupported (https://stackoverflow.com/q/44449972). Are you using amd64 Win7? Let me try to find a test VM image for that.
from guppy3.
from guppy3.
If I'm not mistaken, when people say "amd64" it's just a silly way to say
64 bit processor, regardless of whether it's Intel or AMD. In other words,
if your Windows computer is from the last 10 years and it's not a weird
netbook or something, it's likely "amd64" :)
If the win32 wheel worked for you, I guess your VM is 32bit. You should do
a 64bit VM just because that's what 99% of Windows users do.
Yes, the host machine is x86 64-bit. I don't have a licensed Win 7 to test with, hence I'm looking for a VM to download. Microsoft's test VM image from https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/ are only x86 32-bit images.
Also, you're going the reproduction route. Another way to tackle this would
be to get more logging output from my machine to let you figure out the
bug. If there's anything you want me to run, as long as it's not something
that requires a lot of setup and work, I'll be happy to do that.
I'm guessing this is a segfault in one of the C code. Is it possible for you to use faulthandler or get a C stack trace somehow? Is it possible to get a core dump somehow?
I'm also testing on a Win 10 amd64 install (licensed copy on bare metal) and is unable to reproduce the crash (he = h.heap()
runs successfully and he
prints successfully).
from guppy3.
Got a pirated copy of Windows 7. Will try to reproduce on that later.
from guppy3.
LOL, GitHub is owned by Microsoft, hope they won't notice ;) They don't even let people buy legal copies of Windows 7, and I tried multiple times.
I've never used these tools you mentioned. I don't want to spend time researching, but if you'll give me lines to run, I'll run them.
from guppy3.
Cannot reproduce on 64 bit Win7. The installation of packages are python -m venv venv
, venv\Scripts\activate
, python -m pip install -U pip wheel setuptools
, pip install keras tensorflow
, pip install guppy3
.
I've never used these tools you mentioned. I don't want to spend time researching, but if you'll give me lines to run, I'll run them.
The one I suggested is faulthandler. falulthandler is run passively; you just need to enable it:
import faulthandler
faulthandler.enable()
The problem is that faulthandler is only able to dump a stack trace for the interpreted python code. The fault probably happens in some native C code and it would be helpful to pinpoint the native function that faulted.
I googled around a bit and found https://stackoverflow.com/a/49050274 regarding Windows Error Reporting which might be helpful in that.
from guppy3.
This is the output from faulthandler
:
$ python fluff.py
2021-01-19 17:57:03.684490: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not
load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-01-19 17:57:03.696491: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart
dlerror if you do not have a GPU set up on your machine.
Windows fatal exception: access violation
Current thread 0x0000865c (most recent call first):
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\View.py", line 479 in referrers
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\UniSet.py", line 556 in <lambda>
File "C:\Program Files\Python38\lib\site-packages\guppy\etc\Descriptor.py", line 32 in __get__
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 613 in relimg
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 487 in get_children
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 467 in linegenerator
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 423 in generate
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 438 in get_row_index
ed
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 456 in iterlines
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\RefPat.py", line 417 in _oh_get_line_
iter
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 212 in line_
at
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 232 in lines
_from
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 307 in f
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 339 in <lamb
da>
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\View.py", line 256 in enter
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 339 in get_s
tr
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 272 in get_s
tr_of_top
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\OutputHandling.py", line 386 in reprf
unc
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\View.py", line 356 in heap
File "C:\Program Files\Python38\lib\site-packages\guppy\heapy\Use.py", line 192 in heap
File "fluff.py", line 6 in <module>
Segmentation fault
from guppy3.
I was able to create the WER dump only when the debugger was on, not sure why. When it was off, the crash still happens just without the Windows dialog. You can download the dump here but I have no idea how you would read it.
from guppy3.
This is the output from faulthandler:
Looks like the last python frame is View.py#L479, which would call into hv.c#L1518. This is a rather complex C function to workaround issue #7.
You can download the dump here but I have no idea how you would read it.
Looks like a minidump file:
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ file python.exe.13268.dmp
python.exe.13268.dmp: Mini DuMP crash report, 12 streams, Tue Jan 19 16:04:33 2021, 0x1826 type
Searching around Google has a tool called Breakpad to work with this format and I'm looking into it.
from guppy3.
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ breakpad/src/src/processor/minidump_dump python.exe.13268.dmp > python.exe.13268.dmp.info
The dumped information reports exception
MDException
thread_id = 0x4cb8
exception_record.exception_code = 0xc0000005
exception_record.exception_flags = 0x0
exception_record.exception_record = 0x0
exception_record.exception_address = 0x7fedd47feaf
exception_record.number_parameters = 2
exception_record.exception_information[ 0] = 0x0
exception_record.exception_information[ 1] = 0xd0
thread_context.data_size = 1232
thread_context.rva = 0xeb50
and the context:
MDRawContextAMD64
p1_home = 0x2b9370
p2_home = 0x2b8e80
p3_home = 0x0
p4_home = 0x0
p5_home = 0x3ff00000000
p6_home = 0x0
context_flags = 0x10005f
mx_csr = 0x1fa9
cs = 0x33
ds = 0x2b
es = 0x2b
fs = 0x53
gs = 0x2b
ss = 0x2b
eflags = 0x10246
dr0 = 0x0
dr1 = 0x0
dr2 = 0x0
dr3 = 0x0
dr6 = 0x0
dr7 = 0x0
rax = 0x1d2975e0
rcx = 0x1e16e580
rdx = 0x83c3270
rbx = 0x1d2975e0
rsp = 0x2b95b0
rbp = 0x1e16e580
rsi = 0x1e16e580
rdi = 0x29e7e20
r8 = 0x0
r9 = 0x83c3270
r10 = 0x1c146350
r11 = 0x2b95f8
r12 = 0x0
r13 = 0x0
r14 = 0x0
r15 = 0x1
rip = 0x7fedd47feaf
This address maps into python core
module[4]
MDRawModule
base_of_image = 0x7fedd450000
size_of_image = 0x42c000
checksum = 0x40703a
time_date_stamp = 0x5dfab24b 2019-12-18 23:12:11
module_name_rva = 0x68c0
version_info.signature = 0xfeef04bd
version_info.struct_version = 0x10000
version_info.file_version = 0x30008:0x47e03f5
version_info.product_version = 0x30008:0x47e03f5
version_info.file_flags_mask = 0x3f
version_info.file_flags = 0x0
version_info.file_os = 0x4
version_info.file_type = 0x2
version_info.file_subtype = 0x0
version_info.file_date = 0x0:0x0
cv_record.data_size = 57
cv_record.rva = 0x11745
misc_record.data_size = 0
misc_record.rva = 0x0
(code_file) = "C:\Program Files\Python38\python38.dll"
(code_identifier) = "5DFAB24B42c000"
(cv_record).cv_signature = 0x53445352
(cv_record).signature = 07725f2f-6ae8-46c5-955b-103f10b1c445
(cv_record).age = 1
(cv_record).pdb_file_name = "C:\A\27\b\bin\amd64\python38.pdb"
(misc_record) = (null)
(debug_file) = "C:\A\27\b\bin\amd64\python38.pdb"
(debug_identifier) = "07725F2F6AE846C5955B103F10B1C4451"
(version) = "3.8.1150.1013"
The offset matches the original description
Exception Offset: 000000000002feaf
>>> hex(0x7fedd47feaf - 0x7fedd450000)
'0x2feaf'
Let me see if I can locate which function is at 0x2feaf.
from guppy3.
time_date_stamp = 0x5dfab24b 2019-12-18 23:12:11
This matches the release date of Python 3.8.1... hmm
Downloaded the Python 3.8 DLL from https://www.python.org/ftp/python/3.8.1/python-3.8.1-embed-amd64.zip, and readpe
says:
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/py3.8.1/win $ readpe python38.dll
[...]
COFF/File header
[...]
Date/time stamp: 1576710731 (Wed, 18 Dec 2019 23:12:11 UTC)
[...]
Optional/Image header
[...]
Checksum: 0x40703a
[...]
Nice.
I was under the assumption that you are running under latest Python 3.8 (3.8.7). Let me see if I can reproduce it by using 3.8.1. If not I'll look deeper into the symbols.
from guppy3.
Not on Linux
(venv.py3.8.1) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/py3.8.1/Python-3.8.1 $ python
Python 3.8.1 (default, Jan 19 2021, 21:35:33)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras
2021-01-19 21:40:24.925223: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-01-19 21:40:24.925262: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
>>> import guppy
>>> h = guppy.hpy()
>>> he = h.heap()
>>> he
Partition of a set of 590742 objects. Total size = 78772366 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 149976 25 24049357 31 24049357 31 str
1 135555 23 10635376 14 34684733 44 tuple
2 41449 7 7346732 9 42031465 53 types.CodeType
3 82631 14 6998144 9 49029609 62 bytes
4 41603 7 5658008 7 54687617 69 function
5 4038 1 3949728 5 58637345 74 type
6 2473 0 3919752 5 62557097 79 dict of module
7 7475 1 3242320 4 65799417 84 dict (no owner)
8 4038 1 2036176 3 67835593 86 dict of type
9 8052 1 1716128 2 69551721 88 dict of function
<765 more rows. Type e.g. '_.more' to view.>
>>>
Not on Win 7 either
AFAICT, the python38.dll does not have a symbol table
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/py3.8.1/win $ nm python38.dll
nm: python38.dll: no symbols
And the nearest functions to 0x2feaf
AFAICT, are
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/py3.8.1/win $ readpe python38.dll | grep -A 1 Function | grep 0x | sort | less
[...]
0x2e140: PyObject_RichCompareBool
0x2e384: PySet_Contains
0x2e98: _PyUnicode_EncodeUTF32
0x2f010: PyNumber_InPlaceAdd
0x2f160: PyNumber_Add
0x2fae4: PyLong_AsDouble
0x2fe90: PyWeakref_NewRef
0x302060: _Py_ascii_whitespace
0x30554: PyLong_FromLongLong
0x30718: _PyLong_Frexp
0x30eb0: PyObject_GC_Track
[...]
This is the function of PyWeakref_NewRef
, until its return (yes I'm aware that 2ff27
jumps further down but it's more effort than necessary to track precisely where the function ends):
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/py3.8.1/win $ objdump -d python38.dll | less
[...]
18002fe90: 4c 8b dc mov %rsp,%r11
18002fe93: 49 89 5b 10 mov %rbx,0x10(%r11)
18002fe97: 55 push %rbp
18002fe98: 56 push %rsi
18002fe99: 57 push %rdi
18002fe9a: 41 56 push %r14
18002fe9c: 41 57 push %r15
18002fe9e: 48 83 ec 20 sub $0x20,%rsp
18002fea2: 4c 8b 41 08 mov 0x8(%rcx),%r8
18002fea6: 45 33 f6 xor %r14d,%r14d
18002fea9: 4c 8b ca mov %rdx,%r9
18002feac: 48 8b e9 mov %rcx,%rbp
18002feaf: 49 8b 80 d0 00 00 00 mov 0xd0(%r8),%rax
18002feb6: 48 85 c0 test %rax,%rax
18002feb9: 0f 8e bd d2 12 00 jle 0x18015d17c
18002febf: 48 8d 34 08 lea (%rax,%rcx,1),%rsi
18002fec3: 4d 89 73 08 mov %r14,0x8(%r11)
18002fec7: 48 8b 06 mov (%rsi),%rax
18002feca: 4c 8d 3d 5f 71 37 00 lea 0x37715f(%rip),%r15 # 0x1803a7030
18002fed1: 4d 89 73 18 mov %r14,0x18(%r11)
18002fed5: 48 8b d0 mov %rax,%rdx
18002fed8: 41 8b de mov %r14d,%ebx
18002fedb: 48 85 c0 test %rax,%rax
18002fede: 74 2c je 0x18002ff0c
18002fee0: 4c 39 70 18 cmp %r14,0x18(%rax)
18002fee4: 75 26 jne 0x18002ff0c
18002fee6: 41 8b ce mov %r14d,%ecx
18002fee9: 4c 39 78 08 cmp %r15,0x8(%rax)
18002feed: 75 0b jne 0x18002fefa
18002feef: 49 89 43 08 mov %rax,0x8(%r11)
18002fef3: 48 8b ca mov %rdx,%rcx
18002fef6: 48 8b 40 30 mov 0x30(%rax),%rax
18002fefa: 48 8b d9 mov %rcx,%rbx
18002fefd: 48 85 c0 test %rax,%rax
18002ff00: 74 0a je 0x18002ff0c
18002ff02: 4c 39 70 18 cmp %r14,0x18(%rax)
18002ff06: 0f 84 8e d2 12 00 je 0x18015d19a
18002ff0c: 48 8d 05 6d 49 37 00 lea 0x37496d(%rip),%rax # 0x1803a4880
18002ff13: 49 8b fe mov %r14,%rdi
18002ff16: 4c 3b c8 cmp %rax,%r9
18002ff19: 49 0f 45 f9 cmovne %r9,%rdi
18002ff1d: 48 85 ff test %rdi,%rdi
18002ff20: 49 0f 45 de cmovne %r14,%rbx
18002ff24: 48 85 db test %rbx,%rbx
18002ff27: 74 17 je 0x18002ff40
18002ff29: 48 ff 03 incq (%rbx)
18002ff2c: 48 8b c3 mov %rbx,%rax
18002ff2f: 48 8b 5c 24 58 mov 0x58(%rsp),%rbx
18002ff34: 48 83 c4 20 add $0x20,%rsp
18002ff38: 41 5f pop %r15
18002ff3a: 41 5e pop %r14
18002ff3c: 5f pop %rdi
18002ff3d: 5e pop %rsi
18002ff3e: 5d pop %rbp
18002ff3f: c3 retq
2feaf
is mov 0xd0(%r8),%rax
which is indeed an instruction that could fault.
Questions for myself:
- how does reference graph generation end up creating a new weak reference?
- why would creating new weak reference fault?
from guppy3.
@cool-RR can you check, if you create a new virtual environment with the latest packages, it still faults inside the virtual environment? Like:
python -m venv venv
venv\Scripts\activate
python -m pip install -U pip wheel setuptools
pip install -U keras tensorflow guppy3
python fluff.py
from guppy3.
Microsoft x64 Calling Convention, arguments are at RCX RDX R8 R9
PyObject *
PyWeakref_NewRef(PyObject *ob, PyObject *callback)
ob
in RCX, callback
in RDX
18002fea2: 4c 8b 41 08 mov 0x8(%rcx),%r8
gef➤ ptype /o PyObject
type = struct _object {
/* 0 | 8 */ Py_ssize_t ob_refcnt;
/* 8 | 8 */ struct _typeobject *ob_type;
/* total size (bytes): 16 */
}
struct _typeobject *r8 = ob->ob_type
18002feaf: 49 8b 80 d0 00 00 00 mov 0xd0(%r8),%rax
gef➤ ptype /o struct _typeobject
/* offset | size */ type = struct _typeobject {
/* 0 | 24 */ PyVarObject ob_base;
/* 24 | 8 */ const char *tp_name;
[...]
/* 208 | 8 */ Py_ssize_t tp_weaklistoffset;
[...]
/* 408 | 8 */ int (*tp_print)(PyObject *, FILE *, int);
/* total size (bytes): 416 */
}
Py_ssize_t rax = r8->tp_weaklistoffset
That's weakrefobject.c#L801,
if (!PyType_SUPPORTS_WEAKREFS(Py_TYPE(ob))) {
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)
Then compare the context (#25 (comment))
rcx = 0x1e16e580
r8 = 0x0
We have an object whose type is NULL... how does that happen?
from guppy3.
Looking at stack:
rsp = 0x2b95b0
After patching Breakpad like:
diff --git a/src/tools/linux/md2core/minidump-2-core.cc b/src/tools/linux/md2core/minidump-2-core.cc
index aade82c9..7d64bbef 100644
--- a/src/tools/linux/md2core/minidump-2-core.cc
+++ b/src/tools/linux/md2core/minidump-2-core.cc
@@ -630,7 +630,7 @@ ParseSystemInfo(const Options& options, CrashedProcess* crashinfo,
"Linux") &&
sysinfo->platform_id != MD_OS_NACL) {
fprintf(stderr, "This minidump was not generated by Linux or NaCl.\n");
- exit(1);
+ // exit(1);
}
if (options.verbose) {
I'm able to convert it into a core dump:
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ breakpad/src/src/tools/linux/md2core/minidump-2-core -v python.exe.13268.dmp > python.exe.13268.dmp.core
[...]
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ readelf -eW python.exe.13268.dmp.core
[...]
LOAD 0x007000 0x00000000002b8000 0x0000000000000000 0x008000 0x008000 RW 0x1000
[...]
Ok we have the stack contents in core dump, just need to load it into gdb with a dummy executable:
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ cat null.S
.globl _start
_start:
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ gcc null.S -nostartfiles -o null
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25 $ gdb ./null python.exe.13268.dmp.core
[...]
gef➤ info reg
rax 0x58 0x58
rbx 0x2b8490 0x2b8490
rcx 0x2 0x2
rdx 0x2b8400 0x2b8400
rsi 0x0 0x0
rdi 0x2 0x2
rbp 0x2 0x2
rsp 0x2b8358 0x2b8358
r8 0x0 0x0
r9 0x40 0x40
r10 0x0 0x0
r11 0x286 0x286
r12 0x0 0x0
r13 0x2b8400 0x2b8400
r14 0x0 0x0
r15 0x0 0x0
rip 0x77089d5a 0x77089d5a
eflags 0x246 [ PF ZF IF ]
cs 0x33 0x33
ss 0x2b 0x2b
ds 0x2b 0x2b
es 0x2b 0x2b
fs 0x53 0x53
gs 0x2b 0x2b
gef➤ x/10x 0x2b8358
0x2b8358: 0x504d444d 0x61b1a793 0x0000000c 0x00000020
0x2b8368: 0x00000000 0x60070311 0x00001826 0x00000000
0x2b8378: 0x00000003 0x00000184
Nice.
from guppy3.
For future note, gdb's threads is not useful (these are all ntdll.dll + 0x69d5a):
gef➤ info threads
Id Target Id Frame
* 1849 LWP 19640 0x0000000077089d5a in ?? ()
1850 LWP 24520 0x0000000077089d5a in ?? ()
1851 LWP 33360 0x0000000077089d5a in ?? ()
1852 LWP 27624 0x0000000077089d5a in ?? ()
1853 LWP 19032 0x0000000077089d5a in ?? ()
1854 LWP 14564 0x0000000077089d5a in ?? ()
1855 LWP 15584 0x0000000077089d5a in ?? ()
1856 LWP 22348 0x0000000077089d5a in ?? ()
18002fe90: 4c 8b dc mov %rsp,%r11
r11 should point to the saved return address.
rsp = 0x2b95b0
r11 = 0x2b95f8
gef➤ x/10xg 0x2b95f8
0x2b95f8: 0x0000000000000000 0x0000000000000000
0x2b9608: 0xf680000000000000 0x00005000000007fe
0x2b9618: 0x5caeb94d0000e00b 0xfeef04bd0000719e
0x2b9628: 0x000a000000010000 0x000a000038390bae
0x2b9638: 0x0000003f38390bae 0x0004000400000000
This makes no sense to me. The return address is NULL?
Looking at the object that's passed in
rcx = 0x1e16e580
it's mapped
LOAD 0x000000 0x000000001e0f0000 0x0000000000000000 0x000000 0x09a000 R 0x1000
but the core dump does not contain the data (I'll see if I can figure out how to get it)
However, the second argument
rdx = 0x83c3270
LOAD 0x007000 0x00000000002b8000 0x0000000000000000 0x008000 0x008000 RW 0x1000
LOAD 0x000000 0x0000000009970000 0x0000000000000000 0x000000 0x1c88000 R 0x1000
This is not mapped at all.
from guppy3.
I stand corrected. I found another tool (https://github.com/skelsec/minidump) to look at dumps and it is in fact actually mapped:
(venv) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/py-minidump/minidump $ minidump ../../python.exe.13268.dmp --memory
[...]
0x8390000 | 0x8390000 | 4 | 0x40000 | MEM_COMMIT | PAGE_READWRITE | MEM_PRIVATE
[...]
I guess I should write a tool myself to convert a mini dump into core dump
from guppy3.
Performed this patch to Breakpad: https://gist.github.com/zhuyifei1999/ff2094d04b91c8ef704e79ab816993aa
gef➤ x/10xg 0x2b95f8
0x2b95f8: 0x000007fec43ccf98 0x0000000000000000
0x2b9608: 0x000000001d2975e0 0x000000001d18a100
0x2b9618: 0x00000000002b96a9 0x000000001e16e580
0x2b9628: 0x000007fec43ccd1b 0x0000000004e08140
0x2b9638: 0x000007fec43da8a0 0x0000000008380b30
gef➤ x/10xg 0x83c3270
0x83c3270: 0x00000000000002fc 0x000007fedd7f41f0
0x83c3280: 0x000007fec43d7850 0x0000000008380b30
0x83c3290: 0x0000000000000000 0x0000000000000000
0x83c32a0: 0x000007fedd4a4940 0x0000000000000000
0x83c32b0: 0x0000000000000002 0x000007fedd7f6cf0
gef➤ x/10xg 0x1e16e580
0x1e16e580: 0x0000000000000001 0x0000000000000000
0x1e16e590: 0x0000000000000000 0x000000001e13cf68
0x1e16e5a0: 0x00000000000001a0 0x0000000000000000
0x1e16e5b0: 0x000000001e0f6800 0x0000000000000000
0x1e16e5c0: 0x0000000000000000 0x0000000000000000
gef➤ x/10xg 0x83c3270
0x83c3270: 0x00000000000002fc 0x000007fedd7f41f0
0x83c3280: 0x000007fec43d7850 0x0000000008380b30
0x83c3290: 0x0000000000000000 0x0000000000000000
0x83c32a0: 0x000007fedd4a4940 0x0000000000000000
0x83c32b0: 0x0000000000000002 0x000007fedd7f6cf0
gef➤ x/10xg 0x000007fedd7f41f0 + 24
0x7fedd7f4208: 0x000007fedd780f50 0x0000000000000038
0x7fedd7f4218: 0x0000000000000000 0x000007fedd47ab90
0x7fedd7f4228: 0x0000000000000030 0x0000000000000000
0x7fedd7f4238: 0x0000000000000000 0x0000000000000000
0x7fedd7f4248: 0x000007fedd58e300 0x0000000000000000
gef➤ p (char *)0x000007fedd780f50
$2 = 0x7fedd780f50 "builtin_function_or_method"
I'm guessing it is working?
from guppy3.
gef➤ x/10xg 0x2b95f8
0x2b95f8: 0x000007fec43ccf98 0x0000000000000000
Return address is 0x000007fec43ccf98
.
This belongs to heapyc of guppy (offset cf98
):
0x7fec43c0000-0x7fec43e1000, ChkSum: 0x00000000, GUID: 61B1A793-000C-0000-2000-000000000000, "C:\Program Files\Python38\Lib\site-packages\guppy\heapy\heapyc.cp38-win_amd64.pyd"
Assuming a wheel install,
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3/issue25/guppy $ objdump -d heapy/heapyc.cp38-win_amd64.pyd | less
[...]
18000cf73: 33 d2 xor %edx,%edx
18000cf75: 48 8b cb mov %rbx,%rcx
18000cf78: 44 8d 42 68 lea 0x68(%rdx),%r8d
18000cf7c: e8 81 4e ff ff callq 0x180001e02
18000cf81: 48 89 1f mov %rbx,(%rdi)
18000cf84: 48 8b ce mov %rsi,%rcx
18000cf87: 48 89 6b 40 mov %rbp,0x40(%rbx)
18000cf8b: 48 89 33 mov %rsi,(%rbx)
18000cf8e: 48 8b 55 30 mov 0x30(%rbp),%rdx
18000cf92: ff 15 e8 25 00 00 callq *0x25e8(%rip) # 0x18000f580
18000cf98: 48 89 43 48 mov %rax,0x48(%rbx)
18000cf9c: 48 85 c0 test %rax,%rax
18000cf9f: 75 0b jne 0x18000cfac
18000cfa1: 48 8b cb mov %rbx,%rcx
18000cfa4: ff 15 06 23 00 00 callq *0x2306(%rip) # 0x18000f2b0
18000cfaa: 33 db xor %ebx,%ebx
18000cfac: 48 8b 6c 24 38 mov 0x38(%rsp),%rbp
18000cfb1: 48 8b c3 mov %rbx,%rax
18000cfb4: 48 8b 5c 24 30 mov 0x30(%rsp),%rbx
18000cfb9: 48 8b 74 24 40 mov 0x40(%rsp),%rsi
18000cfbe: 48 83 c4 20 add $0x20,%rsp
18000cfc2: 5f pop %rdi
18000cfc3: c3 retq
[...]
What could this function be?
from guppy3.
Educated guess: hv.c#L384, hv_new_xt_for_type_at_xtp
It's sure that the "something" it is creating a weak reference to is a type object... let's check its name
gef➤ x/10xg 0x1e16e580
0x1e16e580: 0x0000000000000001 0x0000000000000000
0x1e16e590: 0x0000000000000000 0x000000001e13cf68
0x1e16e5a0: 0x00000000000001a0 0x0000000000000000
0x1e16e5b0: 0x000000001e0f6800 0x0000000000000000
0x1e16e5c0: 0x0000000000000000 0x0000000000000000
gef➤ x/10xg 0x1e16e580 + 24
0x1e16e598: 0x000000001e13cf68 0x00000000000001a0
0x1e16e5a8: 0x0000000000000000 0x000000001e0f6800
0x1e16e5b8: 0x0000000000000000 0x0000000000000000
0x1e16e5c8: 0x0000000000000000 0x0000000000000000
0x1e16e5d8: 0x0000000000000000 0x0000000000000000
gef➤ p (char *)0x000000001e13cf68
$4 = 0x1e13cf68 "PyOleNothing"
Googling around I see pympler/pympler#80
Looking at the code of pywin32 I see mhammond/pywin32@daeb5f2
This was released in latest pywin32 https://pypi.org/project/pywin32/#history, pywin32==300
. I have no idea why pywin32 would be imported by an older version of keras
but we can check that is it imported and mapped into memory:
0x1e0f0000-0x1e18a000, ChkSum: 0x00000000, GUID: D6AFCF3D-D19E-4AC0-920B-1413D161983D, "C:\Program Files\Python38\Lib\site-packages\pywin32_system32\pythoncom38.dll"
0x5b950000-0x5b978000, ChkSum: 0x00000000, GUID: 58BDE457-928B-43D9-B645-19979EA39325, "C:\Program Files\Python38\Lib\site-packages\pywin32_system32\pywintypes38.dll"
Also successfully reproduced this:
Considering that it is not valid for an object to have a NULL as its type and be passed around to the python interpreter, I don't think this is something we should work around.
@cool-RR could you please confirm that the crash is resolved with an upgrade to pywin32==300
?
from guppy3.
"I'm not a Windows person." You are now 😆 I've been using Windows for users and doing some development for it, and I never got as deep as you now did.
That was amazing. Yes, the problem was fixed by upgrading pywin32, both in my test example and in my actual application. Thank you very much.
One question that can be asked now is whether to treat this as something that could be improved in guppy3. You could maybe show a warning when someone tries to use guppy3 with an old pywin32
installation, so people who get this crash wouldn't be as confused as we were. But I don't know how prevalent this problem is, and whether that's worth the code. Your decision.
from guppy3.
"I'm not a Windows person." You are now laughing I've been using Windows for users and doing some development for it, and I never got as deep as you now did.
All I did was figuring out how to convert a minidump into a core dump, the rest is my usual GDB process, just complexed by a lack of symbols 😉
You could maybe show a warning when someone tries to use guppy3 with an old pywin32 installation, so people who get this crash wouldn't be as confused as we were. But I don't know how prevalent this problem is, and whether that's worth the code.
Good idea. Hmm
from guppy3.
Wdyt of something like:
if 'pythoncom' in sys.modules:
try:
import pkg_resources
pywin32_ver = (pkg_resources.get_distribution('pywin32')
.parsed_version)
except Exception:
pass
else:
if pywin32_ver.major < 300:
import warnings
warnings.warn(
'pythoncom in pywin32 < 300 may cause crashes. '
'See https://github.com/zhuyifei1999/guppy3/issues/25')
Should it be more visible?
from guppy3.
This is probably what I'd do. I might have the warning say "You probably want to upgrade to the newest version of pywin32 by running pip install pywin32 --upgrade
".
@mhammond @kxrob Does this look like a good way to test the pywin32 version?
from guppy3.
IIUC, that will only work when installed via pip, but some users install via a bdist_wininst executable. If you care about that case, then you can probably look for site-packages\pywin32.version.txt
as a fallback (it's just the build number with a trailing newline) - eg, https://github.com/mhammond/pywin32/blob/f3f55abf528902f3b98c37b0e661d8b52dff7f94/Pythonwin/pywin/framework/app.py#L338-L344
from guppy3.
if 'pythoncom' in sys.modules:
def get_pywin32_ver():
try:
import pkg_resources
return pkg_resources.get_distribution('pywin32').version
except Exception:
pass
try:
import distutils.sysconfig
site_pkg = distutils.sysconfig.get_python_lib(plat_specific=1)
with open(os.path.join(site_pkg, 'pywin32.version.txt')) as f:
return f.read().strip()
except Exception:
pass
return None
pywin32_ver = get_pywin32_ver()
if pywin32_ver:
try:
pywin32_ver = int(pywin32_ver)
except ValueError:
pass
else:
if pywin32_ver < 300:
warnings.warn(
'pythoncom in pywin32 < 300 may cause crashes. See '
'https://github.com/zhuyifei1999/guppy3/issues/25. '
'You may want to upgrade to the newest version of '
'pywin32 by running "pip install pywin32 --upgrade"')
Wdyt?
from guppy3.
Looks good.
from guppy3.
For my future reference, lldb can natively work with minidumps:
$ lldb -c python.exe.13268.dmp
(lldb) target create --core "python.exe.13268.dmp"
Core file '/home/zhuyifei1999/guppy3/python.exe.13268.dmp' (x86_64) was loaded.
(lldb) x/10xg "0x1e16e580 + 24"
0x1e16e598: 0x000000001e13cf68 0x00000000000001a0
0x1e16e5a8: 0x0000000000000000 0x000000001e0f6800
0x1e16e5b8: 0x0000000000000000 0x0000000000000000
0x1e16e5c8: 0x0000000000000000 0x0000000000000000
0x1e16e5d8: 0x0000000000000000 0x0000000000000000
(lldb) p (char *)0x000000001e13cf68
(char *) $1 = 0x000000001e13cf68 "PyOleNothing"
from guppy3.
Related Issues (20)
- Feature: replace/patch an imported class at runtime HOT 15
- Question: How to analyze guppy heap files HOT 2
- Getting text output from tool HOT 3
- Idea: Save the entire reference graph (to make profile browsers more useful)
- Feature: monitor external python process, possibly by injecting a stub? HOT 3
- Provide `guppy.__version__` HOT 1
- Use commas for big numbers HOT 5
- Add support to release aarch64 wheels HOT 2
- [Question] How does `theone` return the Python object? HOT 8
- python.exe crashed on hpy.heap() after Import Official Dropbox SDK for Python HOT 2
- Is it possible to dump the memory snapshot for offline analysis? HOT 3
- Usage with JAX HOT 8
- TypeError: '<' not supported between instances of 'weakref' and 'weakref' HOT 4
- Heisenbug: test_RefPat.RefPatCase.test_presentation fails sometimes on Python 3.9 on Windows HOT 4
- AttributeError with guppy.heapy.UniSet.IdentitySetMulti.partition HOT 3
- Fails to build on Python 3.11 RC2: fatal error: longintrepr.h: No such file or directory HOT 13
- Profile Browser fails with AttributeError: 'bool' object has no attribute '_root' HOT 4
- Remote monitor mode not available for python >3.8 ? HOT 1
- Exception in creating hpy instance
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from guppy3.