Giter Club home page Giter Club logo

Comments (102)

svenil avatar svenil commented on July 18, 2024 2

Anyway, I forgot to say, Good Work!

from guppy3.

svenil avatar svenil commented on July 18, 2024 1

Looks good!

from guppy3.

svenil avatar svenil commented on July 18, 2024

Would be cool! If it is worth the effort and it works. I am slightly inclined to byprod.
But I don't see how it works. I tried to get the traceback on a simple list allocated at the top level but get strange results. I get no mention of the tracetest.py module in the traceback. Can you explain what I am doing wrong?

sverker@sverker-HP-Pavilion-g6-Notebook-PC:~/git/guppy-pe/bugs$ cat tracetest.py
import tracemalloc
x=[]
for f in tracemalloc.get_object_traceback(x).format(): print(f)
sverker@sverker-HP-Pavilion-g6-Notebook-PC:~/git/guppy-pe/bugs$ python3 -X tracemalloc=10
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tracetest
  File "/usr/lib/python3.6/sre_compile.py", line 416
    prefix = []
  File "/usr/lib/python3.6/sre_compile.py", line 498
    prefix, prefix_skip, got_all = _get_literal_prefix(pattern)
  File "/usr/lib/python3.6/sre_compile.py", line 548
    _compile_info(code, p, flags)
  File "/usr/lib/python3.6/sre_compile.py", line 566
    code = _code(p, flags)
  File "/usr/lib/python3.6/re.py", line 301
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.6/re.py", line 233
    return _compile(pattern, flags)
  File "/usr/lib/python3.6/tokenize.py", line 37
    cookie_re = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-\w.]+)', re.ASCII)
  File "<frozen importlib._bootstrap>", line 219
  File "<frozen importlib._bootstrap_external>", line 678
  File "<frozen importlib._bootstrap>", line 665
>>> 

from guppy3.

svenil avatar svenil commented on July 18, 2024

File "/usr/lib/python3.6/sre_compile.py", line 416
prefix = []

Seems it doesn't use pointer equality but compares and hashes on the value in some way.
Arguably a bug? How does it work for you?

Even if I do:

x=['abc1234']

I get the same sre_compile.py traceback.
Something seems to be really wrong or I don't just not get it;-)

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Same thing here. If I reverse the order of:

import tracemalloc
x=[]

I get:

Traceback (most recent call last):
  File "tracetest.py", line 3, in <module>
    for f in tracemalloc.get_object_traceback(x).format(): print(f)
AttributeError: 'NoneType' object has no attribute 'format'

That is so weird.

Also, the printed traceback from tracemalloc should be a most-recent-call-last traceback. If I increase the trace size I get:

  File "tracetest.py", line 1
    import tracemalloc
  File "<frozen importlib._bootstrap>", line 983
  File "<frozen importlib._bootstrap>", line 967
  File "<frozen importlib._bootstrap>", line 677
  File "<frozen importlib._bootstrap_external>", line 728
  File "<frozen importlib._bootstrap>", line 219
  File "/usr/lib/python3.7/tracemalloc.py", line 4
    import linecache
  File "<frozen importlib._bootstrap>", line 983
  File "<frozen importlib._bootstrap>", line 967
  File "<frozen importlib._bootstrap>", line 677
  File "<frozen importlib._bootstrap_external>", line 728
  File "<frozen importlib._bootstrap>", line 219
  File "/usr/lib/python3.7/linecache.py", line 11
    import tokenize
  File "<frozen importlib._bootstrap>", line 983
  File "<frozen importlib._bootstrap>", line 967
  File "<frozen importlib._bootstrap>", line 677
  File "<frozen importlib._bootstrap_external>", line 728
  File "<frozen importlib._bootstrap>", line 219
  File "/usr/lib/python3.7/tokenize.py", line 37
    cookie_re = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-\w.]+)', re.ASCII)
  File "/usr/lib/python3.7/re.py", line 234
    return _compile(pattern, flags)
  File "/usr/lib/python3.7/re.py", line 286
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.7/sre_compile.py", line 764
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.7/sre_parse.py", line 930
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.7/sre_parse.py", line 426
    not nested and not items))
  File "/usr/lib/python3.7/sre_parse.py", line 646
    item = subpattern[-1:]
  File "/usr/lib/python3.7/sre_parse.py", line 166
    return SubPattern(self.pattern, self.data[index])

from guppy3.

svenil avatar svenil commented on July 18, 2024

You reversed the printout, right?
I would think a depth of 1 would be enough to pinpoint the actual allocation site. Trying that, I get the sre_compile.py site even with a non-empty list allocation.

sverker@sverker-HP-Pavilion-g6-Notebook-PC:~/git/guppy-pe/bugs$ cat tracetest.py
import tracemalloc
x=['abc1234']
for f in tracemalloc.get_object_traceback(x).format(): print(f)
sverker@sverker-HP-Pavilion-g6-Notebook-PC:~/git/guppy-pe/bugs$ python3 -X tracemalloc=1
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tracetest
  File "/usr/lib/python3.6/sre_compile.py", line 416
    prefix = []
>>> 

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

You reversed the printout, right?

Ah, Looks like Py 3.7 reversed it

from guppy3.

svenil avatar svenil commented on July 18, 2024

Looks like this can have to do with this. Presumably the list is not really allocated but a list in a free list is reused, and the traced malloc site is presumably where the list was first allocated.

https://bugs.python.org/issue35053

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Ah I see. I'll see if I can build Python 3.8 and test that.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Yep, that definitely is the fix:

zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ ./python --version
Python 3.8.0rc1
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ cat ~/guppy3/tracetest.py
import tracemalloc
x=[]
for f in tracemalloc.get_object_traceback(x).format(): print(f)
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ ./python -X tracemalloc=40 ~/guppy3/tracetest.py 
  File "/home/zhuyifei1999/guppy3/tracetest.py", line 2
    x=[]
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ cat > ~/guppy3/tracetest.py << 'EOF'
> x=[]
> import tracemalloc
> for f in tracemalloc.get_object_traceback(x).format(): print(f)
> EOF
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ cat ~/guppy3/tracetest.py
x=[]
import tracemalloc
for f in tracemalloc.get_object_traceback(x).format(): print(f)
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ ./python -X tracemalloc=40 ~/guppy3/tracetest.py 
  File "/home/zhuyifei1999/guppy3/tracetest.py", line 1
    x=[]

Thanks for the pointer.

I guess I'd better add a warning that this 'classifier' will be inaccurate for certain objects for Python < 3.8 :(

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

The also-very-annoying thing is that you need to get the object's PyGC_Head if it's a GC type, and sizeof(PyGC_Head) is not to be trusted even across Python minor releases, as seen in issue #1.

I'm guessing the most reliable way to get the size at runtime is via _testcapi.SIZEOF_PYGC_HEAD, which dates past the git history of CPython, and hopefully it is installed everywhere...

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024
(venv.py38) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -X tracemalloc=10 -c 'x = []; import guppy.heapy.heapyc; print(guppy.heapy.heapyc.HeapView(guppy.heapy.heapyc.RootState, ()).cli_prod({}).classify(x))'
('<string>', 1)

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I think the extra overhead in 4a07064 is gonna make it really slow for Python 3.5.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Well, it works!

(venv.py38) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> hp.heap()
Partition of a set of 35171 objects. Total size = 4070238 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  10088  29   899768  22    899768  22 str
     1   6847  19   477048  12   1376816  34 tuple
     2   2419   7   427567  11   1804383  44 types.CodeType
     3    450   1   351664   9   2156047  53 type
     4   4839  14   343483   8   2499530  61 bytes
     5   2225   6   302600   7   2802130  69 function
     6    450   1   244968   6   3047098  75 dict of type
     7     95   0   172504   4   3219602  79 dict of module
     8    508   1   149800   4   3369402  83 dict (no owner)
     9   1101   3    79272   2   3448674  85 types.WrapperDescriptorType
<156 more rows. Type e.g. '_.more' to view.>
>>> _.byprod
Partition of a set of 35171 objects. Total size = 4070750 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  11196  32  1185274  29   1185274  29 <frozen importlib._bootstrap>:671
     1   8740  25  1000458  25   2185732  54 None
     2   8604  24   953865  23   3139597  77 /home/zhuyifei1999/guppy3/guppy/etc/Glue.py:50
     3   2593   7   339106   8   3478703  85 <frozen importlib._bootstrap_external>:783
     4   1190   3   166996   4   3645699  90 /home/zhuyifei1999/guppy3/guppy/etc/Glue.py:209
     5     88   0    90739   2   3736438  92 <frozen importlib._bootstrap>:975
     6    744   2    87957   2   3824395  94 <frozen importlib._bootstrap>:991
     7    290   1    38386   1   3862781  95 <frozen importlib._bootstrap>:219
     8    157   0    23059   1   3885840  95 /home/zhuyifei1999/cpython/Lib/site.py:580
     9     98   0    13928   0   3899768  96 /home/zhuyifei1999/cpython/Lib/site.py:410
<166 more rows. Type e.g. '_.more' to view.>
>>> h = _
>>> h[0].kind
hp.Prod(('<frozen importlib._bootstrap>', 671))
>>> h & h[0].kind
Partition of a set of 11196 objects. Total size = 1185522 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   3694  33   356046  30    356046  30 str
     1   2692  24   194968  16    551014  46 tuple
     2    993   9   176015  15    727029  61 types.CodeType
     3   1928  17   158179  13    885208  75 bytes
     4     59   1    62432   5    947640  80 type
     5    102   1    51440   4    999080  84 dict of type
     6    327   3    44472   4   1043552  88 function
     7     54   0    23304   2   1066856  90 dict (no owner)
     8    264   2    19008   2   1085864  92 types.WrapperDescriptorType
     9     14   0    16000   1   1101864  93 dict of module
<27 more rows. Type e.g. '_.more' to view.>
>>> _.byprod
Partition of a set of 11196 objects. Total size = 1185522 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  11196 100  1185522 100   1185522 100 <frozen importlib._bootstrap>:671
>>> h & hp.Prod(('<frozen importlib._bootstrap>', 671))
Partition of a set of 11196 objects. Total size = 1185522 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   3694  33   356046  30    356046  30 str
     1   2692  24   194968  16    551014  46 tuple
     2    993   9   176015  15    727029  61 types.CodeType
     3   1928  17   158179  13    885208  75 bytes
     4     59   1    62432   5    947640  80 type
     5    102   1    51440   4    999080  84 dict of type
     6    327   3    44472   4   1043552  88 function
     7     54   0    23304   2   1066856  90 dict (no owner)
     8    264   2    19008   2   1085864  92 types.WrapperDescriptorType
     9     14   0    16000   1   1101864  93 dict of module
<27 more rows. Type e.g. '_.more' to view.>
>>> h & hp.Prod(None)
Partition of a set of 8740 objects. Total size = 1000458 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   2568  29   219742  22    219742  22 str
     1    243   3   130104  13    349846  35 type
     2    182   2    99504  10    449350  45 dict of type
     3   1238  14    85752   9    535102  53 tuple
     4    780   9    56160   6    591262  59 types.WrapperDescriptorType
     5    779   9    55232   6    646494  65 bytes
     6    312   4    54952   5    701446  70 types.CodeType
     7     62   1    52168   5    753614  75 dict (no owner)
     8     20   0    43664   4    797278  80 dict of module
     9    309   4    42024   4    839302  84 function
<48 more rows. Type e.g. '_.more' to view.>
>>> 

Now I gotta figure out if the other methods should be defined, and write the warnings, and the docs and tests.

By the way, it is seeing objects produced in Glue.py. I don't that should happen, right?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Another not-yet-implemented feature that could probably be useful is to classify by the filename only.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Oh, I messed up. I was testing that Python 3.5 fix in Python 3.8 by commenting out the fast path, and it produces the trace in the wrong order.

Recompiled guppy and now it looks saner:

(venv.py38) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> hp.heap()
Partition of a set of 35164 objects. Total size = 4069658 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  10086  29   899642  22    899642  22 str
     1   6846  19   476992  12   1376634  34 tuple
     2   2418   7   427391  11   1804025  44 types.CodeType
     3    450   1   351664   9   2155689  53 type
     4   4837  14   343397   8   2499086  61 bytes
     5   2224   6   302464   7   2801550  69 function
     6    450   1   244968   6   3046518  75 dict of type
     7     95   0   172504   4   3219022  79 dict of module
     8    508   1   149800   4   3368822  83 dict (no owner)
     9   1101   3    79272   2   3448094  85 types.WrapperDescriptorType
<156 more rows. Type e.g. '_.more' to view.>
>>> _.byprod
Partition of a set of 35164 objects. Total size = 4070099 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  17038  48  1539746  38   1539746  38 <frozen importlib._bootstrap_external>:580
     1   8739  25  1000271  25   2540017  62 None
     2   1338   4   172631   4   2712648  67 <frozen importlib._bootstrap>:219
     3    116   0   114840   3   2827488  69 <frozen importlib._bootstrap>:36
     4    266   1    76640   2   2904128  71 /home/zhuyifei1999/cpython/Lib/abc.py:85
     5     77   0    19800   0   2923928  72
                                             /home/zhuyifei1999/cpython/Lib/collections/__init__.py:
                                             456
     6     10   0    17008   0   2940936  72 <frozen importlib._bootstrap_external>:1491
     7    225   1    15849   0   2956785  73 <frozen importlib._bootstrap_external>:1483
     8     26   0    15824   0   2972609  73 /home/zhuyifei1999/guppy3/guppy/etc/Glue.py:109
     9     26   0    14808   0   2987417  73 /home/zhuyifei1999/guppy3/guppy/etc/Glue.py:185
<2479 more rows. Type e.g. '_.more' to view.>
>>> h = _
>>> h[0].kind
hp.Prod(('<frozen importlib._bootstrap_external>', 580))
>>> h & h[0].kind
Partition of a set of 17038 objects. Total size = 1540230 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   6310  37   549312  36    549312  36 str
     1   2093  12   370692  24    920004  60 types.CodeType
     2   4525  27   330008  21   1250012  81 tuple
     3   4045  24   287502  19   1537514 100 bytes
     4     62   0     1748   0   1539262 100 int
     5      2   0      944   0   1540206 100 frozenset
     6      1   0       24   0   1540230 100 float
>>> _.byprod
Partition of a set of 17038 objects. Total size = 1540230 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  17038 100  1540230 100   1540230 100 <frozen importlib._bootstrap_external>:580
>>> h & hp.Prod(('<frozen importlib._bootstrap_external>', 580))
Partition of a set of 17038 objects. Total size = 1540230 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   6310  37   549312  36    549312  36 str
     1   2093  12   370692  24    920004  60 types.CodeType
     2   4525  27   330008  21   1250012  81 tuple
     3   4045  24   287502  19   1537514 100 bytes
     4     62   0     1748   0   1539262 100 int
     5      2   0      944   0   1540206 100 frozenset
     6      1   0       24   0   1540230 100 float
>>> h & hp.Prod(None)
Partition of a set of 8739 objects. Total size = 1000271 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   2567  29   219683  22    219683  22 str
     1    243   3   130104  13    349787  35 type
     2    182   2    99504  10    449291  45 dict of type
     3   1238  14    85752   9    535043  53 tuple
     4    780   9    56160   6    591203  59 types.WrapperDescriptorType
     5    779   9    55232   6    646435  65 bytes
     6    312   4    54952   5    701387  70 types.CodeType
     7     62   1    52040   5    753427  75 dict (no owner)
     8     20   0    43664   4    797091  80 dict of module
     9    309   4    42024   4    839115  84 function
<48 more rows. Type e.g. '_.more' to view.>
>>> 

from guppy3.

svenil avatar svenil commented on July 18, 2024

Cool! Interesting examples!

from guppy3.

svenil avatar svenil commented on July 18, 2024

By the way, it is seeing objects produced in Glue.py. I don't that should happen, right?

That shouldn't happen but I have seen something like that before. I can see it after a number of _.more on the heap().byprod partition. I also found a number of objects that had been allocated at Classifiers.py:35

I got really many rows when I first used .byprod. How could it be 6339 producer sites? But it may be because of the priming with apport in View.py... Yes, removing that I get just 2651 producer sites on the heap.

The objects that come from internal Heapy modules like Classifiers.py, Glue.py and also View.py deserves some more investigation, they shouldn't be included in heap().

from guppy3.

svenil avatar svenil commented on July 18, 2024

Just occured to me, at least some or perhaps all of the suspicous producer sites may be because of the reusal tricks that Python <3.8 uses when it doesn't really allocate things again that we talked about later. It should be fixed in 3.8, right? I am still using 3.6.8 so that's may be the reason I see those producers.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

That shouldn't happen but I have seen something like that before. I can see it after a number of _.more on the heap().byprod partition. I also found a number of objects that had been allocated at Classifiers.py:35

I checked their traces:

$ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> h = hp.heap().byprod
>>> h
Partition of a set of 35167 objects. Total size = 4070756 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  17038  48  1540193  38   1540193  38 <frozen importlib._bootstrap_external>:580
     1   8741  25  1000408  25   2540601  62 None
     2   1338   4   172631   4   2713232  67 <frozen importlib._bootstrap>:219
     3    116   0   114840   3   2828072  69 <frozen importlib._bootstrap>:36
     4    266   1    76640   2   2904712  71 /home/zhuyifei1999/cpython/Lib/abc.py:85
     5     77   0    19800   0   2924512  72
                                             /home/zhuyifei1999/cpython/Lib/collections/__init__.py:
                                             456
     6     10   0    17008   0   2941520  72 <frozen importlib._bootstrap_external>:1491
     7    225   1    15849   0   2957369  73 <frozen importlib._bootstrap_external>:1483
     8     26   0    15824   0   2973193  73 /home/zhuyifei1999/guppy3/guppy/etc/Glue.py:109
     9     26   0    14808   0   2988001  73 /home/zhuyifei1999/guppy3/guppy/etc/Glue.py:185
<2479 more rows. Type e.g. '_.more' to view.>
>>> h[8].byclodo
Partition of a set of 26 objects. Total size = 15824 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     26 100    15824 100     15824 100 dict of guppy.etc.Glue.Share
>>> for i in __import__('tracemalloc').get_object_traceback(_.byid[0].theone).format(): print(i)
... 
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 852
    return self.c_str(a)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1605
    ob = self.mod._parent.OutputHandling.output_buffer()
  File "/home/zhuyifei1999/guppy3/guppy/heapy/OutputHandling.py", line 295
    return OutputBuffer(self)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/OutputHandling.py", line 53
    self.strio = mod._root.io.StringIO()
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 50
    return self._share.getattr(self, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 209
    d = self.getattr2(inter, dct, owner, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 225
    x = self.getattr_package(inter, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 271
    x = self.makeModule(x, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 331
    return Share(module, self, module.__name__, Clamp)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 109
    self.module = module
>>> h[9].byclodo
Partition of a set of 26 objects. Total size = 14808 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     26 100    14808 100     14808 100 dict (no owner)
>>> for i in __import__('tracemalloc').get_object_traceback(_.byid[0].theone).format(): print(i)
... 
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 227
    x = self.getattr3(inter, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 321
    x = f()
  File "/home/zhuyifei1999/guppy3/guppy/heapy/View.py", line 171
    hv = self.new_hv(_hiding_tag_=self._hiding_tag_,
  File "/home/zhuyifei1999/guppy3/guppy/heapy/View.py", line 409
    hv.register__hiding_tag__type(self._parent.UniSet.Kind)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 50
    return self._share.getattr(self, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 209
    d = self.getattr2(inter, dct, owner, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 225
    x = self.getattr_package(inter, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 271
    x = self.makeModule(x, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 331
    return Share(module, self, module.__name__, Clamp)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 185
    self.data = {}

Just occured to me, at least some or perhaps all of the suspicous producer sites may be because of the reusal tricks that Python <3.8 uses when it doesn't really allocate things again that we talked about later. It should be fixed in 3.8, right? I am still using 3.6.8 so that's may be the reason I see those producers.

Yeah, that is likely. I'm doing my testing under self-compiled 3.8.0 from Git. Under 3.7.5:

(venv) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> h = hp.heap().byprod
>>> h
Partition of a set of 37443 objects. Total size = 4401800 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  13674  37  1242345  28   1242345  28 <frozen importlib._bootstrap_external>:525
     1   8460  23  1005737  23   2248082  51 None
     2   7150  19   753563  17   3001645  68 <frozen importlib._bootstrap>:219
     3    256   1    77392   2   3079037  70 /usr/lib/python-
                                             exec/python3.7/../../../lib/python3.7/abc.py:126
     4    107   0    64552   1   3143589  71 <frozen importlib._bootstrap_external>:606
     5    505   1    32952   1   3176541  72 <frozen importlib._bootstrap_external>:1408
     6     11   0    27128   1   3203669  73 <frozen importlib._bootstrap_external>:1416
     7     37   0    22784   1   3226453  73 <frozen importlib._bootstrap_external>:916
     8    165   0    19024   0   3245477  74 /usr/lib/python-
                                             exec/python3.7/../../../lib/python3.7/abc.py:127
     9     79   0    16066   0   3261543  74 /usr/lib/python3.7/collections/__init__.py:397
<2503 more rows. Type e.g. '_.more' to view.>

I still don't get as much as 6339 sites though. Might be related to that apport thing on your side.

from guppy3.

svenil avatar svenil commented on July 18, 2024

I still don't get as much as 6339 sites though. Might be related to that apport thing on your side.

Yes, I realized.
We don't get the same sites from Glue with Python < 3.8. Maybe it is masked by not being presented at the real producer sites. I'll see if I can try with 3.8 later.

from guppy3.

svenil avatar svenil commented on July 18, 2024

I have managed to install Python3.9 but I can reproduce an error also with python2 and guppy-pe.
It seems to have to do with two things:

  1. When calculating heap(), we traverse all objects but only afterwards clean up the result for the hidden objects with hv_cleanup_mutset(). So the objects in Glue.py that have no hiding tag are not cleaned up.
  2. Even after checking when traversing, via hv_is_obj_hidden, I was only checking the contents of the dict for hiding_tag if the object was an Py_Instance_Check(obj). Objects that inherited from object were not checked.

I see that in guppy3 the Py_Instance_Check was removed from hv_is_obj_hidden. I managed to fix it in guppy-pe and python2 by using _PyObject_GetDictPtr. This was done on the recursive variant, I have still to introduce it in the simulated recursive variant. And I didn't manage to fix it in guppy3 with the simulated recursion. I'll see tomorrow...
Before the fix I could reproduce it in guppy-pe (and also in guppy3) as follows. (Maybe even better example would be to test with Glue.Interface which inherits from object.)

>>> from guppy import hpy
>>> hp=hpy()
>>> from guppy.etc import Glue
>>> h=hp.heap()
>>> g=h&hp.Clodo(dictof=Glue.Share)
>>> g
Partition of a set of 23 objects. Total size = 11960 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     23 100    11960 100     11960 100 dict of guppy.etc.Glue.Share

After the fix in hv.c the result was hp.Nothing

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I see that in guppy3 the Py_Instance_Check was removed from hv_is_obj_hidden.

In Python 3 everything is an instance of a type, there are no old style classes anymore. It is supposed to hide the instance's dict if it contains the hiding tag, but in my altered code it no longer hides the owner of the dict. I guess I should change that.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

It is supposed to hide the instance's dict if it contains the hiding tag

Hmm. That might not be the case.

from guppy3.

svenil avatar svenil commented on July 18, 2024

In guppy3, we were missing _hiding_tag_ in the dict of Interface because of this commit:
3c6c392

When enabling caching again, I got rid of some of the spurious objects in heap() after some more changes in hv.c

Testing your test in the mentioned commit in guppy-pe, I get just RootState even as I have caching enabled in Glue.py. But in guppy3 I get the strange root like in the example now that caching is enabled. I see that this is because of the change to root in View.py for priming.

And all spurios objects are not away when paging down h.byprod with a number of _.more
So it's still something wrong going on.

from guppy3.

svenil avatar svenil commented on July 18, 2024

The caching in Glue.py in Share.getattr only occured if the name was not in the _chgable_ tuple of the glueclamp. If the value is not changed, it shouldn't be a problem that the cache occurs in multiple Interface dicts. So when I add 'root' to the _chgable_ tuple in View.py, the root will not be cached and h.fam.Path.View.root is RootState correctly.

How about enabling caching in Glue.py again (perhaps only in Share.getattr) and adding 'root' to _chgable_ in View.py?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I see that this is because of the change to root in View.py for priming.

Yeah, that should be exactly the cause.

And all spurios objects are not away when paging down h.byprod with a number of _.more
So it's still something wrong going on.

I don't think I understand this. Which test is this?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

The caching in Glue.py only occurs if the name was not in the _chgable_ tuple of the glueclamp. If the value is not changed, it shouldn't be a problem that the cache occurs in multiple Interface dicts. So when I add 'root' to the _chgable_ tuple in View.py, the root will not be cached and h.fam.Path.View.root is rootstate correctly.

How about enabling caching in Glue.py again and adding 'root' to _chgable_ in View.py?

Ah, ok. I didn't realize this part of the logic. Thanks

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Nice!

(venv.py38) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> hp.heap().byprod
Partition of a set of 34080 objects. Total size = 3943039 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  17032  50  1539478  39   1539478  39 <frozen importlib._bootstrap_external>:580
     1   8740  26  1000362  25   2539840  64 None
     2   1338   4   172631   4   2712471  69 <frozen importlib._bootstrap>:219
     3    116   0   114840   3   2827311  72 <frozen importlib._bootstrap>:36
     4    266   1    76640   2   2903951  74 /home/zhuyifei1999/cpython/Lib/abc.py:85
     5     77   0    19800   1   2923751  74
                                             /home/zhuyifei1999/cpython/Lib/collections/__init__.py:
                                             456
     6     10   0    17008   0   2940759  75 <frozen importlib._bootstrap_external>:1491
     7    225   1    15849   0   2956608  75 <frozen importlib._bootstrap_external>:1483
     8    130   0    13748   0   2970356  75 <frozen importlib._bootstrap_external>:64
     9     92   0    13288   0   2983644  76 /home/zhuyifei1999/cpython/Lib/abc.py:86
<2424 more rows. Type e.g. '_.more' to view.>
>>> _.more
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
    10     42   0    13000   0   2996644  76 /home/zhuyifei1999/cpython/Lib/enum.py:164
    11    127   0    11648   0   3008292  76 /home/zhuyifei1999/cpython/Lib/_collections_abc.py:73
    12     75   0    10800   0   3019092  77 <frozen importlib._bootstrap>:344
    13    137   0     9518   0   3028610  77 /home/zhuyifei1999/cpython/Lib/opcode.py:37
    14     52   0     7908   0   3036518  77 <unknown>:0
    15     74   0     7696   0   3044214  77 /home/zhuyifei1999/cpython/Lib/sre_constants.py:59
    16     24   0     6204   0   3050418  77 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:495
    17     26   0     6032   0   3056450  78 /home/zhuyifei1999/cpython/Lib/abc.py:24
    18     58   0     6032   0   3062482  78 <frozen importlib._bootstrap_external>:942
    19      1   0     4696   0   3067178  78 /home/zhuyifei1999/cpython/Lib/opcode.py:36
<2414 more rows. Type e.g. '_.more' to view.>
>>> _.more
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
    20      1   0     4696   0   3071874  78 /home/zhuyifei1999/cpython/Lib/tokenize.py:138
    21     56   0     4586   0   3076460  78
                                             /home/zhuyifei1999/cpython/Lib/collections/__init__.py:
                                             394
    22     73   0     4524   0   3080984  78 /home/zhuyifei1999/cpython/Lib/sre_constants.py:68
    23     66   0     4451   0   3085435  78
                                             /home/zhuyifei1999/cpython/Lib/collections/__init__.py:
                                             434
    24     48   0     4416   0   3089851  78 /home/zhuyifei1999/guppy3/guppy/heapy/Path.py:494
    25     16   0     4136   0   3093987  78 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:99
    26     15   0     4086   0   3098073  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:160
    27     15   0     4086   0   3102159  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:164
    28     15   0     4086   0   3106245  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:168
    29     15   0     4086   0   3110331  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:172
<2404 more rows. Type e.g. '_.more' to view.>
>>> _.more
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
    30     15   0     4086   0   3114417  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:181
    31     15   0     4086   0   3118503  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:185
    32     15   0     4086   0   3122589  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:189
    33     15   0     4086   0   3126675  79 /home/zhuyifei1999/cpython/Lib/ctypes/__init__.py:193
    34      6   0     3968   0   3130643  79 /home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py:1500
    35      7   0     3784   0   3134427  79 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:4
    36      8   0     3672   0   3138099  80 /home/zhuyifei1999/cpython/Lib/_weakrefset.py:35
    37      8   0     3672   0   3141771  80 /home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py:694
    38     10   0     3632   0   3145403  80 /home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py:6
    39      9   0     3568   0   3148971  80 /home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py:327
<2394 more rows. Type e.g. '_.more' to view.>
>>> h = _
>>> h[24].byclodo
Partition of a set of 48 objects. Total size = 4416 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     12  25     2784  63      2784  63 dict (no owner)
     1     12  25      864  20      3648  83 builtins.weakref
     2     24  50      768  17      4416 100 int
>>> h[34].byclodo
Partition of a set of 6 objects. Total size = 3968 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1  17     2272  57      2272  57 dict of type
     1      1  17     1472  37      3744  94 type
     2      2  33      120   3      3864  97 tuple
     3      1  17       72   2      3936  99 builtins.weakref
     4      1  17       32   1      3968 100 int
>>> h[35].byclodo
Partition of a set of 7 objects. Total size = 3784 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1  14     2272  60      2272  60 dict of type
     1      1  14     1064  28      3336  88 type
     2      1  14      232   6      3568  94 dict (no owner)
     3      2  29      112   3      3680  97 tuple
     4      1  14       72   2      3752  99 builtins.weakref
     5      1  14       32   1      3784 100 int

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

These are class creations:

>>> for i in __import__('tracemalloc').get_object_traceback(h[24].byid[0].theone).format(): print(i)
... 
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Path.py", line 330
    return PathsIter(self, start, stop)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Path.py", line 171
    self.reset(start)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Path.py", line 204
    sr = self.mod.sortedrels(self.paths.IG, src)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Path.py", line 514
    for rel in self.relations(src, dst):
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Path.py", line 545
    tab.append(self.rel_table[i](r))
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 50
    return self._share.getattr(self, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 209
    d = self.getattr2(inter, cache, owner, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 229
    x = self.getattr3(inter, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 323
    x = f()
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Path.py", line 494
    class r(c, self.RelationBase):
>>> for i in __import__('tracemalloc').get_object_traceback(h[34].byid[0].theone).format(): print(i)
... 
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 50
    return self._share.getattr(self, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 209
    d = self.getattr2(inter, cache, owner, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 227
    x = self.getattr_package(inter, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 262
    x = __import__(self.makeName(name))
  File "<frozen importlib._bootstrap>", line 991
  File "<frozen importlib._bootstrap>", line 975
  File "<frozen importlib._bootstrap>", line 671
  File "<frozen importlib._bootstrap_external>", line 783
  File "<frozen importlib._bootstrap>", line 219
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1500
    class IdentitySetFamily(AtomFamily):
>>> for i in __import__('tracemalloc').get_object_traceback(h[35].byid[0].theone).format(): print(i)
... 
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 50
    return self._share.getattr(self, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 209
    d = self.getattr2(inter, cache, owner, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 227
    x = self.getattr_package(inter, name)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Glue.py", line 262
    x = __import__(self.makeName(name))
  File "<frozen importlib._bootstrap>", line 991
  File "<frozen importlib._bootstrap>", line 975
  File "<frozen importlib._bootstrap>", line 671
  File "<frozen importlib._bootstrap_external>", line 783
  File "<frozen importlib._bootstrap>", line 219
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Use.py", line 4
    class _GLUECLAMP_(guppy.etc.Glue.Interface):
>>> h[24].sp
 0: hp.Root.??.f_globals['R_ATTRIBUTE']->tp_subclasses
 1: hp.Root.??.f_globals['R_CELL']->tp_subclasses
 2: hp.Root.??.f_globals['R_HASATTR']->tp_subclasses
 3: hp.Root.??.f_globals['R_IDENTITY']->tp_subclasses
 4: hp.Root.??.f_globals['R_INDEXKEY']->tp_subclasses
 5: hp.Root.??.f_globals['R_INDEXVAL']->tp_subclasses
 6: hp.Root.??.f_globals['R_INTERATTR']->tp_subclasses
 7: hp.Root.??.f_globals['R_LIMIT']->tp_subclasses
 8: hp.Root.??.f_globals['R_LOCAL_VAR']->tp_subclasses
 9: hp.Root.??.f_globals['R_NORELATION']->tp_subclasses
>>> h[34].sp
 0: hp.Root.i0_modules['guppy.heapy.UniSet'].__dict__['IdentitySetFamily']
>>> h[35].sp
 0: hp.Root.i0_modules['guppy.heapy.Use'].__dict__['_GLUECLAMP_']

It sounds painful to somehow hide all the 'static' stuffs (as in, they are not associated with a session). Should they be hidden?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Ah, dicts that have hiding tag are simply not traversed through, like a frame:

https://github.com/zhuyifei1999/guppy3/blob/master/src/heapy/stdtypes.c#L68

This would only apply to the dicts though. I got no idea why the owners of the dicts became hidden:

$ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> from guppy.etc import Glue
>>> h = hp.heap()
>>> h&hp.Type(Glue.Share)
hp.Nothing
>>> h&hp.Type(Glue.Interface)
hp.Nothing

from guppy3.

svenil avatar svenil commented on July 18, 2024

Chances are the only entrance point to the internal of Heapy in this case is the top level hp object, the dict of which is not traversed. But the hp object and its dict are included in heap().

>>> h=hp.heap()
>>> h&hp.iso(hp)
Partition of a set of 1 object. Total size = 24 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1 100       24 100        24 100 guppy.heapy.Use._GLUECLAMP_
>>> h&hp.iso(hp.__dict__)
Partition of a set of 1 object. Total size = 192 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1 100      192 100       192 100 dict of guppy.heapy.Use._GLUECLAMP_
>>> h&hp.iso(hp.View)
hp.Nothing
>>> 

But creating another entry point:

>>> x=hp.View
>>> h=hp.heap()
>>> h&hp.iso(hp.View)
Partition of a set of 1 object. Total size = 24 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1 100       24 100        24 100 guppy.etc.Glue.Interface

I think this could be fixed in in hv.c, in hv_is_obj_hidden by checking the dict of the object, and checking hv_is_obj_hidden in hv_heap_rec.

static int
hv_heap_rec(PyObject *obj, HeapTravArg *ta) {
    int r;
    if (hv_is_obj_hidden(ta->hv, obj) && obj->ob_type != &NyRootState_Type)
	return 0;
    r = NyNodeSet_setobj(ta->visited, obj);
    if (r)
        return r < 0 ? r : 0;
    else
        return PyList_Append(ta->to_visit, obj);
}
int
hv_is_obj_hidden(NyHeapViewObject *hv, PyObject *obj)
{
    PyTypeObject *type = Py_TYPE(obj);
    ExtraType *xt = hv_extra_type(hv, type);
    if (xt->xt_trav_code == XT_HE) {
        Py_ssize_t offs = xt->xt_he_offs;
        PyObject **phe = (PyObject **)((char *)obj + offs);
        if (*phe == hv->_hiding_tag_) {
            return 1;
        }
    } else if (xt->xt_trav_code == XT_HI) {
        return 1;
    } else if (type == &NyRootState_Type) {
        /* Fixes a dominos confusion; see Notes Apr 20 2005 */
        return 1;
    } else {
      PyObject **dp = 0;
      dp = _PyObject_GetDictPtr(obj);
      if (dp && *dp && (PyDict_GetItem(*dp, _hiding_tag__name) == hv->_hiding_tag_)) {
	return 1;
      }
    }
    return 0;
}

Checking this fix...

>>> x=hp.View
>>> h=hp.heap()
>>> h&hp.iso(hp.View)
hp.Nothing
>>> 

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Chances are the only entrance point to the internal of Heapy in this case is the top level hp object, the dict of which is not traversed.

Ah I see. By the way, I see in hv_cli_dictof_dictptr that if (PyType_Check(obj)) /* Doesnt work generally; Note Apr 8 2005 */. What does that refer to?

from guppy3.

svenil avatar svenil commented on July 18, 2024

Yeah, I have seen that too and I wondered also. Unfortunately I lost the Note file several years ago when I got a new computer:-( Is the type check superfluous in that _PyObject_GetDictPtr would work always?

from guppy3.

svenil avatar svenil commented on July 18, 2024

The fix may have some performance issues to investigate. BTW, now hp and hp.__dict__ are not included, if it matters.

>>> h&hp.iso(hp)
hp.Nothing
>>> h&hp.iso(hp.__dict__)
hp.Nothing
>>> 

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

The fix may have some performance issues to investigate. BTW, now hp and hp.__dict__ are not included, if it matters.

Test added, so I won't make the same mistake again ;)

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Is the type check superfluous in that _PyObject_GetDictPtr would work always?

The guppy tests pass, and a simple test for an object included in a dick works for me just fine:

$ python -X tracemalloc=10 -ic 'hp = __import__("guppy").hpy()'
>>> class A:
...  def b():
...   pass
... 
>>> hp.iso(A.b).sp
 0: hp.Root.i0_modules['__main__'].__dict__['A'].__dict__['b']

If we ever discover something is wrong with this then we will need a test case.

from guppy3.

svenil avatar svenil commented on July 18, 2024

Testing with -X tracemalloc makes it 'hang' or take very long time in test_secondary_interpreter

python3 -X tracemalloc -ic 'hp = __import__("guppy").hpy()'
>>> hp.test()
imported: guppy.heapy.test.test_dependencies
Testing sets
Test #0
...
test_1 (guppy.heapy.test.test_Path.RootTestCase) ... ok
test_secondary_interpreter (guppy.heapy.test.test_Path.RootTestCase) ... 

It is on a checkout from the byprod branch with no other changes. And tested with python3.9.0a0 as well as python3.6.8
I'll let it continue but it seems never returning

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

o.O It works fine on Travis and on my side, 3.8.0. & 3.7.5. I'll see if I can get python3.9.0a0 and test that. If that doesn't work then I'll get Ubuntu, probably in a docker container or something.

from guppy3.

svenil avatar svenil commented on July 18, 2024

I'm still waiting for the 3.6.8 run to complete. Are you sure you test with -X tracemalloc in Travis?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Oh right. Oops. Good point.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Ok, can reproduce. Happens also on master branch.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

There is a deadlock:

gef➤  info threads 
  Id   Target Id                                  Frame 
* 1    Thread 0x7ffff7c1d740 (LWP 70402) "python" futex_abstimed_wait_cancelable (private=0x0, abstime=0x0, clockid=0x0, expected=0x0, futex_word=0x5555561958f0) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/../sysdeps/unix/sysv/linux/futex-internal.h:208
  2    Thread 0x7ffff5d9d700 (LWP 70413) "python" futex_abstimed_wait_cancelable (private=0x0, abstime=0x7ffff5d9cd20, clockid=<optimized out>, expected=0x0, futex_word=0x5555558dbd48 <_PyRuntime+1224>) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/../sysdeps/unix/sysv/linux/futex-internal.h:208
gef➤  thread apply 1 bt

Thread 1 (Thread 0x7ffff7c1d740 (LWP 70402)):
#0  futex_abstimed_wait_cancelable (private=0x0, abstime=0x0, clockid=0x0, expected=0x0, futex_word=0x5555561958f0) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/../sysdeps/unix/sysv/linux/futex-internal.h:208
#1  do_futex_wait (sem=sem@entry=0x5555561958f0, abstime=0x0, clockid=0x0) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/sem_waitcommon.c:112
#2  0x00007ffff7f4fa38 in __new_sem_wait_slow (sem=sem@entry=0x5555561958f0, abstime=0x0, clockid=0x0) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/sem_waitcommon.c:184
#3  0x00007ffff7f4faed in __new_sem_wait (sem=sem@entry=0x5555561958f0) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/sem_wait.c:42
#4  0x00005555556c64c4 in PyThread_acquire_lock_timed (intr_flag=<optimized out>, microseconds=<optimized out>, lock=<optimized out>) at /home/zhuyifei1999/cpython/Python/thread_pthread.h:459
#5  PyThread_acquire_lock (lock=lock@entry=0x5555561958f0, waitflag=waitflag@entry=0x1) at /home/zhuyifei1999/cpython/Python/thread_pthread.h:697
#6  0x00007ffff6d2125b in hp_interpreter (self=<optimized out>, args=<optimized out>) at /home/zhuyifei1999/guppy3/src/heapy/interpreter.c:181
#7  0x00005555555c81e8 in cfunction_call_varargs (func=0x7ffff6d57a90, args=<optimized out>, kwargs=<optimized out>) at /home/zhuyifei1999/cpython/Objects/call.c:757
[...]
gef➤  thread apply 2 bt

Thread 2 (Thread 0x7ffff5d9d700 (LWP 70413)):
#0  futex_abstimed_wait_cancelable (private=0x0, abstime=0x7ffff5d9cd20, clockid=<optimized out>, expected=0x0, futex_word=0x5555558dbd48 <_PyRuntime+1224>) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/../sysdeps/unix/sysv/linux/futex-internal.h:208
#1  __pthread_cond_wait_common (abstime=0x7ffff5d9cd20, clockid=<optimized out>, mutex=0x5555558dbd50 <_PyRuntime+1232>, cond=0x5555558dbd20 <_PyRuntime+1184>) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/pthread_cond_wait.c:520
#2  __pthread_cond_timedwait (cond=cond@entry=0x5555558dbd20 <_PyRuntime+1184>, mutex=mutex@entry=0x5555558dbd50 <_PyRuntime+1232>, abstime=abstime@entry=0x7ffff5d9cd20) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/pthread_cond_wait.c:656
#3  0x00005555556755fd in PyCOND_TIMEDWAIT (us=<optimized out>, mut=0x5555558dbd50 <_PyRuntime+1232>, cond=0x5555558dbd20 <_PyRuntime+1184>) at /home/zhuyifei1999/cpython/Python/condvar.h:73
#4  take_gil (ceval=0x5555558dbac8 <_PyRuntime+584>, tstate=tstate@entry=0x7ffff0000b60) at /home/zhuyifei1999/cpython/Python/ceval_gil.h:206
#5  0x0000555555676966 in PyEval_RestoreThread (tstate=tstate@entry=0x7ffff0000b60) at /home/zhuyifei1999/cpython/Python/ceval.c:399
#6  0x00005555556b466b in PyGILState_Ensure () at /home/zhuyifei1999/cpython/Python/pystate.c:1298
#7  0x0000555555739d68 in tracemalloc_raw_alloc (elsize=0xa98, nelem=0x1, ctx=0x5555558debc8 <allocators+40>, use_calloc=0x0) at /home/zhuyifei1999/cpython/Modules/_tracemalloc.c:833
#8  tracemalloc_raw_alloc (elsize=0xa98, nelem=0x1, ctx=0x5555558debc8 <allocators+40>, use_calloc=0x0) at /home/zhuyifei1999/cpython/Modules/_tracemalloc.c:815
#9  tracemalloc_raw_malloc (ctx=0x5555558debc8 <allocators+40>, size=0xa98) at /home/zhuyifei1999/cpython/Modules/_tracemalloc.c:845
#10 0x00005555556b2914 in PyInterpreterState_New () at /home/zhuyifei1999/cpython/Python/pystate.c:200
#11 0x00005555556b1697 in new_interpreter (tstate_p=<synthetic pointer>) at /home/zhuyifei1999/cpython/Python/pylifecycle.c:1416
#12 Py_NewInterpreter () at /home/zhuyifei1999/cpython/Python/pylifecycle.c:1565
#13 0x00007ffff6d21350 in t_bootstrap (boot_raw=boot_raw@entry=0x7ffff60f24f0) at /home/zhuyifei1999/guppy3/src/heapy/interpreter.c:54
#14 0x00005555556c5cb7 in pythread_wrapper (arg=<optimized out>) at /home/zhuyifei1999/cpython/Python/thread_pthread.h:232
#15 0x00007ffff7f453d7 in start_thread (arg=<optimized out>) at /var/tmp/portage/sys-libs/glibc-2.30-r2/work/glibc-2.30/nptl/pthread_create.c:479
#16 0x00007ffff7d29f4f in clone () at /usr/src/debug/sys-libs/glibc-2.30-r2/glibc-2.30/sysdeps/unix/sysv/linux/x86_64/clone.S:95

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

This seems to be an issue of Python core itself:

zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ ./python -c 'import _xxsubinterpreters; _xxsubinterpreters.run_string(_xxsubinterpreters.create(), "print(\"hi\")")'
hi
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/cpython $ ./python -X tracemalloc -c 'import _xxsubinterpreters; _xxsubinterpreters.run_string(_xxsubinterpreters.create(), "print(\"hi\")")'
^C^\Quit (core dumped)

tracemalloc hangs in creating new interpreter, because the current thread state is NULL which makes PyGILState_Ensure think GIL is not locked, which then waits on the GIL forever.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

This seems to be (at least used to be) a know issue. https://bugs.python.org/issue18874#msg196720 I wonder how it is workarounded in Python core tests.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

https://github.com/python/cpython/blob/65444cf7fe84d8ca1f9b51c7f5992210751e08bb/Lib/test/support/__init__.py#L3007-L3024

It skips it. Okay...

from guppy3.

svenil avatar svenil commented on July 18, 2024

The _tracemalloc.c file use plenty of things associated with GIL. Why does that matter if it doesn't work with multi-threading at all anyway?

Can we provide a patch we would get some fame :-)

from guppy3.

svenil avatar svenil commented on July 18, 2024
grep -c -i gil ~/git/cpython/Modules/_tracemalloc.c 
39

One would think they were there for a reason but it doesnt work with multi-threading anyways

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Why does that matter if it doesn't work with multi-threading at all anyway?

It supports multi-threading, but not multi-interpreter. It uses the PyGILState_*() API which doesn't work well with multi-interpreter (1 2 3). The discussions doesn't indicate a method to fix it without changes to the C API, and they didn't have much progress for years.

I'd hope that PEP 554 gives multi interpreter more priority.

from guppy3.

svenil avatar svenil commented on July 18, 2024

O, I confused the terms and didn't know the difference. Thanks for the explanation. So it would be hard to fix even in future versions of Python even for tracemalloc?---

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I can't say how hard it is. I don't really understand the thread state and GIL internals. AFAICT if they fix this they will probably introduce new C API or backwards-incompatible changes.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I wonder if it's only that starting interpreters have an issue with tracemalloc, tracemalloc can run fine after the new interpreter is started, or if tracemalloc breaks with multiple interpreters in general.
If it's the former we can temporary disable tracemalloc while we start new interpreters.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I wonder if it would be a good idea to report it to https://github.com/ericsnowcurrently/multi-core-python

from guppy3.

svenil avatar svenil commented on July 18, 2024

I wonder if it's only that starting interpreters have an issue with tracemalloc, tracemalloc can run fine
after the new interpreter is started, or if tracemalloc breaks with multiple interpreters in general.
If it's the former we can temporary disable tracemalloc while we start new interpreters.

I think that is a good idea and I think it can work. Though I have not tested tracemalloc in the new interpreter and there may be some missing decrefs in this code. But it passes the current test. I am providing this code as is and as an unfinished example that I may look at tomorrow if you don't fix it up or you have already implemented it.

diff --git a/src/heapy/interpreter.c b/src/heapy/interpreter.c
index 5a9b993..98613bf 100644
--- a/src/heapy/interpreter.c
+++ b/src/heapy/interpreter.c
@@ -31,6 +31,7 @@ static char hp_set_async_exc_doc[] =
 ;
 
 #include "pythread.h"
+#include "pymem.h"
 #include "eval.h"
 
 struct bootstate {
@@ -38,8 +39,31 @@ struct bootstate {
     PyObject *locals;
     // used by child to signal parent that thread has started
     PyThread_type_lock evt_ready;
+    PyObject *start;
 };
 
+static PyObject *
+trace_stop()
+{
+  if (!_Py_tracemalloc_config.tracing)
+    return 0;
+  else {
+    PyObject *t =    PyImport_ImportModule("tracemalloc");
+    printf("trace_stop called\n");
+    if (t) {
+      PyObject *start = PyObject_GetAttrString(t, "start");
+      PyObject *stop = PyObject_GetAttrString(t, "stop");
+      if (!(start && stop && PyObject_CallFunction(stop, 0)))
+	return 0;
+      else {
+	printf("stop succeeded\n");
+	return start;
+      }
+    } else
+      return 0;
+  }
+}
+
 static void
 t_bootstrap(void *boot_raw)
 {
@@ -67,6 +91,9 @@ t_bootstrap(void *boot_raw)
         PyThread_exit_thread();
     }
 
+    if (boot->start)
+      PyObject_CallFunction(boot->start, "");
+
     // return GIL to parent, wait for it to unlock
     PyThread_release_lock(boot->evt_ready);
     PyEval_RestoreThread(tstate);
@@ -159,6 +186,8 @@ hp_interpreter(PyObject *self, PyObject *args)
     Py_INCREF(cmd);
     Py_XINCREF(locals);
 
+    boot->start = trace_stop();
+
     PyEval_InitThreads(); // Start the interpreter's thread-awareness
 
     evt_ready = PyThread_allocate_lock();

from guppy3.

svenil avatar svenil commented on July 18, 2024

Humm, one may consider checking for tracing via the module function is_tracing instead of directly on the bare C metal in the struct. It's possible the struct _Py_tracemalloc_config is not always available in all versions. I'll see what we can come up with tomorrow.

from guppy3.

svenil avatar svenil commented on July 18, 2024

And the start call should be moved up before the !tstat check... good night

from guppy3.

svenil avatar svenil commented on July 18, 2024

And instead of importing tracemalloc in trace_stop we might consider use some function to find it if it was already imported. But I couldn't find that function right now, can look later. Especially if we were to use the module's is_tracing function, and only if it was imported. So some code to fix.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

find it if it was already imported

Unfortunately, I don't think whether it is imported can indicate if it is enabled:

$ python -X tracemalloc -c 'print("tracemalloc" in __import__("sys").modules, __import__("tracemalloc").is_tracing())'
False True
$ python -X tracemalloc -c 'print("_tracemalloc" in __import__("sys").modules, __import__("tracemalloc").is_tracing())'
False True
$ python -c 'print("_tracemalloc" in __import__("sys").modules, __import__("tracemalloc").is_tracing())'
False False
$ python -c 'print(__import__("tracemalloc").is_tracing(), "tracemalloc" in __import__("sys").modules)'
False True
$ python -X tracemalloc -c 'print(__import__("tracemalloc").is_tracing(), "tracemalloc" in __import__("sys").modules)'
True True

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

_Py_tracemalloc_config is exposed in python/cpython#10063 so that only works for Python >= 3.8. Let me see if I can make is_tracing() work.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I'm thinking of this sort of pseudocode:

import tracemalloc

def interpreter(...):
    do_all_the_stuffs_until_new_thread()

    trace_depth = None

    if tracemalloc.is_tracing():
        trace_depth = tracemalloc.get_traceback_limit()
        tracemalloc.stop()

    try:
        return start_thread_and_wait_for_ready()
    finally:
        if trace_depth is not None:
            tracemalloc.start(trace_depth)

from guppy3.

svenil avatar svenil commented on July 18, 2024

I had missed that tracemalloc.start took a parameter.
However, I think the start call should be done by the new thread before it releases the ready lock as in my code. Otherwise the new thread could release the lock and continue running without the tracing enabled until the original thread gets scheduled again, which for all one knows would be an indeterminate time later (say, until the new thread is blocked by waiting for something).

from guppy3.

svenil avatar svenil commented on July 18, 2024

Oops, tracemalloc.stop() clears the already gathered tracebacks. That's no good I suppose. Using .byprod on heap() afterwards even after tracemalloc.start() again shows None as producer on all the objects:-(

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Oops, tracemalloc.stop() clears the already gathered tracebacks.

Oh no :( So temporarily stopping tracemalloc via the public API is a no-go then.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Side note: Looking at how tracemalloc registers its hooks (https://github.com/python/cpython/blob/master/Modules/_tracemalloc.c#L1090) it gets the previous allocator, sets the allocator with its own that chains into the previous allocator. But the unregistration is only setting the allocator to the previous setting. I just thought about a rare non-ideal case where, if say there are another module that messes with allocator settings, say, fooalloc, then if fooalloc is enabled after tracemalloc, and then tracemalloc is disabled, then fooalloc allocator would go poof along with tracemalloc's.

By the same logic, if we were to reinvent tracemalloc, and someone uses both their tracemalloc and our reinvented tracemalloc, the disables theirs, then ours would malfunction... so this is probably also a no-go.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Hmm.. how about using PyMem_SetAllocator to temporary set a bypassing allocator so we immediately go all the way to the default allocator?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

We actually have a function for this. The downside? It's a Py_BUILD_CORE function.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

The other downside: it was added in Python 3.7

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

AFAICT, the default malloc is always the system one (https://docs.python.org/3/c-api/memory.html#default-memory-allocators). I think we could just temporarily set it to the system one instead of using _PyMem_SetDefaultAllocator :/

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

How does that look?

from guppy3.

svenil avatar svenil commented on July 18, 2024

Can confirm that tests pass with Python3.9.0a0 and Python3.6.8 and byprod worked after the tests. (I merged master with the byprod branch)

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I think I should do a release before getting the producer profile fully working and polishing it

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

To describe what this was stuck on (before I moved on to work on something else): how can you make a classifier that classifies by only the filename, or the line number? Or make a classifier that only matches unknowns?

The problem with the byprod is that information are in pairs, and if I naively reproduce what happens in the AND classifier I can't make a classifier that would only match unknown producer.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Oh...

>>> (h & hp.Prod('<', 1).alt('<=')).byprod
Partition of a set of 64 objects. Total size = 9117 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0     52  81     7389  81      7389  81 <unknown>:0
     1     12  19     1728  19      9117 100 <string>:1
>>> (h & hp.Prod('<', None).alt('<=')).byprod
Partition of a set of 21995 objects. Total size = 2158223 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  19131  87  1681571  78   1681571  78 <frozen importlib._bootstrap_external>:525
     1   1250   6   178304   8   1859875  86 <frozen importlib._bootstrap>:219
     2    131   1    85288   4   1945163  90 <frozen importlib._bootstrap_external>:606
     3    510   2    33283   2   1978446  92 <frozen importlib._bootstrap_external>:1408
     4     11   0    27128   1   2005574  93 <frozen importlib._bootstrap_external>:1416
     5     33   0    25696   1   2031270  94 <frozen importlib._bootstrap_external>:916
     6     30   0    16856   1   2048126  95 <frozen importlib._bootstrap_external>:800
     7    128   1    13132   1   2061258  96 <frozen importlib._bootstrap_external>:59
     8      7   0    10536   0   2071794  96 <frozen importlib._bootstrap>:751
     9     29   0     8952   0   2080746  96 <frozen importlib._bootstrap>:308
<46 more rows. Type e.g. '_.more' to view.>
>>> (h & hp.Prod('/', 10).alt('<')).byprod
Partition of a set of 145 objects. Total size = 36328 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0      7   5     3880  11      3880  11 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:4
     1     10   7     3136   9      7016  19 /usr/lib/python3.7/_weakrefset.py:10
     2     10   7     2600   7      9616  26 /home/zhuyifei1999/guppy3/guppy/heapy/Doc.py:1
     3     11   8     2504   7     12120  33 /home/zhuyifei1999/guppy3/guppy/heapy/Part.py:1
     4     10   7     2320   6     14440  40 /home/zhuyifei1999/guppy3/guppy/heapy/Path.py:6
     5      9   6     2240   6     16680  46 /home/zhuyifei1999/guppy3/guppy/heapy/View.py:1
     6      8   6     2168   6     18848  52
                                             /home/zhuyifei1999/guppy3/guppy/heapy/OutputHandling.py
                                             :7
     7      9   6     1984   5     20832  57 /home/zhuyifei1999/guppy3/guppy/etc/__init__.py:1
     8      7   5     1800   5     22632  62 /home/zhuyifei1999/guppy3/guppy/heapy/ImpSet.py:1
     9      6   4     1720   5     24352  67 /home/zhuyifei1999/guppy3/guppy/heapy/Target.py:5
<26 more rows. Type e.g. '_.more' to view.>
>>> (h & hp.Prod('/home/zhuyifei1999/guppy3/guppy/heapy/Use.py', 3).alt('>=') & hp.Prod('/home/zhuyifei1999/guppy3/guppy/heapy/Use.py', 10).alt('<=')).byprod
Partition of a set of 7 objects. Total size = 3880 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0      7 100     3880 100      3880 100 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:4
>>> (h & hp.Prod('/home/zhuyifei1999/guppy3/guppy/heapy/Use.py', None)).byprod
hp.Nothing
>>> (h & hp.Prod('/home/zhuyifei1999/guppy3/guppy/heapy/Use.py', None).alt('<')).byprod
Partition of a set of 28 objects. Total size = 6904 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0      7  25     3880  56      3880  56 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:4
     1      1   4      144   2      4024  58 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:118
     2      1   4      144   2      4168  60 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:126
     3      1   4      144   2      4312  62 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:164
     4      1   4      144   2      4456  65 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:186
     5      1   4      144   2      4600  67 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:24
     6      1   4      144   2      4744  69 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:27
     7      1   4      144   2      4888  71 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:287
     8      1   4      144   2      5032  73 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:294
     9      1   4      144   2      5176  75 /home/zhuyifei1999/guppy3/guppy/heapy/Use.py:299
<12 more rows. Type e.g. '_.more' to view.>
>>> (h & hp.Prod()).byprod
Partition of a set of 8447 objects. Total size = 1005157 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0   8447 100  1005157 100   1005157 100 unknown
>>> (h & hp.Prod(None, 1).alt('>=') & hp.Prod(None, 1).alt('<=')).byprod
Partition of a set of 65 objects. Total size = 14720 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0     10  15     2600  18      2600  18 /home/zhuyifei1999/guppy3/guppy/heapy/Doc.py:1
     1     11  17     2504  17      5104  35 /home/zhuyifei1999/guppy3/guppy/heapy/Part.py:1
     2      9  14     2240  15      7344  50 /home/zhuyifei1999/guppy3/guppy/heapy/View.py:1
     3      9  14     1984  13      9328  63 /home/zhuyifei1999/guppy3/guppy/etc/__init__.py:1
     4      7  11     1800  12     11128  76 /home/zhuyifei1999/guppy3/guppy/heapy/ImpSet.py:1
     5     12  18     1728  12     12856  87 <string>:1
     6      6   9     1720  12     14576  99 /home/zhuyifei1999/guppy3/guppy/heapy/__init__.py:1
     7      1   2      144   1     14720 100 /home/zhuyifei1999/guppy3/guppy/heapy/RefPat.py:1

Patch incoming soon (TM).

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024
>>> hp.Prod(guppy)
(hp.Prod(None, 1).alt('>=') & hp.Prod('/home/zhuyifei1999/guppy3/guppy/__init__.py', None).alt('<') & hp.Prod(None, 37).alt('<'))
>>> h & _
Partition of a set of 1 object. Total size = 144 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1 100      144 100       144 100 function
>>> _.byprod
Partition of a set of 1 object. Total size = 144 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0      1 100      144 100       144 100 /home/zhuyifei1999/guppy3/guppy/__init__.py:19
>>> _.theone
<function hpy at 0x7f20f393e050>
>>> _ is guppy.hpy
True

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024
>>> hp.iso(guppy.hpy).prod
Traceback (most recent call first):
  File "/home/zhuyifei1999/guppy3/guppy/__init__.py", line 19
    def hpy(ht=None):
  File "<frozen importlib._bootstrap>", line 219
  File "<frozen importlib._bootstrap_external>", line 728
  File "<frozen importlib._bootstrap>", line 677
  File "<frozen importlib._bootstrap>", line 967
  File "<frozen importlib._bootstrap>", line 983
  File "<string>", line 1

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I'm thinking of adding two warnings because if its less-intuitive use comparied to the rest of heapy:

  • If Python version < 3.8: warn on results being inaccurate
  • If everything in an identityset is classified as unknown and tracemalloc is disabled, warn that tracemalloc should be enabled for byprod to do anything

How do these sound?

This also needs docs and tests before merge.

from guppy3.

svenil avatar svenil commented on July 18, 2024

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Formal documentation is difficult :(

Anyways, I think this is done. Anything to be fixed before merge?

from guppy3.

svenil avatar svenil commented on July 18, 2024

I get unknown producer on iso() objects as well as the entire heap()
Maybe it's something wrong in my installation but I am using Python 3.9 that I compiled myself

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

tracemalloc enabled?

I haven't tested this branch on 3.9 support yet but multi-interpreter completely hangs.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

497ef41 - My WIP patch on Py3.9

from guppy3.

svenil avatar svenil commented on July 18, 2024

I had not enabled tracemalloc. Should we have a warning for that?

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

6f35d5f#diff-8c727fb63492de2479412159060b42d6R371

This should have it.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024
(venv) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -ic 'from guppy import hpy; hp=hpy(); h=hp.heap()'
>>> h.byprod
/home/zhuyifei1999/guppy3/guppy/heapy/Use.py:369: UserWarning: Python 3.7 and below tracemalloc may not record accurate producer trace. See https://bugs.python.org/issue35053
  "Python 3.7 and below tracemalloc may not record accurate "
/home/zhuyifei1999/guppy3/guppy/heapy/Use.py:373: UserWarning: Tracemalloc is not tracing. No producer profile available
  "Tracemalloc is not tracing. No producer profile available")
Partition of a set of 36573 objects. Total size = 4247366 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  36573 100  4247366 100   4247366 100 unknown
>>> 
(venv) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -X tracemalloc -ic 'from guppy import hpy; hp=hpy(); h=hp.heap()'
>>> h.byprod
/home/zhuyifei1999/guppy3/guppy/heapy/Use.py:369: UserWarning: Python 3.7 and below tracemalloc may not record accurate producer trace. See https://bugs.python.org/issue35053
  "Python 3.7 and below tracemalloc may not record accurate "
Partition of a set of 36577 objects. Total size = 4247795 bytes.
 Index  Count   %     Size   % Cumulative  % Producer (line of allocation)
     0  19238  53  1691619  40   1691619  40 <frozen importlib._bootstrap_external>:525
     1   8446  23  1005233  24   2696852  63 unknown
     2   1251   3   178368   4   2875220  68 <frozen importlib._bootstrap>:219
     3    130   0    85496   2   2960716  70 <frozen importlib._bootstrap_external>:606
     4    260   1    79408   2   3040124  72 /usr/lib/python-
                                             exec/python3.7/../../../lib/python3.7/abc.py:126
     5    513   1    33462   1   3073586  72 <frozen importlib._bootstrap_external>:1408
     6     11   0    27128   1   3100714  73 <frozen importlib._bootstrap_external>:1416
     7     33   0    25696   1   3126410  74 <frozen importlib._bootstrap_external>:916
     8     98   0    23402   1   3149812  74 /usr/lib/python3.7/collections/__init__.py:397
     9    176   0    19816   0   3169628  75 /usr/lib/python-
                                             exec/python3.7/../../../lib/python3.7/abc.py:127
<2462 more rows. Type e.g. '_.more' to view.>

from guppy3.

svenil avatar svenil commented on July 18, 2024

I realise I got a warning the first time actually, but didn't see it and there was only one warning. Maybe one could consider to have a warning each time we use byprod, not just the first, but I don't know

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I'd rather make an error than make a warning that emits every single time. Warnings get annoying fast :/

from guppy3.

svenil avatar svenil commented on July 18, 2024

May consider an error then!

from guppy3.

svenil avatar svenil commented on July 18, 2024

Consider give an hint of how to enable tracemalloc. I didn't find a clue with python -h but had to look into how we did it the last time.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I put a link to the docs:

(venv) zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/guppy3 $ python -ic 'from guppy import hpy; hp=hpy(); h=hp.heap()'
>>> h.byprod
/home/zhuyifei1999/guppy3/guppy/heapy/Use.py:122: UserWarning: Python 3.7 and below tracemalloc may not record accurate producer trace. See https://bugs.python.org/issue35053
  "Python 3.7 and below tracemalloc may not record accurate "
Traceback (most recent call last):
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1703, in get_partition
    p = a._partition
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 74, in __getattr__
    return self.fam.mod.View.enter(lambda: self.fam.c_getattr(self, other))
  File "/home/zhuyifei1999/guppy3/guppy/heapy/View.py", line 256, in enter
    retval = func()
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 74, in <lambda>
    return self.fam.mod.View.enter(lambda: self.fam.c_getattr(self, other))
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 799, in c_getattr
    return self.c_getattr2(a, b)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 802, in c_getattr2
    raise AttributeError(b)
AttributeError: _partition

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 172, in __repr__
    return self.fam.c_repr(self)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 862, in c_repr
    return self.c_str(a)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1615, in c_str
    return str(self.get_more(a).at(-1))
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1693, in get_more
    return self.mod.OutputHandling.basic_more_printer(a, a.partition)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Descriptor.py", line 32, in __get__
    return super().__get__(instance, owner)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 526, in <lambda>
    partition = property_exp(lambda self: self.fam.get_partition(self), doc="""\
  File "/home/zhuyifei1999/guppy3/guppy/heapy/UniSet.py", line 1706, in get_partition
    p = a.fam.Part.partition(a, a.er)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Part.py", line 783, in partition
    return SetPartition(self, set, er)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Part.py", line 692, in __init__
    for (kind, part) in classifier.partition(set.nodes)]
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 105, in partition
    for k, v in self.partition_cli(iterable):
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 114, in partition_cli
    self.cli.epartition)
  File "/home/zhuyifei1999/guppy3/guppy/etc/Descriptor.py", line 12, in __get__
    return self.fget(instance)
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 40, in _get_cli
    return self.get_cli()
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Classifiers.py", line 1188, in get_cli
    self.mod.Use._check_tracemalloc()
  File "/home/zhuyifei1999/guppy3/guppy/heapy/Use.py", line 126, in _check_tracemalloc
    "Tracemalloc is not tracing. No producer profile available. "
RuntimeError: Tracemalloc is not tracing. No producer profile available. See https://docs.python.org/3/library/tracemalloc.html

from guppy3.

svenil avatar svenil commented on July 18, 2024

Maybe even "Do you want to enable it? (y/n)"

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

It would be too late to enable it by then. If I run tracemalloc.start() then only the objects created after that would have their trace available.

from guppy3.

svenil avatar svenil commented on July 18, 2024

I know, that's unfortunate but then you have the option to create new objects...

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

I know, that's unfortunate but then you have the option to create new objects...

But then there would be more gotchas... I'd have to explain why, "Do you want to enable it? (y/n)" "y", one need to rerun everything to good information.

Not to mention this partition being called from a repr for a MorePrinter so I'd somehow add interactive prompt to it... and then what if stdin isn't even interactive? Someone calling guppy from a long-running server process... argh I don't want to think about user interface.

from guppy3.

svenil avatar svenil commented on July 18, 2024

"Tracing is not enabled. Type hp.tm to enable"

where hp is determined magically to be the hpy() object and tm is an attribute with a side effect to enable tracemalloc. Or if you prefer hp.tm()

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Hmm. Good idea.

from guppy3.

svenil avatar svenil commented on July 18, 2024

But maybe you want to explain that it doesn't apply to already allocated objects.

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

Yeah, I still think the best idea would be to start tracemalloc as early as possible with -X tracemalloc or PYTHONTRACEMALLOC=1. I can't think of how don't know how to make that clear, and explain the reasoning, in one error message.

How about let's just keep it an error? If people went as far as finding the producer classifier from the obscure docs / code of guppy, I'm sure they won't mind reading some highly-readable official Python docs.

We could document on all those APIs and how-tos, and especially how in the world guppy even works and how to write extend guppy, with some more-user-friendly less-mathematical documentations, but I'm a pretty bad doc writer. All I'm good at is emitting code 😂

from guppy3.

svenil avatar svenil commented on July 18, 2024

How about let's just keep it an error?

Yeah, that seems to be a good option.

Another idea I was contemplating was to have an argument to hpy() that enabled tracemalloc. So we didn't have to read the Python docs. Or even a special constructor. But I don't know if it's useful enough and we would have to document it ourself... and I agree, I prefer coding before writing docs myself too, although I have to write docs at work.

from guppy3.

svenil avatar svenil commented on July 18, 2024

BTW x.byprod is missing from the formal documentation of IdentitySet

from guppy3.

zhuyifei1999 avatar zhuyifei1999 commented on July 18, 2024

BTW x.byprod is missing from the formal documentation of IdentitySet

Fixed

from guppy3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.