Currently suspected protocol: ASGI. Verification on RSGI and WSGI should be done a

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Possible memory-leak investigation,about emmett-framework/granian

Comments (14)

cpoppema commented on August 21, 2024 1

@gi0baro Right. I've re-uploaded them as .tar.xz with just 30 seconds of siege. Looks like there is only two files without the --reload flag 🙂

from granian.

cpoppema commented on August 21, 2024 1

@cpoppema this commit might be worth a test round, at least to see if @sciyoshi is right about the root cause.

That looks very promising indeed! I'm not sure about the implications of that change, but comparing observed memory usage it seems to have fixed it 💪 , see graphs below 🎉🎉🎉 To generate the graphs I used mprof run --include-children --multiprocess <server> and mprof plot (pip install memory-profiler matplotlib) which shows memory over time after 5 minutes of sending requests with siege).

baseline granian 1.2.3:

in contrast with uvicorn:

with granian master:

from granian.

gi0baro commented on August 21, 2024 1

Closing this as per changes in 1.3.
Thanks everybody who participated in the debugging!

from granian.

gi0baro commented on August 21, 2024 1

Apologies in advance for commenting on an already closed issue however I am wondering if this should address the memory leak that is being caused by memory not being released by python itself described here w.r.t malloc:

TLS/SSL asyncio leaks memory python/cpython#109534 (the "solution" some use is to do an malloc trim, or setting an env var which seems hacky)
and

discussed also here Memory Not Released After High-Load Operations encode/uvicorn#2078 (ctrl-f granian also compared to other implementations with similar results albeit older)

Both those seems unrelated to this issue (I also participated in the uvicorn discussion you mentioned). I also frankly doubt the discussion in uvicorn has anything to do with the CPython one, given no ssl/tls is used. But I definitely have no time to investigate those.

Edit: Updated to 1.3 and still see the memory leak issue described above

Seems a bit unrelated as well. Can you please open up a new issue with numbers, tested alternatives and a MRE?

from granian.

gi0baro commented on August 21, 2024

@cpoppema jfyi I did some investigation on this during the weekend using https://github.com/bloomberg/memray, but I wasn't able to reproduce the memory increase trend you had on my machine (macOS). I need to schedule a similar test on a Linux machine, as it might be a target-specific issue.

from granian.

cpoppema commented on August 21, 2024

@cpoppema jfyi I did some investigation on this during the weekend using https://github.com/bloomberg/memray, but I wasn't able to reproduce the memory increase trend you had on my machine (macOS). I need to schedule a similar test on a Linux machine, as it might be a target-specific issue.

If you tell me what flags to run, I can run memray no problem. As a quick example, if I wrap my service in memray:

docker compose run --rm -p 8000:8000 server python -m memray run --native --trace-python-allocators --follow-fork -o /app/memray.bin  /usr/local/bin/granian --interface asgi service.server:app --host=0.0.0.0 --loop uvloop --reload

and then use siege to run for 5 minutes.

$ siege -b -t 5m -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" http://127.0.0.1:8000/some_endpoint
{	"transactions":			        2833,
	"availability":			      100.00,
	"elapsed_time":			      299.22,
	"data_transferred":		       10.96,
	"response_time":		        2.63,
	"transaction_rate":		        9.47,
	"throughput":			        0.04,
	"concurrency":			       24.88,
	"successful_transactions":	        2833,
	"failed_transactions":		           0,
	"longest_transaction":		       33.97,
	"shortest_transaction":		        0.81
}

For 1.2.2 this prints 1.0 GiB

docker-compose run --rm --no-deps server python -m memray stats /app/memray.bin
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Location                                                                                          ┃  <Total Memory> ┃  Total Memory % ┃  Own Memory ┃  Own Memory % ┃  Allocation Count ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ _PyEval_Vector at Python/ceval.c                                                                  │         1.000GB │          99.40% │      0.000B │         0.00% │                 3 │
...

For 1.2.3 this prints only 5.x MiB

docker compose run --rm --no-deps server python -m memray table /app/memray.bin
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Location                                                                                          ┃  <Total Memory> ┃  Total Memory % ┃  Own Memory ┃  Own Memory % ┃  Allocation Count ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ _PyEval_Vector at Python/ceval.c                                                                  │         5.662MB │          90.83% │      0.000B │         0.00% │             46325 │
...

More stats:

docker compose run --rm --no-deps server python -m memray stats /app/memray.bin
📏 Total allocations:
	452844

📦 Total memory allocated:
	78.369MB

📊 Histogram of allocation size:
	min: 0.000B
	---------------------------------------------
	< 6.000B   :   1410 ▇
	< 18.000B  :  30607 ▇▇▇▇
	< 57.000B  : 195094 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
	< 174.000B : 133406 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
	< 533.000B :  82120 ▇▇▇▇▇▇▇▇▇▇▇
	< 1.593KB  :   7292 ▇
	< 4.869KB  :   1131 ▇
	< 14.886KB :    999 ▇
	< 45.502KB :    696 ▇
	<=139.087KB:     89 ▇
	---------------------------------------------
	max: 139.088KB

📂 Allocator type distribution:
	 PYMALLOC_MALLOC: 375169
	 PYMALLOC_CALLOC: 37466
	 PYMALLOC_REALLOC: 24259
	 MALLOC: 9738
	 REALLOC: 6012
	 MMAP: 192
	 CALLOC: 6
	 POSIX_MEMALIGN: 2

🥇 Top 5 largest allocating locations (by size):
	- _call_with_frames_removed:<frozen importlib._bootstrap>:241 -> 12.287MB
	- <stack trace unavailable> -> 10.930MB
	- _compile_bytecode:<frozen importlib._bootstrap_external>:729 -> 5.457MB
	- _create_fn:/usr/local/lib/python3.11/dataclasses.py:433 -> 5.451MB
	- get_data:<frozen importlib._bootstrap_external>:1131 -> 3.858MB

🥇 Top 5 largest allocating locations (by number of allocations):
	- _call_with_frames_removed:<frozen importlib._bootstrap>:241 -> 81023
	- _compile_bytecode:<frozen importlib._bootstrap_external>:729 -> 41467
	- _create_fn:/usr/local/lib/python3.11/dataclasses.py:433 -> 38391
	- <stack trace unavailable> -> 15700
	- _path_stat:<frozen importlib._bootstrap_external>:147 -> 14385

but RES memory in htop did keep growing. Starting with 134M, ending up at around 612M after siege is finished after 5 minutes.

from granian.

gi0baro commented on August 21, 2024

@cpoppema the difference between 1.2.2 and 1.2.3 should be due to jemalloc in place of mimalloc (the latter reserves a bigger amount of memory for its arenas, probably).

Running memray that way is perfectly fine. It is quite strange to me it doesn't detect the difference in memory reported by htop.
Can you attach the .bin files so I can take a look at the data?
Also, current master uses PyO3 0.21, which probably changed some memory-related stuff. Might be worth trying also with that and dump the relevant memray output as well.

Thank you for the patience 🙏

from granian.

cpoppema commented on August 21, 2024

It generates 3 .bin files per run (~3M, ~88M, 1.9G) here are the smallest of each run (1.2.3 and 1.3.0)

from granian.

gi0baro commented on August 21, 2024

@cpoppema I'm afraid those files only contains data from the main process, not the actual workers. Would be nice to have also the other dumps (maybe compressed?).
It would be helpful to remove the --reload option when running with memray to avoid all the watchfiles allocations and relevant threads, also probably just 30s/1m of data would be enough to spot leaks.

from granian.

gi0baro commented on August 21, 2024

@cpoppema so, given that we trust memray data:

 ❯ memray stats -n 25 ~/Downloads/memray.bin.8
📏 Total allocations:
	21511327

📦 Total memory allocated:
	4.234GB

📊 Histogram of allocation size:
	min: 0.000B
	-----------------------------------------------
	< 4.000B   :    49702 ▇
	< 21.000B  :   439303 ▇
	< 96.000B  : 12719302 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
	< 445.000B :  7996012 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
	< 1.998KB  :   247030 ▇
	< 9.184KB  :    24465 ▇
	< 42.195KB :    23474 ▇
	< 193.859KB:    12005 ▇
	< 890.660KB:       24 ▇
	<=3.996MB  :       10 ▇
	-----------------------------------------------
	max: 3.996MB

📂 Allocator type distribution:
	 PYMALLOC_MALLOC: 18461721
	 MALLOC: 1872067
	 PYMALLOC_REALLOC: 783705
	 PYMALLOC_CALLOC: 366234
	 MMAP: 17411
	 REALLOC: 10163
	 CALLOC: 26

🥇 Top 25 largest allocating locations (by size):
	- __init__:/usr/local/lib/python3.11/site-packages/pydantic/main.py:171 -> 1.344GB
	- _iter_file_finder_modules:/usr/local/lib/python3.11/pkgutil.py:168 -> 127.991MB
	- digest:/usr/local/lib/python3.11/hmac.py:159 -> 105.208MB
	- _spawn_asgi_lifespan_worker:/usr/local/lib/python3.11/site-packages/granian/server.py:230 -> 104.763MB
	- _compile_bytecode:<frozen importlib._bootstrap_external>:729 -> 70.211MB
	- validate_core_schema:/usr/local/lib/python3.11/site-packages/pydantic/_internal/_core_utils.py:570 -> 47.243MB
	- _init_hmac:/usr/local/lib/python3.11/hmac.py:67 -> 44.418MB
	- _walk:/usr/local/lib/python3.11/site-packages/pydantic/_internal/_core_utils.py:202 -> 44.399MB
	- <stack trace unavailable> -> 44.268MB
	- get_data:<frozen importlib._bootstrap_external>:1131 -> 43.663MB
	- dump_python:/usr/local/lib/python3.11/site-packages/pydantic/type_adapter.py:333 -> 37.683MB
	- _call_with_frames_removed:<frozen importlib._bootstrap>:241 -> 34.613MB
	- search:/usr/local/lib/python3.11/re/__init__.py:176 -> 33.504MB
	- model_validate:/usr/local/lib/python3.11/site-packages/pydantic/main.py:509 -> 30.636MB
	- walk:/usr/local/lib/python3.11/site-packages/pydantic/_internal/_core_utils.py:199 -> 30.465MB
	- digest:/usr/local/lib/python3.11/hmac.py:158 -> 28.764MB
	- _create_fn:/usr/local/lib/python3.11/dataclasses.py:433 -> 27.915MB
	- _gen_cache_key:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:246 -> 25.330MB
	- _gen_cache_key:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:272 -> 23.599MB
	- sub:/usr/local/lib/python3.11/re/__init__.py:185 -> 23.234MB
	- <listcomp>:/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/loading.py:226 -> 22.577MB
	- new:/usr/local/lib/python3.11/hmac.py:184 -> 22.522MB
	- create_schema_validator:/usr/local/lib/python3.11/site-packages/pydantic/plugin/_schema_validator.py:49 -> 21.853MB
	- __init__:/usr/local/lib/python3.11/typing.py:830 -> 21.201MB
	- <genexpr>:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:732 -> 20.580MB

🥇 Top 25 largest allocating locations (by number of allocations):
	- digest:/usr/local/lib/python3.11/hmac.py:159 -> 1475280
	- _spawn_asgi_lifespan_worker:/usr/local/lib/python3.11/site-packages/granian/server.py:230 -> 1117338
	- _init_hmac:/usr/local/lib/python3.11/hmac.py:67 -> 656003
	- _gen_cache_key:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:246 -> 531626
	- _compile_bytecode:<frozen importlib._bootstrap_external>:729 -> 468663
	- _gen_cache_key:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:272 -> 347222
	- __init__:/usr/local/lib/python3.11/site-packages/pydantic/main.py:171 -> 261995
	- _call_with_frames_removed:<frozen importlib._bootstrap>:241 -> 229360
	- _create_fn:/usr/local/lib/python3.11/dataclasses.py:433 -> 196035
	- <genexpr>:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:732 -> 167120
	- new:/usr/local/lib/python3.11/hmac.py:184 -> 164001
	- digest:/usr/local/lib/python3.11/hmac.py:158 -> 163921
	- _walk:/usr/local/lib/python3.11/site-packages/pydantic/_internal/_core_utils.py:202 -> 159390
	- get:/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/identity.py:222 -> 159268
	- validate_core_schema:/usr/local/lib/python3.11/site-packages/pydantic/_internal/_core_utils.py:570 -> 155330
	- <listcomp>:/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/loading.py:226 -> 140308
	- _populate_full:/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/loading.py:1323 -> 131341
	- _filter_on_values:/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/row.py:176 -> 119448
	- search:/usr/local/lib/python3.11/re/__init__.py:176 -> 116985
	- from_model:/app/service/core/entities.py:57 -> 113077
	- _op:/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/row.py:206 -> 108072
	- visit_has_cache_key_list:/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/cache_key.py:732 -> 107191
	- __eq__:/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/row.py:236 -> 102385
	- _path_stat:<frozen importlib._bootstrap_external>:147 -> 99823
	- isdir:<frozen genericpath>:42 -> 94415

my suspect for this to happen only in Granian

__init__:/usr/local/lib/python3.11/site-packages/pydantic/main.py:171 -> 1.344GB

is that probably Python is not deallocating objects between requests. Which is super-weird, but at least now I have a starting point to do some additional checks. Gonna post updates as soon as I make new discoveries.

from granian.

Zerotask commented on August 21, 2024

Interestingly gunicorn also has a max-requests option https://docs.gunicorn.org/en/stable/settings.html#max-requests

(...) This is a simple method to help limit the damage of memory leaks.

We also noticed memory leaks in our django application (with WSGI and ASGI), but I haven't got time to investigate it yet. Our Kubernetes pods will just restart from time to time. Restarting granian only instead would be faster and more efficient.

Related to #34

from granian.

sciyoshi commented on August 21, 2024

Possibly related to #252 (comment).

from granian.

gi0baro commented on August 21, 2024

@cpoppema this commit might be worth a test round, at least to see if @sciyoshi is right about the root cause.

from granian.

Faolain commented on August 21, 2024

Apologies in advance for commenting on an already closed issue however I am wondering if this should address the memory leak (Edit: Updated to 1.3 and still see the memory leak issue described above ) that is being caused by memory not being released by python itself described here w.r.t malloc:

python/cpython#109534 (the "solution" some use is to do an malloc trim, or setting an env var which seems hacky)
and
discussed also here encode/uvicorn#2078 (ctrl-f granian also compared to other implementations with similar results albeit older)

Should I create a new issue for this in case there is something that can be done in granian?

from granian.

Possible memory-leak investigation about granian HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent