pythonspeed / profila Goto Github PK

View Code? Open in Web Editor NEW

61.0 61.0 2.0 98 KB

A profiler for Numba

License: Apache License 2.0

Emacs Lisp 0.19% Python 91.98% Jupyter Notebook 7.83%

profila's People

Contributors

Stargazers

Watchers

Forkers

evelynmitchell said-hadjout

profila's Issues

Initial release, after adding setuptools-scm for the version

Limit search depth

Right now, it seems profiling is done at a practically infinite depth, so my calls usually end up going way deeper than is important to me (i.e. I don't need to see the deepest numpy source code). Is it possible to limit the depth to which the profiling is done, either by simply doing x levels away from the executed file or (possibly even better) only profiling calls in files contained in a certain directory/namespace?

Profila occasionally fails with message that is None

  File "/home/itamarst/devel/low-level-performance-book/venv/lib/python3.11/site-packages/profila/__main__.py", line 54, in get_stats
    async for sample in read_samples(process):
  File "/home/itamarst/devel/low-level-performance-book/venv/lib/python3.11/site-packages/profila/_gdb.py", line 100, in read_samples
    async for sample in _sample(process):
  File "/home/itamarst/devel/low-level-performance-book/venv/lib/python3.11/site-packages/profila/_gdb.py", line 53, in _sample
    if "stack" not in message["payload"]:  # type: ignore
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable

See if Jupyter support is feasible

Release requirements

Tests
Documentation

UX requirements

Fail if loaded after Numba is imported
Fail if gdb can't be found
Fail if not on Linux
After loading warn that it shouldn't be on by default for performance reasons

Implementation notes

On Linux prctl(PR_SET_PTRACER, pid_of_subprocess)/prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY) might allow gdb to access current process without needing root. prctl(PR_SET_PTRACER, 0) to reset.

Would probably need controller to be an additional Python process to prevent deadlocks.

The controlling Python subprocess would need to communicate with the parent process:

Command-line args can tell subprocess where to attach.
Subprocess needs to know when to stop profiling (via stdin closing?).
Subprocess needs to send results to Jupyter (via stdout?).
Subprocess needs to tell Jupyter when it's ready so Jupyter can proceed with code execution.

UPDATE: This may take a while...

The problem

Profila currently uses GDB to get stacktraces, using the Machine Interface ("MI") protocol.
GDB doesn't work on macOS ARM, only lldb currently works.
lldb-mi is third-party tool supposedly compatible with GDB MI, but not readily available on macOS. E.g. VS Code plugins will ship it themselves.

Potential solutions

Switch to non-debugger method of getting stack traces; need something that relies on debuginfo in the way debuggers do. Initial experiments have failed, this is why there hasn't been profiling support for Numba until now! But will keep trying.
Ship lldb-mi in the wheel.
Convince some kind soul to package lldb-mi for Homebrew.
GDB adds support for macOS ARM.
Figure out a different way than MI to control lldb. The remote server protocol (same as GDB?) might work. Or there's https://pypi.org/project/hilda/ which either might be usable or might be a reasonable example. https://lib.rs/crates/lldb is another potential solution.

Workaround

On macOS, use Docker, Podman, or a VM to get a Linux environment.

Use `sys.executable` instead of `python`?

I installed this onto a venv ran into this error:

{'type': 'result', 'message': 'error', 'payload': {'msg': '"/home/jje/.pyenv/shims/python": not in executable format: file format not recognized'}, 'token': None}

But /home/jje/.pyenv/shims/python is executable, so I'm not sure what the issue is.

ls /home/jje/.pyenv/shims/python -l
# -rwxr-xr-x 1 jje jje 176 Sep  8 22:12 /home/jje/.pyenv/shims/python*

Does it need to be a proper binary?

cat /home/jje/.pyenv/shims/python
# #!/usr/bin/env bash
# set -e
# [ -n "$PYENV_DEBUG" ] && set -x
# 
# program="${0##*/}"
# 
# export PYENV_ROOT="/home/jje/.pyenv"
# exec "/usr/share/pyenv/libexec/pyenv" exec "$program" "$@"

But it's not the executable I wanted anyway. I wanted the one from the venv I was calling python -m profilia in. Changing this line got it working the way I needed in my particular case:

-    process.stdin.write(b"-file-exec-file python\n")
+    process.stdin.write(bytes(f"-file-exec-file {sys.executable}\n", encoding="utf-8"))