stratus3d / eflambe Goto Github PK
View Code? Open in Web Editor NEWA tool for rapid profiling of Erlang and Elixir applications
License: Apache License 2.0
A tool for rapid profiling of Erlang and Elixir applications
License: Apache License 2.0
@jj1bdx you are right, I see I am not handling all the trace messages as I should. I don't think I really need those gc start and end events, so I will update the code accordingly so those events aren't logged.
For the issue you linked to, I do see the exception:
** exception exit: {timeout,
{gen_server,call,
[eflambe_server,
{stop_trace,#Ref<0.4056665575.1064304643.107444>}]}}
in function gen_server:call/2 (gen_server.erl, line 239)
in call from eflambe:stop_trace/1 (/Users/kenji/src/sfmt-erlang/_build/default/lib/eflambe/src/eflambe.erl, line 125)
in call from eflambe:apply/2 (/Users/kenji/src/sfmt-erlang/_build/default/lib/eflambe/src/eflambe.erl, line 87)
2> =ERROR REPORT==== 19-Oct-2021::12:49:34.479003 ===
** Generic server eflambe_server terminating
** Last message in was {stop_trace,#Ref<0.4056665575.1064304643.107444>}
** When Server state == {state,
[{trace,#Ref<0.4056665575.1064304643.107444>,1,1,
true,<0.249.0>,
[{open,speedscope}]}]}
** Reason for termination ==
** {{timeout,{gen_server,call,[<0.249.0>,finish]}},
[{gen_server,call,2,[{file,"gen_server.erl"},{line,239}]},
{eflambe_server,handle_call,3,
[{file,"/Users/kenji/src/sfmt-erlang/_build/default/lib/eflambe/src/eflambe_server.erl"},
{line,131}]},
{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,721}]},
{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,750}]}, �
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}
** Client <0.238.0> is dead
And it looks like that gen server was currently executing this line when the timeout occurred - https://github.com/Stratus3D/eflambe/blob/master/src/eflambe_brendan_gregg.erl#L178. It's not clear to me why it timed out, but it could have been because the formatting was taking a while, which could have been caused by a large amount of trace data being collected. Does the sfmt_pure_tests:test_short_speed/0
function run for a long time? Either way I need to get this fixed. If formatting of trace data takes longer than the gen server timeout, you'll get this error. I need change the code so that an unlimited amount of time is permitted for trace formatting.
Originally posted by @Stratus3D in #1 (comment)
Hi @Stratus3D
I'm trying this very simple example using eflambé
from master:
> application:ensure_all_started(eflambe).
{ok,[eflambe]}
> eflambe:capture({lists, seq, [1, 10]}, 1, [{open, speedscope}]).
Nothing happens after that. The call is stuck forever. This happens on both Linux and macOS.
My config:
%% Mac
$ erl
Erlang/OTP 24 [erts-12.1] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]
$ sw_vers
ProductName: macOS
ProductVersion: 11.6
BuildVersion: 20G165
%% Linux
$ erl %% Linux
Erlang/OTP 22 [erts-10.6.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1]
$ lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal
Help appreciated.
It occurred to me today that a set of usage patterns or a "cookbook" for eflambe might be useful.
For example, today I wanted to profile a whole Phoenix pipeline. This did the trick:
:eflambe.capture({Phoenix.Endpoint.Cowboy2Handler, :init, 2}, 1, [open: :speedscope])
Hello!
I noticed that on the hex package site the newest version, 0.2.3 doesn't include the newest changes that were merged about two months ago.
Would it be possible to update the package to 0.2.4 and includes those changes?
Open up a PR: #19 that should update some of versions to the next one.
Thanks!
We need a workflow to run on every commit and every PR update to validate:
If any of these things fail the commit should fail the workflow.
Despite my diving through the eflambe
codebase and erlang:trace
functionality, it's not really possible to say with confidence what unit the data generated is actually in. (Is it reductions?) It would be clearer to state as part of the initial description what the fundamental unit of measurement is in
From ElixirConf 2022: Someone asked about using flamegraph data for statistical analysis. I'm not really sure this is something that would be useful, but I can see how raw data itself could be make the tool a little more flexible.
Add a homepage link, fix the docs link.
Currently eflambe fails to compile on older Erlang/OTP versions, i've personally tested in 18 and 19. I've identified the usage of the new logger and new gen_server
callbacks like handle_continue
to be the problem. This tool is incredibly valuable and would be handy to have for people working on legacy codebases. I can contribute this change but wanted to check if this is something the authors will be interested since it will add some workarounds that are not so beautiful.
From ElixirConf 2022: Sometimes I see eflambe calls themselves in the generated flamegraphs. This is confusing to end users and should be fixed.
@Stratus3D I'm really enjoying eflambé
and i'm wondering if it's possible to add these two innovations from flame_prof:
High scheduler load
The reductions heatmap and corresponding flamegraph below show what callstacks contributed most to schedular load in the example profile. Colour saturation in the heatmap and widths in the flamegraph are scaled to reductions. Scaling flamegraph widths by metrics measured during sampling is the first of flame_prof's two main innovations.
Crucially, in the flamegraph, the callstacks are categorised by the process status of upto 100 processes running that particular code (the example profile auto-selected the top 100.) Such callstack categorisation/grouping is flame_prof's other main innovation. It has proven extremely useful especially in identifying code causing concurrency bottlenecks or memory issues.
Armed with them, eflambé
could be the best profiling tool for Erlang.
I'm using eflambe on my Phoenix project, Elixir 1.14.2, Erlang 25.1.2
and I'm getting a endless spew of unexpected trace event:
[info] Received unexpected trace event: {:trace_ts, #PID<0.1691.0>, :return_to, {:erl_lint, :head, 4}, {1669, 921967, 417470}}
[info] Received unexpected trace event: {:trace_ts, #PID<0.1691.0>, :return_to, {:erl_lint, :"-remove_non_visible/1-lc$^0/1-0-", 1}, {1669, 921967, 417558}}
[info] Received unexpected trace event: {:trace_ts, #PID<0.1691.0>, :return_to, {:erl_lint, :"-remove_non_visible/1-lc$^0/1-0-", 1}, {1669, 921967, 417559}}
[info] Received unexpected trace event: {:trace_ts, #PID<0.1691.0>, :return_to, {:erl_lint, :"-remove_non_visible/1-lc$^0/1-0-", 1}, {1669, 921967, 417560}}
Currently everything is kept in memory. In some scenarios this may not be ideal.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.