beam-telemetry / telemetry Goto Github PK
View Code? Open in Web Editor NEWDynamic dispatching library for metrics and instrumentations.
Home Page: https://hexdocs.pm/telemetry
License: Apache License 2.0
Dynamic dispatching library for metrics and instrumentations.
Home Page: https://hexdocs.pm/telemetry
License: Apache License 2.0
Right now an event name might be an arbitrary list of atoms. I think that event with an empty-list name should not be allowed since it doesn't convey any information about the source of the event.
Thoughts?
Currently the codebase has a behaviour and some indirection and it could all be inlined into a single module (although I would keep the current function boundaries, since it decouples the events from the API). I know @arkgil wanted to keep the impl
behaviour for a bit but we need to eventually clean it up.
This would allow for example using Telemetry events as events in traces (not only as start-end events).
My original use-case was a little bit different though. I wanted to add support for Server-Timing
header in Plug with timings for different parts of the query. This would be grat to gather all events that happened within single request and put them in the response header, however right now there is no way to detect which are the events for "current request" as there is no way to set metadata that would be populated through all Telemetry events. It would be great to have either telemetry:set_process_metadata/1
or use Erlang's logger metadata that would be added to the metadata provided in event dispatch. With metadata I could just do set_process_metadata(#{?MODULE => ref})
and use that to detect events that happened within single process.
Hey y'all!
Apologies if this is the wrong place to be posting this, but I am at my whits end. If there is a better place to be asking this question, please point in that direction if you have the time. I am new to elixir/Erlang and I'm not sure about a lot of things at this point, but here's where I'm at:
Been playing around with Elixir / Phoenix / LiveView, etc. Recently I got a LiveView testing course and the sample project is running Erlang 23, and I'm only (very loosely) familiar with 24. Anyhoo, I downloaded asdf
and started to get things sorted. It was a BEAR getting 23 correctly installed with the right version of openssl and blah blah blah. Anyhoo, I finally got 23 to build correctly, but when I try running any tests at all, I see this a BUNCH of times:
22:52:06.135 [error] Loading of {project_dir}/_build/test/lib/telemetry/ebin/telemetry.beam failed: :badfile
22:52:06.135 [error] beam/beam_load.c(1886): Error loading module telemetry:
This BEAM file was compiled for a later version of the run-time system than 23.
To fix this, please recompile this module with an 23 compiler.
(Use of opcode 172; this emulator supports only up to 170.)
And then eventually the running of the test fails with this error message:
could not compile dependency :phoenix_live_dashboard, "mix compile" failed. You can recompile this dependency with "mix deps.compile phoenix_live_dashboard", update it with "mix deps.update phoenix_live_dashboard" or clean it with "mix deps.clean phoenix_live_dashboard"
** (MatchError) no match of right hand side value: {:error, {:phoenix, {:bad_return, {{Phoenix, :start, [:normal, []]}, {:EXIT, {:undef, [{:telemetry, :attach, [{Phoenix.Logger, [:phoenix, :channel_handled_in]}, [:phoenix, :channel_handled_in], #Function<8.82939286/4 in Phoenix.Logger."-fun.phoenix_channel_handled_in/4-">, :ok], []}, {Phoenix.Logger, :"-install/0-fun-7-", 2, [file: 'lib/phoenix/logger.ex', line: 111]}, {:maps, :fold_1, 3, [file: 'maps.erl', line: 233]}, {Phoenix.Logger, :install, 0, [file: 'lib/phoenix/logger.ex', line: 110]}, {Phoenix, :start, 2, [file: 'lib/phoenix.ex', line: 46]}, {:application_master, :start_it_old, 4, [file: 'application_master.erl', line: 277]}]}}}}}}
(phoenix 1.5.6) lib/mix/tasks/compile.phoenix.ex:11: Mix.Tasks.Compile.Phoenix.run/1
(mix 1.11.2) lib/mix/task.ex:394: Mix.Task.run_task/3
(mix 1.11.2) lib/mix/tasks/compile.all.ex:90: Mix.Tasks.Compile.All.run_compiler/2
(mix 1.11.2) lib/mix/tasks/compile.all.ex:70: Mix.Tasks.Compile.All.compile/4
(mix 1.11.2) lib/mix/tasks/compile.all.ex:57: Mix.Tasks.Compile.All.with_logger_app/2
(mix 1.11.2) lib/mix/tasks/compile.all.ex:35: Mix.Tasks.Compile.All.run/1
(mix 1.11.2) lib/mix/task.ex:394: Mix.Task.run_task/3
(mix 1.11.2) lib/mix/tasks/compile.ex:119: Mix.Tasks.Compile.run/1
So...
Thank you so much for your time!
Right now the handler is detached whenever it fails. Additionally, handlers can be manually removed. However, there is now way to get notified when this happens, which means that one can not attempt to attach handler once again in hope that it won't fail this time (similarly we restart processes in hope that this time everything will work fine).
I believe that such mechanism should be added to the library. I imagine processes attaching handlers on spawn, listening on "detached events" and crashing when they're received, to ultimately be restarted and starting the cycle over again.
A number of integrations are using monotonic_time
for calculating durations, and that's great since it's the only way to avoid bad times when "clock skew" happens.
http://erlang.org/doc/apps/erts/time_correction.html
However, the monontonic_time
does not mark a specific point in time! There are numerous integrations now that are using that time directly as part of a :start
event, which means that they are providing an un-usable start timestamp. Knowing the accurate start time of a span is critical for tracing.
I suggest we decide on a set of recommendations for marking & tracking time.
My initial suggestion is this:
Start
start_time
via System.system_time()
Stop / Failure
duration
via System.monotonic_time()
start_time = System.system_time()
start_time_mono = System.monotonic_time()
:telemetry.exectute(
[:event, :start],
%{
start_time: start_time
},
metadata
)
# do work ...
end_time_mono = System.monotonic_time()
:telemetry.exectute(
[:event, :stop],
%{
duration: end_time_mono - start_time_mono
},
metadata
)
Why not rely on each individual handler to mark time?
I was reading the code and, as far as I was checking it, it looks a copy of gen_event
, it's been able to register handlers for every element propagated and if it's failing it's detached. Because maybe I was missing something, I'm curious, what's the difference with gen_event?
Howdy folks, I'm excited about this project!
I'm the author of the New Relic Elixir Agent (https://github.com/newrelic/elixir_agent/)...
One feature that is really powerful but a little aside from the Telemetry use cases I'm seeing is Distributed Tracing. This kind of features differs in that in addition to callbacks that fire when events happen, we need make some kind of mutation.
For HTTP, this looks like adding an additional header to a request.. In the New Relic agent, the header addition pretty much has to be done manually by the developer.
headers = headers ++ NewRelic.create_distributed_trace_payload(:http)
On the receiving end, the server needs to get access to read this header and take some action. We do this with a simple plug that users must add to the top of their pipeline. You could imagine reading the header and wanting to set some kind of value in conn.private
, for example.
If we provided a clear convention or mechanism for this kind of thing, libraries might be willing to build this in along with the Telemetry events. Basically it boils down to enabling the Telemetry system to have a category of event that can transform some select pieces of data.
Here's a very rough sketch of what the interface could be like, just to explain the idea:
# Inside an HTTP client
# ... this library wants to allow transforming headers & query params
%{headers: headers, params: params} =
Telemetry.transform([:http, :client, :pre_request], %{params: params, headers: headers, url: url, body: body}})
# Inside an HTTP server / Plug
%{conn: conn} =
Telemetry.transform([:http, :server, :receive], %{conn: conn})
Hopefully this is interesting to other folks too!
It would be nice if we had an API that collected/logged all events published in a certain time period. This would make it easy to discover events. For example, in a Phoenix application I could say: call IO.inspect/1
for all events in the next 5 seconds.
I am not sure if this functionality belongs here. In order to not slow down telemetry, we could implement this using :erlang.trace. The goal here is debugging, so it is important to not slow down the main telemetry engine. We could also add this to telemetry_sampler.
It would be useful to debug poorly configured handlers or having the list of all events emitted in an application.
I initially thought about always persisting unhandled events to an ETS table, but I realize that the execute function could be called many times per second so we need to keep the overhead there to a minimum.
So I was thinking of doing it conditionally if the telemetry application is run in debug mode.
For now we could keep things simple and only persist unhandled event names.
Later we could expand the debug mode to do things that are no-nos when running in production.
The downside of this approach is that right now telemetry is so simple that it doesn't need configuration (and that is a good thing) and a debug mode would make it slightly more complex. It also means a possible foot gun for users that could unwittingly set debug mode on production.
I have a use case, where I want to include span_id
and trace_id
, which come from a "distributed tracing" library of choice into logs. If it's OpenTelemetry or Spandex doesn't really matter, in both cases, I am facing the same possible issue.
I want to have span_id
and trace_id
attached to the log lines generated when web request starts already, but in order to achieve it, the handlers which start the OpenTelemetry/Spandex span/trace need to be executed before the handlers attached by Phoenix.Logger
to the same event.
This seems to be work, as I can disable logger initialization on :phoenix
application start, and then attach it manually after I attached OpenTelemetry/Spandex events, for example:
# config/config.exs
config :phoenix, :logger, false
# lib/application.ex
SpandexPhoenix.Telemetry.install()
Phoenix.Logger.install()
# or
OpenTelemetry.register_application_tracer(:my_app)
OpentelemetryPhoenix.setup()
Phoenix.Logger.install()
This works as I'd expect it to work but, the documentation of this project clearly says:
Note that you should not rely on the order in which handlers are invoked.
Line 72 in d3ba8b2
And this is precisely what I am relying on (and others too, without a doubt).
It seems that the handlers are executed in the order they were inserted to underlying ETS table, as they are fetched with :ets.lookup
, which does return them in order of insertion.
https://github.com/beam-telemetry/telemetry/blob/main/src/telemetry_handler_table.erl#L48
Should we make this behavior official? I think we don't have to change anything in the project code itself, maybe we should just add a couple of tests to make sure the order is preserved, and remove the warnings from the documentation that the order is not guaranteed?
Since the project is now public, I think it would be nice to have some CI set up to run tests on each PR. I see that Phoenix itself is using TravisCI, should we use it for this project as well?
Opening this really quick, can't expand on it right this second. But the span
function needs to be able to take some result from the start
event to pass to the stop
event so that the right span can be ended.
That or span
needs to be able to work another way that isn't based on events.
Trying to figure out how to integrate with OpenTelemetry 1.0 without a lot of overhead for the user and so that it doesn't have the issue of not knowing what span you are end'ing when end
is called in the stop event.
Proposal:
Telemetry is a dynamic dispatching library for metrics and instrumentations. It is lightweight, small and can be used in any Elixir project.
In a nutshell, you register a custom module and function to be invoked for certain events, which are executed whenever there is such event. Event names are a list of atoms. Each event is composed of a numeric value and can have metadata attached to it. Let's see an example:
...
It would be great to have built-in support for tracing events along metrics. It would provide nice API to digest by OpenCensus, Spandex, and possibly other tracers out there. While this can be currently "improvised" via creating events in form of [:event, :start]
and [:event, :end]
there are few problems with that approach:
:start
with it's :end
I have some code that attaches Telemetry handlers on startup of some supervisor. I start this supervisor in tests with start_supervised!
, but the problem is that the telemetry handlers are never detached when the supervisor dies.
The code below shows the problem, test "two"
will fail with the error {:error, :already_exists}
. I don't really like to just ignore the outcome of Telemetry.attach
, I could of course handle both :ok
and {:error, :already_exists}
as valid outcomes.
Is that the way to handle it, or is there a way to stop the Telemetry cleanly after a test.
defmodule TestTelemetry do
use ExUnit.Case, async: false
test "one" do
:ok = Telemetry.attach_many(
"log-response-handler",
[[:external_message, :consumed], [:external_message, :received]],
LogResponseHandler,
:handle_event,
nil
)
end
test "two" do
:ok = Telemetry.attach_many(
"log-response-handler",
[[:external_message, :consumed], [:external_message, :received]],
LogResponseHandler,
:handle_event,
nil
)
end
end
When providing StartMetadata and StopMetadata, these values will be sent independently to start and stop events. [โฆ] In general, StopMetadata should only provide values that are additive to StartMetadata so that handlers, such as those used for metrics, can rely entirely on the stop event.
It's not really clear from this if StartMetadata
values needs to be supplied as StopMetadata
as well if they should be available for both events. The code seems to suggest they should, the docs seem to suggest to not do that.
I'm working on an instrumentation package for sending Ecto data to New Relic (https://github.com/binaryseed/new_relic_ecto). I'm the author of New Relic's Elixir agent (https://github.com/newrelic/elixir_agent/)
I'm wondering what the expected workflow is for users who would want to install the package.
For example, in this case, I'm wondering about the following:
If I knew the otp_app
I could get the list of Repos & attach them. But generally do we want users to have to glue things together or should we try to make it as automatic as possible?
attach
, or do we want users interacting with :telemetry.attach
directly?Right now I'm thinking of at least providing a helper that translates from the Repo module to the Telemetry event for the user, and wraps up the details of how my handler is implemented. Here's how that is currently shaping up: binaryseed/new_relic_ecto#4
def attach(repo: repo) do
Telemetry.attach(
"new_relic_ecto",
telemetry_prefix(repo) ++ [:query],
NewRelic.Ecto.Telemetry,
:handle_event,
%{}
)
end
Just hoping to get a sense of the desired user experience... Thanks!
I am wondering if I have two nodes connected via libcluster and I run telemetry.attach on node A and telemetry.execute on B. If node A is able to execute the handle_event automatically.
Or how can I achieve this with telemetry?
Thank you.
This will allow this package to be used by everyone in the community for event dispatching, which will improve the coverage of events throughout the ecosystem. Right now we are planning to move only the core though.
@tsloughter, can you please review the current API and let us know if you would like to do any changes in the process?
or rebar.config
:
{deps, [{telemetry, "~> 0.4.1"}]}.
should be chaged to:
{deps, [{telemetry, "0.4.1"}]}.
for Erlang, at the end of readme.
Moreover, there is no 0.4.1 release so 0.4.0 should be used.
And also
telemetry:execute(
[web, request, done],
#{latency => Latency},
#{request_path => Path, status_code => Status}
)
uses request_path
while
handle_event([web, request, done], #{latency := Latency}, #{path := Path,
status_code := Status}, _Config) ->
?LOG_INFO("[~s] ~p sent in ~p", [Path, Status, Latency]).
handler, matches to the key path
. Which causes an error trying to copy-paste an example form Readme.
Now that this library is implemented in erlang
, would it be possible to have a Elixir-native module to interface with Telemetry
instead of :telemetry
?
I'll PR if it's straightforward...
My version information:
...
"telemetry": {:hex, :telemetry, "0.4.1", "ae2718484892448a24470e6aa341bc847c3277bfb8d4e9289f7474d752c09c7f", [:rebar3], [], "hexpm", "4738382e36a0a9a2b6e25d67c960e40e1a2c95560b9f936d8e29de8cd858480f"},
...
$ asdf current
elixir 1.9.4 (set by /Users/isanders/projects/.tool-versions)
erlang 22.2.1 (set by /Users/isanders/projects/.tool-versions)
My error looks like this:
===> Compiling telemetry
===> Uncaught error in rebar_core. Run with DEBUG=1 to see stacktrace or consult rebar3.crashdump
===> When submitting a bug report, please include the output of `rebar3 report "your command"`
** (Mix) Could not compile dependency :telemetry, "/Users/isanders/.asdf/installs/elixir/1.9.4/.mix/rebar3 bare compile --paths="/Users/isanders/projects/my_app/_build/test/lib/*/ebin"" command failed. You can recompile this dependency with "mix deps.compile telemetry", update it with "mix deps.update telemetry" or clean it with "mix deps.clean telemetry"
** (exit) 1
(mix) lib/mix/tasks/cmd.ex:45: Mix.Tasks.Cmd.run/1
(mix) lib/mix/task.ex:331: Mix.Task.run_task/3
(mix) lib/mix/task.ex:365: Mix.Task.run_alias/3
(mix) lib/mix/task.ex:292: Mix.Task.run/2
(mix) lib/mix/cli.ex:79: Mix.CLI.run_task/2
(elixir) lib/code.ex:813: Code.require_file/2
The rebar3.crashdump
is the weirdest part:
$ cat deps/telemetry/rebar3.crashdump
Error: {badmatch,{error,enoent}}
[{rebar_erlc_compiler,modify_erlcinfo,5,
[{file,"/home/tristan/Devel/rebar3/_build/prod/lib/rebar/src/rebar_erlc_compiler.erl"},
{line,477}]},
{rebar_erlc_compiler,'-update_erlcinfo_fun/2-fun-0-',4,
[{file,"/home/tristan/Devel/rebar3/_build/prod/lib/rebar/src/rebar_erlc_compiler.erl"},
{line,463}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},
{rebar_erlc_compiler,update_erlcinfo,3,
[{file,"/home/tristan/Devel/rebar3/_build/prod/lib/rebar/src/rebar_erlc_compiler.erl"},
{line,448}]},
{rebar_erlc_compiler,'-update_erlcinfo_fun/2-fun-0-',4,
[{file,"/home/tristan/Devel/rebar3/_build/prod/lib/rebar/src/rebar_erlc_compiler.erl"},
{line,463}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},
{rebar_erlc_compiler,update_erlcinfo,3,
[{file,"/home/tristan/Devel/rebar3/_build/prod/lib/rebar/src/rebar_erlc_compiler.erl"},
{line,448}]},
{rebar_erlc_compiler,'-update_erlcinfo_fun/2-fun-0-',4,
[{file,"/home/tristan/Devel/rebar3/_build/prod/lib/rebar/src/rebar_erlc_compiler.erl"},
{line,463}]}]
I don't share a computer with @tsloughter, or even know him, but he is the best lead I have on this.
Hi! Saw Chris' describe Telemetry in his Elixir Conf Talk and want to explore using it.
Is there a more detailed usage example available somewhere
or "demo" app with source we can read on GitHub to help get started using it?
Is it likely to be covered in Programming Elixir 1.6 or Programming Phoenix 1.4 ?
(I've tried searching the latest versions of both eBooks and not found a reference to Telemetry...)
Thanks!
P.S. LMK if this is more appropriate for the ElixirForum, happy to close this issue and move it to there ...
We've seen an issue in Absinthe where a mix task is run that calls code which is instrumented with telemetry, but the handler lookup fails. The ETS table hadn't been created since telemetry wasn't started.
Given that the goal is to have telemetry embedded everywhere, should we rescue the ArgumentError
that arises in cases like this?
telemetry/src/telemetry_handler_table.erl
Lines 45 to 47 in 9f00993
Compiling 1 file (.ex)
** (ArgumentError) argument error
(stdlib) :ets.lookup(:telemetry_handler_table, [:absinthe, :execute, :operation, :start])
/Users/Jaap/Documents/development/family-five/server/deps/telemetry/src/telemetry_handler_table.erl:47: :telemetry_handler_table.list_for_event/1
/Users/Jaap/Documents/development/family-five/server/deps/telemetry/src/telemetry.erl:114: :telemetry.execute/3
lib/absinthe/phase/telemetry.ex:18: Absinthe.Phase.Telemetry.run/2
lib/absinthe/pipeline.ex:366: Absinthe.Pipeline.run_phase/3
lib/absinthe.ex:110: Absinthe.run/3
lib/mix/tasks/compile.schema_json.ex:12: Mix.Tasks.Compile.SchemaJson.run/1
Hi! The hosted docs are still on 0.3 and that is a bit confusing since the change on API.
Could you guys update it there? https://hexdocs.pm/telemetry/
Thanks!
Using Telemetry from git (sha: 9a9da7b8
) the telemetry app won't work in a mix release. It works fine when starting the application with mix.
This is the error we get when trying to start the release:
[14:02:39.213] [error] module=proc_lib function=crash_report/4
Process #PID<0.3570.0> terminating
** (exit) {:bad_return, {{:telemetry_app, :start, [:normal, []]}, {:EXIT, {:undef, [{:telemetry_app, :start, [:normal, []], []}, {:application_master, :start_it_old, 4, [file: 'application_master.erl', line: 293]}]}}}}
(kernel 8.1.2) application_master.erl:142: :application_master.init/4
(stdlib 3.16.1) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Initial Call: :application_master.init/4
Ancestors: [#PID<0.3569.0>]
Message Queue Length: 1
Messages: [{:EXIT, #PID<0.3571.0>, :normal}]
Links: [#PID<0.3569.0>, #PID<0.3496.0>]
Dictionary: []
Trapping Exits: true
Status: :running
Heap Size: 376
Stack Size: 28
Reductions: 225
We have added span/3 to telemetry, so it probably makes sense to build a new release. WDYT?
@arkgil, would you please do the honors when you are around? Thank you!
Originally posted by @bryannaegele in #12 (comment):
We've discussed a number of ways to discover telemetry events for a variety of use cases, most recently as it pertains to registering of event handlers for tracing purposes. Knowledge of event names is a predicate to registering handlers, so we are left with two potential avenues: compile-time introspection or explicit declaration.
Note that this is not intended to be the sole mechanism for event discovery. Issue #12 talks about a way to discover events as they happen. This discussion is about discovering events before they happen so they can be listened to in an unintrusive way.
This is arguably the best overall solution since it requires no additional steps by the user. In my initial investigation, introspection would be difficult to achieve, especially across languages.
By requiring the user or library author to explicitly declare possible events, we have a straightforward mechanism to make use of that declaration at various steps and with a broad range of applications.
One option would be to provide a behaviour with a telemetry_events/0
function which returns a list of discoverable events.
Introduce an attribute of telemetry_events
with a list of discoverable events.
Each of the above is easily discoverable at runtime and compile-time. My initial idea was to use module attributes but, after further consideration, I'm leaning towards @josevalim's idea of going with a behaviour. It's just as easy to use for a user, feels more idiomatic, and provides flexibility going forward. It's probably more useful from a tooling perspective, as well.
telemetry
behaviour with a telemetry_events/0
callback (default impl returns []
).telemetry:register_event/1
, telemetry:deregister_event/1
, and telemetry:register_all_events/1
functions to support further runtime declaration flexibility and discoverability (ETS storage is probably fine).
telemetry:list_all_events/0
would return all registered events.Hello, before all thanks for creating a really cool & needed library.
I have a doubt, as I understand from the code after calling :telemetry.execute
we call linked handler function immediately in a sync way. Am I right?
If I am, I have doubts if in some cases (with a high-frequency usage, ecto
, for example) it will be an optimal solution. Can you please clarify on this topic?
I was working on adding some new telemetry events to Commanded to be used in the bridge library I'm developing for Otel. A point of confusion came for me when I needed to pass the metadata to :telemetry.span/3
. In the PR, you'll see that I passed data metadata to the StartMetadata
, but then passed an empty map for the StopMetadata
.
After ruminating on this with @bryannaegele, it became obvious to me that I should be passing the same value to both start and stop. However, the documentation did not point me in this direction, and the API was surprising initially. I would have expected the StopMetadata
to be merged into the StartMetadata
.
I'm opening this issue to start a discussion on whether this needs a simple documentation update, or if the API needs to be reconsidered slightly.
I've got a previously working Elixir script with Mix.install
:
Mix.install([
{:telemetry, "~> 1.1.0"},
])
I've recently upgraded to macOS Ventura (on M1) and tried to run the script (elixir test.exs
), and I get this error:
** (ErlangError) Erlang error: :enoent
(elixir 1.14.2) lib/system.ex:940: System.shell("/Users/yrashk/.mix/elixir/1-14/rebar3 bare compile --paths /Users/yrashk/Library/Caches/mix/installs/elixir-1.14.2-erts-13.1.1/c0c70753ffb5bc2d1e26727be812c93b/_build/dev/lib/*/ebin", [stderr_to_stdout: true, into: %Mix.Shell{callback: #Function<0.74950258/1 in Mix.Shell.IO.cmd/2>}, env: [{"ERL_LIBS", "/Users/yrashk/Library/Caches/mix/installs/elixir-1.14.2-erts-13.1.1/c0c70753ffb5bc2d1e26727be812c93b/_build/dev/lib"}, {"REBAR_BARE_COMPILER_OUTPUT_DIR", "/Users/yrashk/Library/Caches/mix/installs/elixir-1.14.2-erts-13.1.1/c0c70753ffb5bc2d1e26727be812c93b/_build/dev/lib/telemetry"}, {"REBAR_CONFIG", "/Users/yrashk/Library/Caches/mix/installs/elixir-1.14.2-erts-13.1.1/c0c70753ffb5bc2d1e26727be812c93b/_build/dev/lib/telemetry/mix.rebar.config"}, {"REBAR_PROFILE", "prod"}, {"TERM", "dumb"}]])
$ elixir --version
Erlang/OTP 25 [erts-13.1.1] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1] [jit] [dtrace]
Not quire sure what went wrong here... I've tried deleting all the caches, didn't help.
I'm interested in seeing telemetry
extended with an API to enable an additional type of instrumentation - the mutation of an outbound request / query with Distributed Trace context information.
The general API could be implemented as an instrumentation hook that a library could build into their code that enables telemetry
handlers to mutate specific data structures.
Taking HTTP requests as an example, we could imagine an HTTP client library adding telemetry
hooks:
defmodule HttpClient do
def request(url, body, headers) do
extra_headers =
:telemetry.decorate(
[:httpoison, :request, :headers],
%{url: url, body: body, headers: headers}
)
validate(extra_headers)
do_request(url, body, headers ++ extra_headers)
end
end
Then a handler could attach
to [:httpoison, :request, :headers]
, and do some work in the handler...
defmodule MyTelemetryHandler do
def handle([:httpoison, :request, :headers], request) do
context = get_trace_context(request)
generate_distributed_tracing_headers(context)
end
end
This would provide a general mechanism for outbound instrumentation, without hard-wiring to any vendor-specific instrumentation details.
Very open to discussions about strategy and naming!
Issues to sort out:
For context, this issue has been discussed before in #24 and beam-telemetry/telemetry_metrics#46
I have a use case where a GenServer needs to monitor a connection. The connection automatically reconnects when it disconnects, and it publishes telemetry events for it.
The problem is my GenServer can crash and get restarted, but :telemetry doesn't clean up the event handler.
Task.async(fn -> :ok = :telemetry.attach(:foo, :foo, &handler/4, :foo) end)
|> Task.await()
:telemetry.list_handlers(:foo) != []
Is there any performant way to have :telemetry Process.monitor/1
anything that calls attach/4
and automatically detaches when those processes end?
There are ways to do this on the user side of :telemetry, but it seems very hacky: The config argument of attach could be {handler_id, pid_of_caller
}, then the handler can call Process.alive?(pid_of_caller)
and call detach
if true. It also seems less performant than just having a process monitor.
Thanks for the help.
This is a follow-up to the discussion on the Ecto mailing list and @josevalim's comment here.
@GregMefford proposed that maybe instead of subscribing to prefixes of events like we do now, we should subscribe directly to event names which would allow us to perform a simple :ets.lookup
to find matching handlers.
I really like the simplicity of the proposed approach. I think that we were considering it when discussing the initial implementation, and I'm not sure why we settled on the prefixes ๐ค
I think that the difference boils down to following: in place of
Telemetry.attach(..., [:phoenix], ..., ...)
there would be
Telemetry.attach(..., Phoenix.events(), ..., ...)
To me the second example looks much more readable and better expresses the intent (i.e. subscribing to all the events published by Phoenix). However, it has a limitation: Phoenix (or any other project which uses this library) would need to know all the event names at the time of attaching the handler. Let's imagine that Phoenix emits the event with the execution time of the controller action. With the prefixes, the event name could be dynamically constructed like:
Telemetry.execute([:phoenix, :latency, YourApp.Controller, :show], 12.01])
With static event names we would need to use the event metadata:
Telemetry.execute([:phoenix, :controller, :latency], 12.01,
%{controller: YouApp.Controller, action: :show})
What do you think?
/cc @josevalim @GregMefford @studzien @chrismccord @cloud8421
I am thinking, can Telemetry be used to send event having a nested payload
:telemetry.execute(
[:web, :request, :done],
%{latency: latency},
%{request_path: path, status_code: status}
)
But what if I need to emit an event with measurement as a payload which may be deeply nested map? Not sure if the type of measurement is documented. Or is it a limitation of available reporters only?
This is a spin off from the discussion in #56. @hauleth mentioned about providing telemetry:span/3
and I replied that I don't think we can come up with a generally useful API and we should have a separate discussion about it. So instead of me being grumpy, I thought I would open up said discussion and send a proposal:
telemetry:span(EventPrefix, StartMetadata, (() -> {Result, StopMetadata}) :: Result
This will emit three events:
EventPrefix ++ [start]
with measurement #{system_time := erlang:system_time()}
and StartMetadata
EventPrefix ++ [exception]
with measurement #{duration := erlang:monotonic_time()}
and #{kind := throw | error | exit, reason := term(), stacktrace := list()}
as metadataEventPrefix ++ [stop]
with measurement #{duration := erlang:monotonic_time()}
and StopMetadata
The only limitation with this approach is only that we can't configure measurements but I would say configuring metadata is good enough and despite my initial comment, this should cover a good amount of scenarios.
Shall we ship with this approach?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.