Giter Club home page Giter Club logo

Comments (10)

Jeffail avatar Jeffail commented on July 20, 2024 2

Currently working on this, I'm hoping to have a first draft ready for tomorrow, working on this branch: https://github.com/Jeffail/benthos/tree/feature/distributed-tracing

Progress so far:

  • Using the opentracing API
  • Message parts are able to have a context attached
  • That context can be used as a reference to a span
  • Spans are created at the input level, and children of those root spans can be constructed by any Benthos component

TODO:

  • Verify that multiple calls to Finish on a span don't blow it up (makes my code much simpler)
  • Create a new component tracer similar to metrics, with options for different aggregators
  • Create a new processor trace similar to metric

I'm implementing this in a way that doesn't change any existing framework APIs, so we won't need to wait for a V2 release to have this out. I'm also being very careful not to expose the context associated with message parts in the API as I do not want this to bleed over into some sort of cancellation mechanism by accident.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024 2

Branch is looking good so far. I've got a new tracer component similar to metrics, currently only supports jaeger and none. I need to add traces to more processors but this is looking like it'll be ready to merge into master soon. The best part is that none of the changes are going to break the config spec or internal API so this won't require a major release bump.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024 2

Implemented: ce0609c
Released: https://github.com/Jeffail/benthos/releases/tag/v1.6.0

from connect.

ledor473 avatar ledor473 commented on July 20, 2024 1

I might take a stab at it and we can discuss improvement once an initial PR is sent.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024 1

Hey @ledor473, thanks for the quick feedback. I'll add those fields to the config spec. I'm not too worried about having all fields exposed for now since it's easy to add more as/when they get requested.

I wouldn't normally use environment variables directly since they can be specified within a config: https://github.com/Jeffail/benthos/blob/master/docs/config_interpolation.md#environment-variables. However, the AWS components have already set a precedent of allowing direct env var configs, so I'm not opposed to adding it for Jaeger as well.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Hey @ledor473,

I'm not particularly well versed as to how distributed tracing is used so I'd need to do some reading up on it, but from my basic knowledge it does seem very fitting for a project like Benthos so I'd be happy to explore it as an option.

I'm going to mark this one as help needed as it would be good to share notes as to how much work would be involved, how we would test it, etc, as it sounds like a big task.

Thanks for the suggestion, I have a feeling this could be extremely valuable and I wouldn't have considered it otherwise.

from connect.

nwest1 avatar nwest1 commented on July 20, 2024

I’ve done a bit of work with tracing. The biggest complicating factor here is that there isn’t a standard way to inject or extract a span. It’s specifically omitted from the opentracing spec.

There are zipkin http standards (x-b3-* headers) but that’s one tracer and one protocol.

Internal tracing could be nice for profiling. “zipkin-http” and “jaeger-http” extract/injectors could be provided, but anything else should probably be written by the end user.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Planning to release the first phase of this in v1.6.0 later today, which is:

  • New tracer component in the root of the config spec, allowing you to choose a tracer target. This feature is considered stable in that I do not intend to change it without a major version release.

  • Internal API for working with spans on messages, this is mostly hidden from the current stream APIs but exposed through helper functions. If I find there's a major flaw (but not necessarily broken) in these functions then I might modify them without a major version release.

  • The actual information exposed by Benthos components through opentracing is considered experimental. I've made a first pass at exposing useful information from all processors, but the information exposed as well as its formatting is subject to adjustment without a major version release. I've added a disclaimer to the documentation for tracers that explains this.

  • Each message is given a root span at the input level of a Benthos pipeline, that span is finished when the message is acknowledged at the output level. It is possible for an input component to extract a root span from a previous service, this is implemented already in the HTTP input types (using headers) and I intend to gradually try and add solutions for this to most input types where possible.

  • All APIs for opentracing within Benthos assume a global tracer. I'm doing this to save having to propagate a tracer reference through all components. This would be a problem if we decided to do clever stuff like namespacing spans for pipelines running in streams mode, or outputting to multiple tracers. However, doing so would also require breaking changes to the API so this would need to come in at V2 anyway, which I'm open to if there's a good case for it.

If anyone has any concerns or feedback that might change these plans please let me know soon, I'm more than happy to delay merging if I've not gotten this quite right.

from connect.

ledor473 avatar ledor473 commented on July 20, 2024

@Jeffail I've quickly looked at the changes in the branch and what stands out the most for me is the configuration. Jaeger has quite a few settings that are useful and while you got the required one, I think it would be nice to access all of them.

A way to do it would be to use configuration.FromEnv() like here: https://github.com/jaegertracing/jaeger-client-go/blob/master/config/example_test.go#L110
Which would let people use any of these environment variables: https://github.com/jaegertracing/jaeger-client-go#environment-variables
The change would likely be only this line: https://github.com/Jeffail/benthos/compare/feature/distributed-tracing#diff-eb55ef0e4904b260bcd6fd7ba4318fe4R80

That being said, I'm not sure if Benthos use environment variable elsewhere... so if you would prefer exposing more settings in JaegerConfiguration, I think the following would be valuable:

  • JAEGER_SAMPLER_TYPE: Especially useful to use the remote sampler
  • JAEGER_SAMPLER_MANAGER_HOST_PORT: Needed when using the remote sampler
  • JAEGER_TAGS: Let's you configure Tracer level tags

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Added those extra fields. I've left it so that doing direct environment based configuration is possible in the future if it becomes a hot request.

Leaving it as a PR for now. I'm going to walk away and clear my head for a couple of hours before reading through it again.

I think the only snag I've encountered so far is that when using the batch processor the root span can become finished before the children spans when the message doesn't trigger the batch flush. Jaeger seems to cope fine with that but as it's undefined territory (and looks odd) I need a solution eventually. I haven't got one yet that I would consider "clean", so leaving it as it is for now.

from connect.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.