Giter Club home page Giter Club logo

gsoc-swift-tracing's Introduction

gsoc-swift-tracing's People

Contributors

ktoso avatar pokryfka avatar slashmo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

ktoso

gsoc-swift-tracing's Issues

`TracingInstrument` default implementation of startSpan is potentially infinite loop

TracingInstrument extends Instrument with startSpan method:

/// An `Instrument` with added functionality for distributed tracing. Is uses the span-based tracing model and is
/// based on the OpenTracing/OpenTelemetry spec.
public protocol TracingInstrument: Instrument {
    /// Start a new `Span` within the given `BaggageContext` at a given timestamp.
    /// - Parameters:
    ///   - operationName: The name of the operation being traced. This may be a handler function, database call, ...
    ///   - context: The `BaggageContext` within to start the new `Span`.
    ///   - kind: The `SpanKind` of the new `Span`.
    ///   - timestamp: The `DispatchTime` at which to start the new `Span`.
    func startSpan(
        named operationName: String,
        context: BaggageContext,
        ofKind kind: SpanKind,
        at timestamp: DispatchTime?
    ) -> Span
}

However, at the same time it provides its default implementation with exactly the same signature.

It is possible to create new type of a TracingInstrument which does not override the default implementation, example:

extension XRayRecorder: TracingInstrument {
    public func extract<Carrier, Extractor>(_ carrier: Carrier, into baggage: inout BaggageContext, using extractor: Extractor) where Carrier == Extractor.Carrier, Extractor : ExtractorProtocol {
        // TODO: impl
    }
    
    public func inject<Carrier, Injector>(_ baggage: BaggageContext, into carrier: inout Carrier, using injector: Injector) where Carrier == Injector.Carrier, Injector : InjectorProtocol {
        // TODO: impl
    }
}

in which case:

let instrument: TracingInstrument = XRayRecorder(emitter: emitter)
instrument.startSpan(/ .../)

will be recursively calling itself


I understand the motivation to provide a default implementation was to provide default value for SpanKind.

Perhaps it would be safer if it was optional from the very beginning:

    func startSpan(
        named operationName: String,
        context: BaggageContext,
        ofKind kind: SpanKind?,
        at timestamp: DispatchTime?
    ) -> Span

or not included in the arguments in the extension (in fact the value of kind is not used at all even if different than default value):

    public func startSpan(
        named operationName: String,
        context: BaggageContext,
        at timestamp: DispatchTime? = nil
    ) -> Span {
        self.startSpan(named: operationName, context: context, ofKind: .internal, at: nil)
    }

Build a Context PoC

In its initial version, Context will be a wrapper around a dictionary ([String : Any]) with methods to inject and extract metadata.
The shape of both Context and its storage will likely change in the future, but this first iteration allows us to start experimenting with potential use-cases.

inject() implementations should only be additive

FakeTracer today does remove a header from headers if it's not in the baggage:

 else {
            headers.remove(name: Self.headerName)
        }

it should not do that; we should not be 'destructive' in inject() other than if we happen to override a header because we're setting headers from baggage.

Removing the else there will be enough here ๐Ÿ‘

Add License headers

At the beginning of each file in this project we want to have a license header (similar to NIO). Additionally, in the Locks file we need to add NIOs license headers underneath our own.

Add forceFlush to trace instruments

This is a follow up of pokryfka/aws-xray-sdk-swift#16 (comment).

I do not have a strong opinion on that, still I think its worth a discussion and it exceeds scope of a comment in PR.


XRayRecorder currently exposes:

the recorder currently exposes "NIO" flush:

    public func flush(on eventLoop: EventLoop) -> EventLoopFuture<Void> {}

as well as blocking wait without NIO dependency:

    public func wait(_ callback: ((Error?) -> Void)? = nil) { // }

@ktoso

Could you handle this in the recorder's shutdown?

yes for "services" (Vapor, Kitura, ...)

however in case of "lambda runtime", my understanding is that we do need to emit all the segments during event handling (event though it will make the request handling longer) because otherwise lambda may be suspended (and potentially never wake up again) before emitting finishes, example:

private struct ExampleLambdaHandler: EventLoopLambdaHandler {
    typealias In = Cloudwatch.ScheduledEvent
    typealias Out = Void

    private let recorder = XRayRecorder()

    private func doWork(on eventLoop: EventLoop) -> EventLoopFuture<Void> {
        eventLoop.submit { usleep(100_000) }.map { _ in }
    }

    func handle(context: Lambda.Context, event: In) -> EventLoopFuture<Void> {
        recorder.segment(name: "ExampleLambdaHandler", context: context) {
            self.doWork(on: context.eventLoop)
        }.flatMap {
            self.recorder.flush(on: context.eventLoop)
        }
    }
}

Lambda.run(ExampleLambdaHandler())

to be honest, while my explanation above does make sense to me, I do plan checking more in details how its implemented/enforced in "official" XRay SDKs;

also note that while in case of XRay emitting = sending UDP in local network, which should be comparatively inexpensive and may be finished (subject of more testing) before lambda is suspended even without explicit flushing;
emitting spans using different TracingInstrument in lambda handler will be typically much mroe expensive and will take more time;

Polish instrument initialization and lookups

related question is how the InstrumentationSystem should be configured,
for instance should there be only one instrument, one of a kind and so on:

    // FIXME: smarter impl
    public static var tracer: TracingInstrument {
        self.lock.withReaderLock {
            let tracer: TracingInstrument? = self._tracer
            let res: TracingInstrument = tracer ?? NoOpTracingInstrument()
            return res
        }
    }

BTW the code above will not find a tracer if its provided by official and public MultiplexInstrument

Originally posted by @pokryfka in #31 (comment)

Consider InstrumentationSystem (?) similar to MetricsSystem et al

Libraries will often need to pull an instrument "out of thin air" in a similar way as metrics have to be pulled out sometimes.

We should offer something similar to https://github.com/apple/swift-metrics#selecting-a-metrics-backend-implementation-applications-only

Naming will evolve here but something like InstrumentationSystem to be consistent with Metrics and logging perhaps?
We should also think how it relates to tracing; would we allow e.g. InstrumentationSystem.instrument and also InstrumentationSystem.tracer (where a tracer is-a instrument but not all usages need a tracer etc.)

SpanAttributes as a type

Since the spec suggests those should keep order, just a plain old dict is not going serve this well:

Attributes SHOULD preserve the order in which they're set. Setting an attribute with the same key as an existing attribute SHOULD overwrite the existing attribute's value.

This basically means a SortedDict which we don't have;

It's a SHOULD so we can move forward without it for now but we want to be in a place where we can swap the impl without breaking anyone to conform to this requirement.

So, let's:

struct SpanAttributes { 
    subscript(_ name: String) -> SpanAttribute? 
    func forEachAttribute(_ callback: (String, SpanAttribute) -> ())
}

and for now back it with a private [String:SpanAttribute] which we can easily replace eventually

Extract "InjectInto/ExtractFrom" from InstrumentProtocol

As mentioned in #47

While working on this I noticed that the current shape of InstrumentProtocol with its associated types InjectInto & ExtractFrom prohibits us from setting up a singleton instance to be bootstrapped, as we can't know upfront what the user will bootstrap the system with (so what's the properties type?).

Also, it's definitely possible for a system to use different technologies (HTTP, gRPC, ...) being instrumented by the same cross-cutting tool. That's why it probably makes sense to not have the Instrument be generic over the InjectInto/ExtractFrom types, but rather to move this responsibility somewhere else.

Consider simplifying SpanAttribute types - use Swift types?

Per OT:

The Span interface MUST provide:
An API to set a single Attribute where the attribute properties are passed as arguments. This MAY be called SetAttribute.
To avoid extra allocations some implementations may offer a separate API for each of the possible value types.

TracingInstrument defines two public type: SpanAttribute and SpanAttributes which provide A LOT more and need to be implemented by any TracingInstrument implementation either by:

  • embracing them and adding dependency on the types even if they not directly match their own internal definition of "an attribute" (see XRays annotation and metadata); or
  • adding lots of bridging and copying data back and forward in the process

to be honest I think SpanAttribute and SpanAttributes are a nice example of Swift type safety but its a bit of an hassle to adapt to

in my original XRay Segment implementation I defined only setters (no getters) for XRay Annotation and decided to expose only Swift types:

    internal typealias Annotations = [String: AnnotationValue]

    internal enum AnnotationValue {
        case string(String)
        case integer(Int)
        case float(Float)
        case bool(Bool)
    }

    private func setAnnotations(_ newElements: Annotations) {
        lock.withWriterLockVoid {
            for (k, v) in newElements {
                _annotations.updateValue(v, forKey: k)
            }
        }
    }

    public func setAnnotation(_ value: String, forKey key: String) {
        setAnnotations([key: .string(value)])
    }
    public func setAnnotation(_ value: Bool, forKey key: String) {
        setAnnotations([key: .bool(value)])
    }

    public func setAnnotation(_ value: Int, forKey key: String) {
        setAnnotations([key: .integer(value)])
    }

    public func setAnnotation(_ value: Float, forKey key: String) {
        setAnnotations([key: .float(value)])
    }

added a getter and a function to remove a value trying to conform to Span type but that's still not enough,
I need to expose a (copy) of the collection with all the annotations and a function with a callback modifying all the annotations.

Which OT requirement does it fulfils and what is a use case for that?

I think in a typical use case user creates a span, adds a few attributes (perhaps overriding previous ones) and that's it really.

First end-to-end sample (HTTPClient -> NIO Server -> ...)

First sample in which we want to:

  • have a client issue an http request (our "wrapped" http client for now)
    • a) with some values in context
    • b) without any values in context
  • inject from context to headers
  • receive on a server and extract the context
    • show that we got the same values
    • show how an inject could "make up" a new trace id if none was present in incoming request
    • fun topic: showcase how we'd implement sampling here ("one in n requests should be traced") etc
  • the server should log something, and these logs should include the metadata we got from the context

Maybe make another request from there?

Consider nicer string description of baggage context

Current printout just uses the default properties which is BaggageContext(_storage: [:]) which reads a bit meh, since we want to hide that _storage.

Specifically, it looks like this today:

BaggageContext(_storage: [Baggage.AnyBaggageContextKey(keyType: MyModule.MyFunKey, _name: nil): Baggage.BaggageContext.(unknown context at $10773422c).ValueContainer(value: ...)])

That's a lot of noise :)

Perhaps the default rendering should only render the keys?

Rename package name to `swift-tracing`

As the library and its interface matures, I think it would be great to change the package name to swift-tracing,
even though the GIT repository name remains gsoc-swift-tracing.

Step 0: Sample (single threaded, no branch/join) of inject/extract-ing context

Let's start at step zero before we take it further.

As we go through this project we'll want to cover a bunch of cases and samples ("scenarios") where and how to use the context.

Let's start at step zero and just have an example how in single threaded normal local code one might use the context to:

  • inject some values
  • pass around the context through some methods
  • see that it's usable in there
  • extract values from the context

--

From that step, we can start talking about "where the extract/inject code would live" (e.g. the "glue code" between context + a specific framework (http, other) + and specific tracer, and how we'd fit that in.

Once we have the general shape, let's continue to Step 1 (perhaps inject/extract from http headers), and continue adding more cases :) (we have them outlined in our project plan google doc, but let's take them one step at a time here)

Instrument "box" naming

(ignoring how the inject/extract to/from are expressed).

We'd want to write Instrument as the generic rather than I since that's common swift style to use full words.
But we today use Instrument for the wrapper type:

    init<Instrument>(instrument: Instrument)
        where Instrument: InstrumentProtocol,
        Instrument.InjectInto == HTTPHeaders,
        Instrument.ExtractFrom == HTTPHeaders {

        self.instrument = Instrument(instrument) // conflict

What would be a better naming strategy for this type:

/// A box-type for an `InstrumentProtocol`, necessary for creating homogeneous collections of `InstrumentProtocol`s.
public struct Instrument<InjectInto, ExtractFrom>: InstrumentProtocol {
    private let inject: (BaggageContext, inout InjectInto) -> Void
    private let extract: (ExtractFrom, inout BaggageContext) -> Void

    /// Wrap the given `InstrumentProtocol` inside an `Instrument`.
    /// - Parameter instrument: The `InstrumentProtocol` being wrapped.
    public init<I>(_ instrument: I)
        where
        I: InstrumentProtocol,
        I.InjectInto == InjectInto,
        I.ExtractFrom == ExtractFrom {

TracingInstrument does come to mind, but these ones are not specifically tracing, they are generic.

Having looked at this again I guess this should be AnyInstrument and the protocol can become InstrumentProtocol (OR BaggageInstrument)... WDYT?

OTel style Spans: Links

Some notes about how links can be done:

The Java impl does:

    Builder addLink(SpanContext spanContext);
    Builder addLink(SpanContext spanContext, Attributes attributes);
//    * <p>Links are used to link {@link Span}s in different traces. Used (for example) in batching operations, where a single batch handler processes multiple requests from different traces or the same trace.
    Builder addLink(Link link);

We can basically look the same here:

    mutating func addLink(context: BaggageContext, attributes: SpanAttributes = .none)
    mutating func addLink(_ link: Link)

Same as with any other add things we can add adding... ones if we want via extensions later on.

as for the comment on the spec:

  • An API to record a single Link where the Link properties are passed as arguments. This MAY be called AddLink.
  • An API to record a single Link whose attributes or attribute values are lazily constructed, with the intention of avoiding unnecessary work if a link is unused. If the language supports overloads then this SHOULD be called AddLink otherwise AddLazyLink MAY be considered.

I think we can handle the lazy case inside the SpanAttributes #68

To be honest this starts to lead us into that these context things should become Copy-on-Write types, so we can share them -- and this is a perfect example since we'll never mutate the span attributes here ๐Ÿค”

As step 0 we can just be eager for now, but let's then make a ticket about the lazy part -- I think we should solve it by copy-on-write style things rather than some () -> SpanAttributes which would have been the other option.

Naming naming naming :-)

At some point once we have enough examples, let us gather prior art of names of the bits pieces and settle on names we'd like to stick to.

Add error recording API to Span

To facilitate recording an exception languages SHOULD provide a RecordException convenience method. The signature of the method is to be determined by each language and can be overloaded as appropriate. The method MUST record an exception as an Event with the conventions outlined in the exception semantic conventions document.

According to the Otel spec, exceptions must be recorded as SpanEvent: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/exceptions.md

SwiftLog Backend

Evaluate whether it makes sense to build a Logging LogHandler that uses the Context as metadata storage.

Create MultiplexMiddleware

Currently, it's annoying for each instrumented system to deal with multiple instrumentation middleware. Instead, we want to create a MultiplexInstrumentationMiddleware that takes all other middleware on init and itself conforms to InstrumentationMiddlewareProtocol. Then instrumented systems can think of the whole instrumentation stack as one middleware that under the hood calls out to all middleware.

Prior art for this is Logging 's MultiplexLogHandler.

Share BaggageContext across multiple NIO handlers

Currently, there's no way of sharing BaggageContext across multiple NIO channel handlers. This means, each handler would have to create a new BaggageContext (like in this example) which eliminates the ability for context propagation across handlers in the same channel pipeline.

@ktoso & I thought about adding a new baggage property to the ChannelHandlerContext so that a BaggageContext can be created in the outer most channel handler, set on the Context, and then be accessed in later channel handlers. This would probably also mean that NIO itself would have to depend on the BaggageContext library.

cc @weissi I'd love to get your input on this as well. I'm also happy to to move this over to the NIO repo once we have a general idea on how to make this integration happen ๐Ÿ˜Š

Remove tracing instrument current span

Proposing to remove 'tracingInstrument.currentSpan' completely from our APIs

Thinking some more about it I think we should be doubling down on the fact that we need to pass context around explicitly and let's not try to offer the thread local workaround. If we need to we could revisit this but for now let's remove it and not dive into the world of scope managers at all, as were on the side of explicitly context passing already anyway.

The main reason is that we want baggage context to be agnostic to any implementation of even tracing and thus everyone should be able to adopt it. And, since there are so many concurrency stories in swift it will be hard or impossible to implement a thread local style current span that actually works reliably - e.g. across nio that then uses dispatch that then uses some couritines etc... let's not assume we are able to solve this for the time being.

Take the Context for a spin

As our first use case of the context, we'll write some functions inside a Unit Test that take a Context and inject, extract metadata.

Make Span attributes write-only

Currently Span attributes are modeled through two types SpanAttributes & SpanAttribute. To make conformance to the Span protocol easier we want to drop these types from the public API and instead expose methods on Span for the different primitives, e.g. setAttribute(_ string: String, forKey key: String) instead of the subscript-based attributes property.

Expose Context metadata as Logging metadata

Set the collected information stored in the Context as metadata of a Logger. This Logger will be exposed as part of the Context API. When logging messages using said Logger the stored metadata will then be logged too.

Compile use cases on CI

Now that the number of use cases is growing we might want to start adding them to the CI pipeline. @ktoso Do you think this makes sense? Or should we rather write separate integration tests?

Logging performance considerations

Currently, the computed logger property creates String representations (String(describing: value)) of everything inside the BaggageContext, although a logging configuration may not even print out this metadata.

One possibility suggested by @ktoso is to create a custom LogHandler that allows us to use the String(describing: value) call only when needed.

BaggageContext and Swift Logging: Design alternatives

Logging support has now been implemented as part of a BaggageLogging library (that depends on swift-log). It allows for setting a base Logger that is then being used to compute the context.logger property. In the current implementation, this base Logger is injected into the BaggageContext by stuffing it in the baggage via a special BaggageContext.BaseLoggerKey.

Below are some pro/cons on this approach and a possible alternative.

(1) Storing the Logger inside the BaggageContexts storage

Pros ๐Ÿ‘

  • It's easy, no need to pass a Logger while initializing a BaggageContext
  • It allows us to add the logger property in a separate library, as the BaggageContext itself becomes the "property storage"

Cons ๐Ÿ‘Ž

  • When setting the metadata on the Logger we had to manually exclude the BaggageContext.BaseLoggerKey so that it's not included in the log
  • This may set an example for others to stuff in non-metadata related things into the BaggageContext
  • It feels like a hack because we weren't able to add the Logger as a stored property in the BaggageContext extension

(2) Adding Logger to BaggageContext directly

Pros ๐Ÿ‘

  • Feels more natural, no need to make an exception for the otherwise metadata-only BaggageContext
  • Doesn't lead to exclusion of certain keys when adding to the Logger metadata

Cons ๐Ÿ‘Ž

  • BaggageContext would need to depend on swift-log directly

What are your thoughts on this. Other pros/cons, or even other alternatives to this approach?

Inject/Extract HTTP Headers

  • Manually inject HTTP headers derived from Context into AsyncHTTPClient
  • Manually extract HTTP headers from AsyncHTTPClient and store in Context

Context and W3C TraceContext

Let's implement how a https://www.w3.org/TR/trace-context-1/#considerations-for-trace-id-field-generation would be extracted/injected to/from a baggage.

See also: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#spancontext

A SpanContext represents the portion of a Span which must be serialized and propagated along side of a distributed context. SpanContexts are immutable. SpanContext MUST be a final (sealed) class.

The OpenTelemetry SpanContext representation conforms to the w3c TraceContext specification. It contains two identifiers - a TraceId and a SpanId - along with a set of common TraceFlags and system-specific TraceState values.

TraceId A valid trace identifier is a 16-byte array with at least one non-zero byte.

SpanId A valid span identifier is an 8-byte array with at least one non-zero byte.

TraceFlags contain details about the trace. Unlike Tracestate values, TraceFlags are present in all traces. Currently, the only TraceFlags is a boolean sampled flag.

Tracestate carries system-specific configuration data, represented as a list of key-value pairs. TraceState allows multiple tracing systems to participate in the same trace.

IsValid is a boolean flag which returns true if the SpanContext has a non-zero TraceID and a non-zero SpanID.

IsRemote is a boolean flag which returns true if the SpanContext was propagated from a remote parent. When creating children from remote spans, their IsRemote flag MUST be set to false.

Please review the W3C specification for details on the Tracestate field.

NOOPInstrument nitpicks

  • It could be a struct
    • no need for the let instance -- no need to use a class and let many refs to it, a struct is cheaper
  • move its definition to Instrument.swift rather than InstrumentationSystem.swift
  • should be public, no reason to disallow people from using it
  • spelling: please let's spell it as "NoOp" for "no operation" NOOP reads a bit weird, it's not an acronym.
    • also, prior art in public struct SwiftLogNoOpLogHandler: LogHandler {

Minor things, noticed while poking around

Add swiftformat and "sanity" script

Similar to other SSWG projects, so we can avoid formatting conflicts :)

No strong opinions on the rules, I'll copy what I have but feel free to put other rules in the config if you have strong opinions on something.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.