Giter Club home page Giter Club logo

tracingplane-java's Introduction

brown.tracingplane

1 Quick Start

Documentation and tutorials for this project are located in the Tracing Plane Wiki; see also the Tracing Plane Javadocs

2 Concepts

The Tracing Plane introduces two key abstractions: BaggageContexts and Execution-Flow Scoped Variables.

2.1 BaggageContext

A BaggageContext is a general-purpose request context, intended to be used within and across distributed services.

For example, a request in a microservices environment might involve multiple services, which make calls across the network to each other.

For each request, a BaggageContext carries request metadata (things like request IDs, tags, etc.). Its goal is to be passed alongside the request while it executes.

It's very useful to pass around BaggageContext objects at runtime. They are used by a range of different debugging and monitoring tools, the classic example being distributed tracing (like Zipkin and OpenTracing and Dapper). However, there are also other cool examples, like resource management and dynamic monitoring. We use the name tracing tools to refer to such tools.

2.2 Execution-Flow Scoped Variables

An execution-flow scoped variable is similar, in concept, to a thread-local variable. However, instead of being dynamically scoped to threads, EFS variables are scoped to end-to-end requests.

EFS variables follow requests inline as they execute. The TraceID used by distributed tracing tools is an example of an execution-flow scoped variable.

Updates to an EFS variable occur locally to an EFS instance -- for example, if my request is doing several things concurrently, they might have different values for their respective EFS variables. This relates to the notion of causality -- EFS variables follow execution's causality. The SpanID used by distributed tracing tools is an EFS variable that demonstrates this -- several concurrent execution branches could be executing simultaneously, each with a different value for SpanID.

The Tracing Plane exposes EFS variables with an interface definition language, called BDL (Baggage Definition Language), and corresponding compiler. BDL is similar to protocol buffers, and generates accessors that interface with BaggageContext instances and encapsulate all of the concurrency and propagation nuances that are easy to get wrong.

3 Project Information

3.1 Project Goals

It's actually very hard to get context propagation right and very hard to deploy new tracing tools in today's distributed systems:

  • It's hard to instrument systems to pass around contexts, because it involves touching lots of little bits of code in many places. Instrumentation is hard -- only do it once.
  • It's hard to agree on context formats across system components, especially if they use different languages or frameworks Agree on a general-purpose format -- bind specific tools to it later
  • It's hard to get the behavior of contexts right -- for example, if a request has a high degree of fan-in, how do you reconcile mismatched IDs or context values? Encapsulate well-defined propagation behavior for data types

In general, we want a BaggageContext that can carry any tracing tool's data in a consistent way.

The Tracing Plane is a layered design for context propagation in distributed systems. It involves a data serialization format, a protocol for interpreting data, an interface definition language (called BDL -- Baggage Definition Language), and compiler.

The tracing plane enables interoperability between systems and tracing applications. It provides a "narrow waist" for tracing, analogous to the role of TCP/IP in networking.

3.2 Similar Projects

There are some similar projects, that we list here to make it more concrete what the Tracing Plane is:

  • Go's context package provides request-scoped contexts that you pass around in Go programs. Similarly, we propose BaggageContext objects be passed around in the same way (though our API is slightly different)
  • Span contexts in Zipkin, OpenTracing and Dapper -- these pass around metadata (span and trace IDs) for distributed tracing. The goal of BaggageContext is to provide a well-defined, concrete data format that these tracing tools would be able to use, to store their IDs.
  • Instrumentation in OpenTracing is similar to instrumentation for the Tracing Plane. One difference is that BaggageContext instances are truly opaque at instrumentation time, and are passed across all execution boundaries (including, for example, in request responses); whereas OpenTracing spans are conceptually tightly bound to the task of distributed tracing.

3.3 Project Status

This is an active research project at Brown University by Jonathan Mace and Prof. Rodrigo Fonseca. This work is supported in part by NSF award 1452712, and from generous gifts from Facebook and Google.

The Tracing Plane is motivated by many years of collective experience in end-to-end tracing and numerous tracing-related research projects including X-Trace, Quanto, Retro, Pivot Tracing. You can also check out our research group's GitHub.

We currently provide a Java implementation of the Tracing Plane. However, the Tracing Plane is not tightly coupled to Java, and is designed with interoperability in mind - between languages, systems, platforms, etc.

Keep an eye out for our research paper about Baggage, coming soon.

4 Other Useful Links

Project Wiki

Tracing Plane Javadoc

Example Zipkin / OpenTracing Tracers that are backed by BDL: github.com/JonathanMace/tracingplane-opentracing

tracingplane-java's People

Contributors

jonathanmace avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tracingplane-java's Issues

One-of fields

One-of fields are useful, especially for use cases like supporting 64/128 bit IDs

Volatile fields

Volatile fields are fields that we only want one side of a branch to ever see.

When we branch a context, only one side of the branch will get the field; the other side will get null.

Volatile fields enable the implementation of state-based CRDTs based on vector clocks that have random component IDs; you declare a component id field that is volatile, and a map<id, ...> for the values.

Volatile fields are always transient, because we can't guarantee an intermediary component will respect the volatile-ness

Baggage Buffers - Pending Features

The following features are yet to be implemented in Baggage Buffers:

  • Inline fields
  • Max and Min
  • Add/Subtract Counter
  • Take comments in .bb file and put them as class/field comments in generated code.

Documentation and Javadoc TODOs

README markdowns in root and project subdirs:

  • Readmes for root project and all subprojects. Figure out appropriate way of presenting the information.
  • Tracingplane organization documentation
  • Directory of resources (wiki pages, javadocs, package docs)

Javadocs

  • Package level documentation
  • Docs on all public classes
  • Fold some of the package doc README info into javadocs too

Wiki pages

  • One of several pages outlining researchy parts of tracing plane, mostly pulled from paper
  • High level overview of how to use tracingplane
  • Tutorial: Downloading / cloning / building / installing
  • Tutorial: set up a maven project or add jar to classpath
  • Tutorial: instrumenting a system (threads, runnables, callables); instrumenting a system (custom queues); instrumenting a system (network / io); advanced: using aspectj
  • FAQ page: what is the license? do you have systems instrumented already that i can plug in to? is it compatible with scala?

Other Project TODOs:

  • Make download links for distribution JARs with dependencies
  • Make license explicit somewhere
  • For distribution JARs, enumerate all dependencies and their licenses
  • Split out BDL compiler to separate project
  • Fix scala versions

Configurable Overflow Per-Bag

A bag could predefine the point at which overflow is acceptable, so that the bag itself can overflow without causing all subsequent bags to overflow

To achieve this, baggage buffers would add a way of annotating the acceptable size limit

Then in the baggage protocol, we would need a per-bag overflow marker. I think the best way of doing this would be to add an indexed child with no index specified (which would therefore be lexicographically less than the bag's other children)

Implement enums

Enum of values that can just map to integers

Can then be used for flag sets

End-to-end Tests

After migrating Hadoop instrumentation and brownsys/tracing, do some end-to-end tests to measure overheads and check correctness.

Accessor methods

In the current implementation, users manually manipulate object fields, because that was quicker to implement.

Ideally, it should be more protobuf-style, with getters, setters, and existence checkers.

This will also make it easier to provide things such as optionals and one-ofs

Baggage Protocol bag options

Bag options could be specified as a separate child bag rather than in the header. This would be better for supporting changes to the options (eg, if you want to specify a lower bag threshold)

Implement structs

When you want to merge multiple fields as a single entity, want to use structs.

Structs also impose lower overhead.

Tracingplane Spec

Need a spec so that other people know what to implement if they wanted to integrate tracingplane into other languages.

Transient fields

Bags should support transient fields. Transient fields are omitted during serialization; that is, they only live within a single process.

Transient fields should behave like other fields -- eg they should be bags, and support merge/join

Since transient fields are not serialized, transient fields do not need a field number. They could be declared like:

int64 myregularid = 0;
int64 mytransientid = transient;

That's a little ugly but i think it'll do for now

Implement true/false flags

Implement true-wins and false-wins flags.

With inline data atoms, this can be a zero-length data atom whose presence indicates the value.

Split out repos

Split out compiler as a separate repo from the java library

Application - XTrace Lite

As an extension to tutorials, write an xtrace lite application that will aid transit layer instrumentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.