Giter Club home page Giter Club logo

opentelemetry-dotnet-instrumentation's Introduction

OpenTelemetry .NET Automatic Instrumentation

Slack NuGet NuGet Arm CI sponsored by Actuated

This project adds OpenTelemetry instrumentation to .NET applications without having to modify their source code.


Warning

The following documentation refers to the in-development version of OpenTelemetry .NET Automatic Instrumentation. Docs for the latest version (1.7.0) can be found in opentelemetry.io or here.


Quick start

If you'd like to try the instrumentation on an existing application before learning more about the configuration options and the project, follow the instructions at Using the OpenTelemetry.AutoInstrumentation NuGet packages or use the appropriate install script:

To see the telemetry from your application directly on the standard output, set the following environment variables to true before launching your application:

  • OTEL_DOTNET_AUTO_LOGS_CONSOLE_EXPORTER_ENABLED
  • OTEL_DOTNET_AUTO_METRICS_CONSOLE_EXPORTER_ENABLED
  • OTEL_DOTNET_AUTO_TRACES_CONSOLE_EXPORTER_ENABLED

For a demo using docker compose, clone this repository and follow the examples/demo/README.md.

Components

OpenTelemetry .NET Automatic Instrumentation is built on top of OpenTelemetry .NET:

You can find all references in OpenTelemetry.AutoInstrumentation.csproj and OpenTelemetry.AutoInstrumentation.AdditionalDeps/Directory.Build.props.

To automatically instrument applications, the OpenTelemetry .NET Automatic Instrumentation does the following:

  1. Injects and configures the OpenTelemetry .NET SDK into the application.
  2. Adds OpenTelemetry Instrumentation to key packages and APIs used by the application.

You can enable the OpenTelemetry .NET Automatic Instrumentation as a .NET Profiler to inject additional instrumentations of this project at runtime, using a technique known as monkey-patching. When enabled, the OpenTelemetry .NET Automatic Instrumentation generates traces for libraries that don't already generate traces using the OpenTelemetry .NET SDK.

See design.md for an architectural overview.

Status

The versioning information and stability guarantees can be found in the versioning documentation.

Compatibility

OpenTelemetry .NET Automatic Instrumentation should work with all officially supported operating systems and versions of .NET.

The minimal supported version of .NET Framework is 4.6.2.

Supported processor architectures are:

Note

ARM64 build does not support CentOS based images.

CI tests run against the following operating systems:

Instrumented libraries and frameworks

See config.md#instrumented-libraries-and-frameworks.

Get started

Considerations on scope

Instrumenting self-contained applications is supported through NuGet packages. Note that a self-contained application is automatically generated in .NET 7+ whenever the dotnet publish or dotnet build command is used with a Runtime Identifier (RID) parameter, for example when -r or --runtime is used when running the command.

Install

Download and extract the appropriate binaries from the latest release.

Note

The path where you put the binaries is referenced as $INSTALL_DIR

Instrument a .NET application

When running your application, make sure to:

  1. Set the resources.
  2. Set the environment variables from the table below.
Environment variable .NET version Value
COR_ENABLE_PROFILING .NET Framework 1
COR_PROFILER .NET Framework {918728DD-259F-4A6A-AC2B-B85E1B658318}
COR_PROFILER_PATH_32 .NET Framework $INSTALL_DIR/win-x86/OpenTelemetry.AutoInstrumentation.Native.dll
COR_PROFILER_PATH_64 .NET Framework $INSTALL_DIR/win-x64/OpenTelemetry.AutoInstrumentation.Native.dll
CORECLR_ENABLE_PROFILING .NET 1
CORECLR_PROFILER .NET {918728DD-259F-4A6A-AC2B-B85E1B658318}
CORECLR_PROFILER_PATH .NET on Linux glibc $INSTALL_DIR/linux-x64/OpenTelemetry.AutoInstrumentation.Native.so
CORECLR_PROFILER_PATH .NET on Linux musl $INSTALL_DIR/linux-musl-x64/OpenTelemetry.AutoInstrumentation.Native.so
CORECLR_PROFILER_PATH .NET on macOS $INSTALL_DIR/osx-x64/OpenTelemetry.AutoInstrumentation.Native.dylib
CORECLR_PROFILER_PATH_32 .NET on Windows $INSTALL_DIR/win-x86/OpenTelemetry.AutoInstrumentation.Native.dll
CORECLR_PROFILER_PATH_64 .NET on Windows $INSTALL_DIR/win-x64/OpenTelemetry.AutoInstrumentation.Native.dll
DOTNET_ADDITIONAL_DEPS .NET $INSTALL_DIR/AdditionalDeps
DOTNET_SHARED_STORE .NET $INSTALL_DIR/store
DOTNET_STARTUP_HOOKS .NET $INSTALL_DIR/net/OpenTelemetry.AutoInstrumentation.StartupHook.dll
OTEL_DOTNET_AUTO_HOME All versions $INSTALL_DIR

Note

Some settings can be omitted on .NET. For more information, see config.md.

Important

Starting in .NET 8, the environment variable DOTNET_EnableDiagnostics=0 disables all diagnostics, including the CLR Profiler facility which is needed to launch the instrumentation, if not using .NET Startup hooks. Ensure that DOTNET_EnableDiagnostics=1, or if you'd like to limit diagnostics only to the CLR Profiler, you may set both DOTNET_EnableDiagnostics=1 and DOTNET_EnableDiagnostics_Profiler=1 while setting other diagnostics features to 0. See this issue for more guidance.

Shell scripts

You can install OpenTelemetry .NET Automatic Instrumentation and instrument your .NET application using the provided Shell scripts.

Note

On macOS coreutils is required.

Example usage:

# Download the bash script
curl -sSfL https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation/releases/download/v1.6.0/otel-dotnet-auto-install.sh -O

# Install core files
sh ./otel-dotnet-auto-install.sh

# Enable execution for the instrumentation script
chmod +x $HOME/.otel-dotnet-auto/instrument.sh

# Setup the instrumentation for the current shell session
. $HOME/.otel-dotnet-auto/instrument.sh

# Run your application with instrumentation
OTEL_SERVICE_NAME=myapp OTEL_RESOURCE_ATTRIBUTES=deployment.environment=staging,service.version=1.0.0 ./MyNetApp

otel-dotnet-auto-install.sh script uses environment variables as parameters:

Parameter Description Required Default value
OTEL_DOTNET_AUTO_HOME Location where binaries are to be installed No $HOME/.otel-dotnet-auto
OS_TYPE Possible values: linux-glibc, linux-musl, macos, windows No Calculated
ARCHITECTURE Possible values for Linux: x64, arm64 No Calculated
TMPDIR Temporary directory used when downloading the files No $(mktemp -d)
VERSION Version to download No 1.6.0

instrument.sh script uses environment variables as parameters:

Parameter Description Required Default value
ENABLE_PROFILING Whether to set the .NET CLR Profiler, possible values: true, false No true
OTEL_DOTNET_AUTO_HOME Location where binaries are to be installed No $HOME/.otel-dotnet-auto
OS_TYPE Possible values: linux-glibc, linux-musl, macos, windows No Calculated
ARCHITECTURE Possible values for Linux: x64, arm64 No Calculated

PowerShell module (Windows)

On Windows, you should install OpenTelemetry .NET Automatic Instrumentation and instrument your .NET application using the provided PowerShell module. Example usage (run as administrator):

# Download the module
$module_url = "https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation/releases/download/v1.7.0/OpenTelemetry.DotNet.Auto.psm1"
$download_path = Join-Path $env:temp "OpenTelemetry.DotNet.Auto.psm1"
Invoke-WebRequest -Uri $module_url -OutFile $download_path -UseBasicParsing

# Import the module to use its functions
Import-Module $download_path

# Install core files (online vs offline method)
Install-OpenTelemetryCore
Install-OpenTelemetryCore -LocalPath "C:\Path\To\OpenTelemetry.zip" 

# Set up the instrumentation for the current PowerShell session
Register-OpenTelemetryForCurrentSession -OTelServiceName "MyServiceDisplayName"

# Run your application with instrumentation
.\MyNetApp.exe

You can get usage information by calling:

# List all available commands
Get-Command -Module OpenTelemetry.DotNet.Auto

# Get command's usage information
Get-Help Install-OpenTelemetryCore -Detailed

Updating OpenTelemetry installation:

# Import the previously downloaded module. After an update the module is found in the default install directory.
# Note: It's best to use the same version of the module for installation and uninstallation to ensure proper removal.
Import-Module "C:\Program Files\OpenTelemetry .NET AutoInstrumentation\OpenTelemetry.DotNet.Auto.psm1"

# If IIS was previously registered, use RegisterIIS = $true.
Update-OpenTelemetryCore -RegisterIIS $true

# If Windows services were previously registered, these must be re-registered manually.
Unregister-OpenTelemetryForWindowsService -WindowsServiceName MyServiceName
Update-OpenTelemetryCore
Register-OpenTelemetryForWindowsService -WindowsServiceName MyServiceName -OTelServiceName MyOtelServiceName

Warning

The PowerShell module works only on PowerShell 5.1 which is the one installed by default on Windows.

Instrument a container

You can find our demonstrative example that uses Docker Compose here.

You can also consider using the Kubernetes Operator for OpenTelemetry Collector.

Instrument a Windows Service running a .NET application

See windows-service-instrumentation.md.

Instrument an ASP.NET application deployed on IIS

See iis-instrumentation.md.

Configuration

See config.md.

Manual instrumentation

See manual-instrumentation.md.

Log to trace correlation

See log-trace-correlation.md.

Troubleshooting

See troubleshooting.md.

Contact

See CONTRIBUTING.md.

Contributing

See CONTRIBUTING.md.

Community Roles

Maintainers (@open-telemetry/dotnet-instrumentation-maintainers):

Approvers (@open-telemetry/dotnet-instrumentation-approvers):

Emeritus Maintainer/Approver/Triager:

Learn more about roles in the community repository.

opentelemetry-dotnet-instrumentation's People

Contributors

birojnayak avatar dependabot[bot] avatar dsims21 avatar dszmigielski avatar johnbley avatar kevingosse avatar kielek avatar kronos11 avatar lachmatt avatar macrogreg avatar mtwo avatar nrcventura avatar pdelewski avatar pellared avatar pjanotti avatar rajkumar-rangaraj avatar rassk avatar reyang avatar theletterf avatar vasireddy99 avatar xiang17 avatar yurishkuro avatar zacharycmontoya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opentelemetry-dotnet-instrumentation's Issues

Add OTel events support and exception semantic convention

Run containers during test execution

Why

  1. Should use only the resources needed for particular tests → no need to spin up all containers upfront
  2. Possibility just to execute tests using dotnet test or from IDE
  3. Possibility to debug tests

What

PoC

#56

Dependencies

Would be more effective to first do #109

Support for monitoring sharepoint

Are you requesting automatic instrumentation for a framework or library? Please describe.

  • Framework or library name : Sharepoint

Is your feature request related to a problem? Please describe.
This feature request is meant to explore how we could support monitoring a system like Sharepoint.

Describe the solution you'd like
Sharepoint allows you to extend its functionality by adding additional WCF services. To support Sharepoint we would need to be able instrument WCF services, and Asp.Net requests. We need to find out if Sharepoint already includes a dependency on System.Diagnostics.DiagnosticSource. If it does include that dependency, is there a way to configure a binding redirect so that we can use an appropriate version of that dependency? If someone extends Sharepoint, and includes that dependency, how can we ensure that a binding redirect gets configured.

Hardcoded name of the output file in the code

Describe the bug
There is a hardcoded output file name here. It was confirmed that it causes client applications with profiler attached to crash when the .dylib file's name is changed after compilation - see this comment for reference.

To Reproduce
Steps to reproduce the behavior:

  1. Install profiler on macOS
  2. Rename .dylib file
  3. Set new name in CORECLR_PROFILER_PATH
  4. Run application
  5. Application crashes

Expected behavior
I assume name of the profiler file should not affect how it works, and definitely shouldn't crash client apps.

Screenshots
If applicable, add screenshots to help explain your problem.

Runtime environment (please complete the following information):

  • Instrumentation mode: Used files compiled on my machine
  • OS: macOS Catalina Version 10.15.7 (19H524)
  • .NET version: .NET 5

Full support for .NET Activities & deprecate proprietary Datadog.Trace library

Today we use a proprietary library (Datadog.Trace) to start / stop spans and to perform related tasks. This library is used for both, our auto-instrumentation, as well as for customer-created spans (where required).

.NET has an API to represent spans (Activity). It is a part of all .NET Core versions that we support, and is distributed as a NuGet for the .NET Framework. We are going to transition towards using the .NET API instead of Datadog’s proprietary library.

Activity is also the underlying data exchange mechanism for OTel tracing. Using it will ensure that out auto-instrumentation can be mixed and matched with both, OTel API based telemetry, and vendor-agnostic .NET API based telemetry.

This document outlines the plan and discusses the risks and benefits.

Integrations that will be enabled once we start using activities

It would be nice to compile a list of such libraries for reference purposes. Here is a preliminary / incomplete list.
@cijothomas , @lmolkova , all - did you happen to have a pointer to a more complete list?

Update: Created #21 specifically to list/discuss libraries/components that support Activity or DS.

A number of client libraries will "light up" without the need to develop integrations. This includes all libraries that use OTel APIs and/or .NET Activities.

  • ASP.NET WebForms (4.5+)
  • ASP.NET MVC (4+)
  • ASP.NET WebAPI (4.5+)
  • ASP.NET Core (1.1+)
  • HttpClient (4.5+), and .NET Core (1.1+)
  • HttpWebRequest
  • SqlClient .NET Core (1.0+), NuGet 4.3.0
  • Microsoft.Data.SqlClient 1.1.0)
  • Azure EventHubs Client SDK (1.1.0)
  • Azure ServiceBus Client SDK (3.0.0)

Engineering plan

Phase I (completed):

  1. Proof of concept: https://github.com/DataDog/dd-trace-dotnet/tree/gregp/hack-a-dog-202007-diagnosticsource-activity/src/Datadog.Trace.ClrProfiler.Managed/ActivityBasedTracing

Phase II:

  1. Solve the System.Diagnostics.DiagnosticSource.dll versioning problem.
    Applications are likely to have a dependency on that library. The activity collector and the instrumentation libraries also need to reference that library. We need to ensure that no versioning conflict affects the customer apps. See the section on Diagnostic Source assembly versioning for a detailed discussion.

  2. Start using the dynamic shim (described below)) in the places where we are using Diagnostic Source in the Tracer already. This will address the risk of version conflicts early and expose the shim component to real usage.

Phase III:

  1. Create an Activity collector that does not take dependency on any Datadog libraries and uses a dependency-free configuration API. Serialization / exporter support will be based on OpenTelemetry SDK concepts, but will be optimized for the Tracer use-case.

  2. Create a new version of Datadog.Trace library that fires Activities internally. Use that version for all auto-instrumentation.

Phase IV:

  1. Beta. We can start testing internally and with customers using a feature flag. However, we will have suboptimal performance since we will have the overhead of creating unnecessary Span objects.

  2. One by one, change each integration to emit Activities instead of Spans. As we do this, we will gradually remove the superfluous performance overhead caused by that redirection. We expect different OTel vendors who are taking advantage of this feature to actively participate in this porting effort.

Risks and mitigations

  • Servicing.
    If an issue is discovered in System.Diagnostics.DiagnosticSource.dll, we depend on Microsoft to fix it.

    • An issue is this library is possible, but highly unlikely. That code is used by millions of applications across the .NET universe. If an issue is critical (e.g. security), Microsoft is likely to fix it quickly.
  • Dependency versioning.
    The tracer will take a dependency on System.Diagnostics.DiagnosticSource.dll and applications are likely to also reference that library. Assembly resolution in .NET is complex. We will need to make sure that the right version is always loaded.

    • This problem is shared by all other standard-conform tracing solutions. Microsoft’s Runtime Diagnostics team and the Azure Monitor team have experience solving this issue and they offered consulting services in the context of the OTel forum.
    • We already have a prototype that passes tests. We reviewed the solution strategy with selected team members and with the Microsoft Runtime Diagnostics team. We will review the approach with the rest of the team before final implementation.
  • Performance.
    Based on visual code analysis we expect that performance will improve. We will remove several locks and object creations on the hot path. However, this is not certain.

    • We have benchmarks in place to track resource utilization and how it changes with this change.
  • Migration of custom code using Datadog.Trace.
    Customers who have instrumented their code using Datadog.Trace will need to migrate to .NET Activities in the long term.

    • Preliminary migration plan:
  1. Announce the change. Warn the customers that they will need to upgrade or otherwise their custom spans will stop working at a certain date. Upgrade options:
    (a) Move to a new version of Datadog.Trace. That will require no code change other than referencing the new version. This will have a small perf impact, but code will continue working. This will be the version mentioned in Step 3 of the Engineering Plan. Customers will need to be aware that this version will exist for back-compatibility purposes only and that we will no longer improve it. Future features will be based on Activities (see b below).
    (b) Use .NET Activities (or OTel API, which are also using Activities under the hood). This will require code changes. However, we expect perf improvements and going forward all features will be based on that.
  2. This is the above-mentioned Datadog.Trace version that uses Activities under the hood. At the same time, start emitting appropriate deprecation warnings from other library versions.

Additional questions related to risks & mitigations

  • What will happen with the OpenTracing support?

    • Open Tracing is being superseded by OTel
  • (DD Vendor specific) How will we deal with trace ids and span ids that use the W3C Trace Context standard format? This is the default format for Activities but is not supported yet by Datadog afaik

    • W3C spec includes a description of how to deal with vendor-specific formats. In a nutshell, we will add a tag on Activity to store our IDs and we will ALSO populate the W3C field as described by that spec for such cases.

Diagnostic Source assembly versioning

In order to work with Activity-based telemetry, different Tracer components (most importantly, the instrumentation library, and the telemetry collector) need to use types located in the System.Diagnostics.DiagnosticSource.dll assembly (we refer to it as DiagnosticSource.dll or just DS.dll for brevity). This assembly is commonly used by applications. If Tracer components statically reference DS, and the version referenced is not compatible with the version referenced by the application, a versioning conflict will occur and the application will crash.

To avoid this, we will not statically reference DS. Instead, we will use it via a reflection wrapper (we will refer to it here as the Dynamic Loader (DL). DL will expose the public APIs required by any Tracer component that are normally exposed by DS. For example: Start-Activity, Stop-Activity, Subscribe-Activity-Listener, Get-Activity-Properties and so on. The DL will load a version of DS that is guaranteed to be compatible with the application at runtime, and it will forward the respective API calls via cached delegates. The delegates will be created via reflection and compiled Expressions.

The application itself will use DS in a normal manner and will not be aware of the DL. If an application inspects the Activity stack, it will see both, activities started by the application, as well as activities started by auto-instrumentation in a consistent manner.

To ensure that the DL always loads an appropriate version of the DS assembly, we will use the strategy described below.

Vendoring-in DS

We will vendor a copy of DS into DL. This means we will copy the respective sources from the dotnet runtime and build them into the DL library. We will use this vendored copy as a “fallback-DS” in cases where no version of DS can be loaded.

Initially, we will use DS that will be included with .NET 5, and we will update it if it becomes necessary, or if a critical security or performance patch is released.

We describe how the fallback-DS is used below. In the below algorithm, it will help to avoid any potential versioning conflicts. The drawback is that in certain situations this strategy will prevent auto-instrumentation-based Activities and other Activities emitted by the application from “seeing” each other. The tracer will correctly collect all Activities resulting from auto-instrumentation and miss any other activities. The application components will miss any auto-instrumentation-based activities. However, this will only occur in well-defined, very rare edge cases, where instrumentation can be expected to be partially broken anyway.

Versioning background

Version 4.0.2.0 of the System.Diagnostics.DiagnosticSource assembly is the first official version of that assembly that contains the Activity type (previous versions contained DiagnosticSource only).
That version was shipped in the System.Diagnostics.DiagnosticSource NuGet version 4.4.0 on 2017-06-28.

It is highly unlikely that an application references an older version of DS. E.g. any ASP.NET Core application will reference at least nuget 4.4.1. Our current Tracer for Net Core also requires 4.4.1. However, it is not entirely impossible for an application to statically reference an older DS version. Currently we would crash. This solution avoids crashes in this scenario (and in several others).

DS loading algorithm

  • When the Tracer initializes, DL will execute the Try-Using-Loaded-DS routine.

The Try-Using-Loaded-DS routine:
results in one of:
· a reference to a loaded and “usable” DS assembly instance
· a reference to the vendored fallback-DS
· null (DS was not loaded, but it may be loaded using another method)

{

  • Inspect all loaded assemblies.

  • If DS is loaded in a way that is not guaranteed to work well with auto-instrumentation, we will gracefully abort trying to load DS. We will log a message and resort to using the vendored fallback-DS. Specifically, we handle the following cases:

    • A DS version 4.0.2.0 (or later) is loaded, but it is not loaded into the default load context. (Today, this would lead to inconsistent trace hierarchy graphs.)
    • A DS version 4.0.2.0 (or later) is loaded, more than once in any load context. (Today, this would lead to inconsistent trace hierarchy graphs.)
    • A DS version prior to 4.0.2.0 is loaded. (Today, this would cause the application to crash).
  • If a DS version 4.0.2.0 (or later) is loaded exactly once and it is loaded into the default load context, we will use that instance of the assembly. DL will use reflection to load all required types and Expressions to create and cache delegates for API forwarding.

  • If DS is not loaded, we will make a note of it so that other methods can be used.

}

  • If the Try-Using-Loaded-DS routine results in either a proper or a fallback DS instance, we will use that. If it returns null, we will request the runtime to load DS, without specifying any particular assembly version.

  • This is because it is possible that the Tracer is initializing early in the application process, and the application was going to load DS, but has not done so yet. We want to load whatever version of DS the application would have loaded if the Tracer was not attached. By avoiding to specify a particular DS version, we will cause the runtime to perform the assembly resolution process and to search the assembly probing path according to the logic specific to whichever .NET version is running.

  • After the explicit request to load DS completes, we will execute the Try-Using-Loaded-DS routine again to see if the result is “valid”. If we detect that DS is still not loaded (e.g. it was not on the assembly resolution path) we will use the vendored fallback.

  • During the lifetime of the application we will use the AppDomain’s AssemblyLoad event to receive a notification if a version of the DS assembly is loaded at any time. Each time it happens, we will invoke the Try-Using-Loaded-DS routine:

    • If any time we will detect that there is a single instance of DS loaded, that is “valid” as per the Try-Using-Loaded-DS routine, we will switch to using that DS instance.
    • If any one time after the initial explicit load attempt we detect that DS is either not loaded or loaded in an invalid way as per the Try-Using-Loaded-DS routine, we will switch to using the fallback-DS.
    • We explicitly accept that several milliseconds of connected spans can be lost if such transition occurs.

Remove deprecated packages

There are a couple of projects that are marked as deprecated by Datadog but were brought over as part of the seeding of this repository. Since those projects are not necessary anymore and this OpenTelemetry project does not have to maintain backwards compatibility with them, we should remove them from this repo.

  • src/Datadog.Trace.AspNet/Datadog.Trace.AspNet.csproj
  • src/Datadog.Trace.ClrProfiler.Managed/Datadog.Trace.ClrProfiler.Managed.csproj

Add OTLP exporter support

This feature request is meant to track our overall OTLP support. The specification can be found at https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/protocol.

The specification calls out 2 different versions of the exporter:

  1. OTLP/gRPC
  2. OTLP/HTTP

Most languages already have support for OTLP/gRPC, but adding support for gRPC in this project will be harder because of the native dependencies required to support .NET Framework and .NET Standard 2.0 runtimes. .NET Standard 2.1 offers a fully managed implementation of gRPC that we can leverage to avoid the native code dependency. However, the performance of that fully managed library (at least under certain circumstances) was not as good on Linux for .NET Core 3.1 applications. I believe that this performance was improved with .NET 5.

OTLP/HTTP will likely be easier to support because we'll only have the protobuf dependency which does not require a native library.

Remove DDSpanContextPropagator

OpenTelemetry spec says:

Additional Propagators implementing vendor-specific protocols such as AWS X-Ray trace header protocol MUST NOT be maintained or distributed as part of the Core OpenTelemetry repositories.

As this repo qualifies as "Core OpenTelemetry repository" (correct me if I am wrong), it must not contain vendor-specific propagators.

Change version to 0.0.1

What

  • Change the version of project to 0.0.1 or something similar.
  • Document how to update the version in docs/RELEASING.md. More: #134 (comment)

Why

We have inherited the version from the Datadog repo. We have to start versioning this repo on its own.

Add possibility to add optional OTel HTTP attributes

Is your feature request related to a problem? Please describe.
Add a possibility to add optional OTel HTTP attributes like http.scheme, http.host, http.target via some configuration.
Related comment: #63 (comment)

Describe the solution you'd like
The configuration can be added via TracerSettings behind a configuration key like OTEL_HTTP_OPTIONAL_ATTRIBUTES (boolean type).
Here are the attributes which could be restored: b333cad

Describe alternatives you've considered
Do not add this feature at all.

Additional context
By default, the optional OTel HTTP attributes should NOT be added (because of additional performance cost).

OTEL conventions should be default

What
Change the profiler to use OTEL conventions by default.

Why
For now http outbound, trace id and exporter all use datadog conventions as default. That should not be a case as this is OTEL repo. We shouldn't even implement things that are not in scope of OTEL documentation here.

[OTel Org] Add labels to entry level tasks for new contributors

Hi,

based on open-telemetry/community#469 I have added open-telemetry/opentelemetry-dotnet-instrumentation to Up For Grabs:

https://up-for-grabs.net/#/filters?names=646

There are currently no issues with label help wanted. Please add this label to your entry level tasks, so people can find a way to contribute easily.

If "help wanted" is not the right label, let me know and I can change it (e.g. to "good first issue" or "up-for-grabs"), or you can provide a pull request by editing https://github.com/up-for-grabs/up-for-grabs.net/blob/gh-pages/_data/projects/opentelemetry-dotnet-instrumentation.yml

Thanks!

Minimize usage of singletons

Why

Overuse of singleton like here makes the code very tightly coupled and almost impossible to unit test.

What

Refactor the code to get rid of as many singletons as possible. Where applicable it would be good to use dependency injection (e.g. via constructor). It would be also good to minimize the number of constructors. Each type should ideally have one constructor or two if a second is needed just for sake of testing internals.

For example classes like WebRequest_GetResponseAsync_Integration could have a constructor accepting a tracer and a second parametress which injects the Tracer.Instance. The tracer can be then passed when invoking WebRequestCommon.GetResponse_OnMethodBegin(). However, we would also need to make sure that OnMethodBegin and OnMethodEnd are NOT static (and instrumentation is still working).

Add license headers for C# source files

Why

https://github.com/open-telemetry/community/blob/main/CONTRIBUTING.md#code-attribution

What

  1. License information should be included in all source files where applicable.
  2. Add a CI check to validate if all (applicable) source files have license information (at least C# and C++ source code)
  3. Remove from src/Datadog.Trace/GlobalSuppressions.cs:
// Copyright claims are suppressed until re-naming is finalized
[assembly: SuppressMessage("StyleCop.CSharp.DocumentationRules", "SA1636:File header copyright text must match", Justification = "Reviewed.")]
[assembly: SuppressMessage("StyleCop.CSharp.DocumentationRules", "SA1641:File header company name text must match", Justification = "Reviewed.")]
[assembly: SuppressMessage("StyleCop.CSharp.DocumentationRules", "SA1638:File header file name documentation must match file name", Justification = "Reviewed.")]

Add automated release

What
Add trigger in GH actions workflow that will run the build on tag push, and will create GH release.

Why
We need to have automated release process.

Dll metadata needs updating

src/Datadog.Trace.ClrProfiler.Native/Resource.rc still contains some Datadog references that should be updated for the OpenTelemetry build of the project.

CI is unable to build Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj

When working on #48 I had such error in Azure Pipeline:

/opt/hostedtoolcache/dotnet/dotnet build /home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj -dl:CentralLogger,"/home/vsts/work/_tasks/DotNetCoreCLI_5541a522-603c-47ad-91fc-a4b1d163081b/2.181.0/dotnet-build-helpers/Microsoft.TeamFoundation.DistributedTask.MSBuild.Logger.dll"*ForwardingLogger,"/home/vsts/work/_tasks/DotNetCoreCLI_5541a522-603c-47ad-91fc-a4b1d163081b/2.181.0/dotnet-build-helpers/Microsoft.TeamFoundation.DistributedTask.MSBuild.Logger.dll" /nowarn:netsdk1138
Microsoft (R) Build Engine version 16.8.3+39993bd9d for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

  Determining projects to restore...
  Restored /home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj (in 1.03 sec).
##[error]test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj(0,0): Error NU1608: Detected package version outside of dependency constraint: xunit.core 2.3.1 requires xunit.extensibility.core (= 2.3.1) but version xunit.extensibility.core 2.4.1 was resolved.
/home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj : error NU1608: Detected package version outside of dependency constraint: xunit.core 2.3.1 requires xunit.extensibility.core (= 2.3.1) but version xunit.extensibility.core 2.4.1 was resolved.
##[error]test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj(0,0): Error NU1608: Detected package version outside of dependency constraint: xunit.core 2.3.1 requires xunit.extensibility.execution (= 2.3.1) but version xunit.extensibility.execution 2.4.1 was resolved.
/home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj : error NU1608: Detected package version outside of dependency constraint: xunit.core 2.3.1 requires xunit.extensibility.execution (= 2.3.1) but version xunit.extensibility.execution 2.4.1 was resolved.
##[error]test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj(0,0): Error NU1107: Version conflict detected for xunit.assert. Install/reference xunit.assert 2.4.1 directly to project Vendored.System.Diagnostics.DiagnosticSource.Tests to resolve this issue. 
 Vendored.System.Diagnostics.DiagnosticSource.Tests -> Microsoft.DotNet.RemoteExecutor 5.0.0-beta.20501.7 -> xunit.assert (>= 2.4.1) 
 Vendored.System.Diagnostics.DiagnosticSource.Tests -> xunit 2.3.1 -> xunit.assert (= 2.3.1).
/home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj : error NU1107: Version conflict detected for xunit.assert. Install/reference xunit.assert 2.4.1 directly to project Vendored.System.Diagnostics.DiagnosticSource.Tests to resolve this issue. 
/home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj : error NU1107:  Vendored.System.Diagnostics.DiagnosticSource.Tests -> Microsoft.DotNet.RemoteExecutor 5.0.0-beta.20501.7 -> xunit.assert (>= 2.4.1) 
/home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj : error NU1107:  Vendored.System.Diagnostics.DiagnosticSource.Tests -> xunit 2.3.1 -> xunit.assert (= 2.3.1).
  Failed to restore /home/vsts/work/1/s/test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj (in 3.81 sec).
  1 of 3 projects are up-to-date for restore.

Build FAILED.

Example: https://dev.azure.com/opentelemetry/pipelines/_build/results?buildId=2572&view=logs&j=908793bb-4951-51df-c238-a3e8efd99439&t=672be53d-32c4-5a9a-40be-231648dc190c&l=331

For me, it was always working both on MacOS and Linux (inside a container). Especially that there is a test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Directory.build.props. Take notice that the pipeline passed on Windows (sic!).

The desided outcome is that !test/Vendored.System.Diagnostics.DiagnosticSource.Tests/Vendored.System.Diagnostics.DiagnosticSource.Tests.csproj is removed from .azure-pipelines/unit-tests.yml and build is passing.

CI is not running due to lack of required pool

CI configuration requires a pool, Arm64, that doesn't exist for the OTel repo. Until this time I wasn't able to locate someone with rights to try to create a similar pool for OTel. In lack of that, we may have to disable tests using this pool for the time being.

Technical plan for using OTel semantic conventions

OTel recommendations for semantic conventions around data transport and layout differ from the how they were historically used in the Datadog tracer that was used as a basis for the OTel tracer. Examples are tag names, span names, etc..

We need to allow for a slow and steady transition towards OTel conventions without breaking the existing data path, as current vendors may not be able tp adjust their backend or transition all of their customers.

The scope of this work is to:

  • Understand the delta between what we currently have in the Tracer and and what we should have for OTel.
  • Define and document an architecture that will allow us using OTel conventions (preferably) but not be blocked on modifying all integrations. Also allow existing exporter(s) to continue data unchanged to remain backward-compatible.

Disable / Remove config autoloading

Background

GlobalSettings introduces some automatic config loading. This could lead to serious security issues and also hardens bug tracking.
We should either disable (make it optional) or remove completely that feature. See discussion at #129 (review)

See "datadog.json" part:

            var configurationFileName = configurationSource.GetString(ConfigurationKeys.ConfigurationFileName) ??
                                        configurationSource.GetString("OTEL_DOTNET_TRACER_CONFIG_FILE") ??
                                        Path.Combine(GetCurrentDirectory(), "datadog.json");

📌 Maintain a catalogue of libraries with known explicit telemetry

Hey team!

As discussed in the SIG meeting, it would be useful to create and maintain a catalogue of libraries / APIs / components that emit telemetry out of the box using DiagnosticSource and/or Activities (or some other well-known method).

For easier commenting / editing I have created this doc.
Please help completing it by adding information about the listed libraries and by adding new libraries to the list.
Please feel free to correct things if something is wrong.

Any feedback is welcome!

Flaky test: RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions

The test is falling many times in CI:

2021-03-03T20:24:39.5684852Z [xUnit.net 00:00:11.3384586]     Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions [FAIL]
2021-03-03T20:24:40.9605968Z   Failed Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions [46 ms]
2021-03-03T20:24:40.9607521Z   Error Message:
2021-03-03T20:24:40.9608473Z    Moq.MockException : 
2021-03-03T20:24:40.9609815Z Expected invocation on the mock should never have been performed, but was 2 times: s => s.Increment("runtime.dotnet.exceptions.count", It.IsAny<Int32>(), It.IsAny<Double>(), It.IsAny<String[]>())
2021-03-03T20:24:40.9611297Z No setups configured.
2021-03-03T20:24:40.9612154Z 
2021-03-03T20:24:40.9612520Z Performed invocations: 
2021-03-03T20:24:40.9613149Z IDogStatsd.Gauge("runtime.dotnet.threads.count", 40, 1, null)
2021-03-03T20:24:40.9613627Z IDogStatsd.Gauge("runtime.dotnet.mem.committed", 202768384, 1, null)
2021-03-03T20:24:40.9614132Z IDogStatsd.Gauge("runtime.dotnet.cpu.user", -15625, 1, null)
2021-03-03T20:24:40.9614616Z IDogStatsd.Gauge("runtime.dotnet.cpu.system", 0, 1, null)
2021-03-03T20:24:40.9615154Z IDogStatsd.Gauge("runtime.dotnet.cpu.percent", -390.6, 1, null)
2021-03-03T20:24:40.9615866Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 1, 1, ["exception_type:SocketException"])
2021-03-03T20:24:40.9616628Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 5, 1, ["exception_type:WebException"])
2021-03-03T20:24:40.9617276Z   Stack Trace:
2021-03-03T20:24:40.9617972Z      at Moq.Mock.ThrowVerifyException(MethodCall expected, IEnumerable`1 setups, IEnumerable`1 actualCalls, Expression expression, Times times, Int32 callCount)
2021-03-03T20:24:40.9618830Z    at Moq.Mock.VerifyCalls(Mock targetMock, MethodCall expected, Expression expression, Times times)
2021-03-03T20:24:40.9619554Z    at Moq.Mock.Verify[T](Mock`1 mock, Expression`1 expression, Times times, String failMessage)
2021-03-03T20:24:40.9620174Z    at Moq.Mock`1.Verify(Expression`1 expression, Times times)
2021-03-03T20:24:40.9620789Z    at Moq.Mock`1.Verify(Expression`1 expression, Func`1 times)
2021-03-03T20:24:40.9622069Z    at Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions() in D:\a\1\s\test\Datadog.Trace.Tests\RuntimeMetrics\RuntimeMetricsWriterTests.cs:line 90
2021-03-03T20:24:43.6757048Z Results File: D:\a\_temp\VssAdministrator_WIN-4TJ496I205J_2021-03-03_20_24_30.trx
2021-03-03T20:24:43.6758094Z 
2021-03-03T20:24:43.6811170Z Failed!  - Failed:     1, Passed:   396, Skipped:     1, Total:   398, Duration: 30 s - Datadog.Trace.Tests.dll (net461)

2021-03-02T19:18:55.1876354Z [xUnit.net 00:00:10.4189318]     Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions [FAIL]
2021-03-02T19:18:55.2040127Z   Failed Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions [54 ms]
2021-03-02T19:18:55.2042995Z   Error Message:
2021-03-02T19:18:55.2045234Z    Moq.MockException : 
2021-03-02T19:18:55.2046916Z Expected invocation on the mock should never have been performed, but was 3 times: s => s.Increment("runtime.dotnet.exceptions.count", It.IsAny<Int32>(), It.IsAny<Double>(), It.IsAny<String[]>())
2021-03-02T19:18:55.2050639Z No setups configured.
2021-03-02T19:18:55.2052439Z 
2021-03-02T19:18:55.2053635Z Performed invocations: 
2021-03-02T19:18:55.2055103Z IDogStatsd.Gauge("runtime.dotnet.threads.count", 32, 1, null)
2021-03-02T19:18:55.2056424Z IDogStatsd.Gauge("runtime.dotnet.mem.committed", 373817344, 1, null)
2021-03-02T19:18:55.2057526Z IDogStatsd.Gauge("runtime.dotnet.cpu.user", -15625, 1, null)
2021-03-02T19:18:55.2058760Z IDogStatsd.Gauge("runtime.dotnet.cpu.system", 0, 1, null)
2021-03-02T19:18:55.2059793Z IDogStatsd.Gauge("runtime.dotnet.cpu.percent", -390.6, 1, null)
2021-03-02T19:18:55.2061088Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 10, 1, ["exception_type:WebException"])
2021-03-02T19:18:55.2062359Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 2, 1, ["exception_type:SocketException"])
2021-03-02T19:18:55.2064122Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 20, 1, ["exception_type:HttpRequestException"])
2021-03-02T19:18:55.2080657Z   Stack Trace:
2021-03-02T19:18:55.2082925Z      at Moq.Mock.ThrowVerifyException(MethodCall expected, IEnumerable`1 setups, IEnumerable`1 actualCalls, Expression expression, Times times, Int32 callCount) in C:\projects\moq4\Source\Mock.cs:line 473
2021-03-02T19:18:55.2085217Z    at Moq.Mock.VerifyCalls(Mock targetMock, MethodCall expected, Expression expression, Times times) in C:\projects\moq4\Source\Mock.cs:line 451
2021-03-02T19:18:55.2086761Z    at Moq.Mock.Verify[T](Mock`1 mock, Expression`1 expression, Times times, String failMessage) in C:\projects\moq4\Source\Mock.cs:line 314
2021-03-02T19:18:55.2089574Z    at Moq.Mock`1.Verify(Expression`1 expression, Times times) in C:\projects\moq4\Source\Mock.Generic.cs:line 459
2021-03-02T19:18:55.2090985Z    at Moq.Mock`1.Verify(Expression`1 expression, Func`1 times) in C:\projects\moq4\Source\Mock.Generic.cs:line 466
2021-03-02T19:18:55.2092622Z    at Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions() in D:\a\1\s\test\Datadog.Trace.Tests\RuntimeMetrics\RuntimeMetricsWriterTests.cs:line 90
2021-03-02T19:18:56.4223223Z Results File: D:\a\_temp\VssAdministrator_WIN-80S6H5TTEN9_2021-03-02_19_18_47.trx
2021-03-02T19:18:56.4224315Z 
2021-03-02T19:18:56.4325403Z Failed!  - Failed:     1, Passed:   396, Skipped:     1, Total:   398, Duration: 35 s - Datadog.Trace.Tests.dll (netcoreapp2.1)

2021-03-03T00:57:08.9749625Z [xUnit.net 00:00:10.0116210]     Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions [FAIL]
2021-03-03T00:57:09.1419253Z   Failed Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions [22 ms]
2021-03-03T00:57:09.1428692Z   Error Message:
2021-03-03T00:57:09.1453711Z    Moq.MockException : 
2021-03-03T00:57:09.1455001Z Expected invocation on the mock should never have been performed, but was 2 times: s => s.Increment("runtime.dotnet.exceptions.count", It.IsAny<Int32>(), It.IsAny<Double>(), It.IsAny<String[]>())
2021-03-03T00:57:09.1456383Z No setups configured.
2021-03-03T00:57:09.1457056Z 
2021-03-03T00:57:09.1457807Z Performed invocations: 
2021-03-03T00:57:09.1458712Z IDogStatsd.Gauge("runtime.dotnet.threads.count", 31, 1, null)
2021-03-03T00:57:09.1460517Z IDogStatsd.Gauge("runtime.dotnet.mem.committed", 284004352, 1, null)
2021-03-03T00:57:09.1461531Z IDogStatsd.Gauge("runtime.dotnet.cpu.user", -0, 1, null)
2021-03-03T00:57:09.1462586Z IDogStatsd.Gauge("runtime.dotnet.cpu.system", -0, 1, null)
2021-03-03T00:57:09.1463482Z IDogStatsd.Gauge("runtime.dotnet.cpu.percent", -0, 1, null)
2021-03-03T00:57:09.1464559Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 31, 1, ["exception_type:HttpRequestException"])
2021-03-03T00:57:09.1465895Z IDogStatsd.Increment("runtime.dotnet.exceptions.count", 12, 1, ["exception_type:SocketException"])
2021-03-03T00:57:09.1466851Z   Stack Trace:
2021-03-03T00:57:09.1467885Z      at Moq.Mock.ThrowVerifyException(MethodCall expected, IEnumerable`1 setups, IEnumerable`1 actualCalls, Expression expression, Times times, Int32 callCount)
2021-03-03T00:57:09.1469998Z    at Moq.Mock.VerifyCalls(Mock targetMock, MethodCall expected, Expression expression, Times times)
2021-03-03T00:57:09.1471134Z    at Moq.Mock.Verify[T](Mock`1 mock, Expression`1 expression, Times times, String failMessage)
2021-03-03T00:57:09.1472169Z    at Moq.Mock`1.Verify(Expression`1 expression, Times times)
2021-03-03T00:57:09.1473214Z    at Moq.Mock`1.Verify(Expression`1 expression, Func`1 times)
2021-03-03T00:57:09.1477637Z    at Datadog.Trace.Tests.RuntimeMetrics.RuntimeMetricsWriterTests.ShouldCaptureFirstChanceExceptions() in D:\a\1\s\test\Datadog.Trace.Tests\RuntimeMetrics\RuntimeMetricsWriterTests.cs:line 90
2021-03-03T00:57:09.2229895Z Results File: D:\a\_temp\VssAdministrator_fv-az91-343_2021-03-03_00_57_00.trx
2021-03-03T00:57:09.2231262Z 
2021-03-03T00:57:09.2276522Z Failed!  - Failed:     1, Passed:   399, Skipped:     1, Total:   401, Duration: 32 s - Datadog.Trace.Tests.dll (net5.0)

Introduce IoC / simple DI to improve extendibility

Why

Introduce very simple DI to make current solution more extendable.

What

Due full IoC containers add unnecessary complexity and reduce performance I would offer "Factory configurator" pattern.
Instead of hosting anything like IoC containers, factory configurator will take reference only for implementations of the concrete interfaces (see example below). OTel repo will host DefaultFactoryConfiguration and a distro will host eg: SplunkFactoryConfiguration where it could override the sub services / factories or construct whole implementation from scratch. Factory configuration type could be loaded either from environment variable, from static, etc. IFactoryConfiguration reference must be passed down the line (I expect it will be in the Tracer instance?)

This feature is connected to: #117

class SomeFactoryConfiguration : IFactoryConfiguration
{
   // Singleton services
   public IMyService { get; } = new MyDistroService();

   // Transient services
   public IMyFactory { get; } = new MyDistroFactory();
}

Configuration source binding & validation

Why

Currently some of the configuration source values will be validated late at runtime. We should NOT throw invalid configuration exceptions later than initialization. In same time some of the configuration values will fallback to the default value when invalid value is passed (for eg: enums) - this could lead to confusion for user and also internally (the other part of the system does not know the actual behavior due setting value and behavior differs).

What

  • Settings instance should be read only after initialization
  • All settings should be validated in early startup so we can fail fast.
  • All defaults should be set in early startup

Open for ideas to implement validators. Example:
Using property attributes

class MySettings : IAppSettings
{
   [Configuration(ConfigurationKeys.DebugEnabled, default: false)]
   public bool DebugEnabled { get; private set; }

   [Configuration(ConfigurationKeys.MyEnum)]
   [ConfigurationValidator(typeof(EnumValidator))]
   public MyEnum { get; private set; }
}

pros:

  • easy to read

cons:

  • requires some reflection (slow but done only once)

UPD 2023: Consider using fluent style, so no reflection is needed and readability remains

Example issues

Current implementations

  • TracerSettings
  • IntegrationSettings
  • GlobalSettings

Remove big BLOBs from Git repo history

Problem

After #70 we pulled BLOBs into the Git repository. This is what I saw after git fetch:

Receiving objects: 100% (597/597), 43.73 MiB | 4.30 MiB/s, done

Inspecting big objects in Git:

$git rev-list --all | xargs -rL1 git ls-tree -r --long | sort -uk3 | sort -rnk4  | less
100644 blob 36346064ad2437279f8a8371df00c75bf88b37cf 22442896   test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-03/objects/pack/pack-7fa7cbe93d4ea75e170452b838378c33df1f8b50.pack
100644 blob 8dbb48aa5640ae723c1523e3edafa1c46cb45b22 22441964   test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-01/objects/pack/pack-996ad06b538da839c6c16a8e878c37ac7a66b0a8.pack
100644 blob 9958f19ca10cda84161780068c66cd60d088577b 22437075   test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-04/objects/pack/pack-c15d5f0d30e22438173611f8d403f4df497c024f.pack
100644 blob da42dd55fccac27b620396054d50484daba46f55 22436166   test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-02/objects/pack/pack-ae1a6c8f26907755201d8984390c0679322589af.pack
100644 blob 932a42009599e096b061c8e0cf52dca0de713623 22435531   test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-05/objects/pack/pack-cf1c5a3f513e65c8502d26ae38d746a506b9d564.pack
100644 blob ae6a3a3818007dff2723d43b95246664e6174ad7 5412090    src/Datadog.Trace.ClrProfiler.Native/lib/fmt_x64-windows-stat
ic/lib/fmt.lib
100644 blob be5cec66567e69b6abdf7f2c753285799853745a 5271876    src/Datadog.Trace.ClrProfiler.Native/lib/fmt_x64-windows-stat
ic/debug/lib/fmtd.lib
100644 blob 114e7079afe99a530d6e2182e3956b89484a424d 4550498    src/Datadog.Trace.ClrProfiler.Native/lib/fmt_x86-windows-stat
ic/lib/fmt.lib
100644 blob 9ceb6355130a5e87b5e0c97e9d46d73f92d4ea10 3797062    src/Datadog.Trace.ClrProfiler.Native/lib/fmt_x86-windows-stat
ic/debug/lib/fmtd.lib
100644 blob 8f68a334d9b4481d96ded3142a9a56764977f0ad  808620    test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-03/objects/pack/pack-7fa7cbe93d4ea75e170452b838378c33df1f8b50.idx
100644 blob 74104f742f407c6009b32fa64e70932f5aefd193  808620    test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-04/objects/pack/pack-c15d5f0d30e22438173611f8d403f4df497c024f.idx
100644 blob 5e6a540477537aff1bf86d0948e2d565e8a24308  808620    test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-02/objects/pack/pack-ae1a6c8f26907755201d8984390c0679322589af.idx
100644 blob 58b8b2d9096d92f48a80bc504ed774bcc008f8d9  808620    test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-01/objects/pack/pack-996ad06b538da839c6c16a8e878c37ac7a66b0a8.idx
100644 blob d489a8287e83a9939d9c415dae07db8d593c0adb  808340    test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/Data/gitda
ta-05/objects/pack/pack-cf1c5a3f513e65c8502d26ae38d746a506b9d564.idx

The first file in the list is 22442896 Bytes = 21.4032 Megabytes
The last one 808340 Bytes = 0.770893 Megabytes (still quite big)

These files were introduced here in the upstream repository: DataDog/dd-trace-dotnet#1247

There is a git folder parser within the tracer for a Datadog specific product, the test suite uses a couple of git folder data cloned and checked out in different ways, and test the parser using those folders:
https://github.com/DataDog/dd-trace-dotnet/blob/a9582968040b32ca78640d2fe99f5ef0573bf853/test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/GitParserTests.cs#L14-L104

Possible Solution no 1

  1. @tonyredondo in https://github.com/DataDog/dd-trace-dotnet creates a repo with fewer objects and run the test against that repo .git folder.
  2. All Datadog devs are pushing their changes before history rewrite
  3. @tonyredondo removes he GitParserTests-related BLOBs from the git history
  4. All Datadog devs which use https://github.com/DataDog/dd-trace-dotnet as origin should update local repositories. The easier way is to clone it again https://stackoverflow.com/questions/48267025/how-to-sync-local-history-after-massive-git-history-rewrite
  5. ⚠️⚠️⚠️ All forks would be affected after this change.
  6. @pellared https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation is not a "normal" fork so we could just first sync (cherry-pick) the changes
  7. @pellared then do the same history-rewrite as mentioned in point 3 but for https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation
  8. All OTel devs create new forks of https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation

I am concerned about how this change would impact "normal" forks. IMO it can almost "destroy" them.

Possible Solution no 2

  1. @pellared removes the tests GitParserTests from https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation
  2. @pellared removes the GitParserTests-related BLOBs from the https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation git history
  3. All OTel devs create new forks of https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation

Plan

Personally, I am concerned about possible side-effects of rewriting https://github.com/DataDog/dd-trace-dotnet for the forks (GitHub sees 66 but I imagine there are more).

That is why I suggest: Solution no 2.

Additional Resources:

Extendibility via plugins

Why

Distro providers should have more simpler ways to build up their own distro. Full fork is hard to manage and keep in sync.

What

Introduce plugin architecture for extendibility. Plugins could reside in /plugins folder where dlls can be dynamically loaded.

This feature is connected to: #118

Set Span status according to OTel specs

The topic was brought up by @nrcventura during the SIG meeting: https://youtu.be/e02Kw3VzgyU?t=371

The problem is that currently, it looks like the Span status is often set to ERROR when it should not.

Each status assignment needs to be revised and make sure that it adherce to the OTel specification.,

From: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status

The status code SHOULD remain unset, except for the following circumstances:

When the status is set to ERROR by Instrumentation Libraries, the status codes SHOULD be documented and predictable. The status code should only be set to ERROR according to the rules defined within the semantic conventions. For operations not covered by the semantic conventions, Instrumentation Libraries SHOULD publish their own conventions, including status codes.

Generally, Instrumentation Libraries SHOULD NOT set the status code to Ok, unless explicitly configured to do so. Instrumentation libraries SHOULD leave the status code as Unset unless there is an error, as described above.

From: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md#status

Span Status MUST be left unset if HTTP status code was in the 1xx, 2xx or 3xx ranges, unless there was another error (e.g., network error receiving the response body; or 3xx codes with max redirects exceeded), in which case status MUST be set to Error.

For HTTP status codes in the 4xx and 5xx ranges, as well as any other code the client failed to interpret, status MUST be set to Error.

Don't set the span status description if the reason can be inferred from http.status_code.

We may lean more towards setting the error status and offer a configuration to change that behaviour (see: #85 (comment)).

Here are the docs defining how exceptions should be handled: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/exceptions.md

Moreover, only OTel required tags should be assigned when OTel semantic convention is selected. See: #82 (comment)

Docs should be created describing what the auto-instrumentation does (at least for OTel semantic convention).

Change separator to match OTel convention

Background

Since OTel convention requires comma separator for propagation var configuration (specs), we should change all other places as well to match single design line (as using ; and , in same time will be very confusing).

It should be for sure before GA, but for beta, it seems optional.

Remove Dependabot configuration

Dependabot configuration is currently configured in both this repository and the "upstream" DataDog/dd-trace-dotnet repo. Right now there's considerable code-sharing so the Dependabot PR's in this repo create a bunch of noise in between upstream pulls. In the SIG meeting, we decided to remove the Dependabot configuration for now

Add a badge for CI status and CodeCov

Is your feature request related to a problem?
Most of the existing OpenTelemetry repositories display a badge for CI status and code coverage. In an effort to have the dotnet-instrumentation repository to be up to date and consistent with the other established SDKs and repositories, there should be a status tracking badge in the main README.md file.

Describe the solution you'd like.
As a developer contributing to OpenTelemetry, I recommend adding a CI status and code coverage percentage badge at the top of the README document of the dotnet-instrumentation repository. Code coverage and CI badges are a common feature of many modern open source projects, which improves readability and convenience to developers. By adding CI and code coverage badges to the README.md, with a quick scan, any observer will be able to know the status of the repository.

cc: @alolita

Allow for OTel span data model without code divergence from Datadog

Hey folks,

Based on the discussion in the meetings, I want to make a concrete proposal.
Please treat it as the first stub; we can modify it in any way we want. I just want to help moving the discussion from theoretical to practical, so that we can move forward. :)

Challenge:
The OTel tracer must ensure that span tags use OTel conventions for both, tag-names and tag-content-format. For OTel it's a must-have, for Datadog it is optional. The OTel currently depends on Datadog's contributions. Until this changes we need to make code internations easy. This is a proposal on how we can achieve this.

Requirements:

  • OTel data model conventions must be met.
  • Code bases must not diverge as a result of this particular effort. This implies that:
    • Changes to the OTel Tracer repo must be merged into the Datadog repo.
    • Datadog will only do it, if it has zero performance impact and low complexity effort.

The following solution applies only to new-style CallTarget instrumentations, and to DiagnosticSource-based instrumentations. We will not modify older CallSite instrumentations.
We will delegate the logic of setting tags on spans to a tag-creator/span-initializer class, which will be a singleton per instrumentation. The logic of extracting data from the instrumented code will be shared, only the tag configuration will be delegated.

Consider the HttpClient instrumentation as an example. The meat of the logic is here.

Note the line

Scope scope = ScopeFactory.CreateOutboundHttpScope(Tracer.Instance, requestMessage.Method.Method, requestMessage.RequestUri, IntegrationId, out HttpTags tags);

It does 2 things: (1) Create a span and (2) set tags on that span. We want to separate these 2 concerns as follows:

I. Create an instrumentation-specific iface:

interface ISpanInitializerForSocketsHttpHandler
{
    void InitializeSpan(Span span, string httpMethod, Uri requestUri, IntegrationInfo integrationId);
}

Here, span is the span object that will be emitted, and httpMethod, requestUri, etc is all the data that we extracted from the instrumented API that is relevant for the span. If we wanted more (e.g. request headers), we would add a corresponding parameter.

II. Now the driver class will get a singleton like this:

internal class SocketsHttpHandlerCommon
{
    ...
    ISpanInitializerForSocketsHttpHandler s_spanInitializer = SpanInitializerForSocketsHttpHandler_VendorName();
    ...
}

That one line will be different for each vendor, e.g.:

    ISpanInitializerForSocketsHttpHandler s_spanInitializer = SpanInitializerForSocketsHttpHandler_NewRelic();
    ISpanInitializerForSocketsHttpHandler s_spanInitializer = SpanInitializerForSocketsHttpHandler_Splunk();

Specifically, in the OTel repo it will be

    ISpanInitializerForSocketsHttpHandler s_spanInitializer = SpanInitializerForSocketsHttpHandler_OTel();

and in the Datadog repo it will be

    ISpanInitializerForSocketsHttpHandler s_spanInitializer = SpanInitializerForSocketsHttpHandler_Datadog();

III. Then, the above-mentioned lines that calls the ScopeFactory would need to be refactored into something like:

Scope scope = tracer.StartActive(..);
if (scope != null)
{
    s_spanInitializer.InitializeSpan(scope.Span, requestMessage.Method.Method, requestMessage.RequestUri, IntegrationId);
    return new CallTargetState(scope);
}
else
{
    return CallTargetState.GetDefault();
}

Some minor refactoring to the Tracer / Scope classes will be required to allow setting all tags, operation name, etc after creation (most of it is possible already).

Each vendor will supply its own SpanInitializerForSocketsHttpHandler_Xxx class. Initially, they can all live in all repos in each respective instrumentation folder. Long term, we may or may not require them to live in vendor-specific repositories.

Notably, it is important that the responsibility of the
SpanInitializerForInstrumentationName_VendorName.InitializeSpan(Span span, ...)
method is to completely initialize the Span with the intercepted data from the instrumented code and from the environment. Nothing else.

This will ensure that we can continue using the same code base while allowing for span data-model customization.

Because the customization is, essentially, compile-time, a similar mechanism, can be used for exporter. So the Datadog exporters will expect Datadog tags on Spans, the OTel exporter will expect OTel tags, and each vendor can decide what they want to use when building the product.

Q & A

Who will do the work?

  1. OTel team will review this proposal an validates that it meets our goals.
  2. Datadog tracer team will review the proposal and agree (after potential modifications, if required).
  3. OTel team will do the work for one instrumentation. This will include any required refactoring to Tracer etc, if any. Datadog will code-review.
  4. We will push into OTel master branch, then integrate into Datadog repo.
  5. OTel team will do the work for all other instrumentation. Datadog will code-review.
  6. We will push into OTel master branch, then integrate into Datadog repo.

(Of course, 5 & 6 can happen in batches, we do not need to do all at once. The point is to do one instrumentation, get it in all the necessary places, and only then proceed to the rest).

What if for some particular existing instrumentation, the OTel data model requires some data currently not collected by that instrumentation?

In all likelihood this is "useful data". We will modify the instrumentation (non-vendor-specific code) to collect the additional info and include it into both, Datadog and OTel tags.

There may be some exceptions to this, e.g. when collecting some info is very expensive and the existing instrumentation does not need it. In that case we have two options:

  • There are good reasons not to collect that data and OTel will skip it.
  • That one particular instrumentation will not be compatible and the code for it (and only for it) will diverge. I expect that very few, if any, instrumentations will be in this bucket.

What about existing DiagnosticSource-based telemetry integrations?

We will use the same InitializeSpan(..)-style pattern there.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.