Review the inherited state of V1 metrics and determine which changes are necessary for

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks for the pointer <a class="user-mention notranslate" data-hovercard-type="user"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Metrics about aws-sdk-java-v2 HOT 27 CLOSED

aws commented on July 18, 2024

Metrics

from aws-sdk-java-v2.

Comments (27)

millems commented on July 18, 2024 4

We recognize this feature to be a blocker for many customers to migrate from 1.x to 2.x. We still believe this feature to be important and intend to implement it.

from aws-sdk-java-v2.

dagnir commented on July 18, 2024 4

Hi @gmariotti we don't have any plans currently for a Micrometer implementation.

from aws-sdk-java-v2.

dagnir commented on July 18, 2024 2

@swaranga Yep, absolutely. You can set publishers at the request level as well using the RequestOverrideConfiguration

GetObjectRequest.builder()
                ...
                .overrideConfiguration(o -> o.addMetricPublisher(myPublisher))
                .build();

from aws-sdk-java-v2.

rehevkor5 commented on July 18, 2024 1

It's possible to implement your own MetricPublisher in order to produce metrics in Micrometer format. However, there are some wrinkles due to differences between CloudWatch metrics and the approach used by eg. Prometheus metrics. Here are some details which people might find helpful.

Duration type metrics are directly convertible to Timer.

Boolean type metrics can be converted into two Counters: for example one which increments for true and one which increments for false.

Number type metrics can be tricky. Some of them (such as RetryCount) can be recorded as a Counter (which increments with each given value). However, metrics such as HttpMetric#AVAILABLE_CONCURRENCY, HttpMetric#LEASED_CONCURRENCY, and HttpMetric#PENDING_CONCURRENCY_ACQUIRES behave more like gauges. Also, they are difficult because they are are only sent (via MetricCollection passed to MetricPublisher#publish()) if a request was made. Therefore, if you treat those metrics as a Gauge they will become "stuck" at their last value until another request occurs. CloudWatch aggregates metrics like this once per minute, recording the sum of the values and the count of how many values were given (as well as min & max value). If the metric isn't reported, CloudWatch seems to assume that the value was zero during that time. Therefore, what you can get from CloudWatch is the average value (by dividing sum / count) or min/max of the metric in each time period. We could record these as a DistributionSummary if we need more details (though it might be somewhat misleading). However, a simpler and more reliable way is to imitate the CloudWatch approach and record the value sum as a Counter along with the value count as its own Counter. This allows us to perform calculations such as rate(...sum_count{...}[...]) / rate(...sample_count{...}[...]) (or, equivalently, increase() / increase()), which basically give us the average value for the time range. It's not necessarily 100% accurate when no requests are happening, but at least it will show zero instead of the last value. I'm about to implement this approach, hopefully it works as expected.

from aws-sdk-java-v2.

brharrington commented on July 18, 2024

I'm curious if this has gotten any more thought, is there a current recommendation?

For our use-cases, we mostly care about the request volume and latency. I found that I can get most of what I care about using the pluggable http client support. I'm creating a wrapper for an SdkHttpClient or SdkAsyncHttpClient that captures basic metrics about the requests. The main gap is that I don't see a good way to get the following information that was easily accessible from the V1 metrics collector:

Service name (e.g. EC2)
Request type (e.g., DescribeInstances)
AWS error code, we get the HTTP status code, but that doesn't always map as we would expect to the AWS error

For synchronous clients we have a hacky way to get the first two by looking at the stacktrace and seeing what the entry point to the SDK was. For async clients that is not viable. Is there a better way to access that type of information at the HTTP client layer?

More generally, is there a better integration point that could be used?

from aws-sdk-java-v2.

millems commented on July 18, 2024

@brharrington

For metrics, volume and latency the ExecutionInterceptor is probably the best integration point for a custom metrics implementation.

from aws-sdk-java-v2.

brharrington commented on July 18, 2024

Thanks for the pointer @millems . Looks like it solves most of my problems.

from aws-sdk-java-v2.

millems commented on July 18, 2024

@brharrington LMK if you have any questions.

from aws-sdk-java-v2.

swaranga commented on July 18, 2024

@millems What is the right way to log latencies for a request using the Interceptor API. One way could be to add the start timestamp in a beforeExecution call to the ExecutionAttributes instance and in the afterExecution method, get the start time from the attributes and calculate the elapsed time and log the latency.

Is that the recommended approach?

from aws-sdk-java-v2.

millems commented on July 18, 2024

@swaranga That's the recommended way to log the latency for the entire execution (which includes marshalling and retries), yes.

from aws-sdk-java-v2.

swaranga commented on July 18, 2024

@millems One problem with this approach is that if I want to execute a different interceptor per request so that I can tie the metrics logging with my parent unit of work I cannot do that since there is no way to add a interceptor for a request object.

This is useful for instance when I am serving a request for my callers and as part of that I need to call an AWS API, I would like to log metrics for that particular call along with other metrics for the incoming request to my service. I could probably achieve this by using hacky ThreadLocals but that makes the whole code brittle to refactoring.

This could be addressed by adding support for per request interceptors but assuming both the SDK level interceptors and the request level interceptors are executed for a given request, for metrics logging that may not be the most optimal API design and here is why:

Consider an an application that processes messages from Kinesis using the KCL library which are then transformed and written to a different Kinesis stream. In such a case, I will want to use the same SDK instance for both the read and write to/from Kinesis as recommended by SDK docs. And since I have no way of assigning a request level interceptor for the get records call from KCL (since the call is made directly from the KCL code), I will have to add an SDK level interceptor. At the same time, I will also want to add a request level interceptor for the part of my code that does the Kinesis write so that the Kinesis write call metrics are tied to my unit of work which is the processing of each message.

In such a setup, if both the interceptors are executed, I will have incorrect metrics logged (call counts, failures, avg latency etc). I thus think the interceptor API is too low level to be useful for flexible metrics logging and I ask that you reconsider the design you had for the V1 SDK that only executed one RequestMetricCollector even if multiple collectors were configured at different levels (Http Client, SDK, Request instance - in the order of preference).

A dedicated API is definitely needed, purpose-built for metrics logging and should be addressed before the V2 SDK become more ubiquitous.

from aws-sdk-java-v2.

millems commented on July 18, 2024

We agree that the interceptor API isn't optimal for someone just caring about metrics. There will be a separate interface for metrics. Here's our latest thoughts: #1304 #1362

from aws-sdk-java-v2.

swaranga commented on July 18, 2024

Thank you, I have added my comments on the design PR

from aws-sdk-java-v2.

jeffsteinmetz commented on July 18, 2024

An incremental implementation that at least adds the Java JVM Memory, JVM Threads and OS File Descriptors would be beneficial while the interfaces to the AWS specific metrics are worked out.

from aws-sdk-java-v2.

swaranga commented on July 18, 2024

@jeffsteinmetz Why should the SDK provide metrics on your application's JVM memory usage, file descriptors and threads? Do you mean memory, file descriptors and threads used by the SDK itself? If so, threads might be doable but I am not sure about memory usage. For file descriptors, may be it can provide number of open Http connections from the SDK but beyond that I am not very sure this should be in scope.

from aws-sdk-java-v2.

jeffsteinmetz commented on July 18, 2024

@swaranga, feature parity with the 1.x Java SDK to provide JVM and Machine Metrics. We use this in the 1.x java sdk,

see
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/metrics/package-summary.html

and

https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/generating-sdk-metrics.html

Machine Metrics

Memory

TotalMemory - Total amount of memory currently available to the JVM for current and future objects, measured in bytes. This value may vary over time, depending on the host environment.
FreeMemory - An approximation to the total amount of memory currently available to the JVM for future allocated objects, measured in bytes.
UsedMemory - TotalMemory minus FreeMemory.
SpareMemory - The maximum amount of memory that the JVM will attempt to use, measured in bytes, minus UsedMemory.
Threads

ThreadCount - The current number of live threads including both daemon and non-daemon threads.
DeadLockThreadCount - The number of threads that are deadlocked waiting for object monitors or ownable synchronizers, if any. Threads are deadlocked in a cycle waiting for a lock of these two types if each thread owns one lock while trying to acquire another lock already held by another thread in the cycle. No metrics is generated when the value is zero.
DaemonThreadCount - The current number of live daemon threads. No metrics is generated when the value is zero.
PeakThreadCount - The peak live thread count since the JVM started or since the peak was reset.
TotalStartedThreadCount - The total number of threads created and also started since the JVM started.
File Descriptors

OpenFileDescriptorCount - Number of opened file descriptors of the operating system.
SpareFileDescriptorCount - Maximum number of file descriptors of the operating system minus OpenFileDescriptorCount.

and from the 1.x docs

The AWS SDK for Java can generate metrics for visualization and monitoring with CloudWatch that measure:

your application’s performance when accessing AWS
the performance of your JVMs when used with AWS
runtime environment details such as heap memory, number of threads, and opened file descriptors

from aws-sdk-java-v2.

millems commented on July 18, 2024

+1, I could see it being nice to get JVM metrics into cloudwatch. It's more of a "Java on AWS" feature than a "Java SDK" feature, but it was included in the Java SDK 1.11.x, so it makes sense as a feature request.

from aws-sdk-java-v2.

swaranga commented on July 18, 2024

Understood. Although there seems to be no new development on the related PR's. Is this still a priority?

from aws-sdk-java-v2.

debora-ito commented on July 18, 2024

Feature request from v1: Allow a user to add a JVM shutdown hook to drain metrics queue - aws/aws-sdk-java#1296

from aws-sdk-java-v2.

gigiigig commented on July 18, 2024

Hello! Do you have any ETA for this? Thanks.

from aws-sdk-java-v2.

millems commented on July 18, 2024

@gigiigig Sorry, we unfortunately can't provide ETAs, but we will keep this issue up to date as we progress.

from aws-sdk-java-v2.

dagnir commented on July 18, 2024

Hi all,

We're happy to announce that the preview of the client-side metrics feature is now released! You can try it out starting with version 2.13.52. Please note if you want to use the cloudwatch-metric-publisher, the version to use is 2.13.52-PREVIEW. For more information please take a look at our announcement blog post.

from aws-sdk-java-v2.

swaranga commented on July 18, 2024

From the blog post, it seems like the metrics publisher is configured at the SDK level. Can I also attach one at the per request level? The V1 SDK had that option to override it at the per request level that allowed us to publish the metrics for the request along with the current unit of work.

Is that feasible?

from aws-sdk-java-v2.

gmariotti commented on July 18, 2024

Is there any plan to provide a Micrometer based implementation for the metric publisher?

from aws-sdk-java-v2.

dagnir commented on July 18, 2024

Hi everyone, the Metrics feature is now GA so I'm going to ahead and close this issue. Please check it out and feel free to open an Issue for any problems or feedback you have. https://aws.amazon.com/blogs/developer/client-side-metrics-for-the-aws-sdk-for-java-v2-is-now-generally-available/

from aws-sdk-java-v2.

ginkel commented on July 18, 2024

@rehevkor5 Has your attempt to expose metrics to Micrometer been successful? Thanks!

from aws-sdk-java-v2.

PatrykGala commented on July 18, 2024

@rehevkor5 @ginkel this is my solution: https://gist.github.com/PatrykGala/e4aec004eb55cd8cbdee328f217771c7
The code creates gauge and counter for Integer metrics, one of them is not necessary for specific metric.

from aws-sdk-java-v2.

Metrics about aws-sdk-java-v2 HOT 27 CLOSED

Comments (27)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent