Giter Club home page Giter Club logo

prometheus-hystrix's People

Contributors

ahus1 avatar benzvan avatar beorn7 avatar dadadom avatar dependabot[bot] avatar tomcz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prometheus-hystrix's Issues

Ratpack Compatability

Hi, I'm having an issue integrating this with a Ratpack application. I'm getting the following error on startup:

INFO: An exception was caught and reported. Message: java.lang.IllegalStateException: Cannot install Hystrix integration because another concurrency strategy (class com.netflix.hystrix.strategy.concurrency.HystrixConcurrencyStrategyDefault) is already installed
java.lang.IllegalStateException: Cannot install Hystrix integration because another concurrency strategy (class com.netflix.hystrix.strategy.concurrency.HystrixConcurrencyStrategyDefault) is already installed

It looks like this bit of prometheus-hystrix code:
https://github.com/ahus1/prometheus-hystrix/blob/master/src/main/java/com/soundcloud/prometheus/hystrix/HystrixPrometheusMetricsPublisher.java#L94

is clashing with this bit of Ratpack code:
https://github.com/ratpack/ratpack/blob/master/ratpack-hystrix/src/main/java/ratpack/hystrix/HystrixModule.java#L90

Is it necessary to register the default HystrixConcurrencyStrategy in prometheus-hystrix or can it be removed?

java 7 compatibility

Hi, I am trying to use this cool lib in my application for publishing hystrix metrics. unfortunately, the application fails to start up. I figure that it's because that the application is on java 7 while this lib requires 1.8.

we don't have a plan to upgrade to java 8. and I found that all versions in maven central are 3.x which requires java 8.

So my question is what's the best way to get it work on java 7? as we just need basic functionalities, I am thinking to re-build 2.x with java 7. Is that the way to go? Can you confirm?

Thanks!

prometheus-hystrix not working with micrometer.io libraries

This no longer works. It appears that micrometer.io has replaced it. I have:

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>
    <dependency>
        <groupId>de.ahus1.prometheus.hystrix</groupId>
        <artifactId>prometheus-hystrix</artifactId>
    </dependency>
    <dependency>
        <groupId>io.micrometer</groupId>
        <artifactId>micrometer-core</artifactId>
    </dependency>
    <dependency>
        <groupId>io.micrometer</groupId>
        <artifactId>micrometer-registry-prometheus</artifactId>
    </dependency>

But never do I see a hystrix_command_event_total, after many calls to apis have been made, in the output of /actuator/prometheus. I see a lot of micrometer output but nothing from prometheus-hystrix.

Does you library no longer work now that we have micrometer.io?

Need guide for transformation from old prometheus queries (3.4.0) to new queries (4.0.0)

I wanted to use an existing grafana dashboard (https://grafana.com/dashboards/2113) that nicely shows hystrix circuit breaker information. It didn't work at all. It took me hours to find the reason: the grafana dashboard uses metrics that are to exported anymore as of version 4.0.0 of prometheus-hystrix.
It would be nice to include a transformation guide for the metrics.
An additional question: why did you change the exported metrics in the first place? I do not understand the benefits of the complete metric rewrite.

Remove some of hystrix metrics

Is it possible to remove some of metrics from REST endpoint (e.g. remove everything except "hystrix_command_event_total") ?
but cannot find a way to filter collectors that I don't need.

Metric names do not comply with prometheus standards and buckets sizes are not configurable

i know this is on the road map. I just wanted to get an issue in place for the PR I'm likely to submit next week to fix some or all of the following metrics:

hystrix_command_error_total
hystrix_command_event_total
hystrix_command_latency_execute_seconds_bucket
hystrix_command_latency_execute_seconds_count
hystrix_command_latency_execute_seconds_sum
hystrix_command_latency_total_seconds_bucket
hystrix_command_latency_total_seconds_count
hystrix_command_latency_total_seconds_sum
hystrix_thread_pool_completed_task_count
hystrix_thread_pool_count_threads_executed
hystrix_thread_pool_largest_pool_size
hystrix_thread_pool_queue_size
hystrix_thread_pool_rolling_count_threads_executed
hystrix_thread_pool_rolling_max_active_threads
hystrix_thread_pool_thread_active_count
hystrix_thread_pool_total_task_count

Remove some of hystric metrics

Is it possible to remove some of metrics from REST endpoint (e.g. remove everything except "hystrix_command_event_total") ?

I'm currently using this to init metrics:

HystrixPrometheusMetricsPublisher
				.builder()
				.shouldExportDeprecatedMetrics(false)
				.shouldRegisterDefaultPlugins(false)
				.shouldExportProperties(false)
				.buildAndRegister();

but cannot find a way to filter collectors that I don't need.

Release on maven central

Couldn't find any version of this project on maven central. Any chance to release the current version soon?

Negative latency when curcuit breaker is in open state

I noticed that sometimes counters for _hystrix_command_latency_execute_seconds_sum and _hystrix_command_latency_total_seconds_sum have small glitches like that:

8891.582999999922 @1509010418.551
8891.617999999922 @1509010419.551
8891.633999999922 @1509010420.551
8891.633999999922 @1509010421.551 <-- attention
8891.631999999921 @1509010422.551 <-- attention
8891.631999999921 @1509010423.551
8891.64699999992 @1509010424.551
8891.68199999992 @1509010425.551

Prometheus' functions like rate() or increase() tend to think that there is a reboot between these two values and produce a huuuuge peak on the graph.

Deeper investigation showed that this is because of a series of short-circuited executions. The value for latency in case of short-circuit is always -1 ms.
The debug code like that:

HystrixCommandCompletionStream.getInstance(cmdKey)
    .observe()
    .subscribe(hystrixCommandCompletion -> {
        LOG.warn(
            "CMD Completion: executed={} {}",
            hystrixCommandCompletion.didCommandExecute(),
            hystrixCommandCompletion.toString(),
        );
    });

produces:

CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=true listItemsByUserId[FAILURE][2 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=true listItemsByUserId[FAILURE][2 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]
CMD Completion: executed=false listItemsByUserId[SHORT_CIRCUITED][-1 ms]

In purely synthetical tests with 100% failure rate the total sum of latency can even go below 0:

hystrix_command_latency_execute_seconds_count{command_group="SOME_GROUP",command_name="listItemsByUserId",} 684.0
hystrix_command_latency_execute_seconds_sum{command_group="SOME_GROUP",command_name="listItemsByUserId",} -0.3050000000000003
hystrix_command_latency_total_seconds_count{command_group="SOME_GROUP",command_name="listItemsByUserId",} 684.0
hystrix_command_latency_total_seconds_sum{command_group="SOME_GROUP",command_name="listItemsByUserId",} -0.21800000000000028

(Time machine, isn't it? :) )

I think that this could be safely fixed by this additional check in HystrixPrometheusMetricsPublisherCommand class:

HystrixCommandCompletionStream.getInstance(commandKey)
    .observe()
    .subscribe(hystrixCommandCompletion -> {
        if (hystrixCommandCompletion.didCommandExecute()) {
            histogramLatencyTotal.observe(hystrixCommandCompletion.getTotalLatency() / 1000d);
            histogramLatencyExecute.observe(hystrixCommandCompletion.getExecutionLatency() / 1000d);
        }
        for (HystrixEventType hystrixEventType : HystrixEventType.values()) {
            // this code is not touched
        }
    });

What do you think?

Another strategy was already registered.

Hi,

I'm using Spring Cloud Sleuth to track the calls. So, Sleuth registers the default MetricsPublisher.
When HystrixPrometheusMetricsPublisher is tries to register with HystrixPlugins throws "Another strategy was already registered." error.

Fixed / global labels

We would like to add a fixed label to each metric, e.g. service_name="MyService". This is needed to be able to distinguish between different service instances. In our cases the hystrix command key / group is the same as it is physically the same code, just different named instances.

Using the namespace is not really an option as this make finding & using metrics more difficult.

Ideally one should be able to provide a factory of some kind to create e.g. a counter / histogram. We can then provide a subclass of the real prometheus metric and add our logic as required.

Any other thoughts?

Error: text format parsing error in line 1: invalid metric name

My configuration:

global:
  scrape_interval:     15s
  evaluation_interval: 15s
  external_labels:
      monitor: "gatewaymonitor"

scrape_configs:
  - job_name: "gateway"
    metrics_path: "/public/metrics"

    static_configs:
      - targets: ["10.8.110.83:18888"]

I'm using spring cloud zuul/hystrix/ribbon all the netflix oss.
I'm able to see the hystrix metrics at http://localhost:18888/public/metrics
I'm not able to see issue with my prometheus.yml.
The error I'm getting: https://ibb.co/hDXR5w
Please help.

hystrix_command_total and hystrix_command_error_total are difficult to use

The two metrics hystrix_command_total and hystrix_command_error_total are difficult to use and probably return wrong values.

For example if an command executes with with the events SHORT_CIRCUITED, it will not be counted as an error and not as a a total count. This matches the code in HystrixCommandMetrics.HealthCounts.plus, but is nevertheless counter intuitive.

Especially if you want to put all the events observed in relation to the total command received, this leads to strange results, as the number ob observed SHORT_CIRCUITED events is high, while both error and total rate are low.

Creating Dashboard for mean Latency

Hi everyone,

I am currently trying to create a Grafana Dashbooard with some of the Hystrix Prometheus metrics. One of them is a Graf that should show the mean latency over time.
Therefore I am using the following query:
avg(hystrix_command_latency_total_seconds_sum{command_group=~"$commandGroup", command_name=~"$commandName"} ) by (command_name, command_group)

Unfortunately it feels like the used metric just sums up all latencies over time, because my graph looks like this:
image

I tried to divide the metric hystrix_command_latency_total_seconds_sum by hystrix_command_latency_total_seconds_count, but this gives me very unrealistic low results.

Does anyone know how to properly create a query for a mean latency over time?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.