Giter Club home page Giter Club logo

metrics's Issues

Metrics查询时通过level/key过滤和预期不一致

应用程序使用Springframework 4.3框架,通过引入mertics-core-api 1.7.9包来做一些业务数据度量。
注册了几个Timer都在RT这个group下,通过http://localhost:8006/metrics/RT/去看是正常的。

"data":[
    {
      "interval": 5,
      "metric": "com.xxx.xxx.xxx.xxx.count",
      "metricLevel": "MAJOR",
      "metricType": "COUNTER",
      "tags": {
      },
      "timestamp": 1553842875299,
      "value": 10
    },
以下省略
}

但是通过key的名字去查询的时候 http://localhost:8006/metrics/RT/com.xxx.xxx.xxx.xxx
提示HTTP 500错误
通过level去筛选的时候,http://localhost:8006/metrics/RT/level/MAJOR
没有内容

{
  "success": false,
  "message": "No metric matching the specified level found!",
  "timestamp": 1553843344155
}

但是我通过http://localhost:8006/metrics/RT/level/NORMAL?above=true 这种又可以查到信息了。

Large array allocation

==WARNING==  allocating large array--thread_id[0x00007f77bc03d000]--thread_name[ali-metrics-pool-1-thread-5]--array_size[8589941536 bytes]--array_length[1073742690 elememts]
os_prio=0 tid=0x00007f77bc03d000 nid=0x2436 runnable
        at com.alibaba.metrics.reporter.bin.zigzag.io.LongArrayOutputStream.<init>(LongArrayOutputStream.java:10)
        at com.alibaba.metrics.reporter.bin.zigzag.LongCodec.decompress(LongCodec.java:49)
        at com.alibaba.metrics.reporter.bin.zigzag.LongDZBP.fromBytes(LongDZBP.java:183)
        at com.alibaba.metrics.server.MetricsOnDisk.getDataFromDisk(MetricsOnDisk.java:91)
        at com.alibaba.metrics.server.MetricsSearchService.distinguishSearch(MetricsSearchService.java:326)
        at com.alibaba.metrics.server.MetricsSearchService.search(MetricsSearchService.java:130)
        at com.alibaba.metrics.server.MetricsSearchService.search(MetricsSearchService.java:104)
        at com.alibaba.metrics.rest.MetricsResource.searchMetrics(MetricsResource.java:287)
        at sun.reflect.GeneratedMethodAccessor877.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:151)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:171)
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:152)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:104)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:406)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:350)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:106)
        at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:259)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:319)
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:236)
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1028)
        at com.alibaba.metrics.rest.server.jersey.HttpHandlerContainer.handle(HttpHandlerContainer.java:122)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
        at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
        at java.lang.Thread.run(Thread.java:852)
java.lang.OutOfMemoryError: Java heap space

读取配置文件,错误日志没有打印

problem

读取属性文件的时候,没有找到,log.error,为什么没有日志输出;跑的是demo示例。
我的在demo加的日志是有日志输出的??

出现错误的地方如下:
log.error("Error when loading property file:", e);

涉及代码片段

 Properties prop = new Properties();
        if (configFile == null)
            return prop;
        InputStream input = null;
        try {
            input = new FileInputStream(configFile);
            // load a properties file
            prop.load(input);
        } catch (IOException e) {
            log.error("Error when loading property file:", e);
        } finally {
            if (input != null) {
                try {
                    input.close();
                } catch (IOException e) {
                    // ignore
                }
            }
        }
 return prop;

我的log4j配置:

`
log4j.rootLogger=debug, CONSOLE,DailyFile

log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=DEBUG
log4j.appender.CONSOLE.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} -%-4r [%c] %-5p %x - %m%n
log4j.appender.CONSOLE.Target=System.out
log4j.appender.CONSOLE.Encoding=UTF-8
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout

log4j.appender.DailyFile=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DailyFile.File=metrics_log.log
log4j.appender.DailyFile.Encoding=UTF-8
log4j.appender.DailyFile.DatePattern=yyyy-MM-dd'.log'
log4j.appender.DailyFile.layout=org.apache.log4j.PatternLayout
log4j.appender.DailyFile.layout.ConversionPattern=[metrics_demo] %d{yyyy-MM-dd HH:mm:ss} %5p %c{1}:%L : %m%n
`

Add getGauge(String group, MetricName name) to MetricManager

After you register a Gauge to MetricManager using com.alibaba.metrics.MetricManager#register, there is no way to query it from MetricManager.

As an alternative, you can use MetricManager.getIMetricManager().getGauges() to get all the gauges, but it is better to provide a more simple way.

Possible memory leak

内存泄露报表:

类加载器"xxxClassLoader @ 0x760829790"加载的"com.alibaba.metrics.server.MetricsMemoryCache"实例"0x760904030"占用了1,215,172,200 (33.23%)字节.其内存主要积累在由类加载"bootstrap class loader"加载的"java.util.TreeMap$Entry"实例"0x715e43858".

/logs/metrics/bin/日期/目录下生成了上GB的单个文件,删了这些文件之后再重启应用就不FGC了,否则重启完了还是会继续FGC.

通过OQL语言查询,发现内存里面读入大量的二进制落盘long数组。

image

  1. 需要排查为何生成了上GB的单个文件
  2. 二进制落盘时考虑增加大小限制
  3. 从文件中读取数据的时候,需要增加保护措施,方式读取到内存中的数据过多

NPE is thrown when running in spring-boot

A java application running under spring boot:

2019-03-14 11:46:07.202 ERROR 20364 --- [rter-1-thread-1] com.alibaba.metrics.tomcat.HttpGaugeSet  : Exception occur when getting connector global stats: 

java.lang.NullPointerException: null
  at com.alibaba.metrics.tomcat.HttpGaugeSet.getValueInternal(HttpGaugeSet.java:87)
  at com.alibaba.metrics.CachedMetricSet.refreshIfNecessary(CachedMetricSet.java:48)
  at com.alibaba.metrics.tomcat.HttpGaugeSet$HttpStatGauge.getValue(HttpGaugeSet.java:169)
  at com.alibaba.metrics.reporter.bin.StructMetricManagerReporter.report(StructMetricManagerReporter.java:123)
  at com.alibaba.metrics.reporter.MetricManagerReporter.report(MetricManagerReporter.java:249)
  at com.alibaba.metrics.reporter.MetricManagerReporter$1.run(MetricManagerReporter.java:67)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:308)
  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)

image

It looks when populating metrics, there was only 1 connector returned by calling JMXUtils.getObjectNames(globalReqProcessor), however, during runtime, there is 2 connectors returned by calling the same method.

I think the code should check for possible newly added connectors during runtime.

监控qps,Slf4jReporter如何上报metric?

目前Slf4jReporter 上报Counter是通过Counter.getCount()取值,这个值是从创建开始就一直累加的,
如果用Counter来统计qps,那上报的qps是错误的。
所以如果要监控qps,应该怎么使用?

SystemInfoUtils 路径处理问题

问题代码:

        <dependency>
            <groupId>com.alibaba.middleware</groupId>
            <artifactId>metrics-os</artifactId>
            <version>2.0.6</version>
        </dependency>
            SystemInfoUtils.init();
            SystemInfoUtils.sigar.getCpu();

错误:
java.io.IOException: 文件名、目录名或卷标语法不正确。

初步分析:

    public static void loadLib() throws IOException{
...
        URL res = SystemInfoUtils.class.getResource(resource);

        String path = res.getPath();
        if (path.indexOf("!") > 0)  {
            path = path.substring(0, path.substring(0, path.indexOf("!")).lastIndexOf("/")) + resource;
        }
        if (path.indexOf(":") > 0 && !osName.contains("Windows")) {
            path = path.substring(path.indexOf(":") + 1);
        }
        File file = new File(path);

        if (!file.exists()) {
            file.createNewFile();//IOException occurs.

引入metric-os.jar包,SystemInfoUtils.class.getResource(resource) 获得的路径形式为:
file:/D:/data/maven/repository/com/alibaba/middleware/metrics-os/2.0.6/metrics-os-2.0.6.jar!/sigar-amd64-winnt.dll
即,创建文件时,路径应将前面的file部分去掉。这部分后面的改动有点问题,麻烦确认一下这个问题。

参与共建社区

hi,有共建社区么?譬如钉钉、微信等地址。
看了下metrics,感觉目前metrics还是在脱离hsf阶段,好多diamond的东西还在上面。

线上使用fastcompass 统计cache命中率时出现很多统计错误

线上使用fastcompass 统计cache命中率时出现很多统计错误,相应的测试代码如下:
public void execute() {
long startTime = System.currentTimeMillis();
boolean isSuccess = false;
try {
TimeUnit.MICROSECONDS.sleep(1000);
//do business staff......
//isSuccess = true;
} catch (Exception e) {
//.......
} finally {
long endTime = System.currentTimeMillis();
//模拟成功率
isSuccess = startTime % 10 == 0 ? false : true;
metric.record(endTime - startTime, isSuccess);
}
}
这里用的是模拟数据,成功率应在90%。
在单线程下,这段代码输出正确。
在多线程下,线程池数量为100时,抓到的很多数据bucket_count,success_bucket_count,fail_bucket_count,success_rate都为0.0

Too much logs are outputted

When the number of metrics reaches the upper limit, a warn log will be output every time:

[ WARN ] [2019-04-11 20:00:42] 65634a0c4570ce860ea9f5410eeedb3a- metrics size > 5000, a nop metric will be returned. name: middleware.tomcat.http.request.path{path=xxx.com/isExistNumber.do}

This might introduce the risk to fulfil the disk.
Need to do some trick here. For example, output the warn message a few times and then stop.

java.lang.NullPointerException

env: window 10 :

metrics version:2.0.6

action : run the metric demo base on http(Bootstrap);

problem:
when i access :http://localhost:8006/metrics/list, get follow error
`Error handling request:
java.lang.NullPointerException
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet.collectCpuInfo(CpuUsageGaugeSet.java:118)
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet.getValueInternal(CpuUsageGaugeSet.java:145)
at com.alibaba.metrics.CachedMetricSet.refreshIfNecessary(CachedMetricSet.java:61)
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet$2.getValue(CpuUsageGaugeSet.java:194)
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet$2.getValue(CpuUsageGaugeSet.java:191)
at com.alibaba.metrics.common.NormalMetricsCollector.collect(NormalMetricsCollector.java:135)
at com.alibaba.metrics.rest.MetricsResource.buildMetricRegistry(MetricsResource.java:372)
at com.alibaba.metrics.rest.MetricsResource.buildMetricRegistry(MetricsResource.java:359)
at com.alibaba.metrics.rest.MetricsResource.listMetrics(MetricsResource.java:98)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:415)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:104)
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:277)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268)
at org.glassfish.jersey.internal.Errors.process(Errors.java:316)
at org.glassfish.jersey.internal.Errors.process(Errors.java:298)
at org.glassfish.jersey.internal.Errors.process(Errors.java:268)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:289)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:256)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:703)
at com.alibaba.metrics.rest.server.jersey.HttpHandlerContainer.handle(HttpHandlerContainer.java:133)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

`

metrics.log一行中打印了多个metric日志的问题

在使用metrics.log的时候发现有些行中打印了多个metrics的日志,这样做日志统计时会丢失数据。
如下:

image

版本信息:
metrics-core-api 1.7.9
pandora 2019-03-release-hsf-bugfix (内置1.7.9的实现)
springframework 4.3.21

支持线程池的metrics

自定义线程池,需要关注的poolSize,activeThreads,queuedTasks,completedTasks这些指标支持吗?

Compass getCount()方法的结果与compass.getSuccessCount()结果不一致

每隔一分钟输出compass的数据结果
发现getCount()方法的结果与compass.getSuccessCount()结果不一致

这个现象在我们生产环境中跑一段时间就会出现,而且概率很高,
第一分钟打印
totalCount=624 successCount=619
间隔一分钟后打印
totalCount=679,successCount=679

How can I update a Gauge?

The listener receives a Gauge and is necessary to persist in MetricManager. How can I update a Gauge inside MetricManager? At MetricRegistryImpl#register an exception "A metric named ... already exists" is thrown if I try to call MetricManager.register with the same metricName.

        CpuUsageService cpuUsageService = new CpuUsageServiceImpl(100L, 500L);
        cpuUsageService.addListener("foo.bar", cpu -> {
            SortedMap<MetricName, Gauge> gauges = MetricManager.getIMetricManager().getGauges(DUBBO_GROUP, MetricFilter.ALL);
            Gauge<Float> cpuUser = gauges.get(new MetricName("dubbo.cpu." + invoker.getUrl().getHost(), MetricLevel.MAJOR));
            if (cpuUser == null) {
                MetricName metricName = new MetricName("dubbo.cpu." + invoker.getUrl().getHost(), MetricLevel.MAJOR);
                MetricManager.register(DUBBO_GROUP, metricName, cpu);
            } else {
                //TODO: I need to update the Gauge here
            }
        });

Metrics that been cleaned cannot be queried

If user created Counter with:

Counter counter = MetricManager.getCounter("test", MetricName.build(xxx));

And hold the reference for later use.

If user did not access the counter within 1 day (by default, can be configured), it will be automatically deleted.

Further update to the counter can neither be logged nor be queried from HTTP.

MetricRegistry内存占用问题

MetricRegistry 是个map,目前代码是这么处理的,当已注册的metric数量太多时,返回一个空实现。
是否可以自动清除不活跃的metric,使得map大小合理。

quick start 描述不完整

wiki中的《快速接入指南》中maven依赖2.0.0版本,可能没有打包成功,我们idea中下载不下来这个版本的jar包;后来尝试升级为2.01时能正常依赖,但仅依赖metrics-core-api模块,后面的Counter等演示代码,不能生效。还需要依赖metrics-core-impl模块,否则会报以下错误
图片
另外,2.0.6版本依赖了metrics-core-api和metrics-core-impl模块同样会报上面这个错误。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.