alibaba / metrics Goto Github PK
View Code? Open in Web Editor NEWThe metrics library for Apache Dubbo and any frameworks or systems.
License: Apache License 2.0
The metrics library for Apache Dubbo and any frameworks or systems.
License: Apache License 2.0
public Builder withCollectLevel(CollectLevel level) {
this.collectLevel = collectLevel;
return this;
}
CollectLevel is invalid in FileMetricManagerReporter
应用程序使用Springframework 4.3框架,通过引入mertics-core-api 1.7.9包来做一些业务数据度量。
注册了几个Timer都在RT这个group下,通过http://localhost:8006/metrics/RT/去看是正常的。
"data":[
{
"interval": 5,
"metric": "com.xxx.xxx.xxx.xxx.count",
"metricLevel": "MAJOR",
"metricType": "COUNTER",
"tags": {
},
"timestamp": 1553842875299,
"value": 10
},
以下省略
}
但是通过key的名字去查询的时候 http://localhost:8006/metrics/RT/com.xxx.xxx.xxx.xxx
提示HTTP 500错误
通过level去筛选的时候,http://localhost:8006/metrics/RT/level/MAJOR
没有内容
{
"success": false,
"message": "No metric matching the specified level found!",
"timestamp": 1553843344155
}
但是我通过http://localhost:8006/metrics/RT/level/NORMAL?above=true 这种又可以查到信息了。
Hi
在使用BufferPoolMetricSet时构造函数限制传入MBeanServer
这样是否限制了BufferPoolMetricSet只能采集Agent端的指标,而不能通过JMX Client采集指标
是否应该将构造函数中的MBeanServer替换成MBeanServerConnection?
Both metrics-bin and metrics-reporter module has package com.alibaba.metrics.reporter.bin
==WARNING== allocating large array--thread_id[0x00007f77bc03d000]--thread_name[ali-metrics-pool-1-thread-5]--array_size[8589941536 bytes]--array_length[1073742690 elememts]
os_prio=0 tid=0x00007f77bc03d000 nid=0x2436 runnable
at com.alibaba.metrics.reporter.bin.zigzag.io.LongArrayOutputStream.<init>(LongArrayOutputStream.java:10)
at com.alibaba.metrics.reporter.bin.zigzag.LongCodec.decompress(LongCodec.java:49)
at com.alibaba.metrics.reporter.bin.zigzag.LongDZBP.fromBytes(LongDZBP.java:183)
at com.alibaba.metrics.server.MetricsOnDisk.getDataFromDisk(MetricsOnDisk.java:91)
at com.alibaba.metrics.server.MetricsSearchService.distinguishSearch(MetricsSearchService.java:326)
at com.alibaba.metrics.server.MetricsSearchService.search(MetricsSearchService.java:130)
at com.alibaba.metrics.server.MetricsSearchService.search(MetricsSearchService.java:104)
at com.alibaba.metrics.rest.MetricsResource.searchMetrics(MetricsResource.java:287)
at sun.reflect.GeneratedMethodAccessor877.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:151)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:171)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:152)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:104)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:406)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:350)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:106)
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:259)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:319)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:236)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1028)
at com.alibaba.metrics.rest.server.jersey.HttpHandlerContainer.handle(HttpHandlerContainer.java:122)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
at java.lang.Thread.run(Thread.java:852)
java.lang.OutOfMemoryError: Java heap space
读取属性文件的时候,没有找到,log.error,为什么没有日志输出;跑的是demo示例。
我的在demo加的日志是有日志输出的??
出现错误的地方如下:
log.error("Error when loading property file:", e);
Properties prop = new Properties();
if (configFile == null)
return prop;
InputStream input = null;
try {
input = new FileInputStream(configFile);
// load a properties file
prop.load(input);
} catch (IOException e) {
log.error("Error when loading property file:", e);
} finally {
if (input != null) {
try {
input.close();
} catch (IOException e) {
// ignore
}
}
}
return prop;
我的log4j配置:
`
log4j.rootLogger=debug, CONSOLE,DailyFile
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=DEBUG
log4j.appender.CONSOLE.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} -%-4r [%c] %-5p %x - %m%n
log4j.appender.CONSOLE.Target=System.out
log4j.appender.CONSOLE.Encoding=UTF-8
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.DailyFile=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DailyFile.File=metrics_log.log
log4j.appender.DailyFile.Encoding=UTF-8
log4j.appender.DailyFile.DatePattern=yyyy-MM-dd'.log'
log4j.appender.DailyFile.layout=org.apache.log4j.PatternLayout
log4j.appender.DailyFile.layout.ConversionPattern=[metrics_demo] %d{yyyy-MM-dd HH:mm:ss} %5p %c{1}:%L : %m%n
`
After you register a Gauge to MetricManager using com.alibaba.metrics.MetricManager#register
, there is no way to query it from MetricManager.
As an alternative, you can use MetricManager.getIMetricManager().getGauges() to get all the gauges, but it is better to provide a more simple way.
内存泄露报表:
类加载器"xxxClassLoader @ 0x760829790"加载的"com.alibaba.metrics.server.MetricsMemoryCache"实例"0x760904030"占用了1,215,172,200 (33.23%)字节.其内存主要积累在由类加载"bootstrap class loader"加载的"java.util.TreeMap$Entry"实例"0x715e43858".
/logs/metrics/bin/日期/目录下生成了上GB的单个文件,删了这些文件之后再重启应用就不FGC了,否则重启完了还是会继续FGC.
通过OQL语言查询,发现内存里面读入大量的二进制落盘long数组。
A java application running under spring boot:
2019-03-14 11:46:07.202 ERROR 20364 --- [rter-1-thread-1] com.alibaba.metrics.tomcat.HttpGaugeSet : Exception occur when getting connector global stats:
java.lang.NullPointerException: null
at com.alibaba.metrics.tomcat.HttpGaugeSet.getValueInternal(HttpGaugeSet.java:87)
at com.alibaba.metrics.CachedMetricSet.refreshIfNecessary(CachedMetricSet.java:48)
at com.alibaba.metrics.tomcat.HttpGaugeSet$HttpStatGauge.getValue(HttpGaugeSet.java:169)
at com.alibaba.metrics.reporter.bin.StructMetricManagerReporter.report(StructMetricManagerReporter.java:123)
at com.alibaba.metrics.reporter.MetricManagerReporter.report(MetricManagerReporter.java:249)
at com.alibaba.metrics.reporter.MetricManagerReporter$1.run(MetricManagerReporter.java:67)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:308)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
It looks when populating metrics, there was only 1 connector returned by calling JMXUtils.getObjectNames(globalReqProcessor)
, however, during runtime, there is 2 connectors returned by calling the same method.
I think the code should check for possible newly added connectors during runtime.
生态上有些也比较欠缺,应该是作者比较忙吧,最近正好在用这块,搞好了提交点代码给作者审核吧。
以耗时为例,p99等都是单机定时计算,然后上报出去,那么服务集群的p99怎么算?怎么算都不准吧
目前Slf4jReporter 上报Counter是通过Counter.getCount()取值,这个值是从创建开始就一直累加的,
如果用Counter来统计qps,那上报的qps是错误的。
所以如果要监控qps,应该怎么使用?
metrics-core和 reporter 两个module里大量代码直接粘贴复制了开源项目里代码,为啥不直接依赖进来
问题代码:
<dependency>
<groupId>com.alibaba.middleware</groupId>
<artifactId>metrics-os</artifactId>
<version>2.0.6</version>
</dependency>
SystemInfoUtils.init();
SystemInfoUtils.sigar.getCpu();
错误:
java.io.IOException: 文件名、目录名或卷标语法不正确。
初步分析:
public static void loadLib() throws IOException{
...
URL res = SystemInfoUtils.class.getResource(resource);
String path = res.getPath();
if (path.indexOf("!") > 0) {
path = path.substring(0, path.substring(0, path.indexOf("!")).lastIndexOf("/")) + resource;
}
if (path.indexOf(":") > 0 && !osName.contains("Windows")) {
path = path.substring(path.indexOf(":") + 1);
}
File file = new File(path);
if (!file.exists()) {
file.createNewFile();//IOException occurs.
引入metric-os.jar包,SystemInfoUtils.class.getResource(resource) 获得的路径形式为:
file:/D:/data/maven/repository/com/alibaba/middleware/metrics-os/2.0.6/metrics-os-2.0.6.jar!/sigar-amd64-winnt.dll
即,创建文件时,路径应将前面的file部分去掉。这部分后面的改动有点问题,麻烦确认一下这个问题。
hi,有共建社区么?譬如钉钉、微信等地址。
看了下metrics,感觉目前metrics还是在脱离hsf阶段,好多diamond的东西还在上面。
线上使用fastcompass 统计cache命中率时出现很多统计错误,相应的测试代码如下:
public void execute() {
long startTime = System.currentTimeMillis();
boolean isSuccess = false;
try {
TimeUnit.MICROSECONDS.sleep(1000);
//do business staff......
//isSuccess = true;
} catch (Exception e) {
//.......
} finally {
long endTime = System.currentTimeMillis();
//模拟成功率
isSuccess = startTime % 10 == 0 ? false : true;
metric.record(endTime - startTime, isSuccess);
}
}
这里用的是模拟数据,成功率应在90%。
在单线程下,这段代码输出正确。
在多线程下,线程池数量为100时,抓到的很多数据bucket_count,success_bucket_count,fail_bucket_count,success_rate都为0.0
When the number of metrics reaches the upper limit, a warn log will be output every time:
[ WARN ] [2019-04-11 20:00:42] 65634a0c4570ce860ea9f5410eeedb3a- metrics size > 5000, a nop metric will be returned. name: middleware.tomcat.http.request.path{path=xxx.com/isExistNumber.do}
This might introduce the risk to fulfil the disk.
Need to do some trick here. For example, output the warn message a few times and then stop.
env: window 10 :
metrics version:2.0.6
action : run the metric demo base on http(Bootstrap);
problem:
when i access :http://localhost:8006/metrics/list, get follow error
`Error handling request:
java.lang.NullPointerException
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet.collectCpuInfo(CpuUsageGaugeSet.java:118)
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet.getValueInternal(CpuUsageGaugeSet.java:145)
at com.alibaba.metrics.CachedMetricSet.refreshIfNecessary(CachedMetricSet.java:61)
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet$2.getValue(CpuUsageGaugeSet.java:194)
at com.alibaba.metrics.os.windows.CpuUsageGaugeSet$2.getValue(CpuUsageGaugeSet.java:191)
at com.alibaba.metrics.common.NormalMetricsCollector.collect(NormalMetricsCollector.java:135)
at com.alibaba.metrics.rest.MetricsResource.buildMetricRegistry(MetricsResource.java:372)
at com.alibaba.metrics.rest.MetricsResource.buildMetricRegistry(MetricsResource.java:359)
at com.alibaba.metrics.rest.MetricsResource.listMetrics(MetricsResource.java:98)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:415)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:104)
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:277)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268)
at org.glassfish.jersey.internal.Errors.process(Errors.java:316)
at org.glassfish.jersey.internal.Errors.process(Errors.java:298)
at org.glassfish.jersey.internal.Errors.process(Errors.java:268)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:289)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:256)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:703)
at com.alibaba.metrics.rest.server.jersey.HttpHandlerContainer.handle(HttpHandlerContainer.java:133)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
`
can not pass UT under jdk11
Ensure the metrics can be exported to Spring micrometer.
For example, HttpGaugeSet, and BufferPoolMetricSet, and possible other files should be added with license header.
真心不希望 这个项目就这样不维护了
timer中能否添加一分钟内出现错误的次数统计?
这样的话 我就只需要用一个timer 就可以统计出我所需要的数据了 统计错误次数的原因是 方便监控以及预警
自定义线程池,需要关注的poolSize,activeThreads,queuedTasks,completedTasks这些指标支持吗?
When calculating cluster histogram, it is usually difficult to config the correct bucket for the first time, we should provide a HTTP endpoint to update the buckets dynamically.
Is there specific plane to support dynamic parameter in annotation?
Metrics命名规范:“不要使用太多的tag, 一般而言4-5个已经足够”,假设我想根据某个应用指标比如A,根据值进行统计,这个A指标有几十个不同的值,例如值为1-100,想统计调用里1的调用qps,2的统计qps.....,如果使用tag的话,则为*.qps{A=1,A=2...},这个是否支持呢,是否有其它比较好方式呢?
Rest 服务是 JAX-RS 实现,该怎么集成到 Spring MVC
每隔一分钟输出compass的数据结果
发现getCount()方法的结果与compass.getSuccessCount()结果不一致
这个现象在我们生产环境中跑一段时间就会出现,而且概率很高,
第一分钟打印
totalCount=624 successCount=619
间隔一分钟后打印
totalCount=679,successCount=679
支持prometheus吗?
你好 请问一下Dubbo Metrics支持A应用采集B应用的JVM指标信息吗
在B应用开放JMX端口的情况下
The listener receives a Gauge and is necessary to persist in MetricManager. How can I update a Gauge inside MetricManager? At MetricRegistryImpl#register an exception "A metric named ... already exists" is thrown if I try to call MetricManager.register with the same metricName.
CpuUsageService cpuUsageService = new CpuUsageServiceImpl(100L, 500L);
cpuUsageService.addListener("foo.bar", cpu -> {
SortedMap<MetricName, Gauge> gauges = MetricManager.getIMetricManager().getGauges(DUBBO_GROUP, MetricFilter.ALL);
Gauge<Float> cpuUser = gauges.get(new MetricName("dubbo.cpu." + invoker.getUrl().getHost(), MetricLevel.MAJOR));
if (cpuUser == null) {
MetricName metricName = new MetricName("dubbo.cpu." + invoker.getUrl().getHost(), MetricLevel.MAJOR);
MetricManager.register(DUBBO_GROUP, metricName, cpu);
} else {
//TODO: I need to update the Gauge here
}
});
这里单位转换我无法理解,我感觉正确的应该是if (TimeUnit.SECONDS.equals(timestampPrecision))
,这样的代码整个库中有好几处,可以全局查找关键字timestampPrecision
,不知道是不是BUG呢?
codecov test coverage data is not collected,
Please reference to this project:
https://github.com/lovepoem/codecov-travis-maven-junit5-example
The metrics data should be collectable by prometheus.
跟dubbo没有直接关系。
If user created Counter with:
Counter counter = MetricManager.getCounter("test", MetricName.build(xxx));
And hold the reference for later use.
If user did not access the counter within 1 day (by default, can be configured), it will be automatically deleted.
Further update to the counter
can neither be logged nor be queried from HTTP.
MetricRegistry 是个map,目前代码是这么处理的,当已注册的metric数量太多时,返回一个空实现。
是否可以自动清除不活跃的metric,使得map大小合理。
Counter是一直不停累加,如果需要按period累加,如何实现呢?比始,对每分钟的数据进行sum,再在每分钟上sum的基础上,对每小时的数据sum。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.