A performance monitor for Storm topologies. Consists of:
-
A TaskHook that extracts topology information and manages local metrics
-
A simple HTTP server that can report the last 100 seconds of gathered metrics
-
A set of configurable charts that understand the metrics schema and which can help identify performance problems in the monitored topology:
- A histogram that shows pending tuple counts (e.g. the backlog) on each stream in the topology
- A "Share of Voice" chart that shows the pending tuple counts for a given stream for each of a component's tasks
- A "Share of Voice" chart that shows the relative size of the backlogs on all streams for the topology
So for this you might see this
HashMap map = new HashMap(); map.put("storm.monitor.host", a_String_IP_address); map.put("storm.monitor.port", an_Integer_Port_Number); map.put("storm.monitor.start", System.currentTimeMillis()); // should be the same for all calls map.put("storm.monitor.bucketsize", 1000L); MonitorClient mclient = MonitorClient.forConfig(conf); ... // There are multiple metric groups, each with multiple metrics. // Components have names and multiple instances, each of which has an integer ID mclient.declare(metricGroup,metric,task_id,component_id); ... mclient.increment(metricGroup,metric, 1L , task_id);
You can also add the MonitorClient to all spouts and bolts in a topology with this bit of code in the driver class before submitting the cluster:
TaskHook.registerTo(config);
where variable config is of type backtype.storm.Config and contains the properties described above.
See web/monitor-web in this repo for sample HTML and the Javascript it requires.