Giter Club home page Giter Club logo

linkedin / cruise-control Goto Github PK

View Code? Open in Web Editor NEW
2.7K 132.0 574.0 13.57 MB

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.

Home Page: https://github.com/linkedin/cruise-control/tags

License: BSD 2-Clause "Simplified" License

Java 98.21% Shell 0.12% Groovy 0.03% Python 1.56% HTML 0.08%
kafka cluster-management self-healing

cruise-control's People

Contributors

abhishekmendhekar avatar amuraru avatar bachmanity1 avatar becketqin avatar ccisgg avatar efeg avatar egyedt avatar emilyki avatar gergowilder avatar hshi2022 avatar ilievladiulian avatar jiao-zhangs avatar kismob avatar kyguy avatar linehrr avatar lmr3796 avatar matthinograb avatar mgrubent avatar mhratson avatar mimaison avatar mohitpali avatar nareshv avatar stanislavkozlovski avatar sudoa avatar tomncooper avatar urbandan avatar viktorsomogyi avatar vtomasr5 avatar will-lo avatar zornhsu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cruise-control's Issues

BAD GET Request for rest API - /proposal

Here is /state output which shows all goals are ready along with Proposal

{MonitorState: {state: RUNNING(11.600% trained), NumValidWindows: (1/1) (100.000%), NumValidPartitions: 306/306 (100.000%), flawedPartitions: 0}, ExecutorState: {state: NO_TASK_IN_PROGRESS}, AnalyzerState: {isProposalReady: true, ReadyGaols: [NetworkInboundUsageDistributionGoal, ReplicaDistributionGoal, LeaderBytesInDistributionGoals, CpuUsageDistributionGoal, TopicReplicaDistributionGoal, DiskUsageDistributionGoal, PotentialNwOutGoal, NetworkOutboundUsageDistributionGoal, RackAwareCapacityGoal]}}

Monitored Windows:
{1504614600000=100.000%}

Goal Readiness:
                             RackAwareCapacityGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), Ready
                                PotentialNwOutGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), Ready
                         DiskUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=true), Ready
               NetworkInboundUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), Ready
              NetworkOutboundUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), Ready
                          CpuUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), Ready
                    LeaderBytesInDistributionGoals, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), Ready
                      TopicReplicaDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), Ready
                           ReplicaDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), Ready

But when I execute /proposal REST API, get BAD REQUEST

http://localhost:9090/kafkacruisecontrol/proposal?&verbose=true&ignore_proposal_cache=false&withAvailableValidWindows=true&withAvailableValidPartitions=true

http://localhost:9090/kafkacruisecontrol/proposal?goals=DiskUsageDistributionGoal&verbose=true&ignore_proposal_cache=false&withAvailableValidWindows=true&withAvailableValidPartitions=true

Bad GET request '/proposal'

have a folder for user libs

instead of -jars PATH_TO_YOUR_JAR_1,PATH_TO_YOUR_JAR_2, just drop user code (including transitive libs) under /contrib or some such folder and have it automatically be on the classpath

Custom Sampler ( & Reporter)

If i understand correctly the default way is to use the CruiseControlMetricsReporter but it should be possible to use a different one.
For Example, if already have metrics in prometheus/graphite, i would need to write a different metric sampler for cruise-control...?

  • What metrics are needed?
  • How would one write a different sampler?
  • Is there documentation to do this?

MonitorState NULL?

When I start cruise control, following log info appears constantly:

[2017-09-13 16:43:54,618] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-13 16:44:24,618] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-13 16:44:54,619] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-13 16:45:24,619] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-13 16:45:54,620] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-13 16:46:24,620] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-13 16:46:54,621] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)

Here is my browser's result , url address is localhost:8094/kafkacruisecontrol/state
{MonitorState: {state: RUNNING(0.000% trained), NumValidWindows: (0/0) (NaN%), NumValidPartitions: 0/4129 (0.000%), flawedPartitions: 0}, ExecutorState: {state: NO_TASK_IN_PROGRESS}, AnalyzerState: {isProposalReady: false, ReadyGaols: []}}

I can't get any useful information about my kafka state, what's wrong ?

kafka broker disconnected

this is log:
wechatimg7
and this is my kafka server.properties:
wechatimg8
i sure my kafka server is normal running
kafka version:0.10.1

Json Response From the Servlet Should Include a Version Field

The current version of the Cruise Control enables receiving JSON responses from the servlet by setting the json=true. However, the returned JSON response does not include a version field. This makes the applications using JSON responses prone to failures upon changes in the response format.

The JSON response should specify a version field in the response.

cannot connect to Bootstrap broker

[2017-08-30 19:01:21,153] INFO Initiating client connection, connectString=172.16.22.16:2181/ sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@40e6dfe1 (org.apache.zookeeper.ZooKeeper)
[2017-08-30 19:01:21,171] INFO Opening socket connection to server 172.16.22.16/172.16.22.16:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2017-08-30 19:01:21,182] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2017-08-30 19:01:21,188] INFO Socket connection established to 172.16.22.16/172.16.22.16:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2017-08-30 19:01:21,251] INFO Session establishment complete on server 172.16.22.16/172.16.22.16:2181, sessionid = 0x35e1785afbac94c, negotiated timeout = 30000 (org.apache.zookeeper.ClientCnxn)
[2017-08-30 19:01:21,252] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2017-08-30 19:01:21,498] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:21,704] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:21,910] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:22,116] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:22,321] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:22,526] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:22,730] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-30 19:01:22,934] WARN Bootstrap broker 172.16.22.16:9092 disconnected (org.apache.kafka.clients.NetworkClient)

kafka-0.8.2.2

kafka-0.8.2.2 is not supported?

=======
[2017-08-30 15:47:04,626] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
Exception in thread "main" org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'brokers': Error reading field 'host': Error reading string of length 12592, only 2017 bytes available
at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73)
at org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:380)
at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:195)
at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:253)
at org.apache.kafka.clients.consumer.internals.Fetcher.getAllTopicMetadata(Fetcher.java:233)
at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1340)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.KafkaSampleStore.ensureTopicCreated(KafkaSampleStore.java:125)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.KafkaSampleStore.configure(KafkaSampleStore.java:104)
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:207)
at com.linkedin.kafka.cruisecontrol.monitor.task.LoadMonitorTaskRunner.(LoadMonitorTaskRunner.java:86)
at com.linkedin.kafka.cruisecontrol.monitor.task.LoadMonitorTaskRunner.(LoadMonitorTaskRunner.java:61)
at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.(LoadMonitor.java:112)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.(KafkaCruiseControl.java:75)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:47)

Failed to start cruise control with parameter -daemon

[cruise-control]$ ./kafka-cruise-control-start.sh
USAGE: ./kafka-cruise-control-start.sh [-daemon] [-name servicename] [-loggc] [-jars jar_path] config_path [port]
[cruise-control]$ ./kafka-cruise-control-start.sh -daemon config/cruisecontrol.properties
Enabling Java debug options: -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005
./kafka-cruise-control-start.sh: line 182: : No such file or directory
[cruise-control]$

Improve the behavior of MetricSampleAggregator.

Currently the metric sample aggregator provides the aggregated metrics in the following two ways:

  1. Available Valid Partitions - A cluster model with all the topics whose partitions are valid in all the snapshot windows.
  2. Available Recent Valid Snapshot Windows - A cluster model with the most recent continuous valid snapshot windows (A snapshot window is valid if its monitored partitions percentage meets the model completeness requirements of all the predefined goals. This means that if the latest window is invalid, the cluster model will not have even one valid snapshot window.

We should change the behavior of case 2 to just include all the valid snapshot windows.

Add options to __KafkaCruiseControl* topic creation

We're using log compaction by default. To make cruise control work we have to set the cleanup.policy to "delete".
It would be great if we can specify this (and maybe other topic configs) inside cruise control so they're set by default on topic creation.

NotEnoughValidSnapshotsException

## All the get APIs (except state) are throwing this:

com.linkedin.kafka.cruisecontrol.exception.NotEnoughValidSnapshotsException: The cluster model generation requires 1 snapshots with minimum monitored partitions percentage of 0.0. But there are only 0 valid snapshots available.
at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.aggregateMetrics(LoadMonitor.java:511)
at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.clusterModel(LoadMonitor.java:379)
at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.clusterModel(LoadMonitor.java:347)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.clusterModel(KafkaCruiseControl.java:219)
at com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.getClusterLoad(KafkaCruiseControlServlet.java:349)
at com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.doGet(KafkaCruiseControlServlet.java:154)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:535)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:564)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
at org.eclipse.jetty.util.thread.Invocable.invokePreferred(Invocable.java:128)
at org.eclipse.jetty.util.thread.Invocable$InvocableExecutor.invoke(Invocable.java:222)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:294)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:199)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:673)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:591)
at java.lang.Thread.run(Thread.java:745)

## State API is giving this even after 2 hours of start:
{MonitorState: {state: RUNNING(0.000% trained), NumValidWindows: (0/0) (NaN%), NumValidPartitions: 0/288 (0.000%), flawedPartitions: 0}, ExecutorState: {state: NO_TASK_IN_PROGRESS}, AnalyzerState: {isProposalReady: false, ReadyGoals: []}}

I read the old threads, have already checked all that is mentioned there:

  • __CruiseControlMetrics topic is present
  • cruise-control-metrics-reporter.jar is present on kafka brokers
  • metric.reporters has been set to com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter

Please help me with what should I do now to get snapshots.

Implement better errorhandling within CruiseControlMetricsReporter

We are looking to use CruiseControl for our next cluster expansion. We actively started working on getting the CruiseControl setup in our development environment.

1.) CruiseControl is expecting manual broker.id assignments and failing to startup when it is set to -1.

2.) We missed creating the __CruiseControlMetrics topic before starting the brokers with reporter enabled. Reviewing current CC issues helped as Issue #21 and debug logging made us realize what might be happening. Improved documentation probably will have helped us understand the Topic should be created prior to enabling the metrics reported on the broker. I have created the topic now, but, would also help if documentation can clarify what topic config such as retention period, number of partitions should be used.

3.) Better errorhandling would ensure missing topic , it may be better for the reporter to check that the topic exists and log an error. Currently it keeps logging excessively that topic is missing.

HTTP REST API timeout

I am trying to test cruise-control with 1 kafka broker. I use default settings without optional step 0 in QuickStart guide. Cruise control starts but HTTP endpoint (http://localhost:9090/kafkacruisecontrol/state) is timing out. How I can be sure cruise control is working and use his REST API?
Port is open:
telnet localhost 9090
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

In the browser:
ERR_EMPTY_RESPONSE
or timeout in curl
curl http://localhost:9090/kafkacruisecontrol/state

The logs from the server is:
[2017-09-01 17:44:58,303] INFO DefaultSessionIdManager workerName=node0 (org.eclipse.jetty.server.session)
[2017-09-01 17:44:58,303] INFO No SessionScavenger set, using defaults (org.eclipse.jetty.server.session)
[2017-09-01 17:44:58,307] INFO Scavenging every 600000ms (org.eclipse.jetty.server.session)
[2017-09-01 17:44:58,316] INFO Started o.e.j.s.ServletContextHandler@2d0bfb24{/,null,AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler)
[2017-09-01 17:44:58,326] INFO Started ServerConnector@4525d1d3{HTTP/1.1,[http/1.1]}{0.0.0.0:9090} (org.eclipse.jetty.server.AbstractConnector)
[2017-09-01 17:44:58,327] INFO Started @2168ms (org.eclipse.jetty.server.Server)
Kafka Cruise Control started.

2017/09/01 17:45:06 {MonitorState: {state: LOADING(0.000% trained), LoadingProgress: -100.000%, NumValidWindows: (0/0): (NaN%), NumValidPartitions: 0/65 (0.000%), FlawedPartitions: 0}, ExecutorState: {state: NO_TASK_IN_PROGRESS}, AnalyzerState: {isProposalReady: false, ReadyGaols: []}}

2017-09-01 17:50:28,271] INFO Skipping best proposal precomputing because load monitor is in LOADING state. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-01 17:50:33,216] INFO Sample loading finished. Loaded 0 partition metrics samples and 0 broker metric samples in 335021 ms (com.linkedin.kafka.cruisecontrol.monitor.sampling.KafkaSampleStore)

Null Borker_id Exception

Hi, we are testing CruiseControl on our production Kafka clusters, and got following error message:

[2017-12-27 22:42:16,678] FATAL (kafka.Kafka$)
java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:542)
at java.lang.Integer.parseInt(Integer.java:615)
at com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter.configure(CruiseControlMetricsReporter.java:111)
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstances(AbstractConfig.java:239)
at kafka.server.KafkaConfig.(KafkaConfig.scala:925)
at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:779)
at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:776)
at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:28)
at kafka.Kafka$.main(Kafka.scala:58)
at kafka.Kafka.main(Kafka.scala)

By look at source code, seems CruiseControlMetricsReporter needs to get Broker_id from KafkaConfig object, which reads from server.properties file.

The problem is our production Kafka cluster uses automatic Broker-ids generated by ZooKeeper, so there is no such "broker.id" attribute in the server.properties file, which causes the FATAL error above. Any suggestions to unblock ourselves in this scenario? Thanks!

Ren

[Kafka 0.10.2] CruiseControlMetricsReporter cannot be cast to kafka.metrics.KafkaMetricsReporter

Full stack

[2017-09-01 21:56:47,140] 0    [main] FATAL kafka.Kafka$  -  java.lang.ClassCastException: 
com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter cannot be cast to 
kafka.metrics.KafkaMetricsReporter
at k 
kafka.metrics.KafkaMetricsReporter$$anonfun$startReporters$1.apply(KafkaMetricsReporter.scala:65)
at 
kafka.metrics.KafkaMetricsReporter$$anonfun$startReporters$1.apply(KafkaMetricsReporter.scala:64)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at kafka.metrics.KafkaMetricsReporter$.startReporters(KafkaMetricsReporter.scala:64)
at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:27)
at kafka.Kafka$.main(Kafka.scala:58)
at kafka.Kafka.main(Kafka.scala)

metrics config:

kafka.metrics.reporters=com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter

Shared lib location:

/usr/share/java/kafka/

Does cruise-control support securityProtocol SASL_PLAINTEXT

I have error , when run ./kafka-cruise-control-start.sh config/cruisecontrol.properties 9999
(com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfig)
Exception in thread "main" java.lang.IllegalArgumentException: clientSaslMechanism must be non-null in client mode if securityProtocol is SASL_PLAINTEXT
at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:59)

i'm trying set clientSaslMechanism=GSSAPI in config but this did not help

Cruise Control cannot find sampling topic matches __CruiseControlMetrics

i got this error when i try to start Cruise Control:

[2017-09-06 09:48:26,651] INFO Kafka version : 0.10.1.0 (org.apache.kafka.common.utils.AppInfoParser)
[2017-09-06 09:48:26,651] INFO Kafka commitId : 3402a74efb23d1d4 (org.apache.kafka.common.utils.AppInfoParser)
Exception in thread "main" java.lang.IllegalStateException: Cruise Control cannot find sampling topic matches __CruiseControlMetrics in the target cluster.
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsReporterSampler.configure(CruiseControlMetricsReporterSampler.java:175)
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:207)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcherManager.(MetricFetcherManager.java:110)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcherManager.(MetricFetcherManager.java:62)
at com.linkedin.kafka.cruisecontrol.monitor.task.LoadMonitorTaskRunner.(LoadMonitorTaskRunner.java:61)
at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.(LoadMonitor.java:112)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.(KafkaCruiseControl.java:75)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:47)

did i miss some configurations or should i create this topic manually?

Add an option to blacklist topics from cruise control.

Hi,

we're using https://github.com/andreas-schroeder/kafka-health-check which creates several kafka topics. One broker health topic with one partition assigned to the broker and a "replication-check" topic with one replica per broker.
The problem is - when using the rack-aware goal, cruise control catches an exception because there are more replicas than racks.
Because of this it would be great to be able to blacklist whole topics from cruise control. Blacklisting the topic from partition movement is not sufficient here because the rack-aware goal still throws exceptions.

error while generating report

[2017-09-14 08:40:28,081] INFO Skipping best proposal precomputing because load monitor does not have enough snapshots. (com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer)
[2017-09-14 08:40:33,594] ERROR Error building partition metric sample for test-0 (com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor)
java.lang.IllegalArgumentException: The partition bytes out rate 730.200000 is greater than the broker bytes out rate 13.832645
at com.linkedin.kafka.cruisecontrol.model.ModelUtils.estimateLeaderCpuUtil(ModelUtils.java:64)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor.buildPartitionMetricSample(CruiseControlMetricsProcessor.java:222)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor.process(CruiseControlMetricsProcessor.java:73)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsReporterSampler.getSamples(CruiseControlMetricsReporterSampler.java:114)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.SamplingFetcher.fetchSamples(SamplingFetcher.java:89)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.SamplingFetcher.fetchMetricsForAssignedPartitions(SamplingFetcher.java:72)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcher.call(MetricFetcher.java:24)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcher.call(MetricFetcher.java:16)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2017-09-14 08:40:33,597] ERROR Error building partition metric sample for __CruiseControlMetrics-0 (com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor)
java.lang.IllegalArgumentException: The partition bytes in rate 22.108129 is greater than the broker bytes in rate 13.282005
at com.linkedin.kafka.cruisecontrol.model.ModelUtils.estimateLeaderCpuUtil(ModelUtils.java:60)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor.buildPartitionMetricSample(CruiseControlMetricsProcessor.java:222)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor.process(CruiseControlMetricsProcessor.java:73)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsReporterSampler.getSamples(CruiseControlMetricsReporterSampler.java:114)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.SamplingFetcher.fetchSamples(SamplingFetcher.java:89)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.SamplingFetcher.fetchMetricsForAssignedPartitions(SamplingFetcher.java:72)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcher.call(MetricFetcher.java:24)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcher.call(MetricFetcher.java:16)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2017-09-14 08:40:33,598] INFO Generated 114(2 skipped) partition metric samples and 1 broker metric samples for timestamp 1505392789111 (com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsProcessor)

cannot find sampling topic matches __CruiseControlMetrics in the target cluster.

[2017-09-04 13:37:11,377] INFO ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = latest
bootstrap.servers = [172.31.217.202:9092, 172.31.217.203:9092, 172.31.217.204:9092]
check.crcs = true
client.id = CruiseControlMetricsReporterSamplerConsumer-1417835929
connections.max.idle.ms = 540000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = CruiseControlMetricsReporterSampler-6370840339601264904
heartbeat.interval.ms = 3000
interceptor.classes = null
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 2147483647
max.poll.records = 2147483647
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class com.linkedin.kafka.cruisecontrol.metricsreporter.metric.MetricSerde
(org.apache.kafka.clients.consumer.ConsumerConfig)
[2017-09-04 13:37:11,390] INFO Kafka version : 0.10.1.0 (org.apache.kafka.common.utils.AppInfoParser)
[2017-09-04 13:37:11,391] INFO Kafka commitId : 3402a74efb23d1d4 (org.apache.kafka.common.utils.AppInfoParser)
Exception in thread "main" java.lang.IllegalStateException: Cruise Control cannot find sampling topic matches __CruiseControlMetrics in the target cluster.
at com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsReporterSampler.configure(CruiseControlMetricsReporterSampler.java:175)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcherManager.(MetricFetcherManager.java:111)
at com.linkedin.kafka.cruisecontrol.monitor.sampling.MetricFetcherManager.(MetricFetcherManager.java:62)
at com.linkedin.kafka.cruisecontrol.monitor.task.LoadMonitorTaskRunner.(LoadMonitorTaskRunner.java:61)
at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.(LoadMonitor.java:112)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.(KafkaCruiseControl.java:75)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:47)

My kafka version: kafka_2.11-0.11.0.0
kafka topice:
__CruiseControlMetrics
test

Failed to start kafka server with latest cruise control metrics reporter

I noticed some improvement recently, so I git cloned the latest cruise control, and get the metrics reporter after ./gradlew jar
I replaced the old one with the new reporter, and then failed to boot kafka anymore unless I comment out the report in server.properties. The kafka I'm using is kafka_2.11-0.11.0.0

[2017-10-12 16:35:33,939] INFO Socket connection established to xxxxxxx/xxxxxx:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2017-10-12 16:35:33,949] INFO Session establishment complete on server 10.27.235.137/10.27.235.137:2181, sessionid = 0x15ebd36d0090043, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2017-10-12 16:35:33,951] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2017-10-12 16:35:34,067] INFO Cluster ID = hMzPvbduRpmD0ucSMOu5og (kafka.server.KafkaServer)
[2017-10-12 16:35:34,083] INFO Using default value of localhost:9090 for cruise.control.metrics.reporter.bootstrap.servers (com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter)
[2017-10-12 16:35:34,083] INFO Using default value of PLAINTEXT for cruise.control.metrics.reporter.security.protocol (com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter)
[2017-10-12 16:35:34,089] FATAL [Kafka Server 1], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.config.ConfigException: Missing required configuration "bootstrap.servers" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:463)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:453)
at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:62)
at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:75)
at org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:359)
at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:287)
at com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter.configure(CruiseControlMetricsReporter.java:109)
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstances(AbstractConfig.java:297)
at kafka.server.KafkaServer.startup(KafkaServer.scala:202)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:65)
at kafka.Kafka.main(Kafka.scala)
[2017-10-12 16:35:34,093] INFO [Kafka Server 1], shutting down (kafka.server.KafkaServer)
[2017-10-12 16:35:34,096] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2017-10-12 16:35:34,101] INFO EventThread shut down for session: 0x15ebd36d0090043 (org.apache.zookeeper.ClientCnxn)
[2017-10-12 16:35:34,101] INFO Session: 0x15ebd36d0090043 closed (org.apache.zookeeper.ZooKeeper)
[2017-10-12 16:35:34,104] INFO [Kafka Server 1], shut down completed (kafka.server.KafkaServer)
[2017-10-12 16:35:34,104] FATAL Exiting Kafka. (kafka.server.KafkaServerStartable)
[2017-10-12 16:35:34,107] INFO [Kafka Server 1], shutting down (kafka.server.KafkaServer)

violation and failure detection log REST end point

Its good to have a chronological list of events which can be made available as a REST end point in CC.

Right now, we can have custom notifications via emails for

1 goal violation detection
2 broker failures, etc

If we can hit this /kafkacruisecontrol/events, we should be able to see the most recent things that the CC was able to detect on a broker cluster.

Error Response From the Servlet Should Be Standardized

In some cases the servlet returns an error message in case of a failure. However, in some other cases it throws an exception. The behavior during a failure should be standardized -- i.e. do not throw exceptions and return a well-defined error response in case of failures.

Centralization schedule question

Hello,I have a question:
i think it is impossible for u to haven only one server to deploy cruise-control。
if i have serval servers to deploy cruise-control,
all of the servers will monitor,analyzer,detector,finally
execute the preposals。if so,the preposals will be executed
more than once?
i want to know how do you handle this in your product environment

Add peak load information to the cluster model

As of now we only use the average utilization to decide how to move partitions and leaders. We should include the peak information into the cluster model as well.

So we need another goal to optimize the workload during peak time, especially for the CPU and network IO.

restart kafka server meet error

after set the metric.reporters to com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter

i restart the kafka server, meet the problem as below,

[2017-10-11 10:21:52,359] FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) org.apache.kafka.common.config.ConfigException: Missing required configuration "bootstrap.servers" which has no default value. at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:463) at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:453) at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:62) at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:75) at org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:360) at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:288) at com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter.configure(CruiseControlMetricsReporter.java:109) at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstances(AbstractConfig.java:297) at kafka.server.KafkaServer.startup(KafkaServer.scala:202) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38) at kafka.Kafka$.main(Kafka.scala:65) at kafka.Kafka.main(Kafka.scala)

i add the bootstrap.servers in config/server.properties will not work.
how to solve it. kafka problem or version problem or sth else?

FAILURE: Build failed with an exception

15:06:33.980 [ERROR] [system.err] /home/winson/cruise-control/cruise-control-metrics-reporter/src/main/java/com/linkedin/kafka/cruisecontrol/metricsreporter/CruiseControlMetricsReporter.java:182: error: local variable ccm is accessed from within inner class; needs to be declared final
15:06:33.981 [ERROR] [system.err] LOG.debug("Failed to send cruise control metric {}", ccm);
15:06:33.981 [ERROR] [system.err] ^
15:06:33.989 [ERROR] [system.err] /home/winson/cruise-control/cruise-control-metrics-reporter/src/main/java/com/linkedin/kafka/cruisecontrol/metricsreporter/metric/BrokerMetric.java:56: error: cannot find symbol
15:06:33.989 [ERROR] [system.err] Long.BYTES /* time / + Integer.BYTES / broker id / +
15:06:33.989 [ERROR] [system.err] ^
15:06:33.990 [ERROR] [system.err] symbol: variable BYTES
15:06:33.990 [ERROR] [system.err] location: class Long
15:06:33.990 [ERROR] [system.err] /home/winson/cruise-control/cruise-control-metrics-reporter/src/main/java/com/linkedin/kafka/cruisecontrol/metricsreporter/metric/BrokerMetric.java:56: error: cannot find symbol
15:06:33.990 [ERROR] [system.err] Long.BYTES /
time / + Integer.BYTES / broker id */ +
15:06:33.990 [ERROR] [system.err] ^
15:06:33.991 [ERROR] [system.err] symbol: variable BYTES
15:06:33.991 [ERROR] [system.err] location: class Integer
15:06:33.994 [ERROR] [system.err] /home/winson/cruise-control/cruise-control-metrics-reporter/src/main/java/com/linkedin/kafka/cruisecontrol/metricsreporter/metric/BrokerMetric.java:57: error: cannot find symbol
15:06:33.994 [ERROR] [system.err]


please help me !

java.lang.IllegalArgumentException: Malformed \uxxxx encoding

All i get is this error when i run it. Please help me out

[niraj@localhost cruise-control-master]$ ./kafka-cruise-control-start.sh /home/niraj/Workspace/kafka/Test/out/artifacts/Test_jar/Test.jar config/cruisecontrol.properties
Exception in thread "main" java.lang.IllegalArgumentException: Malformed \uxxxx encoding.
at java.util.Properties.loadConvert(Properties.java:574)
at java.util.Properties.load0(Properties.java:391)
at java.util.Properties.load(Properties.java:341)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:35)
[niraj@localhost cruise-control-master]$

unable to start kafka-cruise-control-start

hello
when i try to run the following command
./kafka-cruise-control-start.sh [-jars PATH_TO_YOUR_JAR_1,PATH_TO_YOUR_JAR_2] config/cruisecontrol.properties [port]

I dont know wich Jars files i need to put in the command line !
PATH_TO_YOUR_JAR_1,PATH_TO_YOUR_JAR_2 : what represent this jars ? and where i can find them ?

0.000% trained & valid partitions

Hi, I'm new to Cruise Control, and trying to run it for our production Kafka clusters. I have followed all the steps in wiki, and got following info from REST API:
curl -X GET host-running-cruisecontrol/kafkacruisecontrol/state?verbose=true

{MonitorState: {state: RUNNING(0.000% trained), NumValidWindows: (0/0) (NaN%), NumValidPartitions: 0/781 (0.000%), flawedPartitions: 0}, ExecutorState: {state: NO_TASK_IN_PROGRESS}, AnalyzerState: {isProposalReady: false, ReadyGoals: []}}

Monitored Windows:
{}

Goal Readiness:
RackAwareGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), NotReady
ReplicaCapacityGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), NotReady
CpuCapacityGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=true), NotReady
DiskCapacityGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=true), NotReady
NetworkInboundCapacityGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=true), NotReady
NetworkOutboundCapacityGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=true), NotReady
PotentialNwOutGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), NotReady
DiskUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=true), NotReady
NetworkInboundUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), NotReady
NetworkOutboundUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), NotReady
CpuUsageDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), NotReady
LeaderBytesInDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.950, includedAllTopics=false), NotReady
TopicReplicaDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), NotReady
ReplicaDistributionGoal, (requiredNumSnapshots=1, minMonitoredPartitionPercentage=0.000, includedAllTopics=true), NotReady

There's no snapshots being taken, and it kept this state for hours. Seems that I didn't set up the service correctly... Could anyone give me some hint?

Thanks!
Ren

The goal for a proposal says fixed when it is not

There are 2 issues

  1. The goal shows that the status (FIXED) but the replica's are not moved. I have enabled self healing
  2. When a post request is issued - /kafkacruisecontrol/rebalance?goals=ReplicaDistributionGoal&dryrun=true. The goal shows as VIOLATED. This is very confusing for the user
    Kindly take a look at the image present attached

screen shot 2017-10-23 at 11 06 19 am

Cruise Control doesnt self heal when the goals are violated

I have a 3 broker cluster and I see the goals are violated. Broker1 has 116 partitions, Broker2 has 116 partitions and Broker 3 has 0 partitions but Kafka Cruise Controller doesnt redistribute the load.
I have set self.heal.enabled =true.

Missing default value for "num.sample.loading.threads"

num.sample.loading.threads is a config that determines the number of Kafka sample store processing / consumer threads. The current implementation sets the values for these config within the corresponding SampleStore class. It should be defined as a config in cruisecontrol.properties.

cannot start kafka server after adding metric.reporters class

i've just added the config in my kafka config file [server.properties]

metric.reporters=com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter

I have the error below

[2017-10-12 14:49:22,698] INFO Session establishment complete on server example.com/127.0.0.1:2181, sessionid = 0x15f10ad90270008, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2017-10-12 14:49:22,699] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2017-10-12 14:49:22,885] INFO Cluster ID = o0tbTctYTu27tXkfEqCp9w (kafka.server.KafkaServer)
[2017-10-12 14:49:22,899] INFO Using default value of localhost:9092 for cruise.control.metrics.reporter.bootstrap.servers (com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter)
[2017-10-12 14:49:22,899] INFO Using default value of PLAINTEXT for cruise.control.metrics.reporter.security.protocol (com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter)
[2017-10-12 14:49:22,905] FATAL [KafkaServer id=0] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.config.ConfigException: Missing required configuration "bootstrap.servers" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:463)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:453)
at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:62)
at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:75)
at org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:360)
at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:292)
at com.linkedin.kafka.cruisecontrol.metricsreporter.CruiseControlMetricsReporter.configure(CruiseControlMetricsReporter.java:109)
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstances(AbstractConfig.java:297)
at kafka.server.KafkaServer.startup(KafkaServer.scala:207)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:89)
at kafka.Kafka.main(Kafka.scala)

Get NotEnoughValidSnapshotsException when fetching /partition_load

I'm new here. When I try to fetch partition_load, I get this exception

Error processing GET request '/partition_load'
com.linkedin.kafka.cruisecontrol.exception.NotEnoughValidSnapshotsException: The cluster model generation requires 1 snapshots with minimum monitored partitions percentage of 0.0. But there are only 0 valid snapshots available.
	at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.aggregateMetrics(LoadMonitor.java:509)
	at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.clusterModel(LoadMonitor.java:377)
	at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.clusterModel(KafkaCruiseControl.java:234)
	at com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.getPartitionLoad(KafkaCruiseControlServlet.java:348)
	at com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.doGet(KafkaCruiseControlServlet.java:140)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:535)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at org.eclipse.jetty.server.Server.handle(Server.java:564)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
	at org.eclipse.jetty.util.thread.Invocable.invokePreferred(Invocable.java:128)
	at org.eclipse.jetty.util.thread.Invocable$InvocableExecutor.invoke(Invocable.java:222)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:294)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:199)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:673)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:591)
	at java.lang.Thread.run(Thread.java:745)

I don't mean why I have no valid snapshot, and I get these information in the log

[2017-09-22 18:40:55,346] INFO Rolled out new snapshot window 1506077100000, number of snapshots = 4 (com.linkedin.kafka.cruisecontrol.monitor.sampling.aggregator.MetricSampleAggregator)
[2017-09-22 18:40:55,346] INFO Removed snapshot window 1506076200000, number of snapshots = 3 (com.linkedin.kafka.cruisecontrol.monitor.sampling.aggregator.MetricSampleAggregator)

Any idea?
Thank you!

Question about capacity.json

Hiya,

What do the values in capacity.json represent?

      "capacity": {
        "DISK": "100000",
        "CPU": "100",
        "NW_IN": "10000",
        "NW_OUT": "10000"
      }

I'm guessing DISK and NW_IN/OUT are in bytes, but what is CPU measured on?

Thanks,

Ben

Bootstrap broker 192.168.50.4:6667 disconnected (org.apache.kafka.clients.NetworkClient)

[2017-08-31 10:33:12,109] INFO Kafka version : 0.10.1.0 (org.apache.kafka.common.utils.AppInfoParser)
[2017-08-31 10:33:12,110] INFO Kafka commitId : 3402a74efb23d1d4 (org.apache.kafka.common.utils.AppInfoParser)
[2017-08-31 10:33:12,156] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2017-08-31 10:33:12,162] INFO Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,162] INFO Client environment:host.name=zhangfeng.localdomain (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,162] INFO Client environment:java.version=1.8.0_77 (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,162] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:java.home=/soft/jdk1.8.0_77/jre (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:java.class.path=./cruise-control/build/dependant-libs/commons-math3-3.6.1.jar:./cruise-control/build/dependant-libs/cruise-control-metrics-reporter.jar:./cruise-control/build/dependant-libs/gson-2.7.jar:./cruise-control/build/dependant-libs/hamcrest-core-1.3.jar:./cruise-control/build/dependant-libs/javax.servlet-api-3.1.0.jar:./cruise-control/build/dependant-libs/jetty-http-9.4.6.v20170531.jar:./cruise-control/build/dependant-libs/jetty-io-9.4.6.v20170531.jar:./cruise-control/build/dependant-libs/jetty-security-9.4.6.v20170531.jar:./cruise-control/build/dependant-libs/jetty-server-9.4.6.v20170531.jar:./cruise-control/build/dependant-libs/jetty-servlet-9.4.6.v20170531.jar:./cruise-control/build/dependant-libs/jetty-util-9.4.6.v20170531.jar:./cruise-control/build/dependant-libs/jline-0.9.94.jar:./cruise-control/build/dependant-libs/jopt-simple-4.9.jar:./cruise-control/build/dependant-libs/junit-4.12.jar:./cruise-control/build/dependant-libs/kafka-clients-0.10.1.0.jar:./cruise-control/build/dependant-libs/kafka_2.10-0.10.1.0-test.jar:./cruise-control/build/dependant-libs/kafka_2.10-0.10.1.0.jar:./cruise-control/build/dependant-libs/log4j-1.2.17.jar:./cruise-control/build/dependant-libs/lz4-1.3.0.jar:./cruise-control/build/dependant-libs/metrics-core-2.2.0.jar:./cruise-control/build/dependant-libs/metrics-core-3.2.2.jar:./cruise-control/build/dependant-libs/netty-3.7.0.Final.jar:./cruise-control/build/dependant-libs/scala-library-2.10.6.jar:./cruise-control/build/dependant-libs/slf4j-api-1.7.25.jar:./cruise-control/build/dependant-libs/slf4j-log4j12-1.7.21.jar:./cruise-control/build/dependant-libs/snappy-java-1.1.2.6.jar:./cruise-control/build/dependant-libs/zkclient-0.9.jar:./cruise-control/build/dependant-libs/zookeeper-3.4.8.jar:./cruise-control/build/libs/cruise-control.jar:./cruise-control-metrics-reporter/build/libs/cruise-control-metrics-reporter.jar (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:java.compiler= (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:os.version=4.4.0-43-Microsoft (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,163] INFO Client environment:user.name=root (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,164] INFO Client environment:user.home=/root (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,164] INFO Client environment:user.dir=/root/CruiseControl (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,164] INFO Initiating client connection, connectString=192.168.50.3:2181,192.168.50.4:2181,192.168.50.5:2181/ sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@6328d34a (org.apache.zookeeper.ZooKeeper)
[2017-08-31 10:33:12,184] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2017-08-31 10:33:12,191] INFO Opening socket connection to server 192.168.50.5/192.168.50.5:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2017-08-31 10:33:12,242] INFO Socket connection established to 192.168.50.5/192.168.50.5:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2017-08-31 10:33:12,544] INFO Session establishment complete on server 192.168.50.5/192.168.50.5:2181, sessionid = 0x35e043e42a30842, negotiated timeout = 30000 (org.apache.zookeeper.ClientCnxn)
[2017-08-31 10:33:12,546] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2017-08-31 10:33:13,150] WARN Bootstrap broker 192.168.50.5:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:13,512] WARN Bootstrap broker 192.168.50.4:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:14,262] WARN Bootstrap broker 192.168.50.4:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:14,543] WARN Bootstrap broker 192.168.50.5:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:14,862] WARN Bootstrap broker 192.168.50.5:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:15,282] WARN Bootstrap broker 192.168.50.4:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:15,451] WARN Bootstrap broker 192.168.50.5:6667 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-08-31 10:33:16,101] WARN Bootstrap broker 192.168.50.4:6667 disconnected (org.apache.kafka.clients.NetworkClient)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.