hdinsight / hdinsight-kafka-tools Goto Github PK
View Code? Open in Web Editor NEWHDInsight Kafka Tools
HDInsight Kafka Tools
I see the kafka directory in the HDP folder but there is no service definition in AMBARI. What am I missing here?
If Kafka is configured for IP advertising as per https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-connect-vpn-gateway#configure-kafka-for-ip-advertising "
rebalance script stops working. Logs show the following error:
"2019-04-05 11:15:10,908 - rebalance_rackaware.py [11028] main - INFO - Checking if all brokers are up.
2019-04-05 11:15:10,908 - rebalance_rackaware.py [11028] main - ERROR - VM 0 with FQDN: wn0-pro-ka has no brokers assigned. Ensure that all brokers are up! It is not recommended to perform replica rebalance when brokers are down.
2019-04-05 11:15:10,908 - rebalance_rackaware.py [11028] main - INFO - No need to rebalance. Current Kafka replica assignment has High Availability OR minimum requirements for rebalance not met. Check logs at /var/log/kafka/rebalance_log for more info."
Looks like the script relies on host names returned by zookeeper. Check this code:
'''
def get_brokerhost_info(zookeeper_client):
logger.info("Associating brokers to hosts...")
zk_brokers_ids = zookeeper_client.get_children(BROKERS_ID_PATH)
brokers_info = {}
for zk_broker_id in zk_brokers_ids:
zk_broker_id_data, stat = zookeeper_client.get('{0}/{1}'.format(BROKERS_ID_PATH, zk_broker_id))
zk_broker_info = json.loads(zk_broker_id_data)
zk_broker_host = zk_broker_info['host'].split('.')[0]
brokers_info[zk_broker_host] = zk_broker_id
return brokers_info
This breaks the script:
[email protected]:~$ /usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh zk0-....:2181 <<< "get /brokers/ids/1004"
Connecting to zk0-....:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://10.0.0.5:9092"],"rack":"/default-rack","jmx_port":9999,"host":"10.0.0.5","timestamp":"1554359490433","port":9092,"version":4}
Do you have any chance to fix this issue so that script can handle Kafka is configured for IP advertising as well? Thanks
I'm referencing the -throttle
parameter of kafka-reassign-partitions.sh
as documented here.
There should be a way to pass in a throttle parameter so that it's possible to limit the impact these data-intensive operations will have on users.
The other key step will be to remove the throttle after rebalancing. Simply ensuring verification once the rebalancing has completed should remove the throttle automatically.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.