ahrtr / etcd-defrag Goto Github PK

View Code? Open in Web Editor NEW

69.0 69.0 7.0 125 KB

An easier to use and smarter etcd defragmentation tool

License: MIT License

Go 89.36% Makefile 2.55% Shell 7.63% Dockerfile 0.46%

defragmentation etcd

etcd-defrag's People

Contributors

Stargazers

Watchers

Forkers

caojiamingalan svenwiltink clementnuss git-yww mcntrn elbehery soulkyu

etcd-defrag's Issues

Ready for production?

Hi Benjamin Wang

Is it already ready for production?

I have seen dirty and unreliable scripts doing the etcd defragmentation, that's why I am very happy to see this tool. Thank you!

Evaluation result is false should not be redirect to os.Stderr

Actually when you start the etcd-defrag if the fragmentation is not performed due to a specific defrag-rule, the log indicating which indicate this will be forwarded to Stderr.

It is not an error so it shouldn't be redirected to Stderr, but probably more into Stdout.

here is the related go line :

etcd-defrag/main.go

Line 141 in 88a7fdd

 fmt.Fprintf(os.Stderr, "Evaluation result is false, so skipping endpoint: %s\n", ep) 

I can create a PR if you think its ok !

defragmentation rule not working properly

I have set up a defragmentation rule: dbSizeInUse / dbSize < 0.5.
Based on the etcd database size, none of the endpoints should be defragmented.

Endpoints rule calculation:
https://10.8.38.111:2379: 48812032 (dbSizeInUse) / 48812032 (dbSize) = 1
https://10.8.60.123:2379: 48799744 (dbSizeInUse) / 48799744 (dbSize) = 1
https://10.8.62.107:2379: 48803840 (dbSizeInUse) / 48807936 (dbSize) = 0,99991

But for some reason defragmentation was executed anyway.

etcd-defrag execution example:

Validating configuration.
Validating the defragmentation rule: dbSizeInUse / dbSize < 0.5 ... valid
Performing health check.
endpoint: https://10.8.60.123:2379, health: true, took: 13.016349ms, error:
endpoint: https://10.8.62.107:2379, health: true, took: 11.974902ms, error:
endpoint: https://10.8.38.111:2379, health: true, took: 18.972517ms, error:
Getting members status
endpoint: https://10.8.38.111:2379, dbSize: 48812032, dbSizeInUse: 48812032, memberId: b12d455af0b42502, leader: 27e0fbbbc2bc90b, revision: 1560450397, term: 5761, index: 1905174647
endpoint: https://10.8.60.123:2379, dbSize: 48799744, dbSizeInUse: 48799744, memberId: 27e0fbbbc2bc90b, leader: 27e0fbbbc2bc90b, revision: 1560450397, term: 5761, index: 1905174648
endpoint: https://10.8.62.107:2379, dbSize: 48807936, dbSizeInUse: 48803840, memberId: c97a792b85f34523, leader: 27e0fbbbc2bc90b, revision: 1560450397, term: 5761, index: 1905174648
Running compaction until revision: 1560450397 ... successful
3 endpoint(s) need to be defragmented: [https://10.8.38.111:2379 https://10.8.62.107:2379 https://10.8.60.123:2379]
[Before defragmentation] endpoint: https://10.8.38.111:2379, dbSize: 49053696, dbSizeInUse: 46804992, memberId: b12d455af0b42502, leader: 27e0fbbbc2bc90b, revision: 1560450404, term: 5761, index: 1905174656
Defragmenting endpoint "https://10.8.38.111:2379"
Finished defragmenting etcd endpoint "https://10.8.38.111:2379". took 1.083007173s
[Post defragmentation] endpoint: https://10.8.38.111:2379, dbSize: 46170112, dbSizeInUse: 46170112, memberId: b12d455af0b42502, leader: 27e0fbbbc2bc90b, revision: 1560450416, term: 5761, index: 1905174668
[Before defragmentation] endpoint: https://10.8.62.107:2379, dbSize: 49025024, dbSizeInUse: 46235648, memberId: c97a792b85f34523, leader: 27e0fbbbc2bc90b, revision: 1560450420, term: 5761, index: 1905174672
Defragmenting endpoint "https://10.8.62.107:2379"
Finished defragmenting etcd endpoint "https://10.8.62.107:2379". took 962.878881ms
[Post defragmentation] endpoint: https://10.8.62.107:2379, dbSize: 46219264, dbSizeInUse: 46219264, memberId: c97a792b85f34523, leader: 27e0fbbbc2bc90b, revision: 1560450429, term: 5761, index: 1905174681
[Before defragmentation] endpoint: https://10.8.60.123:2379, dbSize: 49004544, dbSizeInUse: 46272512, memberId: 27e0fbbbc2bc90b, leader: 27e0fbbbc2bc90b, revision: 1560450432, term: 5761, index: 1905174684
Defragmenting endpoint "https://10.8.60.123:2379"
Finished defragmenting etcd endpoint "https://10.8.60.123:2379". took 936.36402ms
[Post defragmentation] endpoint: https://10.8.60.123:2379, dbSize: 46223360, dbSizeInUse: 46215168, memberId: 27e0fbbbc2bc90b, leader: 27e0fbbbc2bc90b, revision: 1560450435, term: 5761, index: 1905174687
The defragmentation is successful.

etcd cluster with a learner is not supported

Currently, we tried to use etcd-defrag to implement defragmentations on our etcd clusters, and we found it failed quickly due to that the learner node in cluster did not support health check.

Here is the execution log:

Validating configuration.Validating the defragmentation rule: dbQuotaUsage > 0.8 || dbSizeFree/dbQuotaUsage > 0.5 ... validPerforming health check.{"level":"warn","ts":"2023-10-12T17:51:18.358902+0800","logger":"client","caller":"[email protected]/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00030a000/11.11.11.11:2379","method":"/etcdserverpb.KV/Range","attempt":0,"error":"rpc error: code = **Unavailable desc = etcdserver: rpc not supported for learner"}**endpoint: https://11.11.11.11:2379/, health: false, took: 7.499546ms, error: etcdserver: rpc not supported for learnerendpoint: https://33.33.33.33:2379/, health: true, took: 7.733879ms, error:endpoint: https://44.44.44.44:2379/, health: true, took: 9.555876ms, error:endpoint: https://55.55.55.55:2379/, health: true, took: 10.164246ms, error:endpoint: https://66.66.66.66:2379/, health: true, took: 9.741549ms, error:endpoint: https://22.22.22.22:2379/, health: true, took: 43.014812ms, error:

So is this an ongoing issue?

Plan to release 0.2.0

To include the following fix & enhancement,

5b6d922
#6
Updates to readme

cc @CaojiamingAlan @guettli

ahrtr / etcd-defrag Goto Github PK

etcd-defrag's People

Contributors

Stargazers

Watchers

Forkers

etcd-defrag's Issues

Bump go version to 1.22.3

Ready for production?

Evaluation result is false should not be redirect to os.Stderr

defragmentation rule not working properly

etcd cluster with a learner is not supported

Plan to release 0.2.0

[RFE] Pull quota-backend-bytes from etcd server?

Add flag `--compaction` to execute compaction before the defragmentation

Adding new variables in defragmentation rule/Adding

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent