Comments (10)
apiVersion: batch/v1 kind: Job metadata: name: clickhouse-restore-specific-table namespace: ck
you don't need create /etc/clickhouse-backup/config.yml
and run clickhouse-backup
inside Job
you need execute clickhouse-client which will call REST api via SQL and execute clickhouse-backup inside the clickhouse-server pod where you try to restore data
clickhouse-client --host $CLICKHOUSE_HOST --port $CLICKHOUSE_PORT --user $CLICKHOUSE_USER --password $CLICKHOUSE_PASSWORD --query="INSERT INTO system.backup_actions(command) VALUES('restore_remote --table=db.table_name $REMOTE_BACKUP_NAME')"
after run poll command status
while [[ "in progress" == $(clickhouse-client -mn -q "SELECT status FROM system.backup_actions WHERE command='restore_remote --data ${BACKUP_NAMES[$SERVER]}' ORDER BY start DESC LIMIT 1 FORMAT TabSeparatedRaw" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" $BACKUP_PASSWORD) ]]; do
echo "still in progress ${BACKUP_NAMES[$SERVER]} on $SERVER";
sleep 1;
done;
and run delete local
clickhouse-client --host $CLICKHOUSE_HOST --port $CLICKHOUSE_PORT --user $CLICKHOUSE_USER --password $CLICKHOUSE_PASSWORD --query="INSERT INTO system.backup_actions(command) VALUES('delete local $REMOTE_BACKUP_NAME')"
look details in restore job on
and
https://gist.github.com/Slach/d933ecebf93edbbaed7ce0a2deeaabb7
https://gist.github.com/Slach/d933ecebf93edbbaed7ce0a2deeaabb7#file-restore-job-yaml
from clickhouse-backup.
is the process the same for restring ReplicatedAggregatingMergeTree
I typically create tables as such
CREATE TABLE db_1.table_1 ON CLUSTER replicated
...
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/db_1.table_1', '{replica}')
from clickhouse-backup.
optimal command sequence is
restore_remote --schema --table=db.table backup_name
on every host
delete local backup_name
on every host
restore_remote --data --table=db.table backup_name
on first replica each shard
delete local backup_name
on first replica each shard
from clickhouse-backup.
with this
apiVersion: batch/v1
kind: Job
metadata:
name: clickhouse-restore-table
namespace: ck
spec:
backoffLimit: 0
template:
metadata:
name: clickhouse-restore-table
spec:
restartPolicy: Never
containers:
- name: clickhouse-restore
image: clickhouse/clickhouse-client:latest
imagePullPolicy: IfNotPresent
env:
- name: CLICKHOUSE_HOST
value: "chi-ck3-replicated-0-0.ck.svc.cluster.local"
- name: CLICKHOUSE_PORT
value: "9000"
- name: CLICKHOUSE_USER
value: admin
- name: CLICKHOUSE_PASSWORD
value: admin321
- name: REMOTE_BACKUP_NAME
value: "my_backup_name"
how do I even find out what is my_backup_name ?
I tried this to simply list the backups but failed even doing this step
apiVersion: batch/v1
kind: Job
metadata:
name: list-backups
namespace: ck
spec:
backoffLimit: 1
template:
metadata:
name: list-backups
spec:
restartPolicy: Never
containers:
- name: list-backups
image: altinity/clickhouse-backup:latest
command: ["/bin/bash"]
args: ["-c", "clickhouse-backup -c /etc/clickhouse-backup/config.yml list"]
volumeMounts:
- name: config-volume
mountPath: /etc/clickhouse-backup
env:
- name: S3_BUCKET
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_BUCKET
- name: S3_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_ACCESS_KEY
- name: S3_SECRET_KEY
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_SECRET_KEY
volumes:
- name: config-volume
configMap:
name: clickhouse-backup-config
k logs -n ck list-backups-k2tfl
2024/05/01 06:30:48.433569 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/01 06:30:48.434964 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
2024/05/01 06:30:53.439189 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/01 06:30:53.440219 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
2024/05/01 06:30:58.440609 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/01 06:30:58.441468 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
2024/05/01 06:31:03.444703 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/01 06:31:03.445457 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
2024/05/01 06:31:08.446086 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
from clickhouse-backup.
please read again and figure out with links from #907 (comment)
you don't need execute clickhouse-backup in Job
you need execution clickhouse-backup server
as side kind: ClickHouseInstallation
and use clickhouse-client
in Job
to execute command like SELECT * FROM system.backup_list
and SELECT * FROM system.backup_actions
and INSERT INTO system.backup_actions(command) VALUES(...)
from clickhouse-backup.
yeah I'm trying to figure out for many days now
so with this
apiVersion: batch/v1
kind: Job
metadata:
name: query-backup-names
namespace: ck
spec:
backoffLimit: 1
template:
metadata:
name: query-backup-names
spec:
restartPolicy: Never
containers:
- name: clickhouse-client
image: clickhouse/clickhouse-client:latest
env:
- name: CLICKHOUSE_HOST
value: "chi-ck3-replicated-0-0.ck.svc.cluster.local"
- name: CLICKHOUSE_PORT
value: "9000"
- name: CLICKHOUSE_DB
value: "default"
- name: CLICKHOUSE_USER
value: "admin"
- name: CLICKHOUSE_PASSWORD
value: "admin321"
command: ["/bin/bash", "-c"]
args:
- |
clickhouse-client --host $CLICKHOUSE_HOST --port $CLICKHOUSE_PORT --user $CLICKHOUSE_USER --password $CLICKHOUSE_PASSWORD --format TabSeparated --query="SELECT name FROM system.backups"
I see
k logs -n ck query-backup-names-pmstf
backup_name
is this expected? will this backup I run to S3 leave any tracks on the system.backups or they will be not present after the data is moved?
from clickhouse-backup.
clickhouse-backup
doesn't handle system.backups
i don't mentioned to system.backups
here #907 (comment)
from clickhouse-backup.
read again twice my comments
#907 (comment)
and
#907 (comment)
from clickhouse-backup.
I have no idea why this is not working
I start over to make a backup of single table to see I can find where the bug is
so when I run this manifest
apiVersion: batch/v1
kind: Job
metadata:
name: backup-discord-alert-table
namespace: ck
spec:
backoffLimit: 0
template:
metadata:
name: backup-discord-alert-table
spec:
restartPolicy: Never
containers:
- name: clickhouse-backup
image: altinity/clickhouse-backup:2.2.7
imagePullPolicy: IfNotPresent
env:
- name: CLICKHOUSE_USERNAME
value: "admin"
- name: CLICKHOUSE_PASSWORD
value: "admin321"
- name: CLICKHOUSE_HOST
value: "chi-ck3-replicated-0-0.ck.svc.cluster.local"
- name: CLICKHOUSE_PORT
value: "9000"
- name: S3_BUCKET
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_BUCKET
- name: S3_ENDPOINT
value: "https://192.168.3.80:8010"
- name: S3_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_ACCESS_KEY
- name: S3_SECRET_KEY
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_SECRET_KEY
- name: S3_REGION
value: "us-east-1"
- name: S3_ACL
value: "private"
- name: S3_PATH
value: "clickhouse-backup"
- name: S3_SSE
value: "none"
- name: S3_FORCE_PATH_STYLE
value: "true"
- name: S3_DISABLE_SSL
value: "true"
command:
- /bin/sh
- -c
- |
clickhouse-backup create --table discord_alert.btc_all_activity_10_c_sell
I see this
k logs -n ck backup-discord-alert-table-4vgx4
2024/05/01 09:47:25.461828 info clickhouse connection prepared: tcp://chi-ck3-replicated-0-0.ck.svc.cluster.local:9000 run ping logger=clickhouse
2024/05/01 09:47:25.470269 info clickhouse connection open: tcp://chi-ck3-replicated-0-0.ck.svc.cluster.local:9000 logger=clickhouse
2024/05/01 09:47:25.470326 info SELECT metadata_path FROM system.tables WHERE database = 'system' AND metadata_path!='' LIMIT 1; logger=clickhouse
2024/05/01 09:47:25.481700 info SELECT name, engine FROM system.databases WHERE name NOT IN (?) with args [[system INFORMATION_SCHEMA information_schema _temporary_and_external_tables system INFORMATION_SCHEMA information_schema _temporary_and_external_tables]] logger=clickhouse
2024/05/01 09:47:25.493636 info SHOW CREATE DATABASE airflow
logger=clickhouse
2024/05/01 09:47:25.503626 info SHOW CREATE DATABASE binance
logger=clickhouse
2024/05/01 09:47:25.561411 info SHOW CREATE DATABASE bitmex
logger=clickhouse
2024/05/01 09:47:25.580280 info SHOW CREATE DATABASE default
logger=clickhouse
2024/05/01 09:47:25.589218 info SHOW CREATE DATABASE deribit
logger=clickhouse
2024/05/01 09:47:25.597226 info SHOW CREATE DATABASE deribit_days_range
logger=clickhouse
2024/05/01 09:47:25.603402 info SHOW CREATE DATABASE discord_alert
logger=clickhouse
2024/05/01 09:47:25.611717 info SHOW CREATE DATABASE kafka
logger=clickhouse
2024/05/01 09:47:25.621722 info SHOW CREATE DATABASE test_db
logger=clickhouse
2024/05/01 09:47:25.677830 info SELECT name, count(*) as is_present FROM system.settings WHERE name IN (?, ?) GROUP BY name with args [show_table_uuid_in_table_create_query_if_not_nil display_secrets_in_show_and_select] logger=clickhouse
2024/05/01 09:47:25.691222 info SELECT name FROM system.databases WHERE engine IN ('MySQL','PostgreSQL','MaterializedPostgreSQL') logger=clickhouse
2024/05/01 09:47:25.701573 info SELECT countIf(name='data_path') is_data_path_present, countIf(name='data_paths') is_data_paths_present, countIf(name='uuid') is_uuid_present, countIf(name='create_table_query') is_create_table_query_present, countIf(name='total_bytes') is_total_bytes_present FROM system.columns WHERE database='system' AND table='tables' logger=clickhouse
2024/05/01 09:47:25.719833 info SELECT database, name, engine , data_paths , uuid , create_table_query , coalesce(total_bytes, 0) AS total_bytes FROM system.tables WHERE is_temporary = 0 AND match(concat(database,'.',name),'discord_alert.btc_all_activity_10_c_sell') ORDER BY total_bytes DESC SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1 logger=clickhouse
2024/05/01 09:47:26.058808 info SELECT metadata_path FROM system.tables WHERE database = 'system' AND metadata_path!='' LIMIT 1; logger=clickhouse
2024/05/01 09:47:26.071073 info SELECT toUInt8(count()) udf_presents FROM system.columns WHERE database='system' AND table='functions' AND name='create_query' logger=clickhouse
2024/05/01 09:47:26.084102 info SELECT name, create_query FROM system.functions WHERE create_query!='' logger=clickhouse
2024/05/01 09:47:26.097101 info SELECT value FROM system
.build_options
where name='VERSION_INTEGER' logger=clickhouse
2024/05/01 09:47:26.105861 info SELECT count() FROM system.columns WHERE database='system' AND table='disks' AND name='type' logger=clickhouse
2024/05/01 09:47:26.165548 info SELECT path, any(name) AS name, any(type) AS type FROM system.disks GROUP BY path logger=clickhouse
2024/05/01 09:47:26.178409 info SELECT count() FROM system.tables WHERE database='system' AND name='parts_columns' logger=clickhouse
2024/05/01 09:47:26.193619 info SELECT column, groupUniqArray(type) AS uniq_types FROM system.parts_columns WHERE active AND database=? AND table=? GROUP BY column HAVING length(uniq_types) > 1 with args [discord_alert btc_all_activity_10_c_sell] logger=clickhouse
2024/05/01 09:47:26.225145 info ALTER TABLE discord_alert
.btc_all_activity_10_c_sell
FREEZE WITH NAME '29b39a4fd3794e5699acd7011c57b12a'; logger=clickhouse
2024/05/01 09:47:26.246140 info ALTER TABLE discord_alert
.btc_all_activity_10_c_sell
UNFREEZE WITH NAME '29b39a4fd3794e5699acd7011c57b12a' logger=clickhouse
2024/05/01 09:47:26.254073 info SELECT mutation_id, command FROM system.mutations WHERE is_done=0 AND database=? AND table=? with args [discord_alert btc_all_activity_10_c_sell] logger=clickhouse
2024/05/01 09:47:26.265360 info done backup=2024-05-01T09-47-25 logger=backuper operation=create table=discord_alert.btc_all_activity_10_c_sell
2024/05/01 09:47:26.265428 info SELECT value FROM system
.build_options
where name='VERSION_DESCRIBE' logger=clickhouse
2024/05/01 09:47:26.273843 info done backup=2024-05-01T09-47-25 duration=812ms logger=backuper operation=create
2024/05/01 09:47:26.273899 info clickhouse connection closed logger=clickhouse
It completed very fast which is suspicious
so when I try to verify what was the effective result of this job I run this
at this point I'm unable to verify where the backup went to see it
apiVersion: batch/v1
kind: Job
metadata:
name: list-backups
namespace: ck
spec:
backoffLimit: 1
template:
metadata:
name: list-backups
spec:
restartPolicy: Never
containers:
- name: list-backups
image: altinity/clickhouse-backup:latest
command: ["/bin/bash", "-c"]
args:
- |
echo 'host: chi-ck3-replicated-0-0.ck.svc.cluster.local' > /etc/clickhouse-backup/config.yml
echo 'port: 9000' >> /etc/clickhouse-backup/config.yml
echo 'username: admin' >> /etc/clickhouse-backup/config.yml
echo 'password: admin321' >> /etc/clickhouse-backup/config.yml
echo 'remote_storage: s3' >> /etc/clickhouse-backup/config.yml
echo 's3:' >> /etc/clickhouse-backup/config.yml
echo ' endpoint: https://192.168.3.80:8010' >> /etc/clickhouse-backup/config.yml
echo ' bucket: '${S3_BUCKET}'' >> /etc/clickhouse-backup/config.yml
echo ' access_key: '${S3_ACCESS_KEY}'' >> /etc/clickhouse-backup/config.yml
echo ' secret_key: '${S3_SECRET_KEY}'' >> /etc/clickhouse-backup/config.yml
echo ' region: us-east-1' >> /etc/clickhouse-backup/config.yml
echo ' acl: private' >> /etc/clickhouse-backup/config.yml
echo ' path: clickhouse-backup' >> /etc/clickhouse-backup/config.yml
echo ' force_path_style: true' >> /etc/clickhouse-backup/config.yml
echo ' disable_cert_verification: true' >> /etc/clickhouse-backup/config.yml
clickhouse-backup -c /etc/clickhouse-backup/config.yml list
env:
- name: S3_BUCKET
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_BUCKET
- name: S3_ACCESS_KEY
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_ACCESS_KEY
- name: S3_SECRET_KEY
valueFrom:
secretKeyRef:
name: clickhouse-s3-secrets
key: S3_SECRET_KEY
k logs -n ck list-backups-d4w25
/bin/bash: line 1: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 2: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 3: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 4: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 5: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 6: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 7: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 8: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 9: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 10: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 11: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 12: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 13: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 14: /etc/clickhouse-backup/config.yml: No such file or directory
/bin/bash: line 15: /etc/clickhouse-backup/config.yml: No such file or directory
2024/05/01 10:19:27.999433 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/01 10:19:28.000802 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
2024/05/01 10:19:33.001057 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/01 10:19:33.001911 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
I tr to override the default behaviour but it always point to localhost
from clickhouse-backup.
Please stop and read my comments above
your manifest is wrong
I explained why, twice
don't try to execute clickhouse-backup
directly in Job
use clickhouse-client
and system.backup_list and system.backup_actions commands
execute clickhouse-backup server
inside kind: ClikcHouseInstallation
read and figure out examples from
https://gist.github.com/Slach/d933ecebf93edbbaed7ce0a2deeaabb7
Did you try to read documentation?
3th line in readme explain why you can't run clickhouse-backup remotelly
from clickhouse-backup.
Related Issues (20)
- implements `--partitions=db.table:part_name1,part_name2` and `--partitions=db2.table2:*` to allow more flexible backup logic HOT 1
- create system.backup_version, add version to log, add GET /backup/version endpoint
- Implements `X/Y tables` logging for `done` logging HOT 1
- How to backup replica clickhouse cluster correctly ? HOT 1
- `acccess` and `configs` download should use archieve extensions the same manner as main data, for example allow download zstd instead of .tar
- API server should restart if `watch` command fails
- Unable to clean remote broken (can't stat metadata.json) HOT 6
- Multipart Upload to Minio Fails in Newest Minio Release HOT 1
- Permission denied while doing restore_remote if backup is also present locally HOT 10
- Authentication Failed When Configuring Automatic Backup HOT 12
- implements embedded BACKUP/RESTORE ON CLUSTER approach
- not supporter users with dot in name HOT 6
- URL is getting malformed while running restore_remote command with azure_blob_store HOT 4
- create / restore stuck when CLICKHOUSE_MAX_CONNECTIONS=0
- Apply rate limiting as wrapper around Reader \ Writer interface instead of calculate it after upload / download finished
- Implementation --restore-table-mapping HOT 2
- `$` character in database and table name can't make backup
- validation config failures for API server after apply --env in one command
- Replcace AsyncStatus.commands []ActionRow to combination map + slice to allow store only last XXX action items
- Add support for dynamically created disks
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickhouse-backup.