Giter Club home page Giter Club logo

Comments (4)

Slach avatar Slach commented on August 16, 2024

Yes clickhouse-backup execute server side s3:CopyObject/s3:CreateMultipartUpload API during backup exists S3 disk

Use clickhouse-backup 2.4.33
And add object_disk_path to s3 config session in /etc/clickhouse-backup/config.yml

clickhouse-backup credentials in s3 section shall have read access to your original s3 bucket with original data during create_remote operation,
and write access to destination s3 bucket during restore_remote on destination cluster

and read+write access to backup bucket which you provide in s3 section of /etc/clickhouse-backup/config.yml

All three buckets can be the same, but with different path inside

from clickhouse-backup.

and1990 avatar and1990 commented on August 16, 2024

Sorry, I am trying to make it clear.

I have set up two ClickHouse clusters on two separate EC2 instances, with identical configurations. I created a database called test on ClickHouse cluster1 and inserted some data into it. The original data is stored in S3. My goal is to back up the data on ClickHouse cluster1 and then restore the database test on ClickHouse cluster2.

Here's an overview of the steps I've followed:

  1. Run clickhouse-backup create on the first EC2 instance:
    clickhouse-backup create test-backup-0303 --tables=test.events

  2. Copy the backup files to the second EC2 instance:

tar -czvf "test_meta.tgz" -C /var/lib/clickhouse/backup/ test-backup-0303/
tar -czvf "test_data.tgz" -C /var/lib/clickhouse/disks/s3/backup/ test-backup-0303/

scp test_meta.tgz username@ec2:/root
scp test_data.tgz username@ec2:/root

  1. Now, on the second EC2 instance, extract the files to the specific directory and run clickhouse-backup list. It will display the backup test-backup-0303:
tar -xzvf /root/test_meta.tgz -C /var/lib/clickhouse/backup/
tar -xzvf /root/test_data.tgz -C /var/lib/clickhouse/disks/s3/backup/
  1. Run clickhouse-backup restore test-backup-0303 on the second EC2 instance.

Mt question is: "Does the original data stored in S3 get copied to another S3 path when restoring the test database on the second EC2 instance? If yes, how can I prevent copying the S3 data during the restore process on the second EC2 instance?"
Beacause I want the two clickhouse clusters to share the same clickhouse data.

The clickhouse-backup conf is:

general:
    remote_storage: none
    max_file_size: 0
    disable_progress_bar: true
    backups_to_keep_local: 0
    backups_to_keep_remote: 0
    log_level: debug
    allow_empty_backups: false
    download_concurrency: 8
    upload_concurrency: 8
    use_resumable_state: true
    restore_schema_on_cluster: ""
    upload_by_part: true
    download_by_part: true
    restore_database_mapping: {}
    retries_on_failure: 3
    retries_pause: 30s
    watch_interval: 1h
    full_interval: 24h
    watch_backup_name_template: shard{shard}-{type}-{time:20060102150405}
    retriesduration: 30s
    watchduration: 1h0m0s
    fullduration: 24h0m0s
clickhouse:
    username: default
    password: "123456"
    host: 127.0.0.1
    port: 9000
    disk_mapping: {}
    skip_tables:
        - system.*
        - default.*
        - INFORMATION_SCHEMA.*
        - information_schema.*
        - _temporary_and_external_tables.*
    timeout: 5m
    freeze_by_part: false 
    freeze_by_part_where: ""
    use_embedded_backup_restore: false
    embedded_backup_disk: ""
    backup_mutations: true
    restore_as_attach: false
    check_parts_columns: true
    secure: false
    skip_verify: false
    sync_replicated_tables: false
    log_sql_queries: true
    config_dir: /etc/clickhouse-server/
    restart_command: systemctl restart clickhouse-server
    ignore_not_exists_error_during_freeze: true
    check_replicas_before_attach: true
    tls_key: ""
    tls_cert: ""
    tls_ca: ""
    debug: false
``

from clickhouse-backup.

Slach avatar Slach commented on August 16, 2024

Is your second EC2 instance has the same S3 disk credentials and path?
In <storage_configuration> section for clickhouse-server?

if i understand your setup properly, then
/var/lib/clickhouse/disks/s3/backup/test-backup-0303/
doesn't contain DATA, actually it contains metadata files which contains files which referenced to keys on first EC2 instance S3 bucket which you use for s3 disk

could you share
SELECT * FROM system.disks
and
SELECT * FROM system.storage_policies
from first and second EC2 instance?

from clickhouse-backup.

and1990 avatar and1990 commented on August 16, 2024

Yes, you are right. The second EC2 instance doesn't contain data, only contains metadata. Now it works well. Thanks.

from clickhouse-backup.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.