Giter Club home page Giter Club logo

influxdb-tools's Introduction

InfluxDB Tools

Scripts to migrate off influxdb to a better database :)

  • schema-influx-to-clickhouse.py - Generate table schemas for Clickhouse based on influxdb measurements.

  • line-protocol-to-clickhouse.py - Load influxdb line-protocol backup data into Clickhouse.

influx-backup.py

InfluxDB backup/restore script using HTTP API and line-protocol format.

  • Use InfluxDB HTTP API
  • Backup raw data into text files in line-protocol format
  • Restore from a backup
  • Chunked read/write
  • Separate file for each measurement
  • Backup/restore individual measurements
  • Backup/restore specific retention
  • Incremental backups using "since", "until" date/time arguments
  • Delayed restore
  • Gzip support for backup/restore process

It is recommended to do a delayed restore using --restore-chunk-delay, --restore-measurement-delay so your InfluxDB instance does not run out of memory or IO pretty fast.

Usage

usage: influx-backup.py [-h] --url URL --user USER --dir DIR
                        [--measurements MEASUREMENTS]
                        [--from-measurement FROM_MEASUREMENT]
                        [--retention RETENTION] [--gzip] [--dump]
                        [--dump-db DUMP_DB] [--dump-since DUMP_SINCE]
                        [--dump-until DUMP_UNTIL] [--restore]
                        [--restore-db RESTORE_DB]
                        [--restore-chunk-delay RESTORE_CHUNK_DELAY]
                        [--restore-measurement-delay RESTORE_MEASUREMENT_DELAY]

InfluxDB backup script

optional arguments:
  -h, --help            show this help message and exit
  --url URL             InfluxDB URL including schema and port
  --user USER           InfluxDB username. Password must be set as env var
                        INFLUX_PW, otherwise will be asked.
  --dir DIR             directory name for backup or restore form
  --measurements MEASUREMENTS
                        comma-separated list of measurements to dump/restore
  --from-measurement FROM_MEASUREMENT
                        dump/restore from this measurement and on (ignored
                        when using --measurements)
  --retention RETENTION
                        retention to dump/restore
  --gzip                dump/restore into/from gzipped files automatically
  --dump                create a backup
  --dump-db DUMP_DB     database to dump
  --dump-since DUMP_SINCE
                        start date in the format YYYY-MM-DD (starting
                        00:00:00) or YYYY-MM-DDTHH:MM:SSZ
  --dump-until DUMP_UNTIL
                        end date in the format YYYY-MM-DD (exclusive)
                        or YYYY-MM-DDTHH:MM:SSZ
  --restore             restore from a backup
  --restore-db RESTORE_DB
                        database target of restore
  --restore-chunk-delay RESTORE_CHUNK_DELAY
                        restore delay in sec or subsec between chunks of 5000
                        points
  --restore-measurement-delay RESTORE_MEASUREMENT_DELAY
                        restore delay in sec or subsec between measurements

Examples

Dump stats db:

./influx-backup.py --url https://influxdb.localhost:8086 --user admin --dump --dump-db stats --dir stats

Dump heartbeat measurement from stats db with data until 2017-09-01:

./influx-backup.py --url https://influxdb.localhost:8086 --user admin --dump --dump-db stats --dir stats \
    --dump-until 2017-09-01 --measurements heartbeat

NOTE: If you get ChunkedEncodingError on dump, try to limit the data set using "since", "until" arguments.

Restore from stats dir into stats_new db:

./influx-backup.py --url https://influxdb.localhost:8086 --user admin --restore --restore-db stats_new \
    --dir stats

Restore only heartbeat measurement from stats dir into stats_new db:

./influx-backup.py --url https://influxdb.localhost:8086 --user admin --restore --restore-db stats_new \
    --dir stats --measurements heartbeat

influxdb-tools's People

Contributors

roman-vynar avatar timhawes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

influxdb-tools's Issues

IndexError: list index out of range

I'm running the latest influx-backup.py from an up-to-date debian9.

The influxdb version is 0.12 on centos6:
The access is allowed without user/pass

python3 influx-backup.py --url http://IP:8086 --user "" --dump --dump-db data --dir backup/

I get:

>> 2019-02-06 16:00:59 UTC
Starting backup of "data" db to "backup/" dir 

Measurements:
['TABLES]

Traceback (most recent call last):
  File "influx-backup.py", line 319, in <module>
    dump(args.dump_db, WHERE)
  File "influx-backup.py", line 128, in dump
    msfields[i['series'][0]['name']] = {x[0]: x[1] for x in i['series'][0]['values']}
  File "influx-backup.py", line 128, in <dictcomp>
    msfields[i['series'][0]['name']] = {x[0]: x[1] for x in i['series'][0]['values']}
IndexError: list index out of range```

Entity too large error while restoring the DB

While trying to restore DB with file around 5 MB it is producing following error:

Loading stats...  413 HTTP error, <html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx/1.15.6</center>
</body>
</html>

Restore fails with UnicodeError

In case the dump contains non-ASCII characters data restore fails. Encoding the payload to UTF-8 before passing it to requests solves the problem:

r = requests.post(URL+'/write', auth=AUTH, params=params, data=data.encode("UTF-8"))

Invalid line protocol when no tag in source measurements

I found parse data failed during restore data from backup and realized that because the source measurements didn't have any tag.

Add a tag length check before backup data point could fix this issue:

if len(tags) == 0:
    rows.append(f"{m} {','.join(fields)} {timestamp}\n")
else:
    rows.append(f"{m},{','.join(tags)} {','.join(fields)} {timestamp}\n")

Failed to backup measurements with "/" in name

$ ./influx-backup.py --url http://localhost:8086 --dump-db k8s --dir backup --user admin --dump
Password:
>> 2018-04-18 20:07:48 UTC
Starting backup of "k8s" db to "backup" dir

Measurements:
['cpu/limit', 'cpu/node_allocatable', 'cpu/node_capacity', 'cpu/node_reservation', 'cpu/node_utilization', 'cpu/request', 'cpu/usage', 'cpu/usage_rate', 'cpu_usage_1d', 'filesystem/inodes', 'filesystem/inodes_free', 'filesystem/limit', 'filesystem/usage', 'ingress/access-log', 'memory/cache', 'memory/limit', 'memory/major_page_faults', 'memory/major_page_faults_rate', 'memory/node_allocatable', 'memory/node_capacity', 'memory/node_reservation', 'memory/node_utilization', 'memory/page_faults', 'memory/page_faults_rate', 'memory/request', 'memory/rss', 'memory/usage', 'memory/working_set', 'memory_usage_1d', 'network/rx', 'network/rx_errors', 'network/rx_errors_rate', 'network/rx_rate', 'network/tx', 'network/tx_errors', 'network/tx_errors_rate', 'network/tx_rate', 'uptime']

Measurement fields:
{'network/tx_errors_rate': {'value': 'float'}, 'memory/limit': {'value': 'integer'}, 'memory/node_utilization': {'value': 'float'}, 'memory/request': {'value': 'integer'}, 'memory/page_faults_rate': {'value': 'float'}, 'cpu/limit': {'value': 'integer'}, 'cpu/node_utilization': {'value': 'float'}, 'network/tx_rate': {'value': 'float'}, 'network/tx': {'value': 'integer'}, 'memory/cache': {'value': 'integer'}, 'memory/node_capacity': {'value': 'float'}, 'network/tx_errors': {'value': 'integer'}, 'uptime': {'value': 'integer'}, 'cpu/node_capacity': {'value': 'float'}, 'cpu/request': {'value': 'integer'}, 'filesystem/usage': {'value': 'integer'}, 'cpu/node_reservation': {'value': 'float'}, 'cpu/node_allocatable': {'value': 'float'}, 'memory/node_allocatable': {'value': 'float'}, 'ingress/access-log': {'request_size': 'float', 'browser_version': 'string', 'browser_name': 'string', 'client_addr': 'string', 'response_size': 'float', 'host': 'string', 'engine_name': 'string', 'container_name': 'string', 'client_os': 'string', 'duration': 'float', 'is_bot': 'boolean', 'client_platform': 'string', 'scheme': 'string', 'is_mobile': 'boolean', 'labels': 'string', 'client_port': 'float', 'engine_version': 'string', 'namespace_name': 'string'}, 'filesystem/inodes': {'value': 'integer'}, 'network/rx_errors_rate': {'value': 'float'}, 'memory/usage': {'value': 'integer'}, 'network/rx': {'value': 'integer'}, 'filesystem/limit': {'value': 'integer'}, 'memory/node_reservation': {'value': 'float'}, 'cpu/usage_rate': {'value': 'integer'}, 'cpu/usage': {'value': 'integer'}, 'network/rx_errors': {'value': 'integer'}, 'memory/major_page_faults': {'value': 'integer'}, 'network/rx_rate': {'value': 'float'}, 'memory/working_set': {'value': 'integer'}, 'filesystem/inodes_free': {'value': 'integer'}, 'memory/major_page_faults_rate': {'value': 'float'}, 'memory/rss': {'value': 'integer'}, 'memory/page_faults': {'value': 'integer'}}

Dumping cpu/limit...Traceback (most recent call last):
  File "./influx-backup.py", line 320, in <module>
    dump(args.dump_db, WHERE)
  File "./influx-backup.py", line 148, in dump
    f = open('%s/%s' % (DIR, m), 'w')
FileNotFoundError: [Errno 2] No such file or directory: 'backup/cpu/limit'

Measurements with "C" (celsius) don't get extracted

If the measurement has special character(s), it doesn't get extract at all. The "Measurements fields" get incorrectly field in the internal hash.

export INFLUX_PW=none;./influx-backup.py --url http://localhost:8086 --user none --dump --dump-db database --dir /var/tmp/database --measurements "°C"

2018-06-22 10:18:16 UTC
Starting backup of "database" db to "/var/tmp/database" dir

Measurements:
['°C']

Measurement fields:
{'°C': {'temperature': 'float', 'icon_str': 'string', 'current_temperature': 'float', 'operation_mode_str': 'string', 'supported_features': 'float', 'value': 'float', 'mode': 'float', 'min_temp': 'float', 'state': 'string', 'attribution_str': 'string', 'max_temp': 'float', 'friendly_name_str': 'string', 'operation_list_str': 'string'}}

Ignoring °C... 0
Done.

Restore of Dump

When i want to restore a DUMP from a InfluxDB.
Dumping works fine and gives success

Restore of the dump on the same maschine

oot@raspberrypi:/home/pi/influxdb-tools# ./influx-backup.py --url http://localhost:8086 --user admin --restore --restore-db stockwage2 --dir stockwage
Password:

2018-03-21 09:23:22 UTC
Starting restore from "stockwage" dir to "stockwage2" db.

Files:
['sensorar2']

Confirm restore into "stockwage2" db? [yes/no] yes

Loading sensorar2... 404 HTTP error

or Remote Maschine:
oot@raspberrypi:/home/pi/influxdb-tools# ./influx-backup.py --url http://192.168.178.35:8086 --user root --restore --restore-db stockwage --dir stockwage
Password:

2018-03-21 09:27:18 UTC
Starting restore from "stockwage" dir to "stockwage" db.

Files:
['sensorar2']

Confirm restore into "stockwage" db? [yes/no] yes

Loading sensorar2...<class 'requests.exceptions.ConnectionError'>

Fixing some issues on special characters in names an non latin1 characters in names and values

When trying to save and restore the InfluxDB data for my HomeAssistant instance I ran into some errors:

  • It is impossible to save measurements containing a slash ('/') like 'm/s', 'l/h' or 'm³/d'.
    You can replace the slash in the filename by another charachter (U+2044 FRACTION SLASH,
    U+2215 DIVISION SLASH, U+2571 is BOX DRAWINGS LIGHT DIAGONAL UPPER RIGHT TO LOWER LEFT etc.)
    or a sequence of characters ('<@>', '=>' or what you like).
    I preferred the U+2571 solution and fixed it in this way.
  • Identifiers (measurements, tag keys and field keys) containing some characters are not stored correctly
    Since it is allowed to use Blank (' '), comma (',') and equal sign ('=') in tag keys and field keys
    and blanks and commas in measurements (see InfluxDB documentation),
    these characters must also be escaped. Until now (commit #166779d) only values are escaped.
    I fixed this also.
  • Measurements, tag keys, fields keys or values containing non-latin1 characters cannot be restored
    When restoring files containing characters which are not included in the latin1 character set,
    the build of the write request failed. When the data parameter to requests.post() is a string
    it will be converted to the standard HTTP character set latin1 which fails for non-latin1 characters.
    When this parameter is of type bytes no conversion takes place and everything works fine.
    This issue is also fixed.

The 1st and 3rd flaw were already mentioned in the issues #3 and #6.

I hope these changes make this nice program a little bit more usable :-)

Volker Böhm ([email protected])

P.S.: I have a working copy of the changes in my git repository but I don't know how to put it into a pull request on GitHub.

I attached the updated file influx-backup.py as influx-backup.txt since the bloody GitHub does not accept PY-files :-(
influx-backup .txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.