Giter Club home page Giter Club logo

tonsofdump's Introduction

tonsofdump

An ultra-fast, multi-threaded MySQL backup tool written in Go

Installation

You can download this like all other Go tools, if you don't know how to install or use Go programs, that's something you'll need to learn (it's easy though)

go get github.com/BrianLeishman/tonsofdump

Usage

This will only work on tables that have primary keys!! It's very important to how this tool operates, and it will not function otherwise!

  -P int
        your MySQL port (default 3306)
  -b int
        the chunk size in bytes (defaults to 8388608, or 8MB) (default 8388608)
  -d string
        your MySQL database
  -f string
        the storage root for companies of downloaded files
  -h string
        your MySQL host (default "localhost")
  -m int
        number of max threads/tables to download at once (default CPU count)
  -n    skips downloading data for tables
  -p string
        your MySQL password
  -t string
        comma separated list of tables to dump
  -u string
        your MySQL username (default "root")
  -z    set to compress to a *.tar.bz2 file (requires that tar and lbzip2 are installed)
tonsofdump -u root -p password -d myschema -h localhost -t 'mytable,mytable2,$procs'

This will create *.sql files for each of those two tables listed and all the stored procedures on myschema

The pseudo tables are

  • $all - Switches the dump tool to "all" mode, which downloads all tables (not views) except for the tables that are also listed. So -t '$all,log' will download every table on myschema except for log
  • $funcs - Downloads all functions on myschema and stores them as "$funcs.sql"
  • $procs - Downloads all stored procedures
  • $views - Downloads all views

All string type fields get stored as hex literals for fast importing and absolutely no chance of SQL injection-style errors.

String type fields with character sets get stored with their collations inline, except for BLOB and BINARY types. For example, the value 'yeet' will get stored in the dump file like this (depending on your column character set/collation): _utf8mb4 x'79656574'collate utf8mb4_unicode_ci

Numeric fields, null values, and date/time fields get stored literally since they don't have the chance of containing characters that would break the MySQL queries.


Why is this so fast? Well for starters, when the imported table is dropped and recreated, the recreation syntax doesn't include the indexes and foreign keys (except for the primary key and unique indexes). This way each insert isn't rebuilding each index at the time of insertion, and after every insert finishes the indexes are then added to the table.

Unique keys are included from the start in the event that somehow data exists that isn't unique (shouldn't be possible, but you never know), so that the unique keys aren't added at the end and can fail, which would be hard to debug.

Also the unique keys existing at the beginning allow us to run our insert statements as insert ignore into... and throw out the inconsistent duplicate values.

Insert queries (and the select statements used to download the data) try to hit an average command size of about 8MB, which is calculated by looking up the the average row size for the table in the information schema, and seeing how many rows would be needed to hit 8MB, which seems to be the sweet spot (at least in my testing) for getting the data downloaded as quickly as possible.

This also includes all triggers for a table when a table is downloaded, which the MySQL official dump tool seems to think are separate, but I think that's strange.

At the end of the dump tool running, there's a command outputted that can be used to import the saved *.sql files, including an extra "$constraints.sql" file, which is a file that contains all the foreign key statements for every table, which is intended to be ran at the end of the import to account for possible circular foreign key references between tables.

tonsofdump's People

Contributors

brianleishman avatar ianshearer avatar

Stargazers

Pieter Claerhout avatar

Watchers

James Cloos avatar  avatar

tonsofdump's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.