Giter Club home page Giter Club logo

split-exec's Introduction

split-exec

split-exec: A sleek and efficient database Online DML change tool with zero online impact, perceptible progress, and controllable latency.

current status: developing, Maybe product first available release at April 15 2024

Why?

When using a database (such as MySQL or PostgreSQL), we often need to perform DML operations that can affect a large amount of rows. This type of SQL, referred to as "Big-SQL" in the following text.

For example, when data expires, we may want to:

  • Mark old data as deleted. This can be done using a SQL update statement like:

    update online_task status='deleted' where create_time<'2023-02-01'

  • Delete old data to reclaim space. This can be achieved with an SQL delete statement like:

    delete from online_task where create_time<'2023-02-01'

  • Migrate data from the live table to an archive table. This can be done using an SQL insert statement like:

    insert into archive_task select * from online_task where create_time<'2023-02-01

However, executing Big-SQL directly on table may cause many troubles, such as:

  • Coarse-grained locks held by Big-SQL during executing impact live queries.
  • Big-SQL involving large amounts of data cause extensive CPU and IO operations.
  • Replaying Big-SQL also costs too long time, during replaying replicas have data latency.

Risk Summary: overhead connection limit, connection denied accumulation of slow queries data latency.

Otherwise, executing Big-SQL directly like a black-box, we are unable to estimate how much more time is required.

How?

To address this issue, we aim to split a large SQL query into smaller sub-queries based on the primary key or unique key. Additionally, while executing these sub-queries, we monitor the data latency between the master and replicas.

The sample design of flow is like: design-flow-map

TODO: add

Usage

TODO

split-exec's People

Contributors

cassiaman7 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.