cljoly / rusqlite_migration Goto Github PK

View Code? Open in Web Editor NEW

72.0 6.0 16.0 286 KB

↕️ Simple database schema migration library for rusqlite, written with performance in mind.

Home Page: https://cj.rs/rusqlite_migration/

License: Apache License 2.0

Rust 99.31% Shell 0.69%

rusqlite schema rust library sqlite3 hacktoberfest

rusqlite_migration's Introduction

Rusqlite Migration

Rusqlite Migration is a simple and performant schema migration library for rusqlite.

Performance:
- Fast database opening: to keep track of the current migration state, most tools create one or more tables in the database. These tables require parsing by SQLite and are queried with SQL statements. This library uses the user_version value instead. It’s much lighter as it is just an integer at a fixed offset in the SQLite file.
- Fast compilation: this crate is very small and does not use macros to define the migrations.
Simplicity: this crate strives for simplicity. Just define a set of SQL statements as strings in your Rust code. Add more SQL statements over time as needed. No external CLI required. Additionally, rusqlite_migration works especially well with other small libraries complementing rusqlite, like serde_rusqlite.

Example

Here, we define SQL statements to run with Migrations::new() and run these (if necessary) with Migrations::to_latest().

use rusqlite::{params, Connection};
use rusqlite_migration::{Migrations, M};

// 1️⃣ Define migrations
let migrations = Migrations::new(vec![
    M::up("CREATE TABLE friend(name TEXT NOT NULL);"),
    // In the future, add more migrations here:
    //M::up("ALTER TABLE friend ADD COLUMN email TEXT;"),
]);

let mut conn = Connection::open_in_memory().unwrap();

// Apply some PRAGMA, often better to do it outside of migrations
conn.pragma_update_and_check(None, "journal_mode", &"WAL", |_| Ok(())).unwrap();

// 2️⃣ Update the database schema, atomically
migrations.to_latest(&mut conn).unwrap();

// 3️⃣ Use the database 🥳
conn.execute("INSERT INTO friend (name) VALUES (?1)", params!["John"])
    .unwrap();

Please see the examples folder for more, in particular:

async migrations in the quick_start_async.rs file
migrations with multiple SQL statements (using for instance r#"…" or include_str!(…))
migrations defined from a directory with SQL files
use of lazy_static
migrations to previous versions (downward migrations)

I’ve also made a cheatsheet of SQLite pragma for improved performance and consistency.

Built-in tests

To test that the migrations are working, you can add this in your test module:

#[test]
fn migrations_test() {
    assert!(MIGRATIONS.validate().is_ok());
}

The migrations object is also suitable for serialisation with insta, using the Debug serialisation. You can store a snapshot of your migrations like this:

#[test]
fn migrations_insta_snapshot() {
    let migrations = Migrations::new(vec![
        // ...
    ]);
    insta::assert_debug_snapshot!(migrations);
}

Optional Features

Rusqlite_migration provides several Cargo features. They are:

from-directory: enable loading migrations from *.sql files in a given directory
alpha-async-tokio-rusqlite: enable support for async migrations with tokio-rusqlite. As the name implies, there are no API stability guarantees on this feature.

Active Users

This crate is actively used in a number of projects. You can find up-to-date list of those on:

A number of contributors are also reporting issues as they arise, another indicator of active use.

Contributing

Contributions (documentation or code improvements in particular) are welcome, see contributing!

We use various tools for testing that you may find helpful to install locally (e.g. to fix failing CI checks):

Acknowledgments

I would like to thank all the contributors, as well as the authors of the dependencies this crate uses.

Thanks to Migadu for offering a discounted service to support this project. It is not an endorsement by Migadu though.

rusqlite_migration's People

Contributors

Stargazers

Watchers

Forkers

mightypork boaz-codota jiakuan fkaa jokler jalcine zed-industries matze czocher koltesdigital subconsciouscompute drewmcarthur jorgenpt greglearns lenndg fluffysquirrels

rusqlite_migration's Issues

Breaking changes for v2.0

A wishlist of breaking changes for the next major version:

Change return types so that we can fix the cargo clippy pedantic warnings, like casting `i64` to `usize` may lose the sign of the value or casting `usize` to `u32` may truncate the value on targets with 64-bit wide pointers
Introduce intermediary types so that we could write M::up(…).hook(…).down(…).hook(…)
Remove deprecated items

Other things to decide on:

Remove Eq implementation on M #45
Remove PartialEq from our Error type?

I don’t know if a breaking version will ever be released though, there are costs to forcing users to migrate.

Deal with multi-line SQL

.up() currently fail silently with multiline sql statement (later statements are not executed). Perhaps we could split pragma support in .up_pragma() and gain multiline support in .up?

Fix clippy pedantic warnings on the async module

Current issues:

warning: this method could have a `#[must_use]` attribute
  --> rusqlite_migration/src/asynch.rs:25:5
   |
25 |     pub fn new(ms: Vec<M<'static>>) -> Self {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: add the attribute: `#[must_use] pub fn new(ms: Vec<M<'static>>) -> Self`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#must_use_candidate
   = note: `-W clippy::must-use-candidate` implied by `-W clippy::pedantic`

warning: docs for function returning `Result` missing `# Errors` section
  --> rusqlite_migration/src/asynch.rs:55:5
   |
55 |     pub async fn current_version(&self, async_conn: &AsyncConnection) -> Result<SchemaVersion> {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#missing_errors_doc
   = note: `-W clippy::missing-errors-doc` implied by `-W clippy::pedantic`

warning: docs for function returning `Result` missing `# Errors` section
  --> rusqlite_migration/src/asynch.rs:82:5
   |
82 |     pub async fn to_latest(&self, async_conn: &mut AsyncConnection) -> Result<()> {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#missing_errors_doc

warning: docs for function returning `Result` missing `# Errors` section
   --> rusqlite_migration/src/asynch.rs:116:5
    |
116 |     pub async fn to_version(&self, async_conn: &mut AsyncConnection, version: usize) -> Result<()> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#missing_errors_doc

warning: docs for function returning `Result` missing `# Errors` section
   --> rusqlite_migration/src/asynch.rs:139:5
    |
139 |     pub async fn validate(&self) -> Result<()> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#missing_errors_doc

warning: `rusqlite_migration` (lib) generated 5 warnings

Some can silenced maybe? Like the doc ones, when we merely link to the sync method?

Handle errors when migrations return rows

When using rusqlite with the extra_check feature, if a migration returns rows, an
ExecuteReturnedResults error will be returned.
Users might expect some of their migrations to return rows, so they might want to ignore this error only for some rows.
Perhaps we should expose a boolean field when this error should be ignored (like may_return_rows)?

match tx.execute_batch(down) {
    Ok(()) |
    // Ignore this error, migrations may return results
    Err(rusqlite::Error::ExecuteReturnedResults) => (),
    Err(e) => return Err(Error::with_sql(e, down)),
}

Also add a test to make sure that migrations returning rows actually return this error.

Rename latest() to something else (for instance to_latest())

As pointed in #6, the name latest() is a bit confusing.

Check what other similar libs are using
Deprecate (and eventually remove) latest() to keep only the new name

Support and require rusqlite 0.29 in the next stable release (1.1.0)

As we found out in #68, tokio-rusqlite will need to be updated for us to support rusqlite in version 0.29 (we have no such requirement with the versions 1.0.0 of this crate because tokio-rusqlite had not been introduced then).

Once programatik29/tokio-rusqlite#11 is merged, we should update tokio-rusqlite and we will be able to move to hard requirement of rusqlite 0.29 (also in the examples and tests), instead of the open-ended ranges of versions we currently support (for the record, I’m the one who made this misguided promise).

This will also be our chance to remove some of the #[allow(deprecated)] currently spread in the codebase.

Implement migration to previous versions

Currently, only upward migrations are implemented.

Some way to go back to a previous version would be provided, for instance like this:

M::up("CREATE TABLE t(a, b);")
       .down("DROP TABLE t;");

This keeps downward migrations non-mandatory.

Motivation

It seems to be a fairly standard feature
Interest was expressed, for instance by @MightyPork in #2

Unit tests idea

Use the newly introduced drop column sqlite feature. Have a set of migrations and validation at each level (migrate version 1, then downward to 0, to version 2 and downward to 1 then 0).
Check that missing migrations return error

Integrate with Insta for testing

Insta is a popular snapshot testing tool for Rust. I’m not too familiar with it, but it seems that this library could be integrated with it in two ways (they are not exclusive):

Version snapshots of the Migrations along with the code. We should already support this, thanks to the Debug serialisation implemented on this struct.
Just like we have a validate method for testing convenience, we could also have a method that outputs the final state of the database (or of the internal tables holding the schema?) after running all the migrations. That would be what insta would snapshot.

The use of all those features should be documented in the Readme.
This may be a good chance to introduce insta on our own tests.

It would go some way to achive the goals of #101, but in less complex and intrusive way:

it would be just a fuction, and
it would compose with the framework users already use for snapshot testing.

Try to avoid the intermediary Vec when building migrations from files

After we have merged this PR, I’ll look into avoiding the intermediary Vec.

Originally posted by @cljoly in #51 (comment)

to_latest() panics when current_version > max_schema_version

When the user_version value in database is higher than the number of migrations, Migrations::to_latest() attempts an access out of bounds and triggers a panic in lib.rs:331:

thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1',

Example program:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut conn = rusqlite::Connection::open_in_memory()?;

    // pretend that the database has been migrated to version 2...
    conn.pragma_update(None, "user_version", 2)?;

    // ...but our migrations are defined only up to version 1
    let migrations = rusqlite_migration::Migrations::new(vec![
        rusqlite_migration::M::up("CREATE TABLE users (name TEXT);"),
    ]);

    // this call will panic
    migrations.to_latest(&mut conn)?;

    Ok(())
}

The best behavior would probably be to return an error in this case.

Motivation: I use rusqlite_migration in an embedded system that uses A/B updates. Suppose that version A of the system uses database version 5. The system is later updated to version B, which migrates the database to version 6. However, some unrelated problem makes version B crash, so the system reboots back to the old version A. But version A defines migrations only up to version 5, but the database version is 6, so the migration panics, version A stops working and the embedded system fails.

Hide off-by-one SchemaVersion::Inside

As pointed out in #6,

If SchemaVersion::Inside was not off-by-one, it would be more straightforward to use. .to_version() uses the natural version numbering: 0=nothing run yet, 1 = after first migration, etc.

Perhaps a solution would be to make to_version() take a usize that would be converted to SchemaVersion, hiding the off-by-one to the user.

Debug-format on MigrationHook causes a stack overflow

A similar situation to #45. The debug formatter calls itself without a stop condition. Can be verified with:

#[test]
fn test_migration_hook_debug() {
    let m = M::up_with_hook("", |_| HookResult::Ok(()));
    assert_eq!(String::new(), format!("{:?}", m));
}

Solution: Replace the implementation with something that doesn't call itself, such as:

impl Debug for Box<dyn MigrationHook> {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "MigrationHook")
    }
}

FYI: Found thanks to mutation testing.

Remove deprecated symbols

Find things marked as deprecated and remove them after sufficient notice has been given (at least one release more than 1 month ago for now.)

For instance, #6 removed an error.

Use thiserror in our Error type

Let’s explore if we can introduce it without breaking API compatibility. If not, we will have to add this task to #39.

Remove allow(dead_code) where possible

          - [ ] if we keep that after merging this PR, we should create an issue to remove it (ideally, before releasing a stable version, for the sake of stable version quality)

Originally posted by @cljoly in #51 (comment)

Consider passing owned values to the edit API

Basically this (started in #78)

diff --git a/rusqlite_migration/src/builder.rs b/rusqlite_migration/src/builder.rs
index b74d260..558de82 100644
--- a/rusqlite_migration/src/builder.rs
+++ b/rusqlite_migration/src/builder.rs
@@ -41,7 +41,7 @@ impl<'u> MigrationsBuilder<'u> {
     /// Errors:
     ///
     /// Panics in case a migration with a given `id` does not exist.
-    pub fn edit(mut self, id: usize, f: impl Fn(&mut M)) -> Self {
+    pub fn edit(mut self, id: usize, f: impl Fn(M) -> M) -> Self {
         if id < 1 {
             panic!("id cannot be equal to 0");
         }

The advantage is that we could use all methods of M.

The drawback is that passing the owned value and inserting it back could be costly.

Move test helper methods to a common utils file

Methods such as

rusqlite_migration/rusqlite_migration/src/tests/asynch.rs

Line 6 in ce4ce1c

fn m_valid0() -> M<'static> {

rusqlite_migration/rusqlite_migration/src/tests/asynch.rs

Line 37 in ce4ce1c

fn all_valid() -> Vec<M<'static>> {

should be defined once and reused between both the sync and async tests.

In the Readme: Improve rust doc links rendering on GitHub/website

          That does not render well in the Readme on GitHub, but the current state is a broken link anyway, so that can be left for future investigation.

Originally posted by @cljoly in #38 (comment)

Too strict constraint on `rusqlite` version

af4da52 introduced a strict constraint on rusqlite being 0.29.0. Now hat 0.30.0 has been released, rusqlite_migration is not compatible anymore, forcing users to stay on 0.29.0.

Is there a reason for this, or would you consider lessening this constraint again?

Support tokio-rusqlite?

It might be valuable to support tokio-rusqlite. We could put that behind a feature, to prevent this crate from becoming heavier in the general case and expose methods that take an “async” Connection.

Feel free to share your opinion in comments or to 👍🏼 / 👎🏼 .

Alias `async` docs to their `sync` counterparts

We can use https://blog.rust-lang.org/2020/11/19/Rust-1.48.html#adding-search-aliases to link the documentation of both versions together.

Suggestion: make `fn Migrations::max_schema_version` public

fn Migrations::max_schema_version(&self) is a currently private method that returns the number of available migrations wrapped in SchemaVersion.

I would like access to this value in my own code. I'm using Migrations::from_directory() and include_dir! to construct my Migrations, so getting the count myself is non-trivial AFAICT.

My use case is to work out whether any migrations will be run by fn Migrations::to_latest() before running it, and if so to take a backup first.

I created a simple PR for this: #171.

RUSTSEC-2024-0320: yaml-rust is unmaintained.

yaml-rust is unmaintained.

Details
Status	unmaintained
Package	`yaml-rust`
Version	`0.4.5`
URL	rustsec/advisory-db#1921
Date	2024-03-20

The maintainer seems unreachable.

Many issues and pull requests have been submitted over the years
without any response.

Alternatives

Consider switching to the actively maintained yaml-rust2 fork of the original project:

See advisory page for additional details.

Release fix for panic when current_version > max_schema_version

I just went to go file and issue for and fix #17, when I saw that it had already been done! 😅 would love for this to be released as we almost shipped the broken version.

Also, if there's a feature you're looking to implement before release (e.g. #4 ), I'd love to contribute.

Compilation errors with rusqlite 0.29

rusqlite has removed some deprecated items with 0.29, including NO_PARAMS in this commit.

Full compatibility also requires tokio-rusqlite to be updated for which a pull request has been submitted

Support PRAGMA statement in migrations

This was first discussed in #2 (comment) by @MightyPork

If I read the code right, you now run all the migrations in one big transaction, but that means that you can't have pragmas with side effects in a migration (looking at PRAGMA journal_mode = WAL). That could be useful...

What if it used one transaction per migration (including an update of user_version)? Then you could have a no-transaction version of M::up().

This could only break if the program was killed between running the migration (setting the pragma) and updating user_version, then the version number is not updated. ...but that's no problem if the migration was really just setting a pragma, it can be safely re-run.

I'm thinking of syntax like M::up(include_str!("file.sql")).transaction(false) or .no_txn(). This is how you could chain .down() as well, when it is implemented thinking

Replace `Migrations::new_iter` with `std::iter::FromIterator`

Then, deprecate new_iter

Note: This might happen in #51

Run auxiliary Rust code during migration

I would like to run some code during a migration (e.g. converting columns) but now it has to be done either after the fact or during subsequent queries both being awkward. Is functionality like this of any interest?

Store migration hashes to detect accidental modifications

Hello @cljoly,
Most database migration libraries store a hash sum of a given migration in the database to be able to detect migration modifications. From my experience this feature often comes in very useful due to accidental changes performed by distracted programmers, which tend to slip through review.

In our implementation, since we only store the 32-bit migration id in the user_version PRAGMA field, we won't be able to duplicate the same mechanism using this storage technique.

My idea for a backwards-compatible potential solution is to create our own migrations table, containing just two columns:

id - the migration id,
hashsum - a sha256 hash of the migration content (up + down)

This feature can be additive and feature-flagged, since not every user may require it (although it would be strongly advised for the extra consistency benefits).

This table would be used only to compare the hashes, not to verify the current migration - this would still be done with the user_version field. The hash comparison could be either implicit (done when a migration is requested) or explicit (performed with a new method).

Comparison on migrations with hooks causes a stack overflow

The implementation of PartialEq for hooks is recursive without a stop condition. See lib.rs:133 for details - eq calls == which calls eq.

A test such as:

#[test]
fn test_migration_hook_partial_eq() {
    let m = M::up_with_hook("", move |_: &Transaction| Ok(()));
    assert_eq!(m.up_hook, m.up_hook);
}

causes a stack overflow. Any comparison between any two M objects with hooks will cause a stack overflow.

Solutions:

Implement PartialEq and Eq for M manually, ignoring the up_hook and down_hook fields.
Compare closure memory addresses for equality.

Which one do you prefer @cljoly?

FYI: This issue has been noticed thanks to cargo mutants and mutation testing.

Test downward migrations

Similar to the validate method, but for downward migrations. We can't just add a parameter to validate because that would be a breaking API change. But maybe a common builder could be introduced and validate would just be a special case of that.

Update the documentation in the Readme

Introduce mutation testing

For instance with https://github.com/sourcefrog/cargo-mutants

Also, display test coverage prominently?

Contribute to the documentation

As someone who isn’t the author of this, you have super powers! You can see what is unclear in the documentation.

So please, when you try to use this library, if you notice anything unclear in the README or the docs, comment on this issue or open a PR, so that other people would find their way more easily. Thanks a lot!

Further improve performance of `from_directory`

In #77, we started limiting intermediary structures to improve the performance of from_directory. We could go further by dropping the BTreeMap altogether and using only a Vec<Option<Migration>>, initially filled with Nones. A Vec is generally a more cache friendly structure than BTreeMap and we could even preallocate it.

Another possible improvement would be to keep this Vec<Option<Migration>> in the constructor.

Feature Request: support migration of duckdb-rs

Given that duckdb-rs provides a very similar interface with rusqlite, it might be good to provides a intermediate trait to support schema migration of duckdb-rs. DuckDB itself is a fork from sqlite thus also supports user_version

Happy to contribute the PR if it's something you're interested to merge

Load migrations from a directory and store them in the binary

I believe adding a functionality to load migration files from a specified directory, parse them into M::up/M::down instances and return a Migrations<'static> instance would be beneficial for simple use cases which do not require Rust-based hooks.
Supporting hooks would also be a possibility, but a bit more complex (not sure how to do it at the moment),

How I see it in terms of code:

lazy_static! {
    static ref MIGRATIONS: Migrations<'static> = Migrations::from_directory("migrations");
}

this code would scan a migrations directory for sql files, sort them by their name assuming there's a {name}-up.sql and possibly a {name}-down.sql file and create M::up (and M::down) instances for them.

Some considerations:

Names should possibly start with a long number, eg. 000001, conflicting numbers should cause a panic or miscompilation. We can require that, so there are no surprises. Then the filenames would have three separate parts: {number}-{name}-[up,down].sql
Adding hooks can be then performed by modifying the migrations directly with additional named/numbered hooks, so they can be sorted into the Migrations in a deterministic way.

Add CI case to check against old versions

We want to run:

cargo +nightly check -Z minimal-versions

See https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#minimal-versions

Based on this comment: #32 (comment)