Giter Club home page Giter Club logo

db_migrator's People

Contributors

fljdin avatar laurenz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

db_migrator's Issues

Sequence nextval computation

Hello,

As you will see in examples below, sequence nextval may encounter various unwanted values

  • Each sequence are incremented by 1
  • MS SQL stores sequence value differently

Oracle

SQL> CREATE USER test IDENTIFIED BY "test";
SQL> CREATE SEQUENCE test.seq1 START WITH 10 INCREMENT BY 10 NOCACHE;
SQL> SELECT test.seq1.nextval FROM DUAL;

   NEXTVAL
----------
        10

SQL> SELECT test.seq1.nextval FROM DUAL;

   NEXTVAL
----------
        20
SELECT db_migrate(
   plugin => 'ora_migrator',
   server => 'oracle',
   only_schemas => ARRAY['TEST']
);

SELECT nextval('test.seq1');
 nextval 
---------
     31
-- should be 30

MySQL

mysql> CREATE DATABASE test;
mysql> CREATE TABLE test.table1(id INT AUTO_INCREMENT, PRIMARY KEY(id));
mysql> INSERT INTO test.table1 (id) VALUES (null), (null);
mysql> SELECT * FROM test.table1;
+----+
| id |
+----+
|  1 |
|  2 |
+----+
SELECT db_migrate(
   plugin => 'mysql_migrator',
   server => 'mysql',
   only_schemas => ARRAY['test']
);

SELECT nextval('test.table1_seq');
 nextval
---------
       4
-- should be 3

MS SQL

1> CREATE SCHEMA test;
2> GO
1> CREATE SEQUENCE test.seq1 START WITH 10 INCREMENT BY 10 NO CACHE;
2> GO
1> SELECT NEXT VALUE FOR test.seq1;
2> GO

--------------------
                  10
1> SELECT NEXT VALUE FOR test.seq1;
2> GO

--------------------
                  20
SELECT db_migrate(
   plugin => 'mssql_migrator',
   server => 'mssql',
   only_schemas => ARRAY['test']
);

SELECT nextval('test.seq1');
 nextval
---------
      21
-- should be 30

Refresh staging schema from PostgreSQL database

Hi

Currently, the db_migrate_prepare method relies on plugin callbacks to populate the staging schema tables. It is therefore necessary to defer all adjustments by hand, by modifying the contents of the tables.

In some migrations, the target data model is already supplied by the development teams. It is guaranteed to be compliant with the application version of the remote model and contains all necessary adjustments (column types, constraints or indexes, stored procedures).

I can imagine two solutions to meet this need:

  • write a new plugin to connect to a PostgreSQL database (a plugin called pg_migrator) and enable the staging schema to be refreshed from a desired data model, and then carry out the migration with another plugin and the db_migrate_mkforeign method.
  • create a new method similar to db_migrate_prepare, to be called "db_migrate_prepare_local", where the tables in the staging schema would be simple views compiling information from the local catalog.

The first solution would leave db_migrator untouched. This plugin could be provided by another project, maintained by anyone. It could also be proposed as an example from the db_migrator project, replacing the noop_migrator plugin, which is provided for testing purposes. This is elegant, but also ridiculous, since it's not db_migrator's purpose to migrate data from PostgreSQL to PostgreSQL.

The second solution might be simple to implement, but the extension won't be able to consult the catalog of another database in the instance. Schemas will have to be made available for refreshment, then deleted before starting the migration of relations and data.

What do you think about it?
Regards, Florent

Implement BULK loading

During materialize_foreign_table step, data are loaded from foreign table with a INSERT AS SELECT *

Your recent post about COPY performance may inspired db_migrator (Ora2Pg already does).

Statements table functions

A refactor could be made by using table functions returning statements per relations, in that way, it could be easier to separate looping over staging tables and statement execution like I suggested in #24.

Example with db_indexes_statements table function:

CREATE FUNCTION db_indexes_statements(
  plugin         name,
  pgstage_schema name    DEFAULT NAME 'pgsql_stage'
) RETURNS TABLE (statement text)
  LANGUAGE plpgsql VOLATILE CALLED ON NULL INPUT SET search_path = pg_catalog AS
$$DECLARE
BEGIN
  /* loop through all index columns */
  FOR <needed variables> IN <index columns query>
  LOOP
    stmt := format('CREATE INDEX') -- put any variables into this dynamic statement
    -- do more logic...
   
    return next;
  END LOOP;
END; $$;
CREATE FUNCTION db_migrate_indexes(
  plugin         name,
  pgstage_schema name    DEFAULT NAME 'pgsql_stage'
) RETURNS integer
  LANGUAGE plpgsql VOLATILE CALLED ON NULL INPUT SET search_path = pg_catalog AS
$$DECLARE
  rc integer := 0;
BEGIN
  FOR stmt IN SELECT statement 
                FROM db_indexes_statements(plugin => 'mssql_server', pgstage_schema => 'pgsql_stage') 
  LOOP
    EXECUTE stmt;
  EXCEPTION
    WHEN others THEN
      -- printing exception detail
      -- inserting exception into migrate_log table
  END LOOP;
END; $$;

With this uncorrelated system, it becomes possible to perform migration by external tools without calling neither db_migrate nor any db_migrate_* methods.

External tools in any language may consume db_migrator API (staging tables and statement table-functions as proposed) to execute statements and catch exceptions. From my point of view, it will provide a better way to speed-up migration with multiple processes.

mssql_migrator new plugin

Hi,

I released another plugin for SQL Server (https://github.com/fljdin/mssql_migrator) under PostgreSQL license.
Without doing it on purpose, it comes the same week of Ora2Pg v24 release!

Current version 0.0.1 does minimal migration of the AdventureWorks database.
It does not currently consider partition schemes and leaves capitals in identifiers.

Best regards
Florent

Separate indexes from constraints migration

The db_migrate_constraints method perform much more that its says, i.e. create constraints.

  • Creating primary key or unique constraint
  • Creating foreign key constraint
  • Creating CHECK constraint
  • Creating index
  • Setting default to tables columns

Therefore, it may be possible to use dedicated methods for each index or each constraint, as done with materialize_foreign_table that is called by db_migrate_tables to speed-up operation in parallel.

Define lower and upper boundaries for partitioning

I thought some more about partition boundaries.
Perhaps they can be passed as text[]: one value for HASH, two for RANGE and arbitrarily many for LIST.
Then there is the problem of numbers vs. other data types. Perhaps we can require that the values in the text[] are already properly quoted, so that we can simply use them as they are.

Originally posted by @laurenz in #11 (comment)

Triggers may depend on views

The db_migrate() method assumes that triggers are build before views but triggers may depend on views.

If there is no doubt on this, we should reorder db_migrate() calls, update comments on functions and documentation in README.

Schema name in capital letters

Hi laurenz,

I suggest you write explicitly that the schema must be capitalized in the command SELECT oracle_migrate(server => 'myserver', only_schemas => '{MYSCHEMA}');
or in CREATE FOREIGN TABLE ... SERVER myserver OPTIONS (schema 'MYSCHEMA', table 'ACCOUNT');
because it took me and other developers a while to figure out why it didn't work with lowercase names.

Thanks again for your work on this tool.

Foreign table removal management

Hi,

Currently, foreign table removal is performed at the same time as table creation and data loading. The materialize_foreign_table method consists of the following steps:

  • Foreign table renaming (ALTER FOREIGN TABLE %I.%I RENAME TO %I)
  • Creation of identical table, with partitioning defined or not (CREATE TABLE %I.%I (LIKE %I.%I))
  • Creation of table partitions if required (CREATE TABLE %1$I.%3$I PARTITION OF %1$I.%2$I %4$s)
  • Load data or not (INSERT INTO %I.%I SELECT * FROM %I.%I)
  • Delete foreign table (DROP FOREIGN TABLE %I.%I)

If only one of these steps is unsuccessful, the foreign table remains unchanged.

However, with the db_migrate_tables method, the foreign table is removed if any error occurs.
In the event that you wish to repeat the migration of an erroneous table, you need to rebuild the foreign table, which can be tedious, although it is now possible with the low-level function construct_foreign_tables_statements.

How about postponing the removal of foreign tables until the end of the migration, with a call to the db_migrate_finish method?

Installation process

Hello Everyone!
Can someone please help me with the installation process as I am unable to follow the process which is given in Set-up page.
I want to install this db_migrator extension for psql and my environment is Linux
Thanks in advance.

mysql_migrator development

Hi Laurenz,

I released a new plugin for MySQL (fljdin/mysql_migrator) with minimal implementation of the staging views required by db_migrator.db_migrate() methods.

At this time, the migration is reaching the final stages but with minor constraints errors in my development (issue #1).

Best regards,
Florent

exec method db_migrate_prepare() [ERROR: schema ""public"" does not exist]

Hello Laurenz,

we´ve a problem, with your new feature extension db_migrator.
See the steps below....

CREATE EXTENSION oracle_fdw;
CREATE EXTENSION ora_migrator with version '0.9.2';
CREATE EXTENSION db_migrator;

CREATE SERVER oradb FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver '//<xxx>');
GRANT USAGE ON FOREIGN SERVER oradb TO test;
CREATE USER MAPPING FOR test SERVER oradb OPTIONS (user '<user>, password <passw>);
SELECT oracle_diag('oradb');
SET search_path = public;
SET lc_messages TO 'en_US.UTF-8';
SELECT db_migrate_prepare(plugin =>'ora_migrator', server=>'oradb', only_schemas=>'{"<schema>"}');
 
SQL Error [3F000]: ERROR: schema ""public"" does not exist
  Where: PL/pgSQL function "public"."db_migrate_prepare"("name","name","name","name","name"[],"jsonb") line 19 at EXECUTE
 
test=# \dx
                             List of installed extensions
     Name     | Version |   Schema   |                   Description
--------------+---------+------------+-------------------------------------------------
 db_migrator  | 0.9.0   | public     | Tools to migrate other databases to PostgreSQL
 ora_migrator | 0.9.2   | public     | Tools to migrate Oracle databases to PostgreSQL
 oracle_fdw   | 1.2     | public     | foreign data wrapper for Oracle access
 plpgsql      | 1.0     | pg_catalog | PL/pgSQL procedural language
(4 rows)

We tried it before with the older version, there everything went well, with the method oracle_migrate_prepare but our problem is that we need the delta synchronization via the replica.

Thanks for help.

Non case sensitive operator !~*

Hi Laurenz,

My new docker environment thrown collation errors with mysql_fdw, like this:

SELECT condition FROM fdw_stage.checks WHERE condition !~ '^"[^"]*" IS NOT NULL' ;
ERROR:  failed to prepare the MySQL query: 
Character set 'utf8mb3_bin' cannot be used in conjunction with 'binary' in call to regexp_like.

This piece of SQL is extracted from db_migrator in refresh "checks" table stage. After some research, it's appeared that !~ operator is manageb by mysql_fdw as a NOT REGEXP BINARY statement ^1.

What do you think about using non case sensitive operator !~*? It fixed my problem.
Regard, Florent

Extend API for partial (filtered) indexes

Hi Laurenz,

My current research on porting MSSQL to PostgreSQL with a new db_migrator plugin comes with a new API need.
As Oracle does not provide partial indexes, staging indexes and index_columns tables lack of a filter information.

Given SQL Server documentation, a partial index must be handled by db_migrator :

CREATE NONCLUSTERED INDEX FIBillOfMaterialsWithEndDate
    ON Production.BillOfMaterials (ComponentID, StartDate)
    WHERE EndDate IS NOT NULL ;

Seems that only PostgreSQL and SQL Server support this feature, as described by Markus in this post

I will give you a patch proposal in a few time.
Regards

Why db_migrator is relocatable?

I was wandering how refactor those lines in many place of db_migrator where extschema is set with a pg_extension search:

EXECUTE format('SET LOCAL search_path = %I, %s', pgstage_schema, extschema);

After reading documentation about extension's relocatability, is there any reason to relocate db_migrator in a database? As functions are calling each others, it might be acceptable to forbid ALTER EXTENSION SET SCHEMA and use @extschema@ wildcard to refer to the extension schema, set during installation only.

Regards, Florent

Index column expressions may be capitalized

Hi,

On experimenting mssql_migrator development, any capitalized columns were not double-quoted.

WARNING:  Error executing "CREATE UNIQUE INDEX "AK_Department_Name" 
                           ON "HumanResources"."Department" (Name ASC)"
DETAIL:  column "name" does not exist: 
WARNING:  Error executing "CREATE UNIQUE INDEX "AK_Employee_LoginID" 
                           ON "HumanResources"."Employee" (LoginID ASC)"
DETAIL:  column "loginid" does not exist: 
WARNING:  Error executing "CREATE UNIQUE INDEX "AK_Employee_NationalIDNumber" 
                           ON "HumanResources"."Employee" (NationalIDNumber ASC)"
DETAIL:  column "nationalidnumber" does not exist:
...

I made a local patch on db_migrator, but a comment suggested me that would be much harder than this.

diff --git a/db_migrator--1.0.0.sql b/db_migrator--1.0.0.sql
index cec64ac..852280a 100644
--- a/db_migrator--1.0.0.sql
+++ b/db_migrator--1.0.0.sql
@@ -2113,7 +2113,7 @@ BEGIN
       END IF;
 
       /* fold expressions to lower case */
-      stmt := stmt || separator || expr
+      stmt := stmt || separator || format('%I', expr)
                    || CASE WHEN des THEN ' DESC' ELSE ' ASC' END;
       separator := ', ';
    END LOOP;

What do you think?

Florent

db_migrate is not migrating database into PostgreSQL

db_migrate is working fine for some of the schemas like it is creating schema and dumping the data. But for some of the schemas are not migrating into PostgreSQL using db_migrate and also it is not showing any error details like why it it not migrating. It was saying that migration completed successfully but schema itself is not created in PostgreSQL. Can you please help me with the log

Add an option to not partition a table

I just realized that there is another problem.

When the data are migrated, you just select from the partitioned table, that is, you get all data from all partitions, right?
That would fail to load on PostgreSQL if you omitted one partition.

I am not sure what to do about that. I have two ideas:

  1. Change the way data are migrated so that it is partition by partition (that is the hardest solution).
  2. Only provide the option to migrate partitioning with all partitions as they are or not migrate partitioning at all. That would be good enough from my point of view.
    Then we should do away with the migrate column for partitions. We can either introduce a migrate_partitioning flag on the table or expect that the user deletes the partitions from the migrator table.

Originally posted by @laurenz in #11 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.