Comments (27)
Reading the code (I made a few comments), I think this adds the flexibility needed to add what we need for Cassandra.
I will try to merge this in my current work (or rebase on top of it) to see if I can already use it.
from evolve.
Good news: after implementing a CqlStatementBuilder
and hacking the missing logic into the Evolve
class, the migration test now works for Cassandra. It has the previously mentioned limitations, but I think these limitation are fairly reasonable.
Now, I noticed that using this strategy, we end up having multiple entries in the changelog table, manly because each migration script is divided into statements, and each statement is then considered as a migration. This does not look like a big issue to me since they will have the same checksum, name and description, but I'm not sure what are the ramifications of this (maybe that can be solved by changing the logic in Evolve and save migrations based on files and not statements).
Sources: https://github.com/Pvlerick/Evolve/tree/issue-39-add-support-for-cassandra
from evolve.
I started working on a prototype: https://github.com/Pvlerick/Evolve/tree/issue-39-add-support-for-cassandra
It is a bit clunky because Cassandra's nomenclature doesn't map so well with SQL concepts, that I think that in the end it will hold.
from evolve.
Hi @Pvlerick
First of all thank you for the interest and the work you put in Evole. π
I'm not familiiar with Cassandra. Is it a database which needs a schema migration tool ?
I'm not convinced by the interest of Evolve for that kind of nosql database.
Can you enlighten me please ?
from evolve.
Cassandra is a distributed, high availability and scalable database. It is considered NoSQL but it behaves in a similar way in many regards as a classical RDBMS, although it has quite a lot of limitations when it comes to features compared to classical RDBMS (that's the trade-off for being distributed and such).
It has its own query language: CQL (Cassandra Query Language) which is quite similar to SQL. Keyspaces (databases/schema) and tables must be created using it, altered, dropped, etc... using that language, so the need for a migration tool is there.
We currently need such a tool where I work at and Evolve is the best candidate so far, as you wrote all the mechanics and it's just a matter of wiring it all together. As I said, some parts might be a bit clunky because of some of the concepts' names and because some of the abstractions are not going to fit 100%, but from what I have seen so far it could work.
If you don't want to include it in the project, we could try to make this an extension in a different package?
from evolve.
Thank you for the feedback. Appreciate it.
I thought the database schema was more or less managed in the app code with Cassandra.
So, I will be pleased to help you when I can in your effort to develop a Cassandra driver for Evolve.
And let's see where it leads us :)
from evolve.
@Pvlerick For your information, I'm working on the CassandraCSharpDriver integration part in MsBuild / dotnet build. It's a tricky part in Evolve I'm starting to get used to.
You can concentrate on the "Cassandra database stuff".
from evolve.
@lecaillon Ok, cool! I'm progressing well, the dialect part is almost complete and I didn't encounter anything blocking yet.
from evolve.
@lecaillon I have the dialect working but I started to run into some limitations due to the Cassandra drivers, mainly:
- A script can only contain one statement, I didn't find a way to execute multiple statements in one file - obvious workaround is to have a file for each statement, not ideal to say the least;
- The drivers return a
CqlBatchTransaction
whenBeginTransaction
is called onCqlConnection
(see here), which means that any statement that is not idempotent (such as analter table
) will fail because it is executed twice (once whenExecuteNonQuery
is called, once whenCommit
is called - at least that's my current assessment.
Sources updated: https://github.com/Pvlerick/Evolve/tree/issue-39-add-support-for-cassandra
cc @jorgebay, just in case he has something to chip in ;-)
from evolve.
@Pvlerick The Oracle driver acts exacty like this. It supports only one statement at a time (that's why I did not implement it until now). That is not the best case scenario for us :)
I see (at least) 2 options:
- Use the DatabaseHelper.BatchDelimiter property in the cql scripts. Evolve use this string to split one script into several statements. You can give it a try, but it's more a hack than a clean solution IMO.
- The proper way to do it is for each script, to parse the content of the file to extract queries one by one. I didn't do it earlier because no one seems to be interested in Oracle, and honestly it is quite tricky to do. I see a lot of edge cases. More over at the time I did not have a lot of unit tests, so I prefered to be careful on this point.
If you want to work on this point with me I would be really happy. It would be very interresting. Tell me what you want to do.
I did not understand your second remark. :)
from evolve.
If I can give you some advice, I would stay away of the ADO.NET component of the driver and use the core API, that supports parallel execution, request pipelining, ...
from evolve.
@lecaillon I tried using the BatchDelimiter
property, but it didn't work.
Extracting queries from the script is a possible solution, but I'm not sure implementing a parser on our own is a good idea: that's a lot of work and as you say there are tons of edge cases.
Maybe we could implement something simple and document limitations. From the top of my head, I would suggest something that splits the file in lines and look for the end-of-statement character (in CQL, ;
) at the end of each line and regroup lines based on this.
The obvious limitation is that scripts should not have literals spanning on multiple lines that contain ;
at the end of one of the lines, and not have a comment ending with ;
.
To me, this sounds like a reasonable limitation for the tool and is easy to implement. Nothing prevents us from having a better implementation in the future that would remove this limitation, of course.
My second point was that the Evolve
class wraps each statement in a BeginTransaction
/Commit
. It seems that the Cassandra's drivers do not handle this well when it comes to statements such as alter table
.
Basically, when I run my tests with an alter table
, the table gets the new column added when the ExecuteNonQuery
is executed, but it seems that it is executed again when the Commit
is called, which results in an error since the table already has the column. I will have to check if the behavior is the same for other statements (something that @jorgebay can probably confirm).
@jorgebay: we are using the ADO drivers because they allow us to reuse most of the code already written for SQL engines. Since this is for database/keyspace/whatever evolution and versioning, it does not need the extra features from the core API (which indeed should absolutely be used in any application). Should we take from your warning that you plan an decommissioning this, or can we use it safely with the warnings/limitations that you documented?
from evolve.
@lecaillon for the second point, one way to solve this would be to be able to optionnaly remove the BeginTransaction
/Commit
using a property on the Evolve
class. I'll give it a try in another branch and see if this is sound.
from evolve.
@Pvlerick Knowing that some drivers do not support transaction for DDL instructions, it can be an interresting option. Disable it, even for SQL statements seems awkward.
from evolve.
@Pvlerick By the way, integration of the CassandraCSharpDriver is almost done. It should be easy to use in Evolve. That's a good news :)
from evolve.
@lecaillon ok, good news!
Here is the branch for optional transactions : Pvlerick@a657006
Tests are all green, but since this is your code you can probably assess if that's sound, as I might have missed something. The diff is a bit messy, but I did a reformat in VS and it removed some trailing spaces all around :-)
from evolve.
@Pvlerick Concerning the Query parser we can start with an naive implementation only activated for Cassandra migrations.
from evolve.
@lecaillon where would you put that implementation? In a subtype of MigrationScript?
from evolve.
@Pvlerick I'm going to create a specific branch on this one, for us to work on this.
I think the design (even naΓ―ve) must meet those 2 goals:
- Be able to go on with the old fashioned
MigrationScript
when the database driver does not need this feature. ie for all existing drivers excepting Cassandra and Oracle (one day maybe). - Be able to extract all the queries of a script and execute the non DDL contiguous ones in the same transaction.
from evolve.
The proposal
SqlStatementBuilder
Introduce a new abstract class : SqlStatementBuilder
, in charge of parsing the content of a MigrationScript
. A builder is set on each specific DatabaseHelper
. We use the SimpleSqlStatementBuilder
on all existing databases, and the CassandraSqlStatementBuilder
for the CassandraDatabase
.
Each SqlStatementBuilder
has to implement this method:
protected abstract IEnumerable<SqlStatement> Parse(string sqlScript, string delimiter);
SqlStatement
The new SqlStatement
class holds the sql statement and the execution context (some props like CanExecuteInTransaction).
Sources updated : https://github.com/lecaillon/Evolve/tree/feature-query-parser
What do you think ?
from evolve.
Ok, in the meantime, I fixed the migration count issue on the Evolve
class.
I think we now have all the pieces that we need to make this work :-)
I can probably add a bit more tests for Cassandra, but all in all it looks like we have a good starting point that I can start using at work.
from evolve.
@Pvlerick
How do you see the next steps ?
from evolve.
@lecaillon if you implement your proposal in master, I should be able to rebase on it and cleanup the history in my branch then open PR.
from evolve.
@Pvlerick
Ok, I will make my best to implement my part as soon as possible. I keep you in touch when I'm ready.
On your side have you got a deadline for this feature ?
from evolve.
@lecaillon no deadline, but the sooner the better, off course ;-)
I will already start the preparatory work on adding this to the build/deployment process.
from evolve.
@Pvlerick Merged the new SQL query parser in master. Implementation done. Lack few tests.
Feel free to give me your feedback concerning the design, the naming, comments ...
I'm waiting for your PR.
from evolve.
I guess this can be considered done for now, I'll open separate issues for the tweaks.
from evolve.
Related Issues (20)
- Issue while using Evolve with MySQL HOT 1
- Exception on long running migration when using TransactionKind.CommitAll HOT 6
- [Logging] Documentation missing
- Can command line interface accept password from environment HOT 1
- Bug: Error when trying to migrate large files
- Error: `syntax error at or near "-" at character 30` when running Evolve.Erase()
- [question] how to capture all SQL so that it can be applied by some other tool? HOT 1
- Feature Request: Code-based Migrations
- Allow versions longer than long.MaxValue
- Unable to load shared library 'SQLite.Interop.dll' HOT 5
- Unable to apply migration from two different sql files HOT 1
- Inconsistent behavior in treating paths in Evolve.Evolve() on Windows and Linux HOT 1
- Tool does not work the .NET 7 HOT 2
- Use Microsoft.Data.SqlClient HOT 1
- evolve and Postgres get error "SHOW VARIABLES LIKE '" HOT 4
- [Evolve Exception] CREATE INDEX CONCURRENTLY cannot be executed within a pipeline Sql query: -- evolve-tx-off
- Cannot proceed, because event scheduler is disabled. HOT 1
- Version sorting HOT 1
- Is the timeout capped at 60 seconds somewhere? HOT 5
- Migrate from System.Data.SqlClient to Microsoft.Data.SqlClient
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from evolve.