Giter Club home page Giter Club logo

isomorphicdb's Introduction

IsomorphicDB

Merge Coverage Status

TODO List

  • PostgreSQL compatibility
    • PostgreSQL wire protocol
    • Data types
    • Data definition language
      • Create/Drop/Alter table
      • Create/Drop index
      • Primary/Foreign keys
      • Check constraints
    • Data manipulation language
  • Transactions
  • Reactive Dataflow query execution
    • With RSocket for inter node communication
  • Query compilation
  • HTAP (Hybrid transactional/analytical processing)
  • Raft replication
  • Operability
    • Smooth version upgrade
    • Ease of adding/removing node from the cluster (including quick replication)
    • Self-driving

Play around with project

See docs

Project structure

  • docs/ - project documentation
  • etc/ - scripts for convenient run of compatibility tests and other tools
    • ci/ - script helpers to run commands on GitHub Actions
    • local/ - scripts for local usage
  • gradle/ - gradle wrapper to run tests/compatibility tests
  • node_engine/ - module that glues all other together to handle incoming network request and execute it across other modules
  • postgres/ - crate to consolidate PostgreSQL functionality
    • query_ast/ - abstract syntax tree of parsed SQL query
    • query_parser/ - parser that produce AST from SQL string
    • query_response/ - module to represent successful or error response after query execution to a client
  • sql_engine/ - crate to consolidate SQL query engine functionality
    • catalog/ - API for accessing data and its definition
    • data_definition/ - group of modules responsible to represent SQL DDL queries
      • execution_plan - data structures responsible for representing operations of Data Definition Language part of SQL
    • data_manipulation/ - group of modules responsible to represent SQL DML queries
      • operators/ - SQL operators like +, -, LIKE, AND, OR and others
      • query_plan/ - query plan that is executed over database data
      • query_result/ - internal representation of query execution result
      • typed_queries/ - represents query structure that is ready for type coercion and type check
      • typed_tree/ - typed binary tree of SQL operators
      • untyped_queries/ - represents query structure that is ready for type resolution
      • untyped_tree/ - untyped binary tree of SQL operators
    • definition_planner/ - API to create execution plan for a DDL query
    • entities/ - database entities
      • definition/ - database object names and its definitions
      • types/ - SQL types
    • query_analyzer/ - API to analyse a parsed SQL query
    • query_planner/ - API to create execution plan for a SQL query
    • query_processing/ - API to process a parsed/analyzed SQL query
      • type_check/
      • type_coercion/
      • type_inference/
    • scalar/ - representing primitive types as a scalar value that can be use as intermediate computational result
  • storage/ - database transactional storage
    • api/ - type aliases and traits that defines api for in_memory and persistent storage
    • binary/ - representing primitive types as a raw binary vector
    • in_memory/ - in memory only storage
    • persistent/ - persistent storage
  • tests/
    • compatibility/ - groovy based tests to check compatibility with PostgreSQL
    • erlang_client/ - erlang based tests
    • fixtures/ - files needed to set up non-default local testing

Development

Build time dependencies

isomorphicdb uses postgres-parser to parse PostgreSQL 13 SQL syntax which requires LLVM. Thus, to build project you need to install LLVM and add it to $PATH

On Ubuntu the following command should be sufficient:

sudo apt install llvm

On MacOS with zsh

brew install llvm
echo 'export PATH="/usr/local/opt/llvm/bin:$PATH"' ~/.zshrc
source ~/.zshrc 

Setting up integration suite for local testing

For now, it is local and manual - that means some software has to be installed on a local machine and queries result has to be checked visually.

  1. Install psql (PostgreSQL client)
    1. Often it comes with PostgreSQL binaries. On macOS, you can install it with brew executing the following command:
    brew install postgresql
  2. Start the isomorphicdb instance with the command:
    cargo run
  3. Start psql with the following command:
    psql -h 127.0.0.1 -W
    1. enter any password
  4. Run sql scripts from compatibility folder

Compute code coverage locally

  1. Run ./etc/local/code_coverage.sh
  2. Open ./target/debug/coverage/index.html in your browser

Running Compatibility tests locally

  1. Install java version 8 or 11(that were tested)
  2. (Optional) Install gradle version 6 (that were tested)
  3. Run RUST_LOG=debug cargo run from project folder in separate terminal window
  4. Run ./etc/local/compatibility.sh

Running Erlang Client Compatibility tests locally

  1. Install Erlang version 23.1 (which is tested on CI now). You could install a specified version of Erlang via asdf.
  2. Install rebar3 to run Erlang Common Test.
  3. Run ./etc/ci/erlang_client.sh.
  4. Kill isomorphicdb process manually for running the tests again.

isomorphicdb's People

Contributors

alex-dukhno avatar calldata avatar dependabot[bot] avatar hbina avatar lpiepiora avatar oleksandr-valentirov avatar silathdiir avatar suhaibaffan avatar tolledo avatar vkulichenko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

isomorphicdb's Issues

Research on Querying Engines

Initial source is "Design Data Intensive Application"

  • Chapter 2 "Data Model and Query Languages"
  • Chapter 3 "Storage and Retrieval"

Setup automated running skeleton

blocked by #34
to run automation tests server has to implement PostgreSQL Wire Protocol just enough to communicate with psql and rust-postgres driver

Create a wiki page with materials on databases including: video courses, research papers and projects, articles and blogs

Anonymous Check constraint

Epic #15
Blocked by #2

SQL to reproduce:

create schema SMOKE_QUERIES;

create table SMOKE_QUERIES.NOT_NULLABLE_TABLE (
    column_1 smallint check (column_1 > 100)
);

insert into SMOKE_QUERIES.NOT_NULLABLE_TABLE values (99);

Expected result:

ERROR: check 'column value > 100' constraint is violated

The problem with this is that check constraint should use predicate evaluation logic and should be persisted to system.columns table

Implement cross table constraints

  • Primary Key
  • Unique
  • Foreign Key

Test cases:

  • create table with FK to PK on another table
  • create table with FK to neither PK nor Unique column on another table
  • Multiple columns PK
  • Multiple columns Unique
  • Multiple columns FK
  • FK with wrong reference type

Pretty print of selected result

Currently client dumps data from server as UTF8 string.
If select query was sent client has to print selected data in human readable format like

+------+------+
| col1 | col2 |
+------+------+
|     1|     2|
+------+------+

Improve Error handling in Frontend of Storage

Storage frontend does not handle situations when Backend stores less records than requested. (e.g. does not try to roll back writes by removing them or writing old values in case of update)

Support different SQL types

  • integer types smallint, integer and bigint #77
  • character string types char(n) and varchar(n) #89
  • boolean type #107
  • numeric types decimal(p, s) and its alias numeric(p, s) #90
  • approximate types real and double precision #91
  • date time types time [without timezone], time with timezone, timestamp [without timezone], timestamp with timezone and date
  • time interval #93

Improve on disk SledStorage error handling

  1. create_schema - what if sled won't be able to create Db
  2. create_table - what if sled won't be able to open_tree normally
  3. drop_table - what if sled won't be able to drop_tree normally
  4. insert_into -
    1. what if sled won't be able to open_tree normally
    2. what if sled won't be able to insert normally
  5. select_all_from -
    1. what if sled won't be able to open_tree normally
    2. what if sled won't be able to read data in iter normally
  6. update_all -
    1. open_tree
    2. fetch key with iter().keys()
  7. delete_all_from -
    1. open_tree
    2. fetch key with iter().keys()

Fix termination workaround for integration tests

First part of WA is in tests itself - test client send query that can't be parsed and then second client opens up a connection to server Node
Second part of WA is in sql engine - it returns QueryResult::Terminate if query is not parsable.
Third part is in Node - it sets state into STOPPED when gets QueryResult::Terminate from sql engine

Possible ways to fix it are:

  1. try to use async tokio-postgres if it doesn't have same problem as synchronous sfackler/rust-postgres#613
  2. use PostgreSQL drivers of other language and write tests in that lang (e.g. java, python or ruby)

Preserve data inside storage after restart

Currently, database is developed as to store all metadata about schemas and tables in-memory only. If restart process all data would be lost. Technically it is saved on disk with sled, but storage wouldn't find it on disk so will be thinking that there is nothing exists.

Anonymous Not Null Constraint

Currently, database supports only type constraints (implemented list see #17).

SQL to reproduce:

create schema SMOKE_QUERIES;

create table SMOKE_QUERIES.NOT_NULLABLE_TABLE (
    column_1 smallint not null
);

insert into SMOKE_QUERIES.NOT_NULLABLE_TABLE values (null);

Expected result:

ERROR: not null constraint for column_1 is violated

When implementing the task see #124 (comment) conversation to support null values in the storage

Optionally database should support violation of multiple constraints

Named Not Null Constraint

Currently, database supports only type constraints (implemented list see #17).

SQL to reproduce:

create schema SMOKE_QUERIES;

create table SMOKE_QUERIES.NOT_NULLABLE_TABLE (
    column_1 smallint constraint column_1_is_never_null not null
);

insert into SMOKE_QUERIES.NOT_NULLABLE_TABLE values (null);

Expected result:

ERROR: constraint 'column_1_is_never_null' is violated

When implementing the task see #124 (comment) conversation to support null values in the storage

Optionally database should support violation of multiple constraints

Research on Conflict Resolution.

Initial question is: what the hell is it? - Protocol, Strategy

Initial source "Design Data Intensive Application" Chapter 5 "Replication"
Paragraphs:

  • "Handling Write Conflicts"
  • "Detecting Concurrent Writes"

Setup manually running skeleton

  • implement simplest possible interaction with psql client
  • Provide documentation of local setup (for now it is only local)
  • Provide documentation of PostgreSQL queries list for testing compatibility

Handle `^C` in client app

How to reproduce:

  1. Run server app
  2. Run client app
  3. Press ^C in client app
  4. Server does not deregister Token(n)

Properly handle empty string command from client app

Server fails on empty SQL query

steps to reproduce:

  • start server
  • start client
  • type help and hit enter in client app
  • wait for server response
  • hit enter in client app

client trace with the command RUST_LOG=trace cargo run --bin client

 INFO  client > Starting client
help
 DEBUG client > typed command "help\n"
 DEBUG client > command send to server
 DEBUG client > 255 server result code
Ok("")

 DEBUG client > typed command "\n"
 DEBUG client > command send to server
 DEBUG client > 69 server result code
Unknown server code 69. Exiting!

server trace with the command RUST_LOG=trace cargo run --bin server

 INFO  server > Starting server on port 7000
 INFO  server > SQL engine has been created
 INFO  server > Network buffer of 256 size created
 DEBUG server > Connection 127.0.0.1:58240 is accepted
 TRACE server > 4 bytes read from network connection
 DEBUG server > help
 DEBUG server > Received from 127.0.0.1:58240 client
 DEBUG sqlparser::parser > Parsing sql 'help'...
 DEBUG server            > Query execution result
 DEBUG server            > Err(UnimplementedBranch("ParserError(\"Expected a keyword at the beginning of a statement, found: help\")"))
 TRACE server            > 0 bytes read from network connection
 DEBUG server            > 
 DEBUG server            > Received from 127.0.0.1:58240 client
 DEBUG sqlparser::parser > Parsing sql ''...
 DEBUG server            > Query execution result
 DEBUG server            > Err(UnimplementedBranch("UNIMPLEMENTED HANDLING OF \nNone\n STATEMENT!"))
 TRACE server            > 0 bytes read from network connection
 DEBUG server            > 
 DEBUG server            > Received from 127.0.0.1:58240 client
 DEBUG sqlparser::parser > Parsing sql ''...
 DEBUG server            > Query execution result
 DEBUG server            > Err(UnimplementedBranch("UNIMPLEMENTED HANDLING OF \nNone\n STATEMENT!"))
Error: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }

Implement authentication with PostgreSQL wire protocol

Blocked by #194

Currently, database does not handle authentication at all. For now we can:

  • create users table in system namespace during FrontendStore initialisation
  • make single record with admin user and simple password like 123
  • program adjust authentication flow with protocol - here be dragons.
    One of an idea add Authenticator trait to protocol and impl in node module that glue PG wire protocol with other part of backend

Named Check Constraint

Epic #15
Blocked by #2

SQL to reproduce:

create schema SMOKE_QUERIES;

create table SMOKE_QUERIES.NOT_NULLABLE_TABLE (
    column_1 smallint constraint column_1_greater_than_100 check (column_1 > 100)
);

insert into SMOKE_QUERIES.NOT_NULLABLE_TABLE values (99);

Expected result:

ERROR: 'constraint column_1_greater_than_100' is violated

The problem with this is that check constraint should use predicate evaluation logic and should be persisted to system.columns table

Correctly close connection on client side

How to reproduce:

  1. Run example from README
  2. In client up press ^C
    Server app is crashed with backtrace:
thread 'main' panicked at 'Unexpected error: Connection reset by peer (os error 54)', src/bin/server.rs:124:25
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1063
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:204
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:224
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:470
  11: rust_begin_unwind
             at src/libstd/panicking.rs:378
  12: std::thread::local::fast::Key<T>::try_initialize
  13: server::main
             at src/bin/server.rs:124
  14: std::rt::lang_start::{{closure}}
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/rt.rs:67
  15: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  16: std::panicking::try::do_call
             at src/libstd/panicking.rs:303
  17: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:86
  18: std::panicking::try
             at src/libstd/panicking.rs:281
  19: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  20: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  21: std::rt::lang_start
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/rt.rs:67
  22: server::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Process finished with exit code 101

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.