Giter Club home page Giter Club logo

datajoint-core's Introduction

Coverage Status

DataJoint Core Library

The DataJoint Core Library is a low-level software library for all shared code across the user-level DataJoint frameworks. Rather than using their own code, DataJoint frameworks, such as those written in Python and MATLAB, can use the library for connecting to Structured Query Language (SQL) databases, executing queries against a connection, and reading query results. The core library aims to remove the burden of writing and maintaining duplicate code across language-specific DataJoint frameworks, enhancing developer productivity, ecosystem maintainability, and framework extensibility. The DataJoint Core Library can be further enhanced by future work to house more code that currently exists at the user level, such as building generic SQL queries, building schemas, and much more.

This project started as a UTDesign Capstone project with senior students in Computer Science at The University of Texas at Dallas.

Repository Layout

This repository primarily contains two Rust packages (aka, crates) that make up the core library.

At the moment, datajoint-python is a temporary home for integrating the core library into the Python library.

Set Up

Read the set up document for set up information.

datajoint-core's People

Contributors

benjuan-507 avatar garmoned avatar guzman-raphael avatar jackson-nestelroad avatar jhocevar avatar rhijke avatar space-cowboy2000 avatar vumthi avatar yashals avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

datajoint-core's Issues

Create file structure with crates and connection submodule

Set up the initial file structure for the datajoint-core repository. This repository should contain two Rust library crates (packages) and should follow standard conventions for organizing both. The following two packages should be contained in the repository:

  • datajoint-core - A Rust library for shared code across DataJoint clients. Will currently facilitate connections to SQL databases and raw querying.
  • datajoint-core-ffi-c - A Rust library for calling datajoint-core from other languages using the C FFI. This crate defines all C-interface functions (written in Rust) that expose the functionality of datajoint-core to other languages.

The datajoint-core package should currently contain only one submodule: the connection submodule. This submodule should somehow be accessible to the outside world (for exmaple: datajoint_core::connection). In the future, datajoint-core will support more submodules as more functionality is ported over, such as query, schema, and more.

Both Rust packages should be usable as a library crate that can be used by other Rust programs.

Here is an example structure that puts these two packages in one repository:

/packages
    /datajoint-core
        /src
            /connection
        Cargo.toml
    /datajoint-core-ffi-c
        /src
        Cargo.toml
LICENSE
README.md

Send a raw SQL query over a database connection

A database connection should have a public interface for running raw SQL queries (in string form). Queries are assumed to be correct for the database with no placeholder arguments. Query results and potential errors should be safely communicated to the caller.

Initialize database connection with settings object

Allow a database connection to be initialized and established using a settings object (as defined by #3). Initialization and connection should be (if possible) two separate actions to reflect what currently happens in the client libraries.

Add documentation to all C FFI code

All C FFI code should have Rustdoc comments, properly formatted, explaining how the function works, side effects, output parameters, and more.

Implement default settings for database connections

Define and implement a unified set of settings (along with their defaults) for connecting to MySQL and Postgres databases. datajoint-core should currently only use default settings, but setting these settings programmatically and reading them from a file should also be supported. All settings should be appropriately mapped to SQLx for creating a database connection.

Set up SQLx

Set up SQLx, a crate for connecting to and querying various types of SQL databases. This crate should only be used by datajoint-core.

Support generic decoding in Rust and C FFI

The C FFI should have generic decoding using one method as opposed to needing to call a method for every possible data type. The Rust library will need to support generic decoding as well.

pub enum DecodeResult {
    Int8(i8),
    UInt8(u8),
    Int16(i16),
    UInt16(u16),
    Int32(i32),
    UInt32(u32),
    String(String),
    Float32(f32),
    Float64(f64),
    Bytes(Vec<u8>),
}

pub fn TableColumnRef::type_name(&self) -> DataJointType;
pub fn TableRow::decode(&self, column: TableColumnRef) -> DecodeResult;
pub fn TableRow::try_decode(&self, column: TableColumnRef) -> Result<DecodeResult, Error>;
// Getters based on native type.
int table_row_get_int8(TableRow* this, TableColumnRef* column, int8_t* out);
int table_row_get_by_name_int8(TableRow* this, const char* column_name, int8_t* out);
int table_row_get_by_ordinal_int8(TableRow* this, size_t column_ordinal, int8_t* out);

int table_row_get_uint8(TableRow* this, TableColumnRef* column, uint8_t* out);
int table_row_get_by_name_uint8(TableRow* this, const char* column_name, uint8_t* out);
int table_row_get_by_ordinal_uint8(TableRow* this, size_t column_ordinal, uint8_t* out);

// And more...

// Decodes the value to the buffer allocated by the caller.
int table_row_decode_to_buffer(TableRow* this, TableColumnRef* column, void* buffer, size_t buffer_size, size_t* output_size, NativeDecodedType* type);

// Wraps void* data, size_t size, and NativeDecodedType type.
struct AllocatedDecodedValue;

// Decodes the value, making its own allocation in the process.
int table_row_decode_to_allocation(TableRow* this, TableColumnRef* column, AllocatedDecodedValue* output);

// Allocates memory for the wrapper struct.
AllocatedDecodedValue* allocated_decoded_value_new();
// Returns allocated memory.
void* allocated_decoded_value_data(AllocatedDecodedValue* this);
// Returns size of memory allocation.
size_t allocated_decoded_value_size(AllocatedDecodedValue* this);
// Returns type of allocated memory for reading.
NativeDecodedType allocated_decoded_value_type(AllocatedDecodedValue* this);
// Calls the proper deallocator.
void allocated_decoded_value_free(AllocatedDecodedValue* this);

Add C++ FFI

The existing dj implementations in MATLAB and python are both object-oriented, but C-ffi's are necessarily not object-oriented. This will ultimately result in a lot of code duplication as interfaces are built in the target language to bundle up the exposed C functions.

Both MATLAB and python support C++ FFI's, which would reduce this overhead. In python, using e.g. SWIG, one can define a class interface in a C++ header and interface with that class as a regular python class.

namespace dj {
  class Table {
    void insert(*Entries);
    ...
  }
}
class Animal(dj.Table):
  ...

The MATLAB side of things is less ideal, since it does not (currently) support direct inheritance of C++ classes and, even worse, doesn't permit factory constructors. However, one can otherwise interface with C++ objects through the built-in clibgen package. So while building the table interface may require some redundant code, other interfaces like query objects that don't involve user class definitions wouldn't.

Doing this amounts to wrapping the C interface in an additional C++ header, linking to the existing dll, and building with SWIG/clibgen, which is fairly straightforward. An added benefit is that SWIG offers bindings to other target languages including javascript, which you essentially get for free.

var Animal = {...};
Object.setPrototypeOf(Animal, dj.Table);

Initial C FFI

Structs

  • ConnectionSettings
  • Connection
  • Executor
  • Cursor
  • TableRow
  • TableRowVector

Functions

ConnectionSettings

  • ConnectionSettings* connection_settings_new();
  • void connection_settings_free(ConnectionSettings* self);
  • void connection_settings_set_*(ConnectionSettings* self, T value);
  • T connection_settings_get_*(ConnectionSettings* self);

Connection

  • Connection* connection_new(ConnectionSettings* settings);
  • void connection_free(Connection* self);
  • int connection_connect(Connection* self);
  • int connection_disconnect(Connection* self);
  • int connection_reconnect(Connection* self);
  • ConnectionSettings* connection_get_settings(Connection* self);
  • int connection_executor(Connection* self, Executor* out);
  • int connection_execute_query(Connection* self, const char* query, unsigned long long* out_size);
  • int connection_fetch_query(Connection* self, const char* query, Cursor* out_cursor);

Executor

  • Executor *executor_new();
  • void executor_free(Executor* self);
  • int executor_execute(Executor* self, const char* query, unsigned long long* out_size);
  • int executor_fetch_one(Executor* self, const char* query, TableRow* out);
  • int executor_fetch_all(Executor* self, const char* query, TableRowVector* out);
  • Cursor* executor_cursor(Executor* self, Cursor* out);

Cursor

  • Cursor *cursor_new();
  • void cursor_free(Cursor* self);
  • int cursor_next(Cursor* self, TableRow* out);
  • int cursor_rest(Cursor* self, TableRowVector* out);

TableColumnRef

  • TableColumnRef* table_column_ref_new();
  • void table_column_ref_free(TableColumnRef* self);
  • size_t table_column_ref_ordinal(TableColumnRef* self):
  • const char* table_column_ref_name(TableColumnRef* self);
  • DataJointType table_column_ref_type(TableColumnRef* self);

TableRow

  • TableRow* table_row_new();
  • void table_row_free(TableRow* self);
  • int table_row_is_empty(TableRow* self);
  • int table_row_columns(TableRow* self, TableColumnRef* out_columns, size_t columns_size);
  • size_t table_row_column_count(TableRow* self);
  • int table_row_get_column_with_name(TableRow* self, const char* column_name, TableColumnRef* out);
  • int table_row_get_column_with_ordinal(TableRow* self, size_t ordinal, TableColumnRef* out);
  • int table_row_get_*(TableRow* self, TableColumnRef* column, T* out);
  • int table_row_get_by_name_*(TableRow* self, const char* column_name, T* out);
  • int table_row_get_by_ordinal_*(TableRow* self, size_t ordinal, T* out);

TableRowVector

  • TableRowVector* table_row_vector_new();
  • void table_row_vector_free(TableRowVector* self);
  • size_t table_row_vector_size(TableRowVector* self);
  • TableRow* table_row_vector_get(TableRowVector* self, size_t index);

Notes

  • T* T_new() functions for structs without default constructors should call malloc to allocate memory.
  • NULL checks should be implemented for all self parameters and associated with a common error code.
  • All functions that return int return an error code. 0 for success, non-zero for error. No panics allowed. All try_* versions of methods should be called under the hood.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.