twentyfourminutes / venflow Goto Github PK

View Code? Open in Web Editor NEW

167.0 9.0 5.0 3.93 MB

A brand new, fast and lightweight ORM, build for PostgreSQL.

Home Page: https://twentyfourminutes.github.io/Venflow/

License: Apache License 2.0

C# 100.00%

postgres postgre postgresql postgresql-orm orm csharp sql dotnet dotnet-core mapper

venflow's Introduction

Hello there 👋

~~General Kenobi~~

I am Twenty aka. TwentyFourMinutes, I do a lot of code.

Almost always my code in written in C#, as its my favourite language. The back-end is where you can find me, since JavaScript is my worst nightmare. Usually most of the stuff I do is open-source and using the latest features the .NET Team throws at us. Most of the things I do are all about raw speed and Developer experience which are the key points of all the projects I do.

If you want to know more about me and what I do, either check out my pinned repositories or my website.

venflow's People

Contributors

Stargazers

Watchers

Forkers

anotherzane t0shik shubhamshah14102 mamata2507 orm-core-group

venflow's Issues

Generate all necessary JSON & TypeConverter parsers AOT

RepoDb - Fluent Query

Hi, I am interested to see your benchmark result as this is targeting the PostgreSQL.

Can you take a look at here? It seems it is querying all records the data from the DB and do the post filter in the client.

You can use the code below.

connection.QueryAll<T>(top: 1);

For all ToList() method, can you use the AsList() method instead? Part of RepoDb.Extensions.

Remove the functionality to reuse a command, in order to reduce complexity and overhead

Add a fluent query building API

Definition

The new fluent API should be as fast and allocation free as possible, while still providing the same level of configuration capabilities as the old one. The most notable changes are:

The cache key of each query will be its MethodInfo which the pipeline gets from SQL statement which will from now on be required to be defined as a lambda.
See: #28
The actual configuration of a query will be compiled AOT and saved to the method and class name of the lambda
The runtime fluent API will no longer carry any data and instead will use empty structs and dummy methods

Design

The finalized fluent API for querying will look something along the lines of:

await database.Table.Query(table => $"select {table:*} from {table} where {table.Id} = {0}").TrackChanges().SingleAsync();

For now the current goal is just to allow parameters and configurations. Anyhow, this restricts us to limit the amount of flexibility which is not too big of a deal as this was a very uncommon use case anyway. Some of these restrictions are:

The complete query has to be inline, that means that the following code will no longer be considered legal:

var query = database.Table.Query(() => "select * from table");

if(trackChanges)
   query.TrackChanges();
  
var result = await query.SingleAsync();

Also we do not allow any conditional SQL itself in the lambda, so this, providing a variable of type Func and similar attempts would considered be illegal:
```
await database.Table.Query(() => "select Id " + (includeName ? ", Name" : "") + " from table").SingleAsync();
```

The fluent API itself will no longer be based of interfaces and instance methods, but rather by empty structs and extension methods. The SingleAsync method will receive the data through an ambient ThreadStatic variable which holds data such as the DbCommand and query materializer which should be used.

Proposal

public class Table<T> where T : class, new()
{
    public QueryBuilder<T> Query()
    {
        return default;
    }
}

[StructLayout(LayoutKind.Sequential, Pack = 0)]
public readonly ref struct QueryBuilder<T> where T : class, new()
{

}

public static class QueryBuilderExtensions
{
    public static QueryBuilder<T> TrackChanges<T>(this QueryBuilder<T> builder) where T : class, new()
    {
        return default;
    }
    
    public static QueryBuilder<T> TrackChanges<T>(this QueryBuilder<T> builder, bool trackChanges) where T : class, new()
    {
        return default;
    }

    public static T SingleAsync<T>(this QueryBuilder<T> builder) where T : class, new()
    {        
        return default;
    }
}

Investigate if it is possible and faster/more memory efficient to only use the arguments in string.Format when caching Expression queries

Interpolated method names should omit Interpolated and non-interpolated methods should include raw

Discussed in #10

^{Originally posted by TwentyFourMinutes September 4, 2021}
Currently we have the following methods: QueryInterpolatedSingle, QueryInterpolatedBatch and ExecuteInterpolatedAsync, their purpose is to avoid SQL injection and to allow for easier parameterized SQL. However it is quite easy to mistakenly call their non-interpolated counter parts.

Proposal

Omit Interpolated in all of the above mentioned method names and instead add Raw to their counter part methods e.g. QueryRawSingle, QueryRawBatch and ExecuteRawAsync.

Pros

It is a lot less likely to mistakenly to execute queries which are vulnerable to SQL injection.
It is a lot shorter and easier to read.

Cons

It is a breaking change.

I would be happy about any thoughts or suggestions as well as a vote for either 👍 or 👎!

Add CRUD code generators

Definition

Writing all of the generators will be a one if not the most time consuming and buggiest things in the beginning. However there are a lot more benefits than downsides, with the biggest ones being:

Debuggable and inspectable materializer & inserters in a meaningful manner
Optimizing by the compiler to the code, instead of purely relying on the JIT and the hand written IL
Way more maintainable code as hand written IL is very error-prune and hard to maintain

However as mentioned there are downsides, which do hurt, but it is well worth the trade off, with the most notable being:

No reading and writing of hidden fields and or accessory, this includes read-only properties, and all accessory other than public
Not the most optimized materializer theoretically possible, however we need fewer code loaded to accomplish the same task.

Query

Design

As mentioned before the materializer itself will be no longer as optimized as theoretically possibly, this comes down to a simple problem. There is no consistent and easy way to predict which columns may be returned by an SQL query AOT, meaning before the query actually executed once.

This therefore mean that we have two parts which play together while reading a DbDataReader.

The actual detection which columns got returned.
This will be done by a dictionary that gets compiled AOT containing the column names of each property as a key with the value being their absolute property index in the entity. On the first execution at runtime of this query these will be mapped to a new static array where the index will be the column index of the reader and the value being the property index.
The actual materialization which is getting reused by all queries with the same table, order of joins and other configuration options such as change tracking.

Proposal

Single entity:

public static Person PersonParser_1(DbDataReader reader, ushort[] cols)
{
    Person person = null;
    var columnCount = cols.Length;

    for (var index = 0; index < columnCount; index++)
    {
        switch (cols[index])
        {
            case 0:
                person = new Person();
                person.Id = reader.GetFieldValue<int>(index);
                break;
            case 1:
                person.Name = reader.GetFieldValue<string>(index);
                break;
            case 2:
                person.Content = reader.GetFieldValue<string>(index);
                break;
        }
    }

    return person;
}

Multiple one-to-one entities:

public Person PersonParser_2(DbDataReader reader, ushort[] cols) 
{
    Person person = null;
    int personId = default;
    Data lastData = null;

    var startColumnIndex = 0;
    var columnCount = cols.Length;

    while(await reader.ReadAsync())
    {
        for(var index = startColumnIndex; index < columnCount; index++)
        {
            switch(cols[index])
            {
                case 0:
                    person = new Person();
                    person.Id = personId = reader.GetFieldValue<int>(index);

                    startColumnIndex = 3;
                break;
                case 1:
                    person.Name = reader.GetFieldValue<string>(index);
                break;
                case 2:
                    person.Address = reader.GetFieldValue<string>(index);
                break;

                case 3:
                    lastData = new Data();
                    person.Data.Add(lastData);
                    lastData.Id = reader.GetFieldValue<int>(index);
                    lastData.PersonId = personId;
                break;
                case 4:
                    lastData.Content = reader.GetFieldValue<string>(index);
                break;
            }
        }
    }

    return person;
}

Insert

Design

The insert doesn't change much in its implementation other than reusing some parts of the inserter.

Proposal

Single entity:

public static void PersonInserter(NpgsqlCommand cmd, Person p) 
{
    var parameters = cmd.Parameters;

    parameters.Add(new NpgsqlParameter<string>("@p0", p.Name));
    parameters.Add(new NpgsqlParameter<string>("@p1", p.Content));

    cmd.CommandText = "INSERT INTO people (name, content) VALUES (@p0, @p1) RETURNING id";
}

public static async Task Inserter(Person p) 
{
    using var cmd = new NpgsqlCommand();

    PersonInserter(cmd, p);

    p.Id = (int)await cmd.ExecuteScalarAsync();
}

Multiple entities:

public static void PersonInserter(NpgsqlParameterCollection parameters, StringBuilder commandText, Person[] people) 
{
    commandText.Append("INSERT INTO people (name, content) VALUES ");

    var absoluteIndex = 0;

    for(int i = 0; i < people.Length; i++)
    { 
        var p = people[i];

        var name1 = "@p" + absoluteIndex++;
        var name2 = "@p" + absoluteIndex++;

        commandText.Append('(')
                     .Append(name1)
                     .Append(',').Append(' ')
                     .Append(name2)
                     .Append(')').Append(',').Append(' ');

        parameters.Add(new NpgsqlParameter<string>(name1, p.Name));
        parameters.Add(new NpgsqlParameter<string>(name2, p.Content));
    }

    commandText.Length -=2;
    commandText.Append(" RETURNING id");
}

public static async Task Inserter(Person[] p) 
{
    using var cmd = new NpgsqlCommand();
    var commandText = new StringBuilder();
    PersonInserter(cmd.Parameters, commandText, p);
    cmd.CommandText = commandText.ToString();

    await using var reader = await cmd.ExecuteReaderAsync(p.Length == 1 ? CommandBehavior.SingleRow : CommandBehavior.Default);

    var index = 0;

    while(await reader.ReadAsync())
    {
        p[index++].Id = reader.GetFieldValue<int>(0);
    }
}

Update

Design

We used to store an array of booleans with a fixed length of the total theoretically updatable columns, which was semi optimal as it required an array allocation and an iteration over the whole array even if only one column was updated. In Reflow this will completely change.

From now on changes will be stored in local numeric fields containing the changed columns. Additionally, the trackChanges field will be stored in there as well to reduce memory usage even further. Assuming there are four updateable columns there would be one byte field. The least significant bit would indicate if changes should be tracked, the leading four will be responsible for the other columns (from least significant to most significant). The remaining two will be unused. If there are more than 7 fields which are updateable we will change the field to the next smallest fitting numeric type e.g. ushort/uint/ulong. For now we will restrict the maximum amount of updatable columns to 63 (as one bit is required for the previous trackChanges field).

While building the SQL required to actually update the columns in the database we use an algorithm which actually only iterates over all set bits, which improves the performance even further. Additionally we no longer need to query the method which gets us the new value of the columns.

Proposal

public static void Update(Person p, StringBuilder builder, NpgsqlParameterCollection parameters, ref uint abolsuteIndex) 
{
    if(p is not PersonProxy proxy)
        return;

    proxy.GetSectionChanges(0, out var section);

    while(section != 0)
    {
        string name = "@p" + abolsuteIndex++;

        switch((byte)(section & (byte)(~section + 1)))
        {
            case 1 << 0:
                builder.Append("id = ");

                parameters.Add(new NpgsqlParameter<int>(name, proxy.Id));
                break;
            case 1 << 1: 

                builder.Append("name = ");

                parameters.Add(new NpgsqlParameter<string>(name, proxy.Name));
                break;

            default:
                throw new Exception();
        }

        builder.Append(name)
               .Append(',')
               .Append(' ');

        section &= (byte)(section - 1);   
    }
}

public class Person
{
    public virtual int Id { get; set; }
    public virtual string Name { get; set; }
    public string Content { get; set; }
}

public class PersonProxy : Person
{
    private byte _hasChanges_1_8;

    public PersonProxy(bool trackChanges = false)
    {
        if(trackChanges)
            _hasChanges_1_8 |= 1;
    }
    
    public override int Id
    { 
        get 
        { 
            return base.Id;
        } 
        set
        { 
            base.Id = value; 
            
            if((byte)(_hasChanges_1_8 & 1) != 0)
            {
                _hasChanges_1_8 |= 1 << 1;
            }
        }
   }
   
   public override string Name
   { 
        get 
        { 
            return base.Name;
        } 
        set
        { 
            base.Name = value; 
            
            if((byte)(_hasChanges_1_8 & 1) != 0)
            {
                _hasChanges_1_8 |= 1 << 2;
            }
        }
   }
    
   public void GetSectionChanges(byte sectionIndex, out byte section)
   {
       switch(sectionIndex)
       {
           case 0:
               section = (byte)(_hasChanges_1_8 >> 1);
               break;
           default:
               throw new Exception();
       }
   }
}

Delete

Definition

The deletion of entities doesn't change much in its implementation either, other than no longer accessing the primary key through a delegate.

Proposal

Single entity:

public static Task<int> Delete(Person p) 
{
    using var cmd = new NpgsqlCommand();

    cmd.Parameters.Add(new NpgsqlParameter<int>("@p0", p.Id));

    cmd.CommandText = "DELETE FROM people WHERE id = @p0";

    return cmd.ExecuteNonQueryAsync();
}

Multiple entities:

public static Task<int> Delete(Person[] p) 
{
    if(p.Length == 0)
        return Task.FromResult(0);

    using var cmd = new NpgsqlCommand();

    var parameters = cmd.Parameters;

    var commandText = new StringBuilder();

    commandText.Append("DELETE FROM people WHERE id IN (");

    for(int i = 0; i < p.Length; i++)
    {
        var name = "@p" + i;
        parameters.Add(new NpgsqlParameter<int>(name, p[i].Id));
        commandText.Append(name)
                   .Append(',').Append(' ');
    }

    commandText.Length -= 2;
    commandText.Append(')');

    cmd.CommandText = commandText.ToString();

    return cmd.ExecuteNonQueryAsync();
}

Returning default value from database after inserting new object

When I insert a new object un DB I would like to get all default values that I didn't specified to be stored in my C# class.

Make cached entities immutable

Add Venflow to grauenwolfs DotNet ORM Cookbook

https://github.com/grauenwolf/DotNet-ORM-Cookbook

Format the generated source with CSharpier

This would allow for way easier debugging of the generated sources. However, this feature should be opt-in as CSharpier takes time to do its magic and we do not necessarily want to format it for every build. An attribute on the assembly such as [FormatVenflowSource] would be a way to accomplish that. Anyhow we will need to wait until there is a proper way to pass the sources to CSharpier.

Add an option to bypass the sql based cache on a query level

Add an option to the Query fluent API which allows bypassing of the sql cache.

Add option to mark non-primary columns as database generated.

Any column should be able to be marked as database generated.

API Proposal

As the configuration possibilities for a single column stack up by now, we should consider adding a new fluent API just for columns.
This could look something like this:

entityBuilder.Property(x => x.Column).WithName("column");

With the option to mark it as database generated:

entityBuilder.Property(x => x.Column).HasDefault();

// Or

entityBuilder.Property(x => x.Column).IsGenerated();

Types used in the emitted SyntaxFactory code is not properly referenced

Some types are either defined as Foo or Venflow.Foo where both should rather be global::Venflow.Foo.

CLR Enums are not correctly parsed in string interpolation

Describe the bug

Currently CLR Enums are not getting converted to their underlying numeric type.

Reproduction

var row = await database.Table.QueryInterpolatedSingle(§"SELECT * FROM table WHERE clr_enum = {CLREnum.Value}").QueryAsync();

snake_case naming convention support

Is your feature request related to a problem, if so please describe it.

As there is no database creation logic and the ORM is primarily Postgres it would make sense to support native naming convention, this would making building on top of existing databases easier as well.

Describe the solution you'd like

I've got a working solution here:
T0shik@ddb91f7

If you want this change I can create a pull request, if you'd like this to be implemented a particular way feel free to tell me I'll change it.

Additional context (if any)

Nada

Build entity models and their relations from EntityConfigurations

Instantiate database table properties

Definition

Currently we are having a global cache of all DatabaseConfigurations which point to delegates continuing runtime generated backing field accessors which allows for the following usage:

public class Db : Database
{
    public Table<Person> People { get; }
}

However, with AOT generated code we can not access fields which are generated by the compiler at runtime. Therefore we will need to fallback for either internal or public set accessor.

Design

While the afore mentioned restriction is somewhat inconvenient, we can at least improve performance by declaring the base Database class as a generic type which will allow for a generic based cache instead of a type based dictionary cache. This generic cache will be instantiated on the first call of the Database constructor.

The instantiater itself, below represented by a static lambda, will be generated AOT.

Propsoal

public class Db : Database<PostgreDb>
{
    public Table<Person> People { get; set; }
}

public class Database<T> where T : Database<T>
{
    private static Action<Database<T>> _instantiater;

    protected Database()
    {
        if(_instantiater is null)
        {
            _instantiater = static db => (db as PostgreDb).People = new Table<Person>();
        }
        
        _instantiater.Invoke(this);
    }
}

Update docfx to version 3 beta

Add a custom InterpolatedStringHandler to handle allocation free sql and parameter collection

Definition

The purpose of this handler should be to replace the old usage of the FormattableString, which worked and required way less code, however the memory allocation was quite immense.

With the new custom string interpolation we got with .NET 6 and C# 10 we can reduce the amount allocation of heap allocations to exactly 0. Not only are we avoiding the boxing of value type parameters, but also the allocations of the FormattableString object itself and the Expression Trees, which were required to allow for maintainable SQL.

Design

As we enforce that a query through the fluent API has to be inline and the content of the lambda has to be an interpolated string, we can use a ThreadStatic variable. Which acts as shared data pool between the query method and the SqlInterpolationHandler.

The data we provide to the string handler will contain the ParameterIndecies array which will contain all the locations of the parameters (0 based). This will be required once we add the following syntax table => "select * from {table}", another array will be required which contains the actual string that will replace the fake data. However this syntax will most likely be implemented in v2.1.0.

Additionally we will be able to give the exact length of the string that the CommandBuilder will produce as long as it won't contain any collection parameters.

Proposal

[InterpolatedStringHandler]
public struct SqlInterpolationHandler
{
    internal class ScopeData
    {
        [ThreadStatic]
        internal static ScopeData Current;

        internal StringBuilder CommandBuilder;
        internal NpgsqlParameterCollection Parameters;
        internal short[] ParameterIndecies;
    }

    private int _interpolationIndex;
    private short _parameterIndex;
    private short _absolutParameterIndex;
    private short _nextParameterIndex;

    private readonly StringBuilder _commandBuilder;
    private readonly short[] _parameterIndecies;
    private readonly NpgsqlParameterCollection _parameters;

    public SqlInterpolationHandler(int literalLength, int formattedCount)
    {
        _interpolationIndex = 0;
        _parameterIndex = 0;
        _absolutParameterIndex = 0;

        var current = ScopeData.Current;

        _commandBuilder = current.CommandBuilder;
        _parameterIndecies = current.ParameterIndecies;
        _parameters = current.Parameters;
        _nextParameterIndex = _parameterIndecies.Length > 0 ? _parameterIndecies[0] : (short)-1;
    }

    public void AppendLiteral(string value)
    {
        _commandBuilder.Append(value);
    }

    public void AppendFormatted<T>(T value)
    {
        BaseAppendFormatted(value);
    }

    public void AppendFormatted<T>(T value, string format) => AppendFormatted<T>(value);

    private void BaseAppendFormatted<T>(T value)
    {
        if (_interpolationIndex++ != _nextParameterIndex)
        {
            return;
        }

        if (++_parameterIndex < _parameterIndecies.Length)
            _nextParameterIndex = _parameterIndecies[_parameterIndex];

        var parameterName = "@p" + _absolutParameterIndex++;

        _parameters.Add(new NpgsqlParameter<T>(parameterName, value));

        _commandBuilder.Append(parameterName);
    }
}

Usage

Func<SqlInterpolationHandler> lambda = () => $"select * from table where id = {0}";

var command = new NpgsqlCommand();
command.Connection = conn;

var commandBuilder = new StringBuilder(34);

var scope = new SqlInterpolationHandler.ScopeData
{
    ParameterIndecies = new short[ ] { 0 },
    Parameters = command.Parameters,
    CommandBuilder = commandBuilder
};

SqlInterpolationHandler.ScopeData.Current = scope;

query.Invoke();

command.CommandText = commandBuilder.ToString();

updations in the readme file

Hey,
The contributing file has a scope of improvement and can be updated to a concise way after error removal.
I would like to contribute so please can you assign this issue to me.

Create a lambda linker which will predict the names of all lambdas in any given class

RepoDb: QueryAll

On this code line, I would highly recommend to use the QueryAll<T>(top: X) instead.

Like the on one in our documentation:

using (var connection = new SqlConnection(connectionString).EnsureOpen())
{
	var people = connection.QueryAll<Person>(top: 100);
	// Do the stuffs for 'people' here
}

Why? By using the where on the Query<T>() would force writing the WHERE (Field IS NOT NULL); in which slow down the process (along the way) even though they're returning the same result.

Change to async calls once available

Update to async calls to currently sync versions.
Expected for Npgsql v.5.0.0

Use UnprepareAsync instead of Unprepare on NpgqlCommands #2944
Use GetColumnSchemaAsync instead of GetColumnSchema on DbReaders #3010

Provide a high level batching API of all CRUD methods through the new DbBatch API

Rename Batch to Many to prepare for future Batch API

Default timestamp insert is not now() event if specified in db rule

Describe the bug

Database timestampz column gets the value of '0001-12-31 23:42:30.000000 +00:00 BC' when inserting a new row, if the timestamp is not specified in entity. (Even if the database column has a default value of 'now()').

Reproduction

  public class Interception {
    public Guid Id { get; set; }
    public DateTime gdh_inter { get; set; }
  }

  Interception i = new Interception();
  await db.Interceptions.InsertAsync(i);

Further technical details

The only way to get rid of that bug for now is to specify defaut datetime in C# class :

  public class Interception {
    public Guid Id { get; set; }
    public DateTime gdh_inter { get; set; } = DateTime.UtcNow; 
  }

Add csharpier to MSBuild

https://github.com/belav/csharpier

twentyfourminutes / venflow Goto Github PK

venflow's Introduction

Hello there 👋

venflow's People

Contributors

Stargazers

Watchers

Forkers

venflow's Issues

Definition

Design

Proposal

Discussed in #10

Proposal

Pros

Cons

Definition

Query

Design

Proposal

Insert

Design

Proposal

Update

Design

Proposal

Delete

Definition

Proposal

API Proposal

Describe the bug

Reproduction

Is your feature request related to a problem, if so please describe it.

Describe the solution you'd like

Additional context (if any)

Definition

Design

Propsoal

Definition

Design

Proposal

Usage

Describe the bug

Reproduction

Further technical details

Recommend Projects

Recommend Topics

Recommend Org