Giter Club home page Giter Club logo

Comments (11)

fklebert avatar fklebert commented on July 4, 2024

I fully support the idea to not use offsets in expressions at all. I can not think of any meaningful use case that would require this.

In addition I want to drop an idea:
Actually I think that an offset should be a dedicated read-only type with an own keyword.
So something like:

struct Schema
{
    offset myOffset;
    uint8 someValue;
    varsize someVariableSizedValue;
myOffset:
   string stringValue;
};

Any offset would be read-only in the API and only would have a getter. During writing the offset would be initialized correctly (without me having to call initializeOffsets() or anything like that).

from zserio.

Mi-La avatar Mi-La commented on July 4, 2024

What about offset arrays?

struct Schema
{
    offset myOffsetArray[];
    uint8 someValue;
myOffsetArray[@index]:
    string stringArray[];
};

What should zserio return before writing? Is an empty array ok?

from zserio.

mikir avatar mikir commented on July 4, 2024

Actually, the requirement that during writing the offsets should be initialized correctly and automatically together with no setters is very strict.

Consider this:

struct Schema
{
    uint32 myOffsetArray[];
    uint8 someValue;
myOffsetArray[@index % someValue]:
    string stringArray[];
};

To calculate automatically the size of myOffsetArray is not trivial considering that user can invent any expression in offset label. And such calculation must be done because then we are not able to calculate the offset of stringArray.

If we need such strict requirement which it probably makes sense from the user point view, we must be very strict in the usage of the offsets:

  1. Only myOffsetArray[@index] will be allowed. No other expressions in brackets [] will be allowed.
  2. Each offset can be used only once as offset label. This is something which is not checked at all currently in Zserio. And implementation of such checking without these restrictions is very difficult. For example, consider this legal usage:
struct Schema
{
    uint32 myOffsetArray[2];
    uint8 someValue;
myOffsetArray[0]:
    string stringValue1;
myOffsetArray[1]:
    string stringValue2;
};

Putting all together, we must implement a huge restrictions for usage of offsets at the beginning if we want to improve usability. This could potentially break some schemas but we probably might to afford it together with change of the documentation that such usage of offsets is illegal.

Regarding new offset keyword, I support this idea but I can see one drawback. Currently, user can optimize the memory for offsets knowing the maximum size of the blob. For example by using of the type uint16. This won't be possible by new keyword. We will use probably uint32 for offset allowing 4GB blobs as the maximum. Of course, we can invent something like offset16, offset32, offset64 but not sure if this is not too complicated. Besides of that, we cannot disable usage of normal types for offsets like uint16 myOffset because of the backward compatibility.

from zserio.

fklebert avatar fklebert commented on July 4, 2024

I was thinking about using varsize as the underlying datatype for offset. Still never below 1 byte, but better than hardcoded 4 bytes.
I also support the idea of not allowing expressions in offset labels.

What about offset arrays?

struct Schema
{
    offset myOffsetArray[];
    uint8 someValue;
myOffsetArray[@index]:
    string stringArray[];
};

What should zserio return before writing? Is an empty array ok?

I guess empty array would be OK.

from zserio.

mikir avatar mikir commented on July 4, 2024

Actually, varsize is evil and it has been already disabled in Zserio. The problem was that you are not possible to calculate bit size of varsize in advance without knowing the value of the offset. So, you are not able to calculate the offset at all. In another words, you must have 'stable' types for the offsets.

from zserio.

fklebert avatar fklebert commented on July 4, 2024

Actually, varsize is evil and it has been already disabled in Zserio. The problem was that you are not possible to calculate bit size of varsize in advance without knowing the value of the offset. So, you are not able to calculate the offset at all. In another words, you must have 'stable' types for the offsets.

Ah... I see... makes sense.

from zserio.

Mi-La avatar Mi-La commented on July 4, 2024

Only myOffsetArray[@Index] will be allowed. No other expressions in brackets [] will be allowed.

Then we could even skip the @index keyword and write just the brackets - or we could just use the array without brackets at all :-):

    uint32 myOffsetArray[];
myOffsetArray[]:
    string stringArray[];

or yet better:

    uint32 myOffsetArray[];
myOffsetArray:
    string stringArray[];

from zserio.

MisterGC avatar MisterGC commented on July 4, 2024

Dedicated Offset Types: I support the idea of explicit types and don't see a problem with having different variants. @mikir, why do you think this could be too complicated? Are there implementation concerns? Users need to choose the offset field size anyway. We also need to consider backward compatibility. Using plain numeric types could generate a warning like, "Using offset fields with plain numeric fields like uint32 is not recommended; use offset32 instead."

Improved Syntax: I like the improved syntax suggestion. For compatibility, @index should remain optional. I prefer the following format:

    uint32 myOffsetArray[];
myOffsetArray[]:
    string stringArray[];

This way, it's clear at first glance whether it's an offset to the stringArray or an offset array to each element, even if the field definition is not close to the offset label.

from zserio.

mikir avatar mikir commented on July 4, 2024

Dedicated Offset Types: I support the idea of explicit types and don't see a problem with having different variants. @mikir, why do you think this could be too complicated? Are there implementation concerns? Users need to choose the offset field size anyway. We also need to consider backward compatibility. Using plain numeric types could generate a warning like, "Using offset fields with plain numeric fields like uint32 is not recommended; use offset32 instead."

Actually, I had two minor things in my mind:

  • Schema evolution
    If user decided to use for example offset16 and after schema evolution the blob will be significantly increased, then incompatible change to offset32 will be needed.
  • Offset overflow checking
    We will probably need to check the offset overflow properly in case of lower size offset types.

As you pointed out, we have currently the same problems with offsets anyway, so probably this is not a big issue.

Improved Syntax: I like the improved syntax suggestion. For compatibility, @index should remain optional. I prefer the following format:

    uint32 myOffsetArray[];
myOffsetArray[]:
    string stringArray[];

We should just check the abnormal usage:

    varsize myOffsetArraySize;
    varsize myStringArraySize;
    uint32 myOffsetArray[myOffsetArraySize];
myOffsetArray[]:
     string stringArray[myStringArraySize];

from zserio.

mikir avatar mikir commented on July 4, 2024

Because we have discussed 4 different ideas in this issue, I have created three new issues where we can continue discussions:

#641 - Offset expressions should be disabled
#642 - Consider to simplify indexed offsets syntax
#643 - Consider to introduce new zserio type for offsets

from zserio.

mikir avatar mikir commented on July 4, 2024

Don't forget to update documentation with some more detailed description of the offsets.

from zserio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.