Comments (11)
I fully support the idea to not use offsets in expressions at all. I can not think of any meaningful use case that would require this.
In addition I want to drop an idea:
Actually I think that an offset should be a dedicated read-only type with an own keyword.
So something like:
struct Schema
{
offset myOffset;
uint8 someValue;
varsize someVariableSizedValue;
myOffset:
string stringValue;
};
Any offset
would be read-only in the API and only would have a getter. During writing the offset would be initialized correctly (without me having to call initializeOffsets() or anything like that).
from zserio.
What about offset arrays?
struct Schema
{
offset myOffsetArray[];
uint8 someValue;
myOffsetArray[@index]:
string stringArray[];
};
What should zserio return before writing? Is an empty array ok?
from zserio.
Actually, the requirement that during writing the offsets should be initialized correctly and automatically together with no setters is very strict.
Consider this:
struct Schema
{
uint32 myOffsetArray[];
uint8 someValue;
myOffsetArray[@index % someValue]:
string stringArray[];
};
To calculate automatically the size of myOffsetArray
is not trivial considering that user can invent any expression in offset label. And such calculation must be done because then we are not able to calculate the offset of stringArray
.
If we need such strict requirement which it probably makes sense from the user point view, we must be very strict in the usage of the offsets:
- Only
myOffsetArray[@index]
will be allowed. No other expressions in brackets[]
will be allowed. - Each offset can be used only once as offset label. This is something which is not checked at all currently in Zserio. And implementation of such checking without these restrictions is very difficult. For example, consider this legal usage:
struct Schema
{
uint32 myOffsetArray[2];
uint8 someValue;
myOffsetArray[0]:
string stringValue1;
myOffsetArray[1]:
string stringValue2;
};
Putting all together, we must implement a huge restrictions for usage of offsets at the beginning if we want to improve usability. This could potentially break some schemas but we probably might to afford it together with change of the documentation that such usage of offsets is illegal.
Regarding new offset
keyword, I support this idea but I can see one drawback. Currently, user can optimize the memory for offsets knowing the maximum size of the blob. For example by using of the type uint16
. This won't be possible by new keyword. We will use probably uint32
for offset
allowing 4GB blobs as the maximum. Of course, we can invent something like offset16
, offset32
, offset64
but not sure if this is not too complicated. Besides of that, we cannot disable usage of normal types for offsets like uint16 myOffset
because of the backward compatibility.
from zserio.
I was thinking about using varsize
as the underlying datatype for offset
. Still never below 1 byte, but better than hardcoded 4 bytes.
I also support the idea of not allowing expressions in offset labels.
What about offset arrays?
struct Schema { offset myOffsetArray[]; uint8 someValue; myOffsetArray[@index]: string stringArray[]; };
What should zserio return before writing? Is an empty array ok?
I guess empty array would be OK.
from zserio.
Actually, varsize
is evil and it has been already disabled in Zserio. The problem was that you are not possible to calculate bit size of varsize
in advance without knowing the value of the offset. So, you are not able to calculate the offset at all. In another words, you must have 'stable' types for the offsets.
from zserio.
Actually,
varsize
is evil and it has been already disabled in Zserio. The problem was that you are not possible to calculate bit size ofvarsize
in advance without knowing the value of the offset. So, you are not able to calculate the offset at all. In another words, you must have 'stable' types for the offsets.
Ah... I see... makes sense.
from zserio.
Only myOffsetArray[@Index] will be allowed. No other expressions in brackets [] will be allowed.
Then we could even skip the @index
keyword and write just the brackets - or we could just use the array without brackets at all :-):
uint32 myOffsetArray[];
myOffsetArray[]:
string stringArray[];
or yet better:
uint32 myOffsetArray[];
myOffsetArray:
string stringArray[];
from zserio.
Dedicated Offset Types: I support the idea of explicit types and don't see a problem with having different variants. @mikir, why do you think this could be too complicated? Are there implementation concerns? Users need to choose the offset field size anyway. We also need to consider backward compatibility. Using plain numeric types could generate a warning like, "Using offset fields with plain numeric fields like uint32 is not recommended; use offset32 instead."
Improved Syntax: I like the improved syntax suggestion. For compatibility, @index
should remain optional. I prefer the following format:
uint32 myOffsetArray[];
myOffsetArray[]:
string stringArray[];
This way, it's clear at first glance whether it's an offset to the stringArray
or an offset array to each element, even if the field definition is not close to the offset label.
from zserio.
Dedicated Offset Types: I support the idea of explicit types and don't see a problem with having different variants. @mikir, why do you think this could be too complicated? Are there implementation concerns? Users need to choose the offset field size anyway. We also need to consider backward compatibility. Using plain numeric types could generate a warning like, "Using offset fields with plain numeric fields like uint32 is not recommended; use offset32 instead."
Actually, I had two minor things in my mind:
- Schema evolution
If user decided to use for exampleoffset16
and after schema evolution the blob will be significantly increased, then incompatible change tooffset32
will be needed. - Offset overflow checking
We will probably need to check the offset overflow properly in case of lower size offset types.
As you pointed out, we have currently the same problems with offsets anyway, so probably this is not a big issue.
Improved Syntax: I like the improved syntax suggestion. For compatibility,
@index
should remain optional. I prefer the following format:uint32 myOffsetArray[]; myOffsetArray[]: string stringArray[];
We should just check the abnormal usage:
varsize myOffsetArraySize;
varsize myStringArraySize;
uint32 myOffsetArray[myOffsetArraySize];
myOffsetArray[]:
string stringArray[myStringArraySize];
from zserio.
Because we have discussed 4 different ideas in this issue, I have created three new issues where we can continue discussions:
#641 - Offset expressions should be disabled
#642 - Consider to simplify indexed offsets syntax
#643 - Consider to introduce new zserio type for offsets
from zserio.
Don't forget to update documentation with some more detailed description of the offsets.
from zserio.
Related Issues (20)
- Fix MISRA C++ 2023 rule 8.2.6 "An object with integral, enumerated, or pointer to void type shall not be cast to a pointer type"
- Fix MISRA C++ 2023 rule 7.0.3 "The numerical value of a character shall not be used"
- Fix for MISRA C++ 2023 rule MISRACPP2023-8_2_5-a 'reinterpret_cast' should not be used HOT 1
- Fix MISRA C++ 2023 rule 7.0.4 "The operands of bitwise operators and shift operators shall be appropriate"
- Fix MISRA C++ 2023 rule 6.7.1 "Local variables shall not have static storage duration"
- Fix MISRA C++ 2023 rule 13.3.3 "The parameters in all declarations or overrides of a function shall either be unnamed or have identical names"
- Fix MISRA C++ 2023 rule 12.3.1 "The union keyword shall not be used"
- Fix MISRA C++ 2023 rule 28.6.2 "Forwarding references and std::forward shall be used together"
- Fix MISRA C++ 2023 rule 5.10.1 "User-defined identifiers shall have an appropriate form"
- Fix MISRA C++ 2023 rule MISRACPP2023-6_4_1-b Identifier 'add' is hiding other identifier with the same name declared in outer scope HOT 1
- Fix MISRA C++ 2023 rule 15.0.1 "Special member functions shall be provided appropriately"
- Fix MISRA C++ 2023 rule MISRACPP2023-6_4_1-c The identifier 'allocator' hides an identifier declared in class 'BitStreamReader' HOT 1
- Fix MISRA C++ 2023 rule 19.2.2 "The #include directive shall be followed by either a <filename> or "filename" sequence"
- Offset expressions should be disabled
- Consider to simplify indexed offsets syntax
- Consider to introduce new zserio type for offsets
- Reading enum values does not work in Python 3.12 HOT 3
- Duplicated offsets should be disabled
- Offsets using parameters should not be allowed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zserio.