ndsev / zserio Goto Github PK
View Code? Open in Web Editor NEWzero sugar, zero fat, zero serialization overhead
Home Page: https://zserio.org/
License: BSD 3-Clause "New" or "Revised" License
zero sugar, zero fat, zero serialization overhead
Home Page: https://zserio.org/
License: BSD 3-Clause "New" or "Revised" License
Documentation comment parsing is buggy. Consider to rewrite documentation comment grammar.
For example, the following multiple line documentation comment
/** Comment
**/
compiles with error
unexpected token: null
Add new Zserio language elements for Services and RPC definition. Implement RPC C++ generator based on https://github.com/grpc/grpc/tree/master/src/cpp.
Python runtime library
I would like to document my zserio code so that the documentation also appears in the generated code. In the end I want to be able to generate a documentation (of the generated code!) with Doxygen or Javadoc respectively.
Something like the following should be properly considered by the emitter:
/*
* @brief Experience instance
* @since 0.1.0
*/
struct Experience
{
bit:6 yearsOfExperience;
Language programmingLanguage;
};
/*
* @brief Programming language enum
* @since 0.2.0
*/
enum bit:2 Language
{
CPP = 0,
JAVA = 1,
PYTHON = 2,
JS = 3
};
When an optional field depends on another optional field or on an optional parameter, zserio tries to check that the optional clause for both fields is the same. However we are only able to compare the two expressions as strings and when the strings are not equal, we fire the warning, even when the expressions are semantically the same.
The warning is still very useful to inform a user that something could be wrong, so it's not a good idea to remove the warning. However, to be able to write a warning free zserio source, we should be able to disable this warning at least from command line.
scripts/build.sh always builds into the fixed directories /build and deploys into /distr. That will not work on read-only source checkouts.
Please add an option to specify a build-directory, creating the subdirectories build / distr inside the specified directory is imho fine.
This could simplify zserio tests, but it might be necessary to be able to disable particular warnings on command line or directly in the language.
Java runtime library has version stored in the Manifest file. C++ runtime library doesn't have any version. Such version could be stored anywhere in C++ header file just to have any reference during bug reporting.
We are not able to check correct version of runtime library automatically.
Consider to create new script update_version.sh which updates version in all zserio packages (core, extensions, runtime libraries).
when running this commend on the shell:
touch test.ds && java -jar ./zserio.jar test.ds
zserio throws a unhandled NullPointerException:
[ERROR] Internal error
java.lang.NullPointerException
at zserio.tools.ZserioTool.checkPackageName(ZserioTool.java:336)
at zserio.tools.ZserioTool.parsePackage(ZserioTool.java:303)
at zserio.tools.ZserioTool.parse(ZserioTool.java:179)
at zserio.tools.ZserioTool.process(ZserioTool.java:161)
at zserio.tools.ZserioTool.execute(ZserioTool.java:155)
at zserio.tools.ZserioTool.runTool(ZserioTool.java:66)
at zserio.tools.ZserioTool.main(ZserioTool.java:47)```
In the current plain implementation zserio streams are only backward compatible if new content is added to the end of the stream. It is not possible to change structures in the middle of the stream without adding additional data.
The additional data which we will call compatibility data for now, shall not be directly added to the stream but be an additional stream which older parsers may process. The original stream footprint shall not be touched since we want to still maintain a wire frame free format.
Currently we use NativeUnsignedLongType in Java to emulate uint64. It's used also for generic VarUInt type. NativeUnsignedLongType is based on java.math.BigInteger type and range checking is disabled for BigInteger.
Parser and JAVA generators are fine using the following schema, but C++ emitter throws errors.
Schema:
package tutorial;
struct Employee
{
uint8 age : age <= 65; // max age is 65
string name;
uint16 salary;
optional uint16 bonus;
Title title;
// if employee is a team lead, list the team members
Employee teamMember[] if title == Title.TEAM_LEAD;
};
enum uint8 Title
{
DEVELOPER = 0,
TEAM_LEAD = 1,
CTO = 2
};
Error generated by C++ generator:
Emitting C++ code
[ERROR] Internal error
java.lang.StackOverflowError
at java.util.ArrayList$Itr.<init>(Unknown Source)
at java.util.ArrayList.iterator(Unknown Source)
at zserio.ast.CompoundType.needsChildrenInitialization(CompoundType.java:184)
Currently we use a custom BitStreamCloseable
interface to prevent warning when users don't close reader / writer properly. It's because our readers and writers does nothing in the close method.
The only exception is FileBitStreamWriter
which flushes the buffer to a file in the close method.
We can either extend Closeable
interface in BitStreamReader
and BitStreamWriter
and thus force users to close the readers / writers properly, or we can just implement Closeable
interface in FileBitStreamWriter
. Implementing Closeable
only in a single writer might bring inconsistency in our stream reader / writer classes however.
The current zserio compiler throws warnings for unused structures. This is basically a good thing to keep an eye on unused parts of the schema.
But of course there will be always one warning we have to ignore, being the one for a root element.
I think the SQLite extension does work with it when using sql_database or sql_table. Those do not trigger warnings.
If we can prefix a structure with a keywork root
or similar then the zserio compiler would not need to throw the warning.
Choice and union types do not have available all fields. Therefore the following should not be compileable:
package bad_constraint_error;
enum uint8 Selector
{
BLACK,
GREY,
RED
};
choice EnumParamChoice(Selector selector) on selector
{
case Selector.BLACK:
int8 black;
case GREY:
int16 grey : black > 0 && grey > 0; // ERROR because 'black' is not available!
default:
int64 other;
};
Currently zserio supports variable integer encoding with fixed sizes of the resulting bytes.
So a varuint64 in zserio cannot really hold the complete range of 64 bit but has a payload of 57 bit. The other bits are used for the variable encoding.
Other serialization languages do feature complete range with variable encoding. This can result in byte sizes greater than 8 bytes for varuint64 for example.
We should also add this capability to zserio so that we can retrofit it onto other serializations as well.
Proposal is to have:
type | payload | comment |
---|---|---|
varuint | up to 64 bit | |
varint | up to 64 bit | |
var(u)int64 | up to 57 bit | keep for backward compatibility |
var(u)int32 | up to 29 bit | keep for backward compatibility |
var(u)int16 | up to 15 bit | keep for backward compatibility |
Keeping the current variable integer encodings may also be useful for people who rather want to stick with the sizes rather than total variable encoding.
static const bool fitsInPlace = sizeof(Holder<T>) <= sizeof(UntypedHolder::MaxInPlaceType);
if (fitsInPlace)
{
holder = new (&m_untypedHolder.inPlace) Holder<T>();
m_isInPlace = true;
}
else
{
holder = new Holder<T>();
m_untypedHolder.heap = holder;
}
The if in the code above could be replaced with a template.
Definition of native array types in C++ generator unnecessarily requires to specify theirs element types. The element type can be automatically deduced from Zserio types definition.
Built-in operator numbits
is not clearly defined in the documentation:
The
numbits(value)
operator is defined for unsigned integers as minimum number of bits required to encodevalue-1
. The returned number is of typeuint8
. The numbits operator returns1
if applied to value0
or1
.
Such definition forces users to ask additional questions, like why do we have exception for value 0
or why we don't have operator which returns number of bits to encode value
.
Possible solutions:
To change numbits(value)
operator to return number of bits required to encode value
. Very similar to bit_length(value)
operator defined in python as following:
Return the number of bits necessary to represent an integer in binary, excluding the sign and leading zeros.
To change numbits
description in documentation to numbits(num)
as minimum number of bits required to encode num
different values. This change will include the removal of exception numbits(0) = 1
, so numbits(0)
will be 0
.
To do both 1. and 2. together. The new introduced operator can be named as bitlength
.
struct Test
{
uint64 offsets[];
offsets[@index]:
uint32 data[];
};
Generates the following offsets setter (similar for offsets checker):
private final class __OffsetSetter_data implements zserio.runtime.array.OffsetSetter
{
@Override
public void setOffset(int __index, long __byteOffset)
{
final java.math.BigInteger __value = (java.math.BigInteger)__byteOffset;
getOffsets().setElementAt(__value, __index);
}
}
Cast from long __byteOffset to BigInteger is not possible!
Currently when running zserio.jar it only generates reading classes by default. We should change the default option to -withWriterCode so that people interested in read-only need to specify the command line switch not the other way around
initializeOffsets method supposes that the offset array has correct size set by application.
If application fails to resize offset arrays, out of bound exception is thrown. This can be improved and initializeOffsets can resize offset arrays automatically to make application life easier.
// missing package at the beginning
struct Test
{
int32 value;
};
Command java -jar zserio.jar test.zs -doc doc
fires NullPointerException
!
struct Data8
{
int8 first;
int8 second;
};
struct Data16
{
int16 first;
int16 second;
};
choice Test(int numBits
{
case 8:
Data8 array8[];
case 16:
Data16 array16[];
};
Getters cast Object to ObjectArray which fires an "unchecked cast" warning in Java.
When the value is spread over 9 bytes in an unaligned stream, it cannot be read using ByteArrayBitStreamReader.readBits()
!
It would be great to add constructors that allow usage of C++11 initializer_list, at list for simple structures without options and for arrays.
Example:
struct Wgs84
{
float32 longitude;
float32 latitude;
}
generates a simple C++ class, that needs to be filled in your code like this:
Wgs84 coordinate;
coordinate.setLongitude(11.);
coordinate.setLatitude(50.);
It would be great if one could use:
Wgs84 coordinate(11., 50.);
or
Wgs84 coordinate({11., 50.});
instead.
The same is valid for arrays that contain simple types, like an array of integers or floats.
The most easy way to allow std::initizliser_list for sequences is probably to group the members into a structure and add an additional constructor that accepts a const reference to this (internal) struct. The rest will be done by the compiler automatically. Here is how the the generated class could look like (shortened):
class Wgs84
{
public:
struct members
{
float32 longitude;
float32 latitude;
}
Wgs84(const members& m) : m_members(m) {}
[... generated methods as before ...]
private:
members m_members;
}
See disabled cast warning in tests/build.xmt: <compilerarg value="-Xlint:-cast"/>
.
In the generated code, we use code like (int) 5
, which fires redundant cast warning.
If you set CMAKE_GENERATOR to something different than make, you would also need to overwrite MAKE (which is actually the binary that is called for compilation). If you don't do this, the build will fail.
I would expect a warning (at least!) in this case, better let cmake handle that for you. Calling cmake with the option --build <dir>
will find out the binary and call it.
To reproduce this issue on linux install ninja and run:
CMAKE_GENERATOR=Ninja ./scripts/build.sh all-linux64
It would be good to have constrains on invisible arrays. Simply as following:
uint8 value[1..10]
the array must have minimal one and maximal ten entries
uint8 value[2..]
at least one values must be in the array
uint8 value[..23]
not more than 23 entries are possible would be also possible with uint8 value[0..23]
uint8 value[5]
exactly 5 entries must be in the array
Also maybe possible:
uint8 value[2 .. 200 % 2]
there must be at least 2, maximal 200 entries in the array and they have to come in pairs.
struct Company
{
string employees[3..5];
};
struct Company
{
varuint64 numEntries : numEntries >= 3 and numEntries <= 5;
string employees[numEntries];
};
An adoption of the numEntries type is IMHO not necessary. So no bit:3 numEntries
in the case of the example.
enum bit:63 Enum63
{
ENUM63_VALUE1 = 0
};
MSVC doesn't allow base type other than int
for enums (prior to C++11). Gcc probably chooses the bigger base type based on the highest value in the enum. We should investigate what C99 standard says about enums.
Neither C++ nor Java solves casting in expression formatter.
struct BitStructureParameter( bit:1 a, bit:15 b, bit:29 c )
{
bit< a > value1;
bit< b > value2;
bit< c > value3;
};
Writer and reader parts pass parameter values (e.g. getB()) without any casting to the bit stream reader write / read methods.
Generated code:
void BitStructureParameter::write(zserio::BitStreamWriter& _out, zserio::PreWriteAction _preWriteAction)
{
if ((_preWriteAction & zserio::PRE_WRITE_CHECK_RANGES) != 0)
checkRanges();
_out.writeBits(m_value1, getA());
_out.writeBits64(m_value2, getB()); // possible loss of data due to conversion
_out.writeBits64(m_value3, getC());
}
subtype Data D;
struct Data(int32 param)
{
int32 data : data < param;
};
struct Test
{
D data; // doesn't fire an error!
};
subtype Data D;
struct Data(int32 param)
{
int32 data : data < param;
};
struct Test
{
D(10) data; // doesn't fire an error!
};
D
is not used in the Test
class!
Java version >= 1.7 supports binary literals. Currently we change binary literals to hex value for all Java versions.
Generate documentation for GRPC services
We should also add float32 and float64 to the zserio language.
Implement RPC Java generator based on https://github.com/grpc/grpc-java
Float literals does not support scientific notation like e-1 or 1.5E10. Would be nice to support it because now we have float16, float32 and float64 types.
MSVC is not officially supported yet. We need MSVC support to test gRPC on Windows, because gRPC supports only MSVC (officially).
Also MSVC fires different warnings and some of them could be relevant.
The following compiles:
struct SomethingIsWrong
{
varuint64 value if value > 0;
varuint64 after;
};
There is a correct check in if clauses that after
field cannot be used here but there is no check that the same field is not available as well (unlike constraints).
Currently, we have arrays (except of object array) in C++ runtime library with fixed element type. For example, Float16Array
array. This array is used even if there is a subtype to float16
in schema like in the following example:
subtype float16 ElementType;
struct Something
{
ELementType array[];
};
So, generated C++ code uses resolved element type for all arrays. This is pity because C++ support typedefs.
As a consequence of this, ArrayType
returns name of a resolved element type, for example float16[]
.
To prevent usage of std::vector<bool>
our BoolArray
is based on uint8_t
. It might be good for performance, but user would expect to get a bool
type when accessing an element -> e.g. boolArray.elementAt(0)
.
If we want to use another underlying type than bool
, we should be able to provide bool
on the container's interface.
Currently we disabled MSVC warning C4800: forcing value to bool 'true' or 'false' (performance warning), because the generated code fires the warning when a BoolArray
is used as a parameter in a parameterized type.
The following two packages are compiled without any problem even if the constant ConstraintsConstant
is not visible in the package constraint_table
:
package constraint_table;
import constraint_constant.SomeStructure;
sql_table ConstraintsTable
{
int32 withoutSql;
uint16 sqlCheckConstant sql "CHECK(sqlCheckConstant < @ConstraintsConstant)";
};
package constraint_constant;
const uint16 ConstraintsConstant = 123;
struct SomeStructure
{
uint32 someValue;
};
If a token is missing at the end of a parsed zserio-file, 'null' is reported to be found.
Could we rename this token of EndOfFile, that would make it easier to undersstand what is missing.
Parsing test.ds
[ERROR] test.ds:2:1: expecting SEMICOLON, found 'null'
There are FileBitStreamWriter and FileBitStreamReader for Java but not for C++. It might useful to implemented them for C++ as well.
If keyword 'struct' is missing the parser prints the following error:
expecting EOF, found 'Experience'
Consider to improve this error message, for example to print something like
expecting zserio keyword (struct, enum, etc...), found 'Experience'
Would be great to have correct syntax highlighting of zs in typical editors like vim or sublime text. Maybe you could even create a pull-request for https://github.com/github/linguist later which would also enable correct highlighting in github comments.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.