ndsev / zserio Goto Github PK

zero sugar, zero fat, zero serialization overhead

License: BSD 3-Clause "New" or "Revised" License

Java 36.70% CMake 1.45% FreeMarker 5.99% C++ 36.55% Shell 2.42% Python 14.01% ANTLR 0.13% ZenScript 2.74% HTML 0.01% C 0.01%

schema-language serialization-framework code-generation cpp java grpc sqlite wire-format serialization compactness

zserio's People

Contributors

Stargazers

Watchers

zserio's Issues

Improve documentation comment parsing

Documentation comment parsing is buggy. Consider to rewrite documentation comment grammar.

For example, the following multiple line documentation comment

/** Comment
 **/

compiles with error

unexpected token: null

GRPC support for C++

Add new Zserio language elements for Services and RPC definition. Implement RPC C++ generator based on https://github.com/grpc/grpc/tree/master/src/cpp.

Implement zserio runtime library for Python

Python runtime library

Add support for code documentation

I would like to document my zserio code so that the documentation also appears in the generated code. In the end I want to be able to generate a documentation (of the generated code!) with Doxygen or Javadoc respectively.

Something like the following should be properly considered by the emitter:

/*
 * @brief Experience instance
 * @since 0.1.0
 */
struct Experience
{
    bit:6       yearsOfExperience;
    Language    programmingLanguage;
};

/*
 * @brief Programming language enum
 * @since 0.2.0
 */
enum bit:2 Language
{
    CPP     = 0,
    JAVA    = 1,
    PYTHON  = 2,
    JS      = 3
};

Consider command line argument to disable "optional clause warnings"

When an optional field depends on another optional field or on an optional parameter, zserio tries to check that the optional clause for both fields is the same. However we are only able to compare the two expressions as strings and when the strings are not equal, we fire the warning, even when the expressions are semantically the same.
The warning is still very useful to inform a user that something could be wrong, so it's not a good idea to remove the warning. However, to be able to write a warning free zserio source, we should be able to disable this warning at least from command line.

Add possibility to build out-of-source

scripts/build.sh always builds into the fixed directories /build and deploys into /distr. That will not work on read-only source checkouts.

Please add an option to specify a build-directory, creating the subdirectories build / distr inside the specified directory is imho fine.

Consider command line option to treat warnings as errors

This could simplify zserio tests, but it might be necessary to be able to disable particular warnings on command line or directly in the language.

Introduce C++ runtime library version

Java runtime library has version stored in the Manifest file. C++ runtime library doesn't have any version. Such version could be stored anywhere in C++ header file just to have any reference during bug reporting.

We are not able to check correct version of runtime library automatically.

Consider to create new script update_version.sh which updates version in all zserio packages (core, extensions, runtime libraries).

NullPointerException when reading an empty file

when running this commend on the shell:
touch test.ds && java -jar ./zserio.jar test.ds

zserio throws a unhandled NullPointerException:

[ERROR] Internal error
java.lang.NullPointerException
	at zserio.tools.ZserioTool.checkPackageName(ZserioTool.java:336)
	at zserio.tools.ZserioTool.parsePackage(ZserioTool.java:303)
	at zserio.tools.ZserioTool.parse(ZserioTool.java:179)
	at zserio.tools.ZserioTool.process(ZserioTool.java:161)
	at zserio.tools.ZserioTool.execute(ZserioTool.java:155)
	at zserio.tools.ZserioTool.runTool(ZserioTool.java:66)
	at zserio.tools.ZserioTool.main(ZserioTool.java:47)```

Implement Python generator

Implement Python generator as an zserio extension:

add Python extension configuration
expression formatter
native types
configuration for python tests

Add emitters for:

Add python tests

language
arguments
complex

Additional compatibility data

In the current plain implementation zserio streams are only backward compatible if new content is added to the end of the stream. It is not possible to change structures in the middle of the stream without adding additional data.
The additional data which we will call compatibility data for now, shall not be directly added to the stream but be an additional stream which older parsers may process. The original stream footprint shall not be touched since we want to still maintain a wire frame free format.

Implement range checking for types mapped to BigInteger

Currently we use NativeUnsignedLongType in Java to emulate uint64. It's used also for generic VarUInt type. NativeUnsignedLongType is based on java.math.BigInteger type and range checking is disabled for BigInteger.

C++ generator fails on recursive definitions without parameters

Parser and JAVA generators are fine using the following schema, but C++ emitter throws errors.

Schema:

package tutorial;


struct Employee
{
  uint8           age : age <= 65; // max age is 65
  string          name;
  uint16          salary;
  optional uint16 bonus;
  Title           title;
 
  // if employee is a team lead, list the team members
  Employee      teamMember[] if title == Title.TEAM_LEAD;
};
 
enum uint8 Title
{
  DEVELOPER = 0,
  TEAM_LEAD = 1,
  CTO       = 2
};

Error generated by C++ generator:

Emitting C++ code
[ERROR] Internal error
java.lang.StackOverflowError
        at java.util.ArrayList$Itr.<init>(Unknown Source)
        at java.util.ArrayList.iterator(Unknown Source)
        at zserio.ast.CompoundType.needsChildrenInitialization(CompoundType.java:184)

Consider implementing Closeable interface in BitStreamReader and BitStreamWriter

Currently we use a custom BitStreamCloseable interface to prevent warning when users don't close reader / writer properly. It's because our readers and writers does nothing in the close method.
The only exception is FileBitStreamWriter which flushes the buffer to a file in the close method.

We can either extend Closeable interface in BitStreamReader and BitStreamWriter and thus force users to close the readers / writers properly, or we can just implement Closeable interface in FileBitStreamWriter. Implementing Closeable only in a single writer might bring inconsistency in our stream reader / writer classes however.

Introduce command line argument to enable "unused warnings"

The current zserio compiler throws warnings for unused structures. This is basically a good thing to keep an eye on unused parts of the schema.
But of course there will be always one warning we have to ignore, being the one for a root element.

I think the SQLite extension does work with it when using sql_database or sql_table. Those do not trigger warnings.

If we can prefix a structure with a keywork root or similar then the zserio compiler would not need to throw the warning.

Fields in choice or union expressions are wrongly resolved

Choice and union types do not have available all fields. Therefore the following should not be compileable:

package bad_constraint_error;

enum uint8 Selector
{
    BLACK,
    GREY,
    RED
};

choice EnumParamChoice(Selector selector) on selector
{
    case Selector.BLACK:
        int8 black;

    case GREY:
        int16 grey : black > 0 && grey > 0; // ERROR because 'black' is not available!

    default:
        int64 other;
};

Add new language element for variable integers

Currently zserio supports variable integer encoding with fixed sizes of the resulting bytes.
So a varuint64 in zserio cannot really hold the complete range of 64 bit but has a payload of 57 bit. The other bits are used for the variable encoding.
Other serialization languages do feature complete range with variable encoding. This can result in byte sizes greater than 8 bytes for varuint64 for example.

We should also add this capability to zserio so that we can retrofit it onto other serializations as well.

Proposal is to have:

type	payload	comment
varuint	up to 64 bit
varint	up to 64 bit
var(u)int64	up to 57 bit	keep for backward compatibility
var(u)int32	up to 29 bit	keep for backward compatibility
var(u)int16	up to 15 bit	keep for backward compatibility

Keeping the current variable integer encodings may also be useful for people who rather want to stick with the sizes rather than total variable encoding.

Optimize AnyHolder inplace creation

static const bool fitsInPlace = sizeof(Holder<T>) <= sizeof(UntypedHolder::MaxInPlaceType);
if (fitsInPlace)
{
    holder = new (&m_untypedHolder.inPlace) Holder<T>();
    m_isInPlace = true;
}
else
{
    holder = new Holder<T>();
    m_untypedHolder.heap = holder;
}

The if in the code above could be replaced with a template.

Improve C++ native type mapping

Definition of native array types in C++ generator unnecessarily requires to specify theirs element types. The element type can be automatically deduced from Zserio types definition.

Clarify built-in operator numbits

Built-in operator numbits is not clearly defined in the documentation:

The numbits(value) operator is defined for unsigned integers as minimum number of bits required to encode value-1. The returned number is of type uint8. The numbits operator returns 1 if applied to value 0 or 1.

Such definition forces users to ask additional questions, like why do we have exception for value 0 or why we don't have operator which returns number of bits to encode value.

Possible solutions:

To change numbits(value) operator to return number of bits required to encode value. Very similar to bit_length(value) operator defined in python as following:

Return the number of bits necessary to represent an integer in binary, excluding the sign and leading zeros.
To change numbits description in documentation to numbits(num) as minimum number of bits required to encode num different values. This change will include the removal of exception numbits(0) = 1, so numbits(0) will be 0.
To do both 1. and 2. together. The new introduced operator can be named as bitlength.

Java generated code for uint64 offsets doesn't compile

struct Test
{
    uint64 offsets[];
offsets[@index]:
    uint32 data[];
};

Generates the following offsets setter (similar for offsets checker):

    private final class __OffsetSetter_data implements zserio.runtime.array.OffsetSetter
    {
        @Override
        public void setOffset(int __index, long __byteOffset)
        {
            final java.math.BigInteger __value = (java.math.BigInteger)__byteOffset;
            getOffsets().setElementAt(__value, __index);
        }
    }

Cast from long __byteOffset to BigInteger is not possible!

Make -withWriterCode the default in zserio.jar

Currently when running zserio.jar it only generates reading classes by default. We should change the default option to -withWriterCode so that people interested in read-only need to specify the command line switch not the other way around

initializeOffsets method does not resize offset arrays

initializeOffsets method supposes that the offset array has correct size set by application.

If application fails to resize offset arrays, out of bound exception is thrown. This can be improved and initializeOffsets can resize offset arrays automatically to make application life easier.

NullPointerException when generating doc from source without package definition

// missing package at the beginning
struct Test
{
    int32 value;
};

Command java -jar zserio.jar test.zs -doc doc fires NullPointerException!

Choice / Union fires "unchecked" warning when contains an array

struct Data8
{
    int8 first;
    int8 second;
};

struct Data16
{
    int16 first;
    int16 second;
};

choice Test(int numBits
{
case 8:
    Data8 array8[];
case 16:
    Data16 array16[];
};

Getters cast Object to ObjectArray which fires an "unchecked cast" warning in Java.

Java ByteArrayBitStreamReader cannot read bit:63 values from unaligned stream

When the value is spread over 9 bytes in an unaligned stream, it cannot be read using ByteArrayBitStreamReader.readBits()!

optional support for initializer_list

It would be great to add constructors that allow usage of C++11 initializer_list, at list for simple structures without options and for arrays.

Example:

struct Wgs84
{
  float32 longitude;
  float32 latitude;
}

generates a simple C++ class, that needs to be filled in your code like this:

Wgs84 coordinate;
coordinate.setLongitude(11.);
coordinate.setLatitude(50.);

It would be great if one could use:

Wgs84 coordinate(11., 50.);

Wgs84 coordinate({11., 50.});

instead.
The same is valid for arrays that contain simple types, like an array of integers or floats.

The most easy way to allow std::initizliser_list for sequences is probably to group the members into a structure and add an additional constructor that accepts a const reference to this (internal) struct. The rest will be done by the compiler automatically. Here is how the the generated class could look like (shortened):

class Wgs84
{
public:
    struct members
    {
        float32 longitude;
        float32 latitude;
    }
    Wgs84(const members& m) : m_members(m) {}

    [... generated methods as before ...]

private:
    members m_members;
}

Remove unnecessary casts in Java generated code

See disabled cast warning in tests/build.xmt: <compilerarg value="-Xlint:-cast"/>.
In the generated code, we use code like (int) 5, which fires redundant cast warning.

CMAKE_GENERATOR other than make doesn't work

If you set CMAKE_GENERATOR to something different than make, you would also need to overwrite MAKE (which is actually the binary that is called for compilation). If you don't do this, the build will fail.

I would expect a warning (at least!) in this case, better let cmake handle that for you. Calling cmake with the option --build <dir> will find out the binary and call it.

To reproduce this issue on linux install ninja and run:

CMAKE_GENERATOR=Ninja ./scripts/build.sh all-linux64

Invisible array constrains

It would be good to have constrains on invisible arrays. Simply as following:

uint8 value[1..10] the array must have minimal one and maximal ten entries
uint8 value[2..] at least one values must be in the array
uint8 value[..23] not more than 23 entries are possible would be also possible with uint8 value[0..23]
uint8 value[5] exactly 5 entries must be in the array
Also maybe possible:
uint8 value[2 .. 200 % 2] there must be at least 2, maximal 200 entries in the array and they have to come in pairs.

Example:

invisible zserio

struct Company
{
string employees[3..5];
};

classic zserio

struct Company
{
    varuint64 numEntries : numEntries >= 3 and numEntries <= 5;
    string    employees[numEntries];
};

An adoption of the numEntries type is IMHO not necessary. So no bit:3 numEntries in the case of the example.

Enums with larger than 32bit base type doesn't work in C++

enum bit:63 Enum63
{
    ENUM63_VALUE1 = 0
};

MSVC doesn't allow base type other than int for enums (prior to C++11). Gcc probably chooses the bigger base type based on the highest value in the enum. We should investigate what C99 standard says about enums.

Expression formatter doesn't solve casting at all

Neither C++ nor Java solves casting in expression formatter.

struct BitStructureParameter( bit:1 a, bit:15 b, bit:29 c )
{
    bit< a >    value1;
    bit< b >    value2;
    bit< c >    value3;
};

Writer and reader parts pass parameter values (e.g. getB()) without any casting to the bit stream reader write / read methods.

Generated code:

void BitStructureParameter::write(zserio::BitStreamWriter& _out, zserio::PreWriteAction _preWriteAction)
{
    if ((_preWriteAction & zserio::PRE_WRITE_CHECK_RANGES) != 0)
        checkRanges();

    _out.writeBits(m_value1, getA());
    _out.writeBits64(m_value2, getB()); // possible loss of data due to conversion
    _out.writeBits64(m_value3, getC());
}

Subtypes to parameterized types are not handled correctly in C++

Subtype to a parameterized type doesn't fire an error when used as non-parameterized types!

subtype Data D;
struct Data(int32 param)
{
    int32 data : data < param;
};
struct Test
{
    D data; // doesn't fire an error!
};

Subtype (typedef) is not used in the generated code

subtype Data D;
struct Data(int32 param)
{
    int32 data : data < param;
};
struct Test
{
    D(10) data; // doesn't fire an error!
};

D is not used in the Test class!

Generate binary literals properly according to Java version

Java version >= 1.7 supports binary literals. Currently we change binary literals to hex value for all Java versions.

Doc Emitter GRPC Support

Generate documentation for GRPC services

Support float32 and float64

We should also add float32 and float64 to the zserio language.

GRPC support for Java

Implement RPC Java generator based on https://github.com/grpc/grpc-java

Implement scientific notation for float literals

Float literals does not support scientific notation like e-1 or 1.5E10. Would be nice to support it because now we have float16, float32 and float64 types.

GRPC Support

Support MSVC compiler

MSVC is not officially supported yet. We need MSVC support to test gRPC on Windows, because gRPC supports only MSVC (officially).
Also MSVC fires different warnings and some of them could be relevant.

Disable if clauses which use the same field

The following compiles:

struct SomethingIsWrong
{
    varuint64 value if value > 0;
    varuint64 after;
};

There is a correct check in if clauses that after field cannot be used here but there is no check that the same field is not available as well (unlike constraints).

Don't resolve element type for arrays in C++

Currently, we have arrays (except of object array) in C++ runtime library with fixed element type. For example, Float16Array array. This array is used even if there is a subtype to float16 in schema like in the following example:

subtype float16 ElementType;

struct Something
{
    ELementType array[];
};

So, generated C++ code uses resolved element type for all arrays. This is pity because C++ support typedefs.

As a consequence of this, ArrayType returns name of a resolved element type, for example float16[].

Implement streaming RPC methods

BoolArray doesn't work with bool elements and uses uint8_t instead

To prevent usage of std::vector<bool> our BoolArray is based on uint8_t. It might be good for performance, but user would expect to get a bool type when accessing an element -> e.g. boolArray.elementAt(0).
If we want to use another underlying type than bool, we should be able to provide bool on the container's interface.

Currently we disabled MSVC warning C4800: forcing value to bool 'true' or 'false' (performance warning), because the generated code fires the warning when a BoolArray is used as a parameter in a parameterized type.

Imports are resolved wrongly

The following two packages are compiled without any problem even if the constant ConstraintsConstant is not visible in the package constraint_table:

package constraint_table;

import constraint_constant.SomeStructure;

sql_table ConstraintsTable
{
    int32  withoutSql;
    uint16 sqlCheckConstant sql "CHECK(sqlCheckConstant < @ConstraintsConstant)";
};

package constraint_constant;

const uint16 ConstraintsConstant = 123;

struct SomeStructure
{
    uint32  someValue;
};

Add new language: Python

Currently zserio only supports generating code for JAVA and C++.
We should add support for Python for better coverage of used languages.

#40 Python runtime library
#52 Python generator
#88 Python SQLite Support
#93 Python GRPC Support

EndOfFile-token is called 'null' by parser

If a token is missing at the end of a parsed zserio-file, 'null' is reported to be found.
Could we rename this token of EndOfFile, that would make it easier to undersstand what is missing.

Parsing test.ds
[ERROR] test.ds:2:1: expecting SEMICOLON, found 'null'

Implement FileBitStreamWriter and FileBitStreamReader for C++

There are FileBitStreamWriter and FileBitStreamReader for Java but not for C++. It might useful to implemented them for C++ as well.

Improve parsing error message if 'struct' is missing

If keyword 'struct' is missing the parser prints the following error:

expecting EOF, found 'Experience'

Consider to improve this error message, for example to print something like

expecting zserio keyword (struct, enum, etc...), found 'Experience'

Provide syntax-highlighting-files

Would be great to have correct syntax highlighting of zs in typical editors like vim or sublime text. Maybe you could even create a pull-request for https://github.com/github/linguist later which would also enable correct highlighting in github comments.

ndsev / zserio Goto Github PK

zserio's People

Contributors

Stargazers

Watchers

Forkers

zserio's Issues

Implement Python generator as an zserio extension:

Add emitters for:

Add python tests

Example:

invisible zserio

classic zserio

Recommend Projects

Recommend Topics

Recommend Org