breese / trial.protocol Goto Github PK
View Code? Open in Web Editor NEWNetwork wire protocols
Network wire protocols
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `trial::protocol::dynamic::variable::map(std::initializer_list<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, trial::protocol::dynamic::variable> >)':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:964: multiple definition of `trial::protocol::dynamic::variable::map(std::initializer_list<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, trial::protocol::dynamic::variable> >)'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:964: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `trial::protocol::dynamic::variable::map()':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:959: multiple definition of `trial::protocol::dynamic::variable::map()'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:959: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `trial::protocol::json::basic_reader<char>::frame** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<trial::protocol::json::basic_reader<char>::frame*>(trial::protocol::json::basic_reader<char>::frame* const*, trial::protocol::json::basic_reader<char>::frame* const*, trial::protocol::json::basic_reader<char>::frame**)':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:944: multiple definition of `trial::protocol::dynamic::variable::null'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/boost_1_61_0/boost/test/unit_test_suite.hpp:338: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `std::_Deque_base<trial::protocol::json::basic_reader<char>::frame, std::allocator<trial::protocol::json::basic_reader<char>::frame> >::_Deque_impl::_M_swap_data(std::_Deque_base<trial::protocol::json::basic_reader<char>::frame, std::allocator<trial::protocol::json::basic_reader<char>::frame> >::_Deque_impl&)':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:948: multiple definition of `trial::protocol::dynamic::variable::array(std::initializer_list<trial::protocol::dynamic::variable>)'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:948: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `new_allocator':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:943: multiple definition of `trial::protocol::dynamic::variable::array()'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:943: first defined here
clang: error: linker command failed with exit code 1 (use -v to see invocation)
test/CMakeFiles/dynamic_dict_validator.dir/build.make:127: recipe for target 'test/dynamic_dict_validator' failed
From #35
An Allocator doesn't have to be default constructible or stateless. All functions should accept an allocator instance,
const Allocator& a
. You can default this to an instance ofAllocator()
but users who use allocators where this is not an option, can supply the appropriate instance.
Previous discussion start at: #4 (comment)
Enumerations have their own distinct type, so they cannot be added to dynamic::variable
. However, they have an integer as the underlying type, so they could be stored as such.
I am considering adding direct support for storing and retrieving enumerations.
For example:
enum class light { off, on };
// Create variable with on state.
dynamic::variable data(light::on)
// Check if stored value is an enum.
// Notice: this will succeed for any integer as well.
if (data.is<light>())
{
// Retrieve the stored value as an enumerator.
// Notice: the stored value may not match a valid enumerator.
auto value = data.value<light>();
}
The disadvantage of this, is that the use can store an arbitrary integer, and then retrieve it as an enum which may not have an enumerator with the given integer value. This will result in undefined behaviour.
I did several small tests and I don't get when this token could help me in my logic. It's never emitted in my tests (I didn't grep through the code to know where it's being used).
dynamic::variable
supports associative arrays. This is currently implemented with std::map
for convenience - I was too lazy to implement a hash function for variable
.
I deliberately kept the variable
iterators as ForwardIterator so we would be able to switch the implementation to std::unordered_map
.
While writing test suites that uses std algorithms, I encountered several that cannot be used with variable
because of its iterator category. For instance, we can use std::is_sorted
(requires ForwardIterator) but not std::sort
(requires RandomAccessIterator.)
The interesting algorithms are std::partition
(requires BidirectionalIterator) and std::sort
. Lack of support for std::partition
is not a major problem because we can get a partition with std::remove
. Support for std::sort
is more interesting, because although it makes no sense on an associative array (which is either sorted by key, or unsortable) but it would be nice to be able to sort the elements in a normal array.
So the question is, shall variable::iterator
model BidirectionalIterator/RandomAccessIterator or continue to model ForwardIterator? The former will give us access to more algorithms, but we lose the ability to use std::unordered_map
.
I have slight preference for ForwardIterator, because reordering algorithms, e.g. std::sort
and std::reverse
, may have a strange effect on associative arrays.
Would you be interested in having a XML parser in Trial.Protocol? I'll need one XML parser to implement WebDAV, so I could write one in the design of pull parsers like the ones in this library.
So, I was playing with lua bindings to trial.protocol (actually it's a bigger project that happens to have JSON support as well and I'm using trial.protocol for that) and I end up finally having an use-case for the writer
API. So, the following lua code:
local json = require('json')
local writer = json.writer.new()
writer:begin_object()
writer:value('foo')
writer:value('bar')
writer:end_object()
print(writer:generate())
will print:
{"foo":"bar"}
So far so good, but the literal API seems incomplete. There are two types of literals — nodes and insignificant linear whitespace. Linear whitespace literals are useful to indent the generated document, but node literals are useful in serialization libraries. For my lua bindings, the user might write a __tojson()
metamethod as:
function __tojson(self, state)
local writer = state.writer
writer:begin_object()
writer:value('type')
writer:value(0)
writer:end_object()
end
But that's kind of verbose. And given the type to be serialized has a constant representation, that's also inefficient. I would like to be able to write the following:
function __tojson(self, state)
local writer = state.writer
writer:literal('{"type":0}')
end
But that obviously isn't going to work. When called as part of a bigger object as in:
writer:begin_object()
writer:value('hello')
writer:value('world')
writer:value('foo')
encode_foo(foo, writer)
writer:end_object()
print(writer:generate())
The output will be:
{"hello":"world","foo"{"type":0}}
An invalid JSON.
So, we need a type of literal that accounts for a raw node to be written. I don't think that's hard. I think the bikeshedding to choose the function name is going to be more demanding than the implementation itself.
Anyway, I'm not in a rush to see this issue solved. There are other features besides JSON I have to work on before releasing my project.
Hey,
a friend of mine ( @skhaz ) showed me a very interesting JSON library for C: https://github.com/cesanta/frozen
I thought it could be useful as an inspiration design for higher-level abstractions.
There is a topic from your blog:
Better yet, define an mpg123::error exception that inherits from std::system_error, so exceptions from the mpg123 adapter can be distinguished from other system_error exceptions in the try-catch block.
— http://breese.github.io/2017/05/12/customizing-error-codes.html
I just noticed that you use this trick in the library (and then remembered your blog post):
include/trial/protocol/json/error.hpp:class error : public std::system_error
However, the PR for the skip algorithm I've sent previously (I was playing with json::partial::skip()
lately) doesn't follow this trick and throws system_error
directly:
So Boost.Predef already uses asciidoc to document stuff: https://github.com/boostorg/predef/blob/caa0627ab5518333d8c0f272667140c793adca1d/doc/build.jam#L8
Could we use asciidoc from now on too? Current docs kind of stalled already, so I guess it's not very “fun” to touch QuickBook. Maybe asciidoc helps.
I mean, reader::value<T>()
will already do a switch
on symbol
to possibly throw a runtime error. This precondition makes the API easier to use wrong and I fail to see a benefit on this.
Btw, we're using trial.protocol "heavily" (all JSON messages) on our new project at BlinkTrade, so if you need to advertise "is there a real usage case?", you can mention it (only boring thing til now is the increased number of assert(false)
and code complexity thanks to #3).
This library looks good, and good things provoke requests like these. Here's an initial list:
An Allocator doesn't have to be a template with a single parameter. i.e. The template parameter shouldn't be template<typename> class Allocator
but instead just class Allocator
. You can default it to std::allocator<char>
, it doesn't matter, since you'll be rebinding as necessary.
An Allocator doesn't have to be default constructible or stateless. All functions should accept an allocator instance, const Allocator& a
. You can default this to an instance of Allocator()
but users who use allocators where this is not an option, can supply the appropriate instance.
Allocator construction shouldn't use addressof(*ptr)
but instead to_address(ptr)
(using addressof(*ptr)
here before ptr references an object of T is undefined behavior).
Allocators might be final
so do not derive from them unconditionally. Instead of deriving from Allocator, derive from a facility like boost::empty_value<Allocator>
(which will use inheritance if not final, or otherwise just store a member).
So, I'm trying to update tabjson to support the concatenated JSON streaming model. Therefore the first change I had to do was replacing reader
by chunk_reader
. However as soon as I made the change the project failed to compile due to algorithms such as json::partial::skip()
no longer working.
These algorithms hardcode the type json::reader
. However I don't think the current decision is wrong. It's pretty much unclear how to recover once a composite operation fails midway. Failing to accept chunk_reader
args is a good thing here. If we were to change the algorithms I envision two approaches:
Both approaches are undesirable in my eyes.
However the question remains: how would one approach the usage of algorithms such as partial::skip()
in a chunked stream? I have a few ideas and I thought I'd open up an issue to discuss them. I'll comment more later (a few of the ideas will also be discussed in separate issues).
On the feature/dynamic branch it is possible to construct a dynamic::variable
by passing a variable::array_type
(aka std::vector<variable>
.) I want to change this to used named constructors instead.
For example, the following code:
std::vector<variable> values;
variable data(values);
should be changed into:
std::variable<variable> values;
variable data = variable::array(values.begin(), values.end());
The reason for this change is that the current code can lead to surprising results. While working on adding support for operator<<
I encountered an infinite recursion because operator<<(variable)
was invoked when given a std::vector<variable>
.
I could have solved the recursion by adding an overload for std::vector<variable>
but that would preclude users from adding their own. Notice that operator<<(variable)
simply forwards to operator<<
of the underlying type, so the default behaviour of operator<<(std::vector<variable>)
should be a compiler error (unless a custom overload exists.) It is not used for serialization.
All of the above also applies to variable::map_type
.
Any objections?
Floating-point numbers are current referred to as symbol::number
(for comparison integer numbers use symbol::integer
.)
I have never been pleased with that term, so I am planning to renaming it to symbol::real
. This name fits well with the meaning of std::real
.
This will be a (trivial) breaking-change for dynamic::variable
, so I want to hear if there are any objections.
Likewise, I plan to rename code::float_number
, code::double_number
, and code::long_double_number
to code::short_real
, code::real
, and code::long_real
respectively.
I know Boost.Hana serialization code wasn't merged yet. The issue is a future TODO item. I'll tackle it myself. And I don't want to keep my mind busy with this information. So I'll just dump the idea here and get back to it at some point in the future.
So I previously suggested the idea to optionally merge matching/decode phases (for string values only): #44 (comment). Once the feature hits the repo, I can update the Boost.Hana serialization code to make use of it.
The idea is: when I hit a member for which a std::string
value is desired, branch the algorithm on:
struct_str_member.capacity() == 0
Do not merge matching/decode steps. Perform a separate match to get the literal()
size and reserve enough data in one step. Then collect the string.
struct_str_member.capacity() != 0
Trust the user's size estimate heuristic and just merge matching/decode steps.
when trying to use intrusive serialization (e.g. with a member template 'serialize') I fail to compile the code. I have put a demo working with boost::archive::text_iarchive / test_oarchive in my fork (latest develop)
https://github.com/bodomartin/trial.protocol/
which is otherwise synced with your latest develop.
(using gcc 5.4.0, linux) - I noticed that when using a private serialize template the serializer_overloader does not find the member function? The boost serialization lib uses the 'access' friend.
Tested on commit cea147e, but looking at the source code from latest develop, I think it should be failing too. I can write an isolated case to test it on develop
later.
Compiler is clang. Error is:
application.cpp:36:9: error: no matching member function for call to 'erase'
obj.erase(it);
~~~~^~~~~
3rd/trial.protocol/include/trial/dynamic/variable.hpp:869:14: note: candidate function not viable: no known conversion
from 'trial::dynamic::basic_variable<std::allocator>::key_iterator' to 'trial::dynamic::basic_variable<std::allocator>::const_iterator' for 1st argument
iterator erase(const_iterator position);
^
Also, finally I am coding the part where performance is not the critical and I can do whole parses and get the convenience of dynamic::variable
on my code 🎉🎉🎉.
So, I'm developing a Telegram client and I'm using this lib: https://core.telegram.org/tdlib
What I learnt is that a few applications (including tdlib) will reject JSON messages if they include the fractional part (even if it's all zeros). Here's the reply I get from the previous application when sending one of such requests:
{"code":400.000000000000,"@type":"error","message":"Failed to parse JSON object as TDLib request: Can't parse \"559815.000000000\" as an integer"}
So I guess Trial.Protocol's generator needs to detect such patterns and omit the fractional part if it's going to be all zeroes. Otherwise this class of applications will not work.
For my case I can (and will) use a workaround. However I think it's important for Trial.Protocol to up its game, so I'm reporting this issue.
I have no need for this feature in the very long road. Actually, I could use them if there was an algorithm to generate random JSONs based on a JSON schema so I could add fuzzy testings in my applications, but I don't want to. This issue is just a reminder for missing features.
Specification (still a draft): http://json-schema.org/
RapidJSON has this feature: http://rapidjson.org/md_doc_schema.html
And Mozilla services use RapidJSON because they need JSON Schema validation: https://github.com/mozilla-services/mozilla-pipeline-schemas
I've been experimenting on using exceptions to bail out early of undesired keys, but then stumbled on the noexcept
problem: https://github.com/vinipsmaker/trial.protocol/blob/203e70548a504fb63ba4230826d01f446f9b560d/include/trial/protocol/json/scan.hpp#L426
Explained at #21 (comment)
Error:
That points to the following code:
trial.protocol/test/bintoken/decoder_suite.cpp
Line 1362 in 481dbf3
It should be an easy fix.
include/c++/12.1.0/bits/stl_pair.h: In instantiation of 'struct std::pair<const trial::dynamic::basic_variable<std::allocator<char> >, trial::dynamic::basic_variable<std::allocator<char> > >':
include/c++/12.1.0/type_traits:1467:45: required by substitution of 'template<class _From1, class _To1, class> static std::true_type std::__is_convertible_helper<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> >, false>::__test(int) [with _From1 = const trial::dynamic::basic_variable<std::allocator<char> >&; _To1 = trial::dynamic::basic_variable<std::allocator<char> >; <template-parameter-1-3> = <missing>]'
include/c++/12.1.0/type_traits:1476:42: required from 'struct std::__is_convertible_helper<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> >, false>'
include/c++/12.1.0/type_traits:1482:12: required from 'struct std::is_convertible<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >'
include/c++/12.1.0/type_traits:3284:72: required from 'constexpr const bool std::is_convertible_v<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >'
include/c++/12.1.0/bits/stl_pair.h:259:18: required from 'static constexpr bool std::pair<_T1, _T2>::_S_convertible() [with _U1 = const trial::dynamic::basic_variable<std::allocator<char> >&; _U2 = const trial::dynamic::basic_variable<std::allocator<char> >&; _T1 = trial::dynamic::basic_variable<std::allocator<char> >; _T2 = trial::dynamic::basic_variable<std::allocator<char> >]'
include/c++/12.1.0/bits/stl_pair.h:268:65: required from 'struct std::pair<trial::dynamic::basic_variable<std::allocator<char> >, trial::dynamic::basic_variable<std::allocator<char> > >'
include/trial/protocol/bintoken/serialization/dynamic/variable.hpp:121:44: required from here
include/c++/12.1.0/bits/stl_pair.h:268:65: in 'constexpr' expansion of 'std::pair<const trial::dynamic::basic_variable<std::allocator<char> >, trial::dynamic::basic_variable<std::allocator<char> > >::_S_convertible<const trial::dynamic::basic_variable<std::allocator<char> >&, const trial::dynamic::basic_variable<std::allocator<char> >&>()'
include/c++/12.1.0/bits/stl_pair.h:260:20: error: the value of 'std::is_convertible_v<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >' is not usable in a constant expression
260 | return is_convertible_v<_U2, _T2>;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/c++/12.1.0/bits/char_traits.h:42,
from include/c++/12.1.0/string:40:
include/c++/12.1.0/type_traits:3284:25: note: 'std::is_convertible_v<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >' used in its own initializer
3284 | inline constexpr bool is_convertible_v = is_convertible<_From, _To>::value;
this patch solves it
+++ b/bintoken/serialization/dynamic/variable.hpp 2022-05-11 15:15:31.021192091 -0700
@@ -118,7 +118,9 @@ struct save_overloader< protocol::bintok
ar.template save<bintoken::token::null>();
for (auto it = data.begin(); it != data.end(); ++it)
{
- auto value = std::make_pair(it.key(), it.value());
+ auto key = it.key();
+ auto val = it.value();
+ auto value = std::make_pair(key, val);
ar.save_override(value.first, protocol_version);
ar.save_override(value.second, protocol_version);
}
I have some loops like the following on my codebase:
for (;;) {
// Key
if (reader.symbol() == json::token::symbol::end_object) {
reader.next();
break;
}
assert(reader.symbol() == json::token::symbol::string);
auto current_key = reader.literal();
current_key.remove_prefix(1);
current_key.remove_suffix(1);
if (!reader.next())
throw Error("bad object");
// Value
try {
if (current_key == "TestReqID") {
msg.header.req_id = reader.value<decltype(msg.header.req_id)>();
read_req_id = true;
} else if (current_key == "SendTime") {
msg.send_time = reader.value<decltype(msg.send_time)>();
read_send_time = true;
}
json::partial::skip(reader);
} catch (const json::error&) {
throw Error("bad object");
}
}
It'd be nice if there was a built-in function to compare current string-value with a user-provided string_view
. Doing so would spare a call to reader.value<std::string>()
which is an allocating function.
I have added a tentative version of tree parsing and formatting of JSON in the feature/tree branch. I would like to solicit feedback on this before it enters the develop branch.
The API is fairly straight-forward:
// Converts a JSON formatted input buffer into a dynamic::variable.
// The input buffer must be one of the supported buffer types (e.g. std::string, std::ostringstream, std::vector)
// <trial/protocol/json/parse.hpp>
template <typename BufferType>
dynamic::variable parse(BufferType input);
// Converts a dynamic::variable into a JSON formatted output buffer.
// The output buffer is created.
// <trial/protocol/json/format.hpp>
template <typename BufferType>
BufferType format(dynamic::variable);
// Converts a dynamic::variable into a JSON formatted output buffer.
// The output buffer is supplied by the user.
// <trial/protocol/json/format.hpp>
template <typename BufferType>
void format(dynamic::variable, BufferType&);
From #35
Allocators might be
final
so do not derive from them unconditionally. Instead of deriving from Allocator, derive from a facility likeboost::empty_value<Allocator>
(which will use inheritance if not final, or otherwise just store a member).
About small_union
Could this line (below) be added to the main CMakeLists.txt to make the
include dir 'installable'?
install(DIRECTORY ${TRIAL_PROTOCOL_ROOT}/include/ DESTINATION include)
dynamic::variable
is currently defined as an alias for dynamic::basic_variable<char>
. This CharT
template parameter is then used to select the internal string type. So you need to define a new alias for dynamic::basic_variable<wchar_t>
to handle wide-character strings.
After some thinking, I believe that it would be better to remove the template parameter completely, and instead add all four standard string types as union members of dynamic::variable
(which then becomes a class instead of an alias.)
This means that a dynamic::variable
can store any fundamental type (where void/nullptr are replaced by a null type) and any standard string, as well as an array or map of other dynamic::variable
.
Comments?
From #35
Allocator construction shouldn't use
addressof(*ptr)
but insteadto_address(ptr)
(usingaddressof(*ptr)
here before ptr references an object of T is undefined behavior).
From #35
An Allocator doesn't have to be a template with a single parameter. i.e. The template parameter shouldn't be
template<typename> class Allocator
but instead just class Allocator. You can default it tostd::allocator<char>
, it doesn't matter, since you'll be rebinding as necessary.
The following will fail on bad input:
void f(boost::string_view raw_result)
{
trial::dynamic::variable result;
try {
trial::protocol::json::iarchive archive(raw_result);
archive >> result
} catch (...) {
// ...
}
}
The serialization code doesn't handle the trial::protocol::json::token::symbol::error
enumeration:
I have some hurry to solve this issue then I'll probably send a PR (after I solve the issue on my own tree). But which error handling do you think it fits into Trial.Protocol design? Is throwing an exception fine here?
Given this JSON:
{
"abc": [ { "def": 4 }, {} ],
"def": 42
}
and a reader which is in the root JSON point:
json::reader reader(raw_json);
assert(reader.symbol() == json::token::symbol::begin_object);
find_key(reader, "def")
would advance the reader to hold the point/value 42
.
An initial non-tested implementation is:
if (reader.symbol() != json::token::symbol::begin_object)
return false;
if (!reader.next())
return false;
while (true) {
// Key
if (reader.symbol() == json::token::symbol::end_object) {
reader.next();
return false;
}
if (reader.value<std::string>() == key) {
reader.next();
return reader.symbol() != json::token::symbol::error;
}
if (!reader.next())
return false;
// Value
skip_element(reader);
if (reader.symbol() == json::token::error)
return false;
}
I think this function should only try to find the key if it is given the token at the begin_object
token because the semantics are clearer. If you're already iterating over the object keys, there is no point in provide an “easier” function to you, as all you'd need is the skip_element
function and stay in the loop (one call to the skip_element
in the else clause and nothing more... which is 2 extra lines for user code...).
Also, if you think this function can belong to Trial.Protocol, another function could be close_current_object
which would pair with the reader
advanced by find_key
nicely.
clang (3.9.1) , boost 1.62 , complains about ctype in libc++ ,and in the json vector serialization. I have been trying to fix these but no success for far :-|
Explained at #21 (comment)
include/trial/protocol/detail/lightweight_test.hpp:42:28: error: no member named 'report_errors_remind' in namespace 'boost::detail'
boost::detail::report_errors_remind();
While working on batches of JSONs stored in a file, I've stumbled upon this error. I'd normally provide an accompanying fix, but I think you'll have way easier time delving in the decoding layers due to code familiarity.
Pull #45 provides a test. It's just a copy'n'paste of an already existing test, but changing the input to move the NULL terminator further away in the byte sequence.
I had to disable JSON support in emilua (when built on windows) because build failures on trial.protocol. One build failure happened at
. I quickly found some code on StackOverflow to replace that function. However other build failures happened as soon as I fixed that one and so I just disabled the JSON module for now.There is a FIXME
comment already fixed:
I mean, I was surprised to see this comment, because on my tests, there was already an exception happening on overflow. Then I went further into implementation and saw implementation:
dynamic::variable
is a heteregenous data structure that can contain both arrays (std::vector<T>
) and associated arrays (std::map<T, U>
). Only the associated arrays have keys, so the iterators dereferences the value. Therefore a special key iterator, which dereferences the key, and associated functions have been added.
We need a way to distinguish between the two in the interface.
The conventional way is to use a prefix. For example:
variable::key_iterator it = object.key_begin();
variable::key_iterator it2 = object.key_find("key");
I have experimented with an alternative (emulating inner classes) that uses C++ constructs (scoped resolution operator and dot operator) instead of prefixes. This gives the following interface:
variable::key::iterator it = object.key.begin();
variable::key::iterator it2 = object.key.find("key");
While I find this to be a nicely scoped interface, the price is that each variable
object will contain an extra pointer (the nested key struct has a pointer to its parent class.)
Is this an acceptable price or should I revert to prefixes?
So the returned string uses the provided allocator. A new overload like this:
template<class Alloc> inline
std::basic_string<char, std::char_traits<char>, Alloc>
reader::value<std::basic_string<char, std::char_traits<char>, Alloc>>(
const Alloc& alloc) const;
In this post, they show interest in JSON parser which is not DOM-based:
Theoretically, glTF2 should be less efficient to parse than Collada, as it requires parsing the whole JSON into memory. Collada can be streamed in by using a SAX XML parser
I'm opening this issue just as a reminder in case I decide to open a PR against their code base. Trial.Protocol would gain another heavy user to the JSON library.
So, I've been keeping the topic of parsers combinators on the back of my head since ~2016 when you introduced me to the idea.
TBH I had some resistance to the approach of parser combinators because they're just too difficult to implement and of limited usefulness/reuse (or so I thought). However recently I was playing with re2c on a new project and in this new project I had the chance to declare several mini-parsers and jump from one to another while they reuse data pretty seamlessly. The idea immediately struck me. I was using parsers combinators.
Usually the talks around the topic focus on functional languages and the approaches I've seen are pretty complex (and limited too TBH). However on re2c I had the chance to combine parsers under an imperative approach and the result was pretty pleasant and much more natural on C++.
The experiment also made me realize that all this time I've been facing parsers combinators under the wrong angle. Usually parsers combinators are presented focusing on the reuse of matchers for mini-parsers and a few very simple combinators/glues. However it has weak appeal given the rules for mini-parsers (i.e. a few very simple regexes that could be implemented very easily by hand) (1) offer very little reuse and (2) only offer reuse for the easy logic. When I'm interested in reusing a parser I want to reuse the logic for the hard stuff (e.g. state/nested-level tracking on JSON parsers or the complex algorithm to determine the body length on HTTP). The hard logic is not on the matchers, but on the code for everything else. It's the logic from this big self-contained parser that I want to use, not a few matchers that I could have written myself.
That got me thinking: what is it that will enable me to reuse the logic from the big parser? What is it that re2c is doing to enable me to combine mini-parsers so seamlessly? The answer was right in front of me. Combining mini-parsers the re2c-way allows one to defer the handling of some token straight away to another mini-parser. Let's begin with a JSON example:
{ "hello": "world" }
Suppose you want a syntactic extension that allows one to encode dates in JSON documents:
{ "hello": %2021-07-07% }
What you want here is: when the token %2021-07-07%
is reached, a different parser is invoked to handle it and then we advance the main parser as well. So, really, it's two things that are seen by the main parser:
Trial.Protocol already offers #1
. #2
can be emulated. For instance, before the parser for our syntactic date extension returns, it could call reader.next("null")
. However I contend it's desirable to offer an interface that consumes the already-decoded token/value to skip some unnecessary logic. For instance, this mini date parser could be calling reader.next<token::null>()
instead. Could we have this addition? I don't think it's a polemic addition to the interface.
That covers the main part of the problem. And then, for the last part, we have the matchers again: what to do on token::code::error_unexpected_token
? If we're using a chunked parser then this should be a non-fatal error as it'll allow for a fallback mini-parser to be invoked to handle just that token. This change would allow one to reuse Trial.Protocol parser to parse JSON documents with comments, for instance. However this little point on non-fatal unexpected-token is polemic and it deserves way more thought than the previous points. In the meantime, could chunk_reader.next<token::null>()
be added?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.