Giter Club home page Giter Club logo

trial.protocol's People

Contributors

breese avatar vinipsmaker avatar vinipsmaker2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

trial.protocol's Issues

Linking error in dynamic::variable

CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `trial::protocol::dynamic::variable::map(std::initializer_list<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, trial::protocol::dynamic::variable> >)':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:964: multiple definition of `trial::protocol::dynamic::variable::map(std::initializer_list<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, trial::protocol::dynamic::variable> >)'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:964: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `trial::protocol::dynamic::variable::map()':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:959: multiple definition of `trial::protocol::dynamic::variable::map()'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:959: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `trial::protocol::json::basic_reader<char>::frame** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<trial::protocol::json::basic_reader<char>::frame*>(trial::protocol::json::basic_reader<char>::frame* const*, trial::protocol::json::basic_reader<char>::frame* const*, trial::protocol::json::basic_reader<char>::frame**)':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:944: multiple definition of `trial::protocol::dynamic::variable::null'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/boost_1_61_0/boost/test/unit_test_suite.hpp:338: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `std::_Deque_base<trial::protocol::json::basic_reader<char>::frame, std::allocator<trial::protocol::json::basic_reader<char>::frame> >::_Deque_impl::_M_swap_data(std::_Deque_base<trial::protocol::json::basic_reader<char>::frame, std::allocator<trial::protocol::json::basic_reader<char>::frame> >::_Deque_impl&)':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:948: multiple definition of `trial::protocol::dynamic::variable::array(std::initializer_list<trial::protocol::dynamic::variable>)'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:948: first defined here
CMakeFiles/dynamic_dict_validator.dir/__/src/message_parsers/dynamic_dict_validator.cpp.o: In function `new_allocator':
/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:943: multiple definition of `trial::protocol::dynamic::variable::array()'
CMakeFiles/dynamic_dict_validator.dir/dynamic_dict_validator.cpp.o:/home/blinktrade/Projetos/tradex/3rd/trial.protocol/include/trial/protocol/dynamic/detail/variable.ipp:943: first defined here
clang: error: linker command failed with exit code 1 (use -v to see invocation)
test/CMakeFiles/dynamic_dict_validator.dir/build.make:127: recipe for target 'test/dynamic_dict_validator' failed

AllocatorAwareContainer

From #35

An Allocator doesn't have to be default constructible or stateless. All functions should accept an allocator instance, const Allocator& a. You can default this to an instance of Allocator() but users who use allocators where this is not an option, can supply the appropriate instance.

[dynamic] enum support?

Enumerations have their own distinct type, so they cannot be added to dynamic::variable. However, they have an integer as the underlying type, so they could be stored as such.

I am considering adding direct support for storing and retrieving enumerations.

For example:

enum class light { off, on };

// Create variable with on state.
dynamic::variable data(light::on)

// Check if stored value is an enum.
// Notice: this will succeed for any integer as well.
if (data.is<light>())
{
  // Retrieve the stored value as an enumerator.
  // Notice: the stored value may not match a valid enumerator.
  auto value = data.value<light>();
}

The disadvantage of this, is that the use can store an arbitrary integer, and then retrieve it as an enum which may not have an enumerator with the given integer value. This will result in undefined behaviour.

Ordered or unordered?

dynamic::variable supports associative arrays. This is currently implemented with std::map for convenience - I was too lazy to implement a hash function for variable.

I deliberately kept the variable iterators as ForwardIterator so we would be able to switch the implementation to std::unordered_map.

While writing test suites that uses std algorithms, I encountered several that cannot be used with variable because of its iterator category. For instance, we can use std::is_sorted (requires ForwardIterator) but not std::sort (requires RandomAccessIterator.)

The interesting algorithms are std::partition (requires BidirectionalIterator) and std::sort. Lack of support for std::partition is not a major problem because we can get a partition with std::remove. Support for std::sort is more interesting, because although it makes no sense on an associative array (which is either sorted by key, or unsortable) but it would be nice to be able to sort the elements in a normal array.

So the question is, shall variable::iterator model BidirectionalIterator/RandomAccessIterator or continue to model ForwardIterator? The former will give us access to more algorithms, but we lose the ability to use std::unordered_map.

I have slight preference for ForwardIterator, because reordering algorithms, e.g. std::sort and std::reverse, may have a strange effect on associative arrays.

XML parser

Would you be interested in having a XML parser in Trial.Protocol? I'll need one XML parser to implement WebDAV, so I could write one in the design of pull parsers like the ones in this library.

Literal whitespace vs literal node

So, I was playing with lua bindings to trial.protocol (actually it's a bigger project that happens to have JSON support as well and I'm using trial.protocol for that) and I end up finally having an use-case for the writer API. So, the following lua code:

local json = require('json')

local writer = json.writer.new()

writer:begin_object()
writer:value('foo')
writer:value('bar')
writer:end_object()
print(writer:generate())

will print:

{"foo":"bar"}

So far so good, but the literal API seems incomplete. There are two types of literals — nodes and insignificant linear whitespace. Linear whitespace literals are useful to indent the generated document, but node literals are useful in serialization libraries. For my lua bindings, the user might write a __tojson() metamethod as:

function __tojson(self, state)
    local writer = state.writer

    writer:begin_object()
    writer:value('type')
    writer:value(0)
    writer:end_object()
end

But that's kind of verbose. And given the type to be serialized has a constant representation, that's also inefficient. I would like to be able to write the following:

function __tojson(self, state)
    local writer = state.writer

    writer:literal('{"type":0}')
end

But that obviously isn't going to work. When called as part of a bigger object as in:

writer:begin_object()
writer:value('hello')
writer:value('world')
writer:value('foo')
encode_foo(foo, writer)
writer:end_object()
print(writer:generate())

The output will be:

{"hello":"world","foo"{"type":0}}

An invalid JSON.

So, we need a type of literal that accounts for a raw node to be written. I don't think that's hard. I think the bikeshedding to choose the function name is going to be more demanding than the implementation itself.

Anyway, I'm not in a rush to see this issue solved. There are other features besides JSON I have to work on before releasing my project.

Use error specializations thoroughly

There is a topic from your blog:

Better yet, define an mpg123::error exception that inherits from std::system_error, so exceptions from the mpg123 adapter can be distinguished from other system_error exceptions in the try-catch block.

http://breese.github.io/2017/05/12/customizing-error-codes.html

I just noticed that you use this trick in the library (and then remembered your blog post):

include/trial/protocol/json/error.hpp:class error : public std::system_error

However, the PR for the skip algorithm I've sent previously (I was playing with json::partial::skip() lately) doesn't follow this trick and throws system_error directly:

Should `category() == token::category::data` really be a precondition?

I mean, reader::value<T>() will already do a switch on symbol to possibly throw a runtime error. This precondition makes the API easier to use wrong and I fail to see a benefit on this.

Btw, we're using trial.protocol "heavily" (all JSON messages) on our new project at BlinkTrade, so if you need to advertise "is there a real usage case?", you can mention it (only boring thing til now is the increased number of assert(false) and code complexity thanks to #3).

Fixing Allocator support in Trial.Json

This library looks good, and good things provoke requests like these. Here's an initial list:

  • An Allocator doesn't have to be a template with a single parameter. i.e. The template parameter shouldn't be template<typename> class Allocator but instead just class Allocator. You can default it to std::allocator<char>, it doesn't matter, since you'll be rebinding as necessary.

  • An Allocator doesn't have to be default constructible or stateless. All functions should accept an allocator instance, const Allocator& a. You can default this to an instance of Allocator() but users who use allocators where this is not an option, can supply the appropriate instance.

  • Allocator construction shouldn't use addressof(*ptr) but instead to_address(ptr) (using addressof(*ptr) here before ptr references an object of T is undefined behavior).

  • Allocators might be final so do not derive from them unconditionally. Instead of deriving from Allocator, derive from a facility like boost::empty_value<Allocator> (which will use inheritance if not final, or otherwise just store a member).

Reader algorithms on chunk_reader

So, I'm trying to update tabjson to support the concatenated JSON streaming model. Therefore the first change I had to do was replacing reader by chunk_reader. However as soon as I made the change the project failed to compile due to algorithms such as json::partial::skip() no longer working.

These algorithms hardcode the type json::reader. However I don't think the current decision is wrong. It's pretty much unclear how to recover once a composite operation fails midway. Failing to accept chunk_reader args is a good thing here. If we were to change the algorithms I envision two approaches:

  • Extend the algorithms to embed the reading logic through user-injection points (e.g. a closure called to read more once end-of-buffer is reached).
  • Use a coroutine so the algorithm can resume once the user fills more of the buffer.

Both approaches are undesirable in my eyes.

However the question remains: how would one approach the usage of algorithms such as partial::skip() in a chunked stream? I have a few ideas and I thought I'd open up an issue to discuss them. I'll comment more later (a few of the ideas will also be discussed in separate issues).

Array and map construction

On the feature/dynamic branch it is possible to construct a dynamic::variable by passing a variable::array_type (aka std::vector<variable>.) I want to change this to used named constructors instead.

For example, the following code:

std::vector<variable> values;
variable data(values);

should be changed into:

std::variable<variable> values;
variable data = variable::array(values.begin(), values.end());

The reason for this change is that the current code can lead to surprising results. While working on adding support for operator<< I encountered an infinite recursion because operator<<(variable) was invoked when given a std::vector<variable>.

I could have solved the recursion by adding an overload for std::vector<variable> but that would preclude users from adding their own. Notice that operator<<(variable) simply forwards to operator<< of the underlying type, so the default behaviour of operator<<(std::vector<variable>) should be a compiler error (unless a custom overload exists.) It is not used for serialization.

All of the above also applies to variable::map_type.

Any objections?

dynamic::variable: rename number to real

Floating-point numbers are current referred to as symbol::number (for comparison integer numbers use symbol::integer.)

I have never been pleased with that term, so I am planning to renaming it to symbol::real. This name fits well with the meaning of std::real.

This will be a (trivial) breaking-change for dynamic::variable, so I want to hear if there are any objections.

Likewise, I plan to rename code::float_number, code::double_number, and code::long_double_number to code::short_real, code::real, and code::long_real respectively.

Merged matching & decode on Boost.Hana serialization

I know Boost.Hana serialization code wasn't merged yet. The issue is a future TODO item. I'll tackle it myself. And I don't want to keep my mind busy with this information. So I'll just dump the idea here and get back to it at some point in the future.

So I previously suggested the idea to optionally merge matching/decode phases (for string values only): #44 (comment). Once the feature hits the repo, I can update the Boost.Hana serialization code to make use of it.

The idea is: when I hit a member for which a std::string value is desired, branch the algorithm on:

struct_str_member.capacity() == 0

Do not merge matching/decode steps. Perform a separate match to get the literal() size and reserve enough data in one step. Then collect the string.

struct_str_member.capacity() != 0

Trust the user's size estimate heuristic and just merge matching/decode steps.

intrusive serialization

when trying to use intrusive serialization (e.g. with a member template 'serialize') I fail to compile the code. I have put a demo working with boost::archive::text_iarchive / test_oarchive in my fork (latest develop)

https://github.com/bodomartin/trial.protocol/

which is otherwise synced with your latest develop.
(using gcc 5.4.0, linux) - I noticed that when using a private serialize template the serializer_overloader does not find the member function? The boost serialization lib uses the 'access' friend.

obj.erase(it) fails on map

Tested on commit cea147e, but looking at the source code from latest develop, I think it should be failing too. I can write an isolated case to test it on develop later.

Compiler is clang. Error is:

application.cpp:36:9: error: no matching member function for call to 'erase'
    obj.erase(it);
    ~~~~^~~~~
3rd/trial.protocol/include/trial/dynamic/variable.hpp:869:14: note: candidate function not viable: no known conversion
      from 'trial::dynamic::basic_variable<std::allocator>::key_iterator' to 'trial::dynamic::basic_variable<std::allocator>::const_iterator' for 1st argument
    iterator erase(const_iterator position);
             ^

Also, finally I am coding the part where performance is not the critical and I can do whole parses and get the convenience of dynamic::variable on my code 🎉🎉🎉.

Generator for real numbers shouldn't print the fractional part if there is none

So, I'm developing a Telegram client and I'm using this lib: https://core.telegram.org/tdlib

What I learnt is that a few applications (including tdlib) will reject JSON messages if they include the fractional part (even if it's all zeros). Here's the reply I get from the previous application when sending one of such requests:

{"code":400.000000000000,"@type":"error","message":"Failed to parse JSON object as TDLib request: Can't parse \"559815.000000000\" as an integer"}

So I guess Trial.Protocol's generator needs to detect such patterns and omit the fractional part if it's going to be all zeroes. Otherwise this class of applications will not work.

For my case I can (and will) use a workaround. However I think it's important for Trial.Protocol to up its game, so I'm reporting this issue.

JSON Schema support

I have no need for this feature in the very long road. Actually, I could use them if there was an algorithm to generate random JSONs based on a JSON schema so I could add fuzzy testings in my applications, but I don't want to. This issue is just a reminder for missing features.

Specification (still a draft): http://json-schema.org/

RapidJSON has this feature: http://rapidjson.org/md_doc_schema.html

And Mozilla services use RapidJSON because they need JSON Schema validation: https://github.com/mozilla-services/mozilla-pipeline-schemas

Issues with constness in std::make_pair

include/c++/12.1.0/bits/stl_pair.h: In instantiation of 'struct std::pair<const trial::dynamic::basic_variable<std::allocator<char> >, trial::dynamic::basic_variable<std::allocator<char> > >':                                                                                                                                
include/c++/12.1.0/type_traits:1467:45:   required by substitution of 'template<class _From1, class _To1, class> static std::true_type std::__is_convertible_helper<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> >, false>::__test(int) [with _From1 = const trial::dynamic::basic_variable<std::allocator<char> >&; _To1 = trial::dynamic::basic_variable<std::allocator<char> >; <template-parameter-1-3> = <missing>]'                                      
include/c++/12.1.0/type_traits:1476:42:   required from 'struct std::__is_convertible_helper<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> >, false>'                                                                                                       
include/c++/12.1.0/type_traits:1482:12:   required from 'struct std::is_convertible<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >'                                                                                                                      
include/c++/12.1.0/type_traits:3284:72:   required from 'constexpr const bool std::is_convertible_v<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >'                                                                                                      
include/c++/12.1.0/bits/stl_pair.h:259:18:   required from 'static constexpr bool std::pair<_T1, _T2>::_S_convertible() [with _U1 = const trial::dynamic::basic_variable<std::allocator<char> >&; _U2 = const trial::dynamic::basic_variable<std::allocator<char> >&; _T1 = trial::dynamic::basic_variable<std::allocator<char> >; _T2 = trial::dynamic::basic_variable<std::allocator<char> >]'                                                                                                                                    
include/c++/12.1.0/bits/stl_pair.h:268:65:   required from 'struct std::pair<trial::dynamic::basic_variable<std::allocator<char> >, trial::dynamic::basic_variable<std::allocator<char> > >' 
                                                                                                                                   
include/trial/protocol/bintoken/serialization/dynamic/variable.hpp:121:44:   required from here
                         
include/c++/12.1.0/bits/stl_pair.h:268:65:   in 'constexpr' expansion of 'std::pair<const trial::dynamic::basic_variable<std::allocator<char> >, trial::dynamic::basic_variable<std::allocator<char> > >::_S_convertible<const trial::dynamic::basic_variable<std::allocator<char> >&, const trial::dynamic::basic_variable<std::allocator<char> >&>()'                                                                                                                                                                             
include/c++/12.1.0/bits/stl_pair.h:260:20: error: the value of 'std::is_convertible_v<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >' is not usable in a constant expression                                                                             
  260 |             return is_convertible_v<_U2, _T2>;                                                                                                                                              
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                               
In file included from include/c++/12.1.0/bits/char_traits.h:42,                                                             
                 from include/c++/12.1.0/string:40:                                                                         
include/c++/12.1.0/type_traits:3284:25: note: 'std::is_convertible_v<const trial::dynamic::basic_variable<std::allocator<char> >&, trial::dynamic::basic_variable<std::allocator<char> > >' used in its own initializer                                                                                                         
 3284 |   inline constexpr bool is_convertible_v = is_convertible<_From, _To>::value;

this patch solves it

+++ b/bintoken/serialization/dynamic/variable.hpp	2022-05-11 15:15:31.021192091 -0700
@@ -118,7 +118,9 @@ struct save_overloader< protocol::bintok
             ar.template save<bintoken::token::null>();
             for (auto it = data.begin(); it != data.end(); ++it)
             {
-                auto value = std::make_pair(it.key(), it.value());
+                auto key = it.key();
+                auto val = it.value();
+                auto value = std::make_pair(key, val);
                 ar.save_override(value.first, protocol_version);
                 ar.save_override(value.second, protocol_version);
             }

Compare current string-value with a string_view w/o allocating

I have some loops like the following on my codebase:

for (;;) {
    // Key
    if (reader.symbol() == json::token::symbol::end_object) {
        reader.next();
        break;
    }

    assert(reader.symbol() == json::token::symbol::string);
    auto current_key = reader.literal();
    current_key.remove_prefix(1);
    current_key.remove_suffix(1);

    if (!reader.next())
        throw Error("bad object");

    // Value
    try {
        if (current_key == "TestReqID") {
            msg.header.req_id = reader.value<decltype(msg.header.req_id)>();
            read_req_id = true;
        } else if (current_key == "SendTime") {
            msg.send_time = reader.value<decltype(msg.send_time)>();
            read_send_time = true;
        }
        json::partial::skip(reader);
    } catch (const json::error&) {
        throw Error("bad object");
    }
}

It'd be nice if there was a built-in function to compare current string-value with a user-provided string_view. Doing so would spare a call to reader.value<std::string>() which is an allocating function.

Tree parsing and formatting

I have added a tentative version of tree parsing and formatting of JSON in the feature/tree branch. I would like to solicit feedback on this before it enters the develop branch.

The API is fairly straight-forward:

// Converts a JSON formatted input buffer into a dynamic::variable.
// The input buffer must be one of the supported buffer types (e.g. std::string, std::ostringstream, std::vector)
// <trial/protocol/json/parse.hpp>
template <typename BufferType>
dynamic::variable parse(BufferType input);

// Converts a dynamic::variable into a JSON formatted output buffer.
// The output buffer is created.
// <trial/protocol/json/format.hpp>
template <typename BufferType>
BufferType format(dynamic::variable);

// Converts a dynamic::variable into a JSON formatted output buffer.
// The output buffer is supplied by the user.
// <trial/protocol/json/format.hpp>
template <typename BufferType>
void format(dynamic::variable, BufferType&);

Do not inherit from allocator

From #35

Allocators might be final so do not derive from them unconditionally. Instead of deriving from Allocator, derive from a facility like boost::empty_value<Allocator> (which will use inheritance if not final, or otherwise just store a member).

About small_union

make include dir/files installable

Could this line (below) be added to the main CMakeLists.txt to make the
include dir 'installable'?

install(DIRECTORY ${TRIAL_PROTOCOL_ROOT}/include/ DESTINATION include)

Wide strings

dynamic::variable is currently defined as an alias for dynamic::basic_variable<char>. This CharT template parameter is then used to select the internal string type. So you need to define a new alias for dynamic::basic_variable<wchar_t> to handle wide-character strings.

After some thinking, I believe that it would be better to remove the template parameter completely, and instead add all four standard string types as union members of dynamic::variable (which then becomes a class instead of an alias.)

This means that a dynamic::variable can store any fundamental type (where void/nullptr are replaced by a null type) and any standard string, as well as an array or map of other dynamic::variable.

Comments?

Use to_address()

From #35

Allocator construction shouldn't use addressof(*ptr) but instead to_address(ptr) (using addressof(*ptr) here before ptr references an object of T is undefined behavior).

Change Allocator template to Allocator type

From #35

An Allocator doesn't have to be a template with a single parameter. i.e. The template parameter shouldn't be template<typename> class Allocator but instead just class Allocator. You can default it to std::allocator<char>, it doesn't matter, since you'll be rebinding as necessary.

trial::protocol::json::iarchive(raw_result) >> trial::dynamic::variable{} fails on bad input

The following will fail on bad input:

void f(boost::string_view raw_result)
{
    trial::dynamic::variable result;
    try {
        trial::protocol::json::iarchive archive(raw_result);
        archive >> result
    } catch (...) {
        // ...
    }
}

The serialization code doesn't handle the trial::protocol::json::token::symbol::error enumeration:

I have some hurry to solve this issue then I'll probably send a PR (after I solve the issue on my own tree). But which error handling do you think it fits into Trial.Protocol design? Is throwing an exception fine here?

Function find_key

Given this JSON:

{
  "abc": [ { "def": 4 }, {} ],
  "def": 42
}

and a reader which is in the root JSON point:

json::reader reader(raw_json);
assert(reader.symbol() == json::token::symbol::begin_object);

find_key(reader, "def") would advance the reader to hold the point/value 42.

An initial non-tested implementation is:

if (reader.symbol() != json::token::symbol::begin_object)
  return false;

if (!reader.next())
  return false;

while (true) {
  // Key
  if (reader.symbol() == json::token::symbol::end_object) {
    reader.next();
    return false;
  }

  if (reader.value<std::string>() == key) {
    reader.next();
    return reader.symbol() != json::token::symbol::error;
  }

  if (!reader.next())
    return false;

  // Value
  skip_element(reader);
  if (reader.symbol() == json::token::error)
    return false;
}

I think this function should only try to find the key if it is given the token at the begin_object token because the semantics are clearer. If you're already iterating over the object keys, there is no point in provide an “easier” function to you, as all you'd need is the skip_element function and stay in the loop (one call to the skip_element in the else clause and nothing more... which is 2 extra lines for user code...).

Also, if you think this function can belong to Trial.Protocol, another function could be close_current_object which would pair with the reader advanced by find_key nicely.

fails to compile with clang

clang (3.9.1) , boost 1.62 , complains about ctype in libc++ ,and in the json vector serialization. I have been trying to fix these but no success for far :-|

Tests fail to build on boost 1.76

include/trial/protocol/detail/lightweight_test.hpp:42:28: error: no member named 'report_errors_remind' in namespace 'boost::detail'
            boost::detail::report_errors_remind();

Parser gets confused on non-NULL terminated string

While working on batches of JSONs stored in a file, I've stumbled upon this error. I'd normally provide an accompanying fix, but I think you'll have way easier time delving in the decoding layers due to code familiarity.

Pull #45 provides a test. It's just a copy'n'paste of an already existing test, but changing the input to move the NULL terminator further away in the byte sequence.

Interface scoping via prefix or nested struct?

dynamic::variable is a heteregenous data structure that can contain both arrays (std::vector<T>) and associated arrays (std::map<T, U>). Only the associated arrays have keys, so the iterators dereferences the value. Therefore a special key iterator, which dereferences the key, and associated functions have been added.

We need a way to distinguish between the two in the interface.

The conventional way is to use a prefix. For example:

variable::key_iterator it = object.key_begin();
variable::key_iterator it2 = object.key_find("key");

I have experimented with an alternative (emulating inner classes) that uses C++ constructs (scoped resolution operator and dot operator) instead of prefixes. This gives the following interface:

variable::key::iterator it = object.key.begin();
variable::key::iterator it2 = object.key.find("key");

While I find this to be a nicely scoped interface, the price is that each variable object will contain an extra pointer (the nested key struct has a pointer to its parent class.)

Is this an acceptable price or should I revert to prefixes?

Can `dynamic::variable::map_type::erase` accept `string_view`?

This is not the same as #19. #19 is a bug. This is a feature request.

The following code fails:

auto a = obj.value<trial::dynamic::variable::map_type>();
a.erase(boost::string_view{});

I have to allocate a std::string to erase the proper element. It'd be nice if it worked with string_view directly.

Parsers combinators

So, I've been keeping the topic of parsers combinators on the back of my head since ~2016 when you introduced me to the idea.

TBH I had some resistance to the approach of parser combinators because they're just too difficult to implement and of limited usefulness/reuse (or so I thought). However recently I was playing with re2c on a new project and in this new project I had the chance to declare several mini-parsers and jump from one to another while they reuse data pretty seamlessly. The idea immediately struck me. I was using parsers combinators.

Usually the talks around the topic focus on functional languages and the approaches I've seen are pretty complex (and limited too TBH). However on re2c I had the chance to combine parsers under an imperative approach and the result was pretty pleasant and much more natural on C++.

The experiment also made me realize that all this time I've been facing parsers combinators under the wrong angle. Usually parsers combinators are presented focusing on the reuse of matchers for mini-parsers and a few very simple combinators/glues. However it has weak appeal given the rules for mini-parsers (i.e. a few very simple regexes that could be implemented very easily by hand) (1) offer very little reuse and (2) only offer reuse for the easy logic. When I'm interested in reusing a parser I want to reuse the logic for the hard stuff (e.g. state/nested-level tracking on JSON parsers or the complex algorithm to determine the body length on HTTP). The hard logic is not on the matchers, but on the code for everything else. It's the logic from this big self-contained parser that I want to use, not a few matchers that I could have written myself.

That got me thinking: what is it that will enable me to reuse the logic from the big parser? What is it that re2c is doing to enable me to combine mini-parsers so seamlessly? The answer was right in front of me. Combining mini-parsers the re2c-way allows one to defer the handling of some token straight away to another mini-parser. Let's begin with a JSON example:

{ "hello": "world" }

Suppose you want a syntactic extension that allows one to encode dates in JSON documents:

{ "hello": %2021-07-07% }

What you want here is: when the token %2021-07-07% is reached, a different parser is invoked to handle it and then we advance the main parser as well. So, really, it's two things that are seen by the main parser:

  1. When it reaches a certain point, a hole appears. IOW, it's a chunked parser.
  2. When the hole is over, the state of the parser should be advanced as if it consumed a value token.

Trial.Protocol already offers #1. #2 can be emulated. For instance, before the parser for our syntactic date extension returns, it could call reader.next("null"). However I contend it's desirable to offer an interface that consumes the already-decoded token/value to skip some unnecessary logic. For instance, this mini date parser could be calling reader.next<token::null>() instead. Could we have this addition? I don't think it's a polemic addition to the interface.

That covers the main part of the problem. And then, for the last part, we have the matchers again: what to do on token::code::error_unexpected_token? If we're using a chunked parser then this should be a non-fatal error as it'll allow for a fallback mini-parser to be invoked to handle just that token. This change would allow one to reuse Trial.Protocol parser to parse JSON documents with comments, for instance. However this little point on non-fatal unexpected-token is polemic and it deserves way more thought than the previous points. In the meantime, could chunk_reader.next<token::null>() be added?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.