Giter Club home page Giter Club logo

Comments (9)

tomas-abrahamsson avatar tomas-abrahamsson commented on September 28, 2024

Hi, interesting stuff!

About switching MsgDef to a map: I'm getting the impression you are using the gpb module directly as opposed to generating an encoder/decoder module and using it? If yes, then you would probably get some performance improvement more-or-less for free if you can switch over to using a generated module. Can you, or are you limited to gpb due to something? Maybe you or your clients are using the Elixir protobuf, in case it would imply some restrictions? I'm not familiar with how it works. I'd be interested to learn more about your use case. The generated encoder/decoder does not traverse the list of definitions. Instead, the definitions are embedded in the structure of the generated code.

I've kept the data-driven encoder/decoder in the gpb module mostly to be able to cross-check encoding/decoding results against generated modules, but I have generally not spent any efforts on performance or other bells and whistles here. This has gone into the code generator instead.

Regarding maps, I just opened an issue over in hexpm/hex_core#134 for what the oldest OTP version to support. (If it is still 17, it would impose some limitations on how one can phrase the maps expressions)

Regarding the api of the (assumedly intended) gpb module, I think an approach could be to work on maps internally, but the api would still need to accept defs also as a list (bwd compat), and in that case convert the defs to maps as a first step. As the documented definitions format is a list, I guess it would make sense to also expose, as an api function, that function that will turn the definition list into a map. (An alternative is of course to define a new version of the definitions format, but I think would be more work, since then the code generator needs to be adapted as well.)


Regarding io-lists, do you have any performance figures for only this part of the proposed change? Currently, the code relies on the Erlang optimization that binaries are initially write-appendable under the hood. This is at least for the generated code, maybe also for the gpb module (don't remember,) but again, are you referring to the data-driven encoder/decoder in the gpb module? I tried earlier with iolists instead, but didn't find it made any much of a speedup, if I remember correctly. Unfortunately, I don't think I have any results to share anymore and it was quite some time ago.

from gpb.

sasa1977 avatar sasa1977 commented on September 28, 2024

I'm getting the impression you are using the gpb module directly as opposed to generating an encoder/decoder module and using it?

Yeah, I was benching and optimizing gpb:encode_msg. I didn't look at the generated encoder, but I naively assumed that the code in those modules would internally use gpb. Is that not the case, i.e. you're saying that gpb is basically a parallel implementation of the encoder/decoder?

Regarding io-lists, do you have any performance figures for only this part of the proposed change?

As mentioned in the first comment, switching to iolists brought an extra 2x improvement (on the particular data structure I was benching).

Currently, the code relies on the Erlang optimization that binaries are initially write-appendable under the hood.

Yes, binary is write-appendable, but it still needs to be reallocated if it doesn't have enough space (source). Given a large enough input data structure, I'd expect frequent reallocation. With iolists, these reallocations can be avoided. Only at the end we invoke an efficient iolist_to_binary to produce the encoded bytes.

are you limited to gpb due to something?

I'll double check and report back.

from gpb.

sasa1977 avatar sasa1977 commented on September 28, 2024

As mentioned in the first comment, switching to iolists brought an extra 2x improvement (on the particular data structure I was benching).

Sorry, this statement is misleading. The speed up was 2x after the maps optimization. But disregarding that, and speaking in absolute numbers, with the test input I'm using the master version encoding takes 5ms on average. If I switch to iolists, it takes 4ms on average.

from gpb.

tomas-abrahamsson avatar tomas-abrahamsson commented on September 28, 2024

Is that not the case, i.e. you're saying that gpb is basically a parallel implementation of the encoder/decoder?

Yes, that's correct, it is a parallel implementation, there is no run-time dependency from the generated code to gpb.

but it still needs to be reallocated if it doesn't have enough space [...] with the test input I'm using the master version encoding takes 5ms on average. If I switch to iolists, it takes 4ms on average.

Good point about reallocations, I didn't think about that. And 20% improvement on encoding (for this particular input) is indeed something :) So this seems it would be a worthwhile improvement.

There could probably a break-even somewhere if a binary of an iolist is small, to use integers instead in case they are below 256. I'm thinking about memory usage for binaries vs integers, as described in the efficiency guide For example let's say we have a field that is of type bytes, let's say the field's number is 1 and its value is 17 bytes. The wiretype for a length-delimited field with number 1 is (1 bsl 3) + 2 = 10 which is below 256. The length 17 is below 128, so the varint-encoding of the length will also be below 256. Then it would be slightly more memory-efficient to store it in iolist as [10, 17, <<17 bytes>>] than the more general [<<10>>, <<17>>, <<17 bytes>>] or [<<10, 17>>, <<17 bytes>>] But this could well be a premature optimization that just results in too complex code.

from gpb.

sasa1977 avatar sasa1977 commented on September 28, 2024

Yes, that's correct, it is a parallel implementation, there is no run-time dependency from the generated code to gpb.

How stable would you say the gbp implementation is? I'm asking because my clients are currently using that one, and it seems that switching to the generated code might be a pretty large undertaking at this point.

from gpb.

tomas-abrahamsson avatar tomas-abrahamsson commented on September 28, 2024

Stable in what sense? I'd say both implementations are fairly well tested. Most work has gone into the generated code. Both since it is a bit more complex problem, but also because there are more options for it.

I forgot to mention that for the generated code, there is also a nif option to generate code that uses Google's C++ protobuf library via NIFs to encode and decode, and a bit more performance can be squeezed out with the bypass_wrappers option. But there are some caveats if you plan to use or switch between overlapping set of proto definitions, see this section of the README.nif-cc and the build process becomes yet a bit more complex of course.

from gpb.

sasa1977 avatar sasa1977 commented on September 28, 2024

OK thanks! Let me know if you're interested in accepting these two perf improvements for the gpb module.

from gpb.

tomas-abrahamsson avatar tomas-abrahamsson commented on September 28, 2024

Yes, definitely, I think they'd be nice improvements.

from gpb.

tomas-abrahamsson avatar tomas-abrahamsson commented on September 28, 2024

Thanks for both PRs. I will take a look.

from gpb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.