Comments (5)
Yep I agree, there is too much inlining right now. I've also ran into code size problems with MPack. I have a port of MPack to Pebble smartwatch lying around here, and it compiles fine on it, but the unit test suite is way too huge to run. I've only been able to run minor individual tests, and there's no point in the port if it takes up most of the space available to the program.
The recent inlining came about due to some aggressive optimizations I've been doing the past couple months. I do want to point out that none of the inlining was done speculatively. A few months ago I started writing a benchmarking suite for schemaless data parsers which I'm getting ready to release publicly soon. It compares the performance and code size of various libraries including the reference msgpack implementation, MPack, CMP, RapidJSON, and YAJL. This benchmarking suite largely drove the development of v0.5 of MPack which made major performance improvements. It's what I've been using to do these performance tests, and the inlining changes were all made on observation of very real, statistically significant performance improvements in my tests (on a desktop PC, a Chromebook, and a Raspberry Pi.) So I'm not just assuming inlining is automatically faster or guessing at what to inline; they do actually improve performance.
Some of these inlines don't need to be done directly with different optimization options however. I'm switching over the benchmarking suite to -O3 -flto -fwhole-program
, which makes any use-once inline unnecessary. This is what the kernel style guide you linked is referring to. I had originally avoided this though, since in my experience most projects in the real world aren't actually compiled this way, and I wanted to keep the benchmarking as realistic as possible (I'm still avoiding profile-guided optimization in benchmarking for the same reason.) And I do agree that performance isn't everything, especially since typically the data parser is such a minor part of any large application, so optimizing for code size instead of speed is usually better anyway. But these things are hard to demonstrate objectively, and I do still want MPack to be the fastest schemaless parser available. So there's a tradeoff to be made here.
In the meantime you might be able to get away with replacing static inline
with MPACK_INLINE
in the code, and then defining MPACK_INLINE
to static __attribute__((unused))
or something. If your linker is good and/or configured properly it should be able to merge down duplicate static function definitions. In the longer term, I might consider adding a configuration option to choose between optimizing for size or speed which would conditionally inline some functions. This might make MPack be able to claim to be the fastest and smallest MessagePack parser when configured appropriately. (This is especially true since it doesn't do any of the templating that the reference implementation does, or that other parsers claim is key to their performance, like RapidJSON. It isn't. I believe that the extent of templating they do is bad, and one of the ways I want to be able to demonstrate this is by having a smaller code footprint.)
Anyway thanks for bringing this up. It's good to know the issues you're having. I agree and have been having similar issues, and it gives me a good direction to work on. I'll definitely be making this a priority for v0.6.
from mpack.
Hi Nicholas (@ludocode),
Thanks for you clear answer. This library is one of the only ones which fit embedded (micro-controller) usage because use of malloc/calloc free is left how it is configured. The use of dynamic memory is not acceptable for MISRA-C. Code size is the most important for deeply embedded systems (e.g ARM Cortex-Mx). I will track the development and when I have some enhancements I will send you pull requests.
from mpack.
This is now implemented on develop (b2bb68c). This work was done as part of a conversion of inline usage away from static inline
, instead using "extern inlines" as appropriate to provide a single definition when a function cannot be inlined.
A new option MPACK_OPTIMIZE_FOR_SIZE
is available in mpack-config.h
. If you are using GCC it will detect the use of -Os
and will automatically enable the option. Any non-trivial functions previously declared inline
are now only declared inline
if MPACK_OPTIMIZE_FOR_SIZE
is not true.
You can see the complexity of supporting all of this here:
mpack/src/mpack/mpack-platform.h
Lines 131 to 203 in b2bb68c
This new option seems like it may be pointless though. When using my benchmarking suite as a semi-realistic test of executable size, enabling MPACK_OPTIMIZE_FOR_SIZE
under -Os -flto
doesn't seem to accomplish anything; in fact with GCC 5.2 it actually makes the writer test larger by about 170 bytes!
Remember, inline
is just a compiler suggestion, and when compiling with -Os
, the compiler is likely to ignore it if it does not actually reduce the size of the binary. I'm guessing adding inline
in lots of places makes the compiler perform more inter-procedural analysis to find cases where it can actually reduce the size, but it doesn't hurt anything because most of the time it refuses to inline the function. The avoidance of static
also doesn't matter, since the linker was already smart enough to merge duplicate definitions.
So the reasoning in your kernel.org documentation may no longer be true, at least not when using a modern compiler and configuring it to optimize for size with link-time optimization. Using static
and inline
liberally doesn't actually bloat the binary at all, and in fact can actually further reduce the size.
Anyway, give the new code a try and see if the option does anything for you. I didn't try any other platforms or compilers, and I didn't try a non-lto build either, so maybe there are cases where it helps. Make sure you are compiling with -Os -flto
and linking with -Os -flto -fwhole-program
if possible. Also, and I'm sure this is obvious, but double-check to make sure neither of DEBUG
and _DEBUG
are defined. Try toggling the MPACK_OPTIMIZE_FOR_SIZE
option to see how it affects the size of your binary. If it doesn't have a significant effect I'm most likely going to remove it and just use MPACK_INLINE for everything.
from mpack.
I just released version 0.6 with the above inline fixes and the MPACK_OPTIMIZE_FOR_SIZE
option, so I'm resolving this as fixed. The size option might be removed at some point if it doesn't really reduce the size under proper size-optimized builds, but either way it's fine for now.
from mpack.
Thanks for putting your time and effort in this, totaly agree on your points and have nothing to add.
Kind regards,
Jerry
from mpack.
Related Issues (20)
- Stream reader with manual memory management? HOT 2
- comparison with libmpack HOT 1
- Warning about incompatible int types in mpack_snprintf calls HOT 1
- ESP32-C3 rebooting when executing functions from the MPack library
- Stream parser that doesn't buffer the entire message
- tools/unit.sh help - does not work
- How to know the end of message pack in serial communication? HOT 1
- memory error on avr (arduino uno)
- File IO incompatible with custom context HOT 2
- Suggestions for warnings? HOT 3
- mpack crashes without a matching call to mpack_complete_* HOT 3
- Node API for read+write? HOT 1
- How to change the data with node API? Or how to write fast when I know the data offset or giving node? HOT 2
- get nothing if prev node is nil HOT 3
- Make it possible to use a dynamic buffer with mpack writer HOT 2
- builder page allocation leak and segfault HOT 6
- Missing Symbolic Link HOT 1
- Should NULL data be allowed for mpack_write_str() with count of 0? HOT 1
- Error handler segfault HOT 1
- How can I get it header only? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mpack.