Comments (8)
I have another proposition of optimization.
In _ntoa_long
and _ntoa_long_long
you can make different processing path based on base of number. If its base is power of 2 you can replace division and modulo operation with simple bitwise right shift and and operation. It will improve performance a lot on most microcontrolers and even more on those without hardware division support.
Another optimization (in those functions) is to use look up table while obtaining if output should be a digit or a letter. It will reduce the if statement and all folowing calculations with single memory access.
I propose something like this:
const char LUT[] = "0123456789ABCDEF";
const char lut[] = "0123456789abcdef";
// internal itoa for 'long' type
static size_t _ntoa_long(out_fct_type out, char* buffer, size_t idx, size_t maxlen, unsigned long value, bool negative, unsigned long base, unsigned int prec, unsigned int width, unsigned int flags)
{
char buf[PRINTF_NTOA_BUFFER_SIZE];
size_t len = 0U;
// no hash for 0 values
if (!value)
{
flags &= ~FLAGS_HASH;
}
// write if precision != 0 and value is != 0
if (!(flags & FLAGS_PRECISION) || value)
{
if(base == 10)
{
do
{
const char digit = (char) (value % base);
buf[len++] = LUT[digit];
value /= base;
} while(value && (len < PRINTF_NTOA_BUFFER_SIZE));
}
else
{
char* lutUsed = (flags & FLAGS_UPPERCASE ? LUT : lut);
unsigned long baseBits = (base == 16 ? 4 : (base == 2 ? 2 : 3));
do
{
const char digit = (char) (value & (base - 1));
buf[len++] = lutUsed[digit];
value >>= baseBits;
} while(value && (len < PRINTF_NTOA_BUFFER_SIZE));
}
}
return _ntoa_format(out, buffer, idx, maxlen, buf, len, negative, (unsigned int)base, prec, width, flags);
}
from printf.
from printf.
(I am just a downstream observer, not a participant.)
I recognize those kinds of ARM optimizations, and they sound good. This is great.
However Iโd also like to encourage you to optimize for code size as well. And I can help if you like.
from printf.
Hi @ledvinap , thanks a lot for your detailed investigation and the according results. Amazing!!
Iยดm on Easter holiday in the moment and have a look into all issues when returning home.
Happy Easter!
from printf.
Hi @ledvinap, I am presuming this means that out_fct_type
would be removed and only out_object.fn
would remain? This makes sense because fctprintf
is the most general overload and all other variants can pass through the same struct, so there should be no need to have two separate function pointers.
from printf.
That would actually be a good idea for the the base-10 branch also. I would perhaps explicitly write value % 10
instead of value % base
, although I believe most optimizing compilers will understand that base
is 10
at that point.
But gcc and clang will use multiplication and shifting instead of division whenever the divisor is a known integer (example here, the latter calls __aeabi_idivmod
).
from printf.
@ledvinap : Hello Petr,
I think these different suggestions should be split into separate issues; and for each issue, a PR. I realize this repository hasn't seen much traction, but still.
from printf.
@legier : See also issue #116 .
from printf.
Related Issues (20)
- Use more appropriate types for base, precision and width HOT 4
- Avoid geneal-case division and modulus in _ntoa functions
- Undefined behavior on INT_MIN HOT 1
- Proper handling of denormals
- printf_("%.1e", 9.96) prints "10.0e+00", should print "1.0e+01"
- Printing +/-0 with "%g" doesn't fall back to decimal mode
- More cases of "%g" not falling back to decimal mode
- Support for FreeRTOS lxip specifier HOT 1
- Siginificant-digits-to-precision transition made invalid by rounding
- Floating point printing - how?
- Vfctprintf
- could you support %I64d %I64u like msvc done? HOT 1
- [notice] https://github.com/eyalroz/printf <-This repo is maintaining HOT 8
- Zero padding disabled when precision specified HOT 3
- this is incredible HOT 6
- WebAssembly HOT 2
- GCC flags for freestanding? HOT 2
- for declaration not compatible with C compilers HOT 1
- _vsnprintf reads past the format parameter value HOT 6
- I've created an Arduino Library from your source code HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from printf.