Giter Club home page Giter Club logo

Comments (8)

legier avatar legier commented on May 25, 2024 2

I have another proposition of optimization.
In _ntoa_long and _ntoa_long_long you can make different processing path based on base of number. If its base is power of 2 you can replace division and modulo operation with simple bitwise right shift and and operation. It will improve performance a lot on most microcontrolers and even more on those without hardware division support.
Another optimization (in those functions) is to use look up table while obtaining if output should be a digit or a letter. It will reduce the if statement and all folowing calculations with single memory access.
I propose something like this:

const char LUT[] = "0123456789ABCDEF";
const char lut[] = "0123456789abcdef";
// internal itoa for 'long' type
static size_t _ntoa_long(out_fct_type out, char* buffer, size_t idx, size_t maxlen, unsigned long value, bool negative, unsigned long base, unsigned int prec, unsigned int width, unsigned int flags)
{
  char buf[PRINTF_NTOA_BUFFER_SIZE];
  size_t len = 0U;

  // no hash for 0 values
  if (!value) 
  {
    flags &= ~FLAGS_HASH;
  }

  // write if precision != 0 and value is != 0
  if (!(flags & FLAGS_PRECISION) || value) 
  {
    if(base == 10)
    {
      do
      {
        const char digit = (char) (value % base);
        buf[len++] = LUT[digit];
        value /= base;
      } while(value && (len < PRINTF_NTOA_BUFFER_SIZE));
    }
    else
    {
      char* lutUsed = (flags & FLAGS_UPPERCASE ? LUT : lut);
      unsigned long baseBits = (base == 16 ? 4 : (base == 2 ? 2 : 3));
      do
      {
        const char digit = (char) (value & (base - 1));
        buf[len++] = lutUsed[digit];
        value >>= baseBits;
      } while(value && (len < PRINTF_NTOA_BUFFER_SIZE));
    }
  }

  return _ntoa_format(out, buffer, idx, maxlen, buf, len, negative, (unsigned int)base, prec, width, flags);
}

from printf.

 avatar commented on May 25, 2024 1

from printf.

 avatar commented on May 25, 2024

(I am just a downstream observer, not a participant.)

I recognize those kinds of ARM optimizations, and they sound good. This is great.

However Iโ€™d also like to encourage you to optimize for code size as well. And I can help if you like.

from printf.

mpaland avatar mpaland commented on May 25, 2024

Hi @ledvinap , thanks a lot for your detailed investigation and the according results. Amazing!!
Iยดm on Easter holiday in the moment and have a look into all issues when returning home.
Happy Easter!

from printf.

vgrudenic avatar vgrudenic commented on May 25, 2024

Hi @ledvinap, I am presuming this means that out_fct_type would be removed and only out_object.fn would remain? This makes sense because fctprintf is the most general overload and all other variants can pass through the same struct, so there should be no need to have two separate function pointers.

from printf.

vgrudenic avatar vgrudenic commented on May 25, 2024

That would actually be a good idea for the the base-10 branch also. I would perhaps explicitly write value % 10 instead of value % base, although I believe most optimizing compilers will understand that base is 10 at that point.

But gcc and clang will use multiplication and shifting instead of division whenever the divisor is a known integer (example here, the latter calls __aeabi_idivmod).

from printf.

eyalroz avatar eyalroz commented on May 25, 2024

@ledvinap : Hello Petr,

I think these different suggestions should be split into separate issues; and for each issue, a PR. I realize this repository hasn't seen much traction, but still.

from printf.

eyalroz avatar eyalroz commented on May 25, 2024

@legier : See also issue #116 .

from printf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.