Giter Club home page Giter Club logo

Comments (9)

kfitch avatar kfitch commented on August 17, 2024

I would argue this is a case where the documentation should be changed, and M_NORMALIZE_RADIANS should be cleaned/fixed. Having a function called sin(or cos) that has a different input range than the normal mathematical function is just asking for trouble. I violates the "principle of least surprise" and is likely to lead to problems for users of the library.

Having a faster version of sin(or cos) that has a distinct name to be used in those cases where the input is known to be constrained might be a good idea, If we did have such a function I think either +/-Pi or +/-2Pi would be more reasonable restrictions than 0:2Pi. Off the top of my head, here are some possible names:

  • fastsin
  • sin_restricted
  • sin_PiOrLess
  • sin_???
  • sin_profit

P.S.

Should M_NORMALIZE_RADIANS a part of the public API of PAL? Currently it lives in one of the public headers and as a result could be used by users of PAL. I suspect M_NORMALIZE_RADIANS is something that should be internal. If you look at the implementations of trig function in the msun code it takes a different approach to "normalizing." We probably we want the flexibility to tweak this macro in the future for performance reasons without risking breaking user code.

from pal.

lfochamon avatar lfochamon commented on August 17, 2024

Given the whole idea of the library, wouldn't it make sense to define a range---be it [0,2pi) or [-pi,pi)---and leave the generation/normalization of the range to the user. In other words, just p_sin and p_cos be the "fast" functions. That would seem to be more in line with the "no belt/suspender" and GIGO ideas. Specially given the benchmarks @kfitch reported in #117. Otherwise, +1 for making M_NORMALIZE_RADIANS internal.

from pal.

aolofsson avatar aolofsson commented on August 17, 2024

@kfitch Yes, agree that we may be breaking one principle here for the sake of another. There are many embedded libraries that have used this before. I will add links to them in the README.md to clarify. We seem to be having a culture clash here between resource embedded and desktop/server developer viewpoints?:-)

from pal.

mansourmoufid avatar mansourmoufid commented on August 17, 2024

I agree that the periodic trigonometric functions should assume their argument is in the correct range. But in this case it would be nice to provide an fmod function which developers can use to ensure that it is.

For example:

p_fmod_f32(x, n, 2.f * M_PI); // optional
p_cos_f32(x, y, n);

from pal.

kfitch avatar kfitch commented on August 17, 2024

@eliteraspberries That is definitely one reasonable approach, but it brings up a few more questions:

For the case where the library user does not know for sure the range of his inputs and would use the example code you presented, could we do better performance wise? If the "fmod" operation was inside the cos function then we would have a chance to have better cache usage (or even avoid the cache by using registers). Also, the compiler might have more wiggle room to schedule things when unrolling loops and looping to fill in potential pipeline holes. And finally there are potential efficiencies to be had when combining the fmod with the cos (or other trig function). Look at the implementation under the msun directory. They actually wrap into +/-pi/2 (instead of +/-2pi), and keep track of the multiple of pi/2 needed. This lets them use various symmetries when evaluating the function. By limiting to a range closer to 0 it might be possible to use fewer terms in the Taylor series expansion approximation, which again might improve performance. Of course, the proof is in the pudding. The only way to know how much impact these potential optimizations would have would be to try it out. And, probably try it out on a few different vector lengths on a few different platforms.

For the case where the library user does know the input range to be limited to +/- 2Pi we can be very fast.

I would argue that having two versions of each trig function would be the ideal case; one with the full input range, and another limited to +/- 2Pi. This lets the library user express their intent/expectations clearly. Of course a larger API means a larger maintenance burden.

from pal.

mansourmoufid avatar mansourmoufid commented on August 17, 2024

@kfitch That sounds like it would please everyone, with a little more work.

In that case, I would humbly suggest to implement a static inline version of the fmod function in a header file, so it could be used inline by several functions. So the compiler does all the work.

Something like this for example:

fmod.h:

static inline void _fmod(const float *a, float *c, size_t n, float x)
{
    size_t i;
    for (i = 0; i < n; i++) {
        ...
    }
}
static inline void _fmod_2pi(const float *a, float *c, size_t n)
{
    _fmod(a, c, n, 2.f * M_PI);
}
void fmod(const float *, float *, size_t, float);
void fmod_2pi(const float *, float *, size_t);

fmod.c:

#include "fmod.h"
void fmod(const float *a, float *c, size_t n, float x)
{
    _fmod(a, c, n, x);
}
void fmod_2pi(const float *a, float *c, size_t n)
{
    _fmod_2pi(a, c, n);
}

cos.c:

#include "fmod.h"
static inline _cos(const float *a, float *c, size_t n)
{
    size_t i;
    for (i = 0; i < n; i++) {
        ...
    }
}
void cos(const float *a, float *c, size_t n)
{
    _cos(a, c, n);
}
void cos_norm(const float *a, float *c, size_t n)
{
    _fmod_2pi(a, c, n);
    _cos(a, c, n);
}

The compiler would inline _fmod and _cos into the cos_norm function, perhaps combine the two for loops, and do whatever instruction scheduling.

The library would end up being slightly bigger though. If everyone likes the idea, I can work on it tomorrow. In the end, I'm happy with whatever the project decides.

from pal.

mateunho avatar mateunho commented on August 17, 2024

@eliteraspberries 👍 LGTM

from pal.

lfochamon avatar lfochamon commented on August 17, 2024

Is implementing the fmods in a header file a good idea (it's effectively what's happening)? Also, if you're providing cos_norm, do you need a public fmod_2pi (@kfitch raised that issue earlier)? Some implementations of trigonometric functions might use different ranges in the future.

from pal.

mansourmoufid avatar mansourmoufid commented on August 17, 2024

I think there are two separate issues here. First is the unit testing data. It should have input which is bound to [0, 2pi]. That's easy to fix, just generate new data (gold).

The second is that the implementations may not properly handle input on [0, 2pi] -- perhaps only on [-pi, pi] or [0, pi/2]. This issue has to be handled by the individual function implementation. For example the tangent implementation uses a core algorithm for input in [0, pi/4] and then uses identities and symmetries to extend that to [0, 2pi].

from pal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.