fu5ha / ultraviolet Goto Github PK

A wide linear algebra crate for games and graphics.

Home Page: https://crates.io/crates/ultraviolet

Rust 100.00%

ultraviolet's Introduction

`ultraviolet`

This is a crate to computer-graphics and games-related linear and geometric algebra, but fast, both in terms of productivity and in terms of runtime performance.

In terms of productivity, ultraviolet uses no generics and is designed to be as straightforward of an interface as possible, resulting in fast compilation times and clear code. In addition, the lack of generics and Rust type-system "hacks" result in clear and concise errors that are easy to parse and fix for the user.

In terms of runtime performance, ultraviolet was designed from the start with performance in mind. To do so, we provide two separate kinds of each type, each with nearly identical functionality, one with usual scalar f32 values, and the other a 'wide' type which uses SIMD f32x4 vectors for each value. This design is clear and explicit in intent, and it also allows code to take full advantage of SIMD.

The 'wide' types use an "SoA" (Structure of Arrays) architecture such that each wide data structure actually contains the data for 4 or 8 of its associated data type and will do any operation on all of the simd 'lanes' at the same time. For example, a Vec3x8 is equivalent to 8 Vec3s all bundled together into one data structure.

Doing this is potentially much (factor of 10) faster than an standard "AoS" (Array of Structs) layout, though it does depend on your workload and algorithm requirements. Algorithms must be carefully architected to take full advantage of this, and doing so can be easier said than done, especially if your algorithm involves significant branching.

ultraviolet was the first Rust math library to be designed in this "AoSoA" manner, though nalgebra now supports it for several of their data structures as well.

Benchmarks

See mathbench-rs for latest benchmarks (may not be fully up-to-date with git master).

Cargo Features

To help further improve build times, ultraviolet puts various functionality under feature flags. For example, the 2d and 3d projective geometric algebras as well as f64 and integer types are disabled by default. In order to enable them, enable the corresponding crate feature flags in your Cargo.toml. For example:

[dependencies]
ultraviolet = { version = "0.9", features = [ "f64", "int" ] }

Will enable the f64 and int features. Here's a list of the available features:

f64 – Enable f64 bit wide floating point support. Naming convention is D[Type], such as DVec3x4 would be a collection of 4 3d vectors with f64 precision each.
int – Enable integer vector types.
bytemuck – Enable casting of many types to byte arrays, for use with graphics APIs.
mint – Enable interoperation with other math crates through the mint interface.
num-traits – Enable identity traits for interoperation with other math crates.
serde – Enable Serialize and Deserialize implementations for many scalar types.

Crate Features

This crate is currently being dogfooded in my ray tracer rayn, and is being used by various independent Rust game developers for various projects. It does what those users have currently needed it to do.

There are a couple relatively unique/novel features in this library, the most important being the use of the Geometric Algebra.

Instead of implementing complex number algebra (for 2d rotations) and Quaternion algebra (for 3d rotations), we use Rotors, a concept taken from Geometric Algebra, to represent 2d and 3d rotations.

What this means for the programmer is that you will be using the Rotor3 type in place of a Quaternion, though you can expect it to do basically all the same things that a Quaternion does. In fact, Quaternions are directly isomorphic to Rotors (meaning they are in essense the same thing, just formulated differently). The reason this decision was made was twofold: first, the derivation of the math is actually quite simple to understand. All the derivations for the code implemented in the Rotor structs in this library are written out in the derivations folder of the GitHub repo; I derived them manually as part of the implementation.

On the other hand, Quaternions are often basically just seen as black boxes that we programmers use to do rotations because they have some nice properties, but that we don't really understand. You can use Rotors this same way, but you can also easily understand them. Second is that in some sense they can be seen as 'more correct' than Quaternions. Specifically, they facilitate a more proper understanding of rotation as being something that occurs within a plane rather than something that occurs around an axis, as it is generally thought. Finally, Rotors also generalize to 4 and even higher dimensions, and if someone wants to they could implement a Rotor4 which retains all the properties of a Rotor3/Quaternion but does rotation in 4 dimensions instead, something which simply is not possible to do with Quaternions.

If it's missing something you need it to do, bug me on the GitHub issue tracker and/or Rust community discord server (I'm Fusha there) and I'll try to add it for you, if I believe it fits with the vision of the lib :)

Examples

Euler Integration

Euler Integration is a method for numerically solving ordinary differential equations. If that sounds complicated, don't worry! The details of the method don't matter if you're not looking to implement any kind of physics simulation but this method is common in games. Keep reading for the code below!

The point is that if you are doing the same basic math operations on multiple floating point values with no conditionals (no ifs), porting to wide data types and parallel processing is quite simple.

Here is the scalar example of Euler Integration:

fn integrate(
    pos: &mut [uv::Vec3],
    vel: &mut [uv::Vec3],
    acc: &[uv::Vec3],
    dt: f32,
) {
    for ((position, velocity), acceleration) in pos.iter_mut().zip(vel).zip(acc) {
        *velocity = *velocity + *acceleration * dt;
        *position = *position + *velocity * dt;
    }
}

The code loops over each set of corresponding position, velocity, and acceleration vectors. It first adjusts the velocity by the acceleration scaled by the amount of time that has passed and then adjusts the position by the velocity scaled by the amount of time that has passed.

These are all multiplication, addition, and assignment operators that need to be applied in the same way to all of the variables in question.

To port this function to wide data types and parallel processing, all we have to do is change the data types and we're done! The new function looks like this:

fn integrate_x8(
    pos: &mut [uv::Vec3x8],
    vel: &mut [uv::Vec3x8],
    acc: &[uv::Vec3x8],
    dt: f32x8,
) {
    for ((position, velocity), acceleration) in pos.iter_mut().zip(vel).zip(acc) {
        *velocity = *velocity + *acceleration * dt;
        *position = *position + *velocity * dt;
    }
}

This function now processes 8 sets of vectors in parallel and brings significant speed gains!

The only caveat is that the calling code that creates the slices of vectors needs to be modified to populate these wide data types with 8 sets of values instead of just one. The scalar code for that looks like this:

let mut pos: Vec<uv::Vec3> = Vec::with_capacity(100);
let mut vel: Vec<uv::Vec3> = Vec::with_capacity(100);
let mut acc: Vec<uv::Vec3> = Vec::with_capacity(100);

// You would probably write these constant values in-line but
// they are here for illustrative purposes
let pos_x = 1.0f32;
let pos_y = 2.0f32;
let pos_z = 3.0f32;

let vel_x = 4.0f32;
let vel_y = 5.0f32;
let vel_z = 6.0f32;

let acc_x = 7.0f32;
let acc_y = 8.0f32;
let acc_z = 9.0f32;

for ((position, velocity), acceleration) in pos.iter_mut().zip(vel).zip(acc) {
    pos.push(uv::Vec3::new(pos_x, pos_y, pos_z));
    vel.push(uv::Vec3::new(vel_x, vel_y, vel_z));
    acc.push(uv::Vec3::new(acc_x, acc_y, acc_z));
}

Whereas to populate the same for the 8-lane wide Vec3x8 data type, the code could look like this:

let mut pos: Vec<uv::Vec3x8> = Vec::with_capacity(100 / 8 + 1);
let mut vel: Vec<uv::Vec3x8> = Vec::with_capacity(100 / 8 + 1);
let mut acc: Vec<uv::Vec3x8> = Vec::with_capacity(100 / 8 + 1);

let pos_x = uv::f32x8::splat(1.0f32);
let pos_y = uv::f32x8::splat(2.0f32);
let pos_z = uv::f32x8::splat(3.0f32);

let vel_x = uv::f32x8::splat(4.0f32);
let vel_y = uv::f32x8::splat(5.0f32);
let vel_z = uv::f32x8::splat(6.0f32);

let acc_x = uv::f32x8::splat(7.0f32);
let acc_y = uv::f32x8::splat(8.0f32);
let acc_z = uv::f32x8::splat(9.0f32);

for ((position, velocity), acceleration) in pos.iter_mut().zip(vel).zip(acc) {
    pos.push(uv::Vec3x8::new(pos_x, pos_y, pos_z));
    vel.push(uv::Vec3x8::new(vel_x, vel_y, vel_z));
    acc.push(uv::Vec3x8::new(acc_x, acc_y, acc_z));
}

Note that 100 / 8 in maths terms would be 12.5, but we can't conveniently have a half-sized Vec3x8.

There are various ways to handle these 'remainder' vectors. You could fall back to scalar code, or progressively fall back to narrower wide types, such as Vec3x4, or you can just consider whether the cost of calculating a few additional vectors that you won't use is worth adding complexity to your code.

Ray-Sphere Intersection

Scalar code that operates on a single value at a time needs some restructuring to take advantage of SIMD and the 4-/8-wide data types.

Below is an example of scalar ray-sphere instersection code using Vec3 for points and vectors:

fn ray_sphere_intersect(
    ray_o: uv::Vec3,
    ray_d: uv::Vec3,
    sphere_o: uv::Vec3,
    sphere_r_sq: f32,
) -> f32 {
    let oc = ray_o - sphere_o;
    let b = oc.dot(ray_d);
    let c = oc.mag_sq() - sphere_r_sq;
    let descrim = b * b - c;

    if descrim > 0.0 {
        let desc_sqrt = descrim.sqrt();

        let t1 = -b - desc_sqrt;
        if t1 > 0.0 {
            t1
        } else {
            let t2 = -b + desc_sqrt;
            if t2 > 0.0 {
                t2
            } else {
                f32::MAX
            }
        }
    } else {
        f32::MAX
    }
}

This porting guide will not discuss the details of the algorithm, but will focus on how to convert the code to apply parallel SIMD operations on wide data types.

The first thing to do is to convert the parameter and return types from scalar Vec3 to wide Vec3x8 and f32x8:

fn ray_sphere_intersect_x8(
    ray_o: uv::Vec3x8,
    ray_d: uv::Vec3x8,
    sphere_o: uv::Vec3x8,
    sphere_r_sq: uv::f32x8,
) -> uv::f32x8 {

Each call to the function will process 8 ray-sphere intersections in parallel. The first four lines of the function remain the same:

    let oc = ray_o - sphere_o;
    let b = oc.dot(ray_d);
    let c = oc.mag_sq() - sphere_r_sq;
    let descrim = b * b - c;

Despite this code being the same, the calculations for 8 rays and spheres will be carried out at the same time!

The next line of the scalar code tests the value of descrim to see if it is greater than 0.0. When operating on 8 values at a time, the code cannot branch along two separate paths because the value of descrim for each of the 8 values may cause branching to different sets of operations. To support this we would need to convert back to scalar code and then we lose all the performance benefits of parallel processing.

So, how do we convert this? We have a tradeoff to consider depending on the frequency of divergence, that is depending on how often the branch will follow one or the other path. If it is very likely for the given data and algorithm that the majority of branches will take one path, we can check whether all lanes take that path and then branch based on that. Such a bias toward one branch path is relatively rare, and in the case of this algorithm it is common to branch either way so this approach would produce slower code.

Another approach is to calculate the results for both branches for all 8 lanes, and then filter the results with masks that select the correct values from the possibilities at the end.

To create the mask for 8 lanes of descrim values with 0.0:

    let desc_pos = descrim.cmp_gt(uv::f32x8::splat(0.0));

In the true case of the original scalar version, we then have more arithmetic operations that end up looking the exact same when we do them on the vectorized version:

    let desc_sqrt = descrim.sqrt();

    let t1 = -b - desc_sqrt;

And now in the scalar code we have another branch based on t1 > 0.0, so we apply the same technique, with a little bit extra:

    let t1_valid = t1.cmp_gt(uv::f32x8::splat(0.0)) & desc_pos;

The & desc_pos at the end does a bitwise and operation to combine the masks that say whether each of the lanes of t1 > 0.0 are true or false, with those of whether each of the lanes of descrim > 0.0 were true or false, and if both are true for a lane, then the mask value will be true for that lane in t1_mask, otherwise the value for the lane will be false. This is combining the nested logic.

The true case of the t1 > 0.0 condition just returns t1, but the false case has some more calculation and branching that can be ported in a similar way:

    let t2 = -b + desc_sqrt;
    let t2_valid = t2.cmp_gt(uv::f32x8::splat(0.0)) & desc_pos;

This may sound like it could be slower than scalar code because this algorithm being applied to wide data types is doing all the calculations for both branches regardless of which is true, and you would be right!

This approach is indeed a tradeoff and depends on the likelihood of branching one way or the other, and the cost of calculation of the branches. However, even with an algorithm that is particularly branch-heavy like the ray-sphere intersection we're analyzing here, in practice, the benefits of being able to calculate multiple pieces of data simultaneously often results in a net win! As with all optimization, measurement tells the truth.

At this point, we have ported almost the entire algorithm. We have values for t1 and t2 for each of the 8 lanes. We have mask values in t1_valid that indicate whether both descrim > 0.0 && t1 > 0.0 for each lane. And we have t2_valid with values indicating exactly descrim > 0.0 && t2 > 0.0. When the scalar code does not return t1 or t2, it returns f32::MAX. How do we now select the correct return value for each of the lanes?

ultraviolet has a blend function on the mask types that uses the true or false values for each of the lanes to select from the calculated values for the true and false cases. So if a were a wide vector of values that would be calculated in the true case of a branch, and b were for the false case, with a mask m we could select from a and b based on m by calling m.blend(a, b) and the result would be the desired output values!

Let's try to apply that to the scalar code by looking just at its logical control flow:

    if descrim > 0.0 {
        if t1 > 0.0 {
            t1
        } else {
            if t2 > 0.0 {
                t2
            } else {
                f32::MAX
            }
        }
    } else {
        f32::MAX
    }

So if we take the outer-most if condition..

   let t = t1_valid.blend(t1, ???);

What is the value for false case of the descrim > 0.0 && t1 > 0.0 test? There are two possibilities - either descrim <= 0.0, which is the false case of the descrim > 0.0 condition, or descrim > 0.0 && t1 <= 0.0 which is the else case where we handle t2. This looks complicated. Let's try looking at the descrim > 0.0 && t2 > 0.0 case in the scalar code and try blending that:

    let t = t2_valid.blend(t2, uv::f32x8::splat(std::f32::MAX));

So descrim > 0.0 && t2 > 0.0 has two false cases, either descrim <= 0.0 and we want to return f32::MAX, or descrim > 0.0 && t2 <= 0.0 and we want to return f32::MAX, so we can blend to select the correct values here to cover the false case of the scalar descrim > 0.0 condition, and the false case of the t1 > 0.0 condition, that leaves only the true case of the t1 > 0.0 condition left to resolve...

And that is exactly what t1_valid.blend(t1, ???) would select! So we can combine the two blends like this:

    let t = t2_valid.blend(t2, uv::f32x8::splat(std::f32::MAX));
    let t = t1_valid.blend(t1, t);

t now contains t1, t2 or f32::MAX as appropriate for each of the lanes! We have completed the port of the scalar algorithm code to leverage SIMD operations on 8-lane wide data types to calculate 8 ray-sphere intersections in parallel!

Below is the full example of the same ray-sphere intersection algorithm implemented using the wide Vec3x8 type:

fn ray_sphere_intersect_x8(
    sphere_o: uv::Vec3x8,
    sphere_r_sq: uv::f32x8,
    ray_o: uv::Vec3x8,
    ray_d: uv::Vec3x8,
) -> uv::f32x8 {
    let oc = ray_o - sphere_o;
    let b = oc.dot(ray_d);
    let c = oc.mag_sq() - sphere_r_sq;
    let descrim = b * b - c;

    let desc_pos = descrim.cmp_gt(uv::f32x8::splat(0.0));

    let desc_sqrt = descrim.sqrt();

    let t1 = -b - desc_sqrt;
    let t1_valid = t1.cmp_gt(uv::f32x8::splat(0.0)) & desc_pos;

    let t2 = -b + desc_sqrt;
    let t2_valid = t2.cmp_gt(uv::f32x8::splat(0.0)) & desc_pos;

    let t = t2_valid.blend(t2, uv::f32x8::splat(std::f32::MAX));
    let t = t1_valid.blend(t1, t);

    t
}

ultraviolet's People

Contributors

Stargazers

Watchers

ultraviolet's Issues

Impl PartialEq for Vec3x4 and other SIMD vectors

I noticed that PartialEq is only implemented for Vec3 and f32x4, etc. but not for Vec3x4.
It would make using them in macros and tests much easier.

error[E0369]: binary operation `==` cannot be applied to type `Vec3x4`

`Similarity{N}::into_homogeneous_matrix` ignores scale

This function constructs matrix from translation component and multiplies by matrix made of rotation component. I guess scale component should be used as well.

Union of Aabb

Assume we have:

let a = Aabb::new(Vec3::new(0, 0, 0), Vec3::new(2, 2, 2));
let b = Aabb::new(Vec3::new(-1, -1, -1), Vec3::new(1, 1, 1));

Can we have function union:

union(a, b) == Aabb::new(Vec3::new(-1, -1, -1), Vec3::new(2, 2, 2))

Add mint support for non-wide types

Add methods to extract translation and rotation from `Mat4`

Given a Mat4 of the form M = T * R where T is a translation matrix and R is a rotation matrix, I would like to be able to retrieve T as a Vec3 and R as a Rotor3 from M.

Doing this would also make it possible to implement a Mat4 -> Isometry3 conversion.

Retrieving T is easy it is the tranlsation given by M.transform_point3(Vec3::zero())
To get R one can do

Mat3::new(
  M.transform_vec3(Vec3::unit_x()),
  M.transform_vec3(Vec3::unit_y()),
  M.transform_vec3(Vec3::unit_z()),
).into_rotor3()

But maybe there is a better way.

What are your opinion about adding these methods ?
One problem that not all Mat4s are of the product of a translation and a rotation so this factorization would not always make sense.

PartialEq for Vec and Mat

Once again, something I would find useful, and I actually need. Is there any reason why it's not implemented?

Possibly change Default implementations to be closer to 'identity' rather than just uniformly zero

Currently, Default implementations simply make all values zero. This is concise and clean from a memory standpoint but can result in some possibly unintuitive behavior and, more importantly, invalid values of some constructs being created by Default. For example, Rotors should always have a norm of one, but currently upon being constructed by Default::default, they will have norm of 0 which is invalid. Similarly for square matrices, perhaps Default should create identity matrices.

Addition and scalar multiplication of matrices.

I notice that these impls are missing:

impl Add for MatN (and probably AddAssign as well)
impl Mul<f32> for MatN

Admittedly, these aren't the most common operations, but they do come up occasionally, and it seems weird that they're missing.

There is a workaround, but it's a bit annoying; split the matri(x/ces) into (its/their) component vectors, and apply the operation separately on each component.

rotate, scale, and nonuniform_scale methods for Mat4

As Mat4 (and Wat4) already have translate and translated methods, it would make sense for it to also have rotate, rotated, scale, scaled, nonuniform_scale, and nonuniform_scaled methods.

Implement 2d projective geometric algebra

I'm thinking under a pga2d module

`Mul<Bivec3>` implementation for `Mat3`

I am currently writing a physics engine (just to learn) and I happened to find ultraviolet while looking for a linear algebra library. I had never heard of geometric algebra before so learning about bivectors and rotors was interesting.

I managed to figure out that torque should probably be represented by a bivector but I ran into a problem when I was going to calculate the angular acceleration. I am assuming that the inverse of the inertia tensor can still be represented by a 3x3 matrix but I noticed that there is no implementation of Mul<Bivec3> for Mat3. Should there be one or am I going about this in the wrong way?

I am also considering using a Bivec3 (which I noticed has an implementation of Mul<Bivec3>) for the inertia tensor but I guess this would only let me have the equivalent of a diagonal matrix.

multiply_left/multiply_right methods for matrices and rotors

What do you think about adding multiply methods for matrices and rotors? I'm thinking about methods of the form

pub fn multiply_left(&mut self, lhs: Self) {
  *self = lhs * *self;
}

pub fn multiply_right(&mut self, rhs: Self) {
  *self = *self * rhs;
}

The multiply right is probably the most usefull since I believe these types already have an *= implementation

Do new benchmarks

nalgebra has gained support for AoSoA-style SIMD like we use (see https://www.dimforge.com/blog/2020/03/23/simd-aosoa-in-nalgebra/), so benchmarks need updating

Feature: Mat2/3/4::adjugate method

The adjugate is just like the inverse, but not scaled by the inverse of the determinant.
It can be used as a more efficient alternative to the inverse when working with homogeneous matrices (since multiplication by a scalar has no effect on homogeneous coordinates), saving a determinant and a division operation.

Might be worth considering.

Inverse and eigen methods on matrix

Thanks for the crate. I am looking for a methods which implements inverse, eigenvalues, eigenvectors. Are these supported?

Implement 2d and 3d bivectors and rotors

Some reference https://marctenbosch.com/quaternions/

[Question] SIMD on if-cases

Thanks for the cool library! I want to use SIMD instructions for the following code, but I was not able to understand the methods working on the signs of individual components due to the lack of documentation. Could you tell me, if this is possible with the current ultraviolet?

// all variables are of type f32, but should be of type f32x8
let beta = if x >= 0.0 {
    alpha
} else {
    TWO_PI - alpha
};

What would also help: What exactly does the sign_bit() function do?

256-bit avx instructions under stable

Pending on Lokathor/wide#46

constructors should be const as often as possible

matrix constructors that need floating point ops can't be const, but VecN::new and MatN::identity and so on can be marked as const

intersect_ray/intersect_plane returns an Option for possibly no reason?

The family of functions intersect_ray/line/plane on Plane/Ray all return an Option that is actually always Some while the comment(that is not present on Ray::intersect_plane) states that the callee has to check prior to calling the given function whether the ray actually intersects the plane or not. The return type in general made me at first assume that the operation is actually checking whether the intersection occured or not. Was the Option forgotten to be removed or was it planned to actually be a checked operation at some point?

The code in question: https://github.com/termhn/ultraviolet/blob/0dfc114deda9e8ed94c964de027bd74a206cd672/src/geometry.rs#L177-L182

256 and 512-bit avx instructions under nightly through packed_simd

This will let us provide f32x8, f32x18, f64x4, f64x8 types

Add rigid body transform types (isometry, similarity, affine)

Built-in support for wgpu

Including built-in support for wgpu for functions like projection_wgpu and that kind of thing would be really nice. It's easy to fix it by just multiplying by a correction matrix, but it's not quite convenient.

f64 types have no PartialEq

It seems PartialEq is not implemented for the f64 types!

However, the f32 types do have PartialEq, so what gives?

Example:

rotor3s!(
    Rotor3 => (Mat3, Vec3, Bivec3, f32),
    Rotor3x4 => (Mat3x4, Vec3x4, Bivec3x4, f32x4),
    Rotor3x8 => (Mat3x8, Vec3x8, Bivec3x8, f32x8)
);

impl std::cmp::PartialEq for Rotor3 {
    fn eq(&self, other: &Self) -> bool {
        self.s == other.s && self.bv == other.bv
    }
}

#[cfg(feature = "f64")]
rotor3s!(
    DRotor3 => (DMat3, DVec3, DBivec3, f64),
    DRotor3x2 => (DMat3x2, DVec3x2, DBivec3x2, f64x2),
    DRotor3x4 => (DMat3x4, DVec3x4, DBivec3x4, f64x4)
);

// No PartialEq for DRotor3? Come to think of it, DBivec3 doesn't have one either, even though Bivec3 does...

Should I PR? Was this an oversight or an intentional decision?

Rotor interpolation seems strange

My geometric algebra is fairly non-existent, so apologies if this is working correctly, and I just haven't understood things.

It seems like you get strange results when interpolating between two rotors. I think I'd expect that you would be returned a rotor which represents a rotation between the two input rotors. So, if I interpolate between two identical rotors, I'd expect to get the same rotor returned. That doesn't seem to happen. For example, if I try to interpolate between two identity rotors:

Rotor2::identity().lerp(Rotor2::identity(), 0.5)

Rotor2 {
    s: 4.0,
    bv: Bivec2 {
        xy: 0.0,
    },
}

Note how the scalar is 4, instead of 1. Is it right to say that to interpolate between two rotors, then both the scalar and the bivector should just be independently interpolated?

Implement a Mat3 -> Rotor3 conversion ?

The possibility to convert a rotation matrix into a rotor could be usefull.
The problem is that such a conversion might not always make sense if the matrix is not a rotation matrix.
One could

Have a method Mat3::to_rotor3(&self) -> Option<Rotor3> that would test if self is actually a rotation (slow and returns an Option)
Return garbage if the matrix is not a rotation and be explicit about it in the documentation.

Here is how the Matrix -> Quaternion is done in cgmath

Improve 2d matrix support/transforms

Currently Mat2 is pretty bone-dry. It would probably be useful to support some 2d stuff there.

Implement PartialEq trait to compare vector types

Hey i really like ultraviolet so far, havent used any simd yet but will at some point.

Was just wondering if it would be possible to implement the PartialEq trait in the macro that generates the vectors? I use comparisons a lot on vectors and was surprised it didnt exist.

Can't compile ultraviolet with "int" feature.

It seems that there is a missing colon in Add:add; when I tried compiling ultraviolet I got an error suggesting to insert an extra colon.

Publish version 0.5.0 to crates.io

It would allow access to the serde feature for use in other crate dependencies :)

Implement 3d projective geometric algebra

under pga3d module

Builder pattern for Mat4?

Perhaps the uv::mat::Mat4::translate method could be move builder like?
Instead of uv::mat::Mat4::translate(&mut self, Vec3) it could be uv::mat::Mat4::translate(self, Vec3)

Improve implementation of vec ops

Implement sum and difference of products using fma

https://pharr.org/matt/blog/2019/11/03/difference-of-floats.html

Unclear documentation for "angle" parameters

When creating rotations (Rotor, etc.), the methods use a parameter angle, which is not described as being in radians or degrees.

This could be cleared up, as the user has either to go to the source code to determine the implementation, or use the old try-and-error approach.

Mat4.translate() not working or unintuitive?

https://github.com/termhn/ultraviolet/blob/f09d279f99fbb1e941952575afa1f539a2b9a901/src/mat.rs#L1099-L1102

I tried calling translate on a Mat4 and noticed that the end result was not at all what I was expecting it to be.

I would have expected the matrix to represent a translation along the provided vector after calling the function, but upon inspecting the source code I saw that this doesn't seem to be the case at all.

Is the translation done here some other kind of mathematical operation I'm not aware of (I failed to find anyhting suggesting this after looking around a bit), or is this code not in a functioning state?

It seems to me that changing only a single value in the matrix can't possibly be called translation, so this code is confusing me quite a bit.

Implement `serde::Serialize` and `serde::Deserialize` for bivec and rotor types

The serde traits are implemented for the vectors and matrices types. I think it would be usefull to have them also for bivec and rotor types.

I can try to write these implementations, using those of vector and matrices as model.

Serde for Vec and Mat

It would be really nice to have serde support behind a feature for the it makes sense for, I could only find Vec and Mat, though I might be missing some.

Mat3::from_nonuniform_scale_homogeneous should take Vec2D instead of Vec3D

Currently the Z value is unused.

Rotor2 do not rotate...

Quoting https://github.com/termhn/ultraviolet/blob/master/src/rotor.rs

            /// Rotates a vector by this rotor.
            ///
            /// `self` *must* be normalized!
            #[inline]
            pub fn rotate_vec(self, vec: &mut $vt) {
                let fx = self.s * vec.x + self.bv.xy * vec.y;
                let fy = self.s * vec.y - (self.bv.xy * vec.x);

                vec.x = self.s * fx - (self.bv.xy * fy);
                vec.y = self.s * fy + self.bv.xy * fx;
            }

If I call s = self.s, b = self.bv.xy, then vec.x becomes

vec.x = s * fx - b * fy
vec.x = s * s * vec.x + s*b * vec.y - b * s * vec.y + b * b * vec.x 
vec.x = self.mag() * vec.x

Which is definitely not what we want.. I don't know what the correct formula is however

Spherical Linear Interpolation

I implemented the necessary trig functions in both wide and packed_simd so it works in master using git deps but pending on Lokathor/wide#48 and rust-lang/packed_simd#286 to be merged in order to release

Any plans for f64 support?

I understand wanting to avoid generics and how f64 would negatively impact SIMD speed. I believe there are plenty of use cases that make support for f64 based types worth adding, even if it involved making separate types for those to avoid using generics.

Newest ultraviolet from crates.io fails to build

To replicate create a new cargo project, eg. with cargo init test_ultraviolet.
Then add the following to Cargo.toml (0.4.5 is the newest version at the time of writing this):

[dependencies]
ultraviolet = "0.4.5"

and try building the project with cargo build.

This results in the following error:

    Updating crates.io index
   Compiling wide v0.4.4
   Compiling bytemuck v1.2.0
   Compiling ultraviolet v0.4.5
error[E0599]: no method named `sin_cos` found for struct `wide::m_f32x4::f32x4` in the current scope
   --> /home/anula/.cargo/registry/src/github.com-1ecc6299db9ec823/ultraviolet-0.4.5/src/mat.rs:234:36
    |
234 |                 let (s, c) = angle.sin_cos();
    |                                    ^^^^^^^ method not found in `wide::m_f32x4::f32x4`
...
569 | mat3s!(Mat3 => Rotor3, Bivec3, Mat4, Vec4, Vec2, Vec3, f32, Wat3 => WRotor3, WBivec3, Wat4, Wec4, Wec2, Wec3, f32x4);
    | --------------------------------------------------------------------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0599]: no method named `sin_cos` found for struct `wide::m_f32x4::f32x4` in the current scope
   --> /home/anula/.cargo/registry/src/github.com-1ecc6299db9ec823/ultraviolet-0.4.5/src/rotor.rs:110:45
    |
110 |                 let (sin, cos) = half_angle.sin_cos();
    |                                             ^^^^^^^ method not found in `wide::m_f32x4::f32x4`
...
319 | rotor2s!(Rotor2 => (Mat2, Vec2, Bivec2, f32), WRotor2 => (Wat2, Wec2, WBivec2, f32x4));
    | --------------------------------------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0599]: no method named `sin_cos` found for struct `wide::m_f32x4::f32x4` in the current scope
   --> /home/anula/.cargo/registry/src/github.com-1ecc6299db9ec823/ultraviolet-0.4.5/src/rotor.rs:121:45
    |
121 |                 let (sin, cos) = half_angle.sin_cos();
    |                                             ^^^^^^^ method not found in `wide::m_f32x4::f32x4`
...
319 | rotor2s!(Rotor2 => (Mat2, Vec2, Bivec2, f32), WRotor2 => (Wat2, Wec2, WBivec2, f32x4));
    | --------------------------------------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0599]: no method named `sin_cos` found for struct `wide::m_f32x4::f32x4` in the current scope
   --> /home/anula/.cargo/registry/src/github.com-1ecc6299db9ec823/ultraviolet-0.4.5/src/rotor.rs:370:45
    |
370 |                 let (sin, cos) = half_angle.sin_cos();
    |                                             ^^^^^^^ method not found in `wide::m_f32x4::f32x4`
...
677 | rotor3s!(Rotor3 => (Mat3, Vec3, Bivec3, f32), WRotor3 => (Wat3, Wec3, WBivec3, f32x4));
    | --------------------------------------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to 4 previous errors

For more information about this error, try `rustc --explain E0599`.
error: could not compile `ultraviolet`.

To learn more, run the command again with --verbose.

I believe the cause is the newest version of "wide" crate. ultraviolet seems to build with up to 0.4.3 wide version, but simply specifies "wide ^0.4" as dependency.

As a workaround, I can force wide to the older version with: cargo update -p wide --precise 0.4.3 and the build succeeds.

Could we have a `Rotor3::into_euler_angles()`?

There's currently a Rotor3::from_euler_angles, but also having a Rotor3::into_euler_angles() would be convenient. I'm still not too familiar with the differences between quaternions and rotors, or if there is much of a difference, so I'm not quite sure how to implement it myself.

Add matrix initializers

i.e. for perspective/orthographic projections, etc.

Automated code review for ultraviolet

Hi!
I am a member of the team developing monocodus — a service that performs automatic code review of GitHub pull requests to help organizations ensure a high quality of code.
We’ve developed some useful features to the moment, and now we’re looking for early users and feedback to find out what we should improve and which features the community needs the most.

We ran monocodus on a pre-created fork of your repo on GitHub https://github.com/monocodus-demonstrations/ultraviolet/pulls, and it found some potential formatting issues. I hope that this information will be useful to you and would be happy to receive any feedback here or on my email [email protected].

If you want to try our service, feel free to follow the link: https://www.monocodus.com
The service is entirely free of charge for open source projects. Hope you’ ll like it :)

refract not working as expected

I've been working through Ray Tracing in One Weekend using ultraviolet as the math library. After implementing dielectric refraction the output is incorrect when using ultraviolet's built in refraction function. Porting the refract algorithm from the book works as expected.

ultraviolet:

reference (matches the book):

error[E0599]: no method named `cmp_ge` found for struct `wide::f32x4` in the current scope
  --> src/util.rs:32:19
   |
32 |                 r.cmp_ge(eps).none()
   |                   ^^^^^^ method not found in `wide::f32x4`
...
38 | impl_eq_eps_wide!(f32x4, f32x8, f64x2, f64x4);
   | ---------------------------------------------- in this macro invocation
   |
   = help: items from traits can only be used if the trait is in scope
   = note: the following trait is implemented but not in scope; perhaps add a `use` for it:
           `use wide::CmpGe;`
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

I'm able to fix them by applying the following diff:

diff --git a/src/lib.rs b/src/lib.rs
index 8f37eba..fe000f2 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -120,6 +120,7 @@ pub use transform::*;
 pub use vec::*;
 
 pub(crate) use wide;
+use wide::*;
 
 pub use wide::f32x4;
 pub use wide::f32x8;

Was that statement missing or am I doing something wrong somewhere?