statrs-dev / statrs Goto Github PK

View Code? Open in Web Editor NEW

549.0 549.0 79.0 2.45 MB

Statistical computation library for Rust

Home Page: https://docs.rs/statrs/latest/statrs/

License: MIT License

Rust 99.84% Shell 0.16%

statrs's People

Contributors

Stargazers

Watchers

statrs's Issues

Refactor traits

Refactor traits to be more specific e.g. (trait Mode and trait Median). Should probably done in the same step as this issue

Test coverage for Statistic.rank for n > 10

Need code coverage on case where data length > 10 for rank function in order to cover the quick_sort impl. As a side note, the quick sort impl is borrowed whole sale from Math.NET and there is probably a more idiomatic Rust implementation that could be used.

Implementation of inverse digamma

Need resources on good implementation of inverse digamma (psi) function. The Math.NET code seems to be incorrect and doesn't even pass their own unit tests

Implement Erlang distribution

References:
https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/Erlang.cs
https://en.wikipedia.org/wiki/Erlang_distribution

Better README

Need to improve README documentation

Implement exponential integral

https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/SpecialFunctions/ExponentialIntegral.cs

Implement Geometric distribution

http://en.wikipedia.org/wiki/Geometric_distribution

https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/Geometric.cs

Error handling: Panics vs Result

Currently the responsibility for guarding against exceptional cases (e.g. input not in valid domain, mathematically invalid operations etc) is passed to the user. We panic when an operation does not make mathematical sense (e.g. calculating the cumulative distribution function for discrete distributions at a negative input) which forces users to double check to make sure their inputs are valid. While this results in technically correct and predictable behavior from the API, I'm not sure if it's ergonomic or idiomatic and have been mulling over possibly introducing a Result based API either replacing or in addition to the stricter panic based API. This however warrants some discussion and I would love feedback from the community

What is the desired behavior for gamma distributions when rate is INF

I'm not personally familiar enough with the gamma distribution to say what the desired behavior for pmf, ln_pmf, and cdf should be if shape or rate are f64::INFINITY

Implement inverse gamma distribution

https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/InverseGamma.cs

http://en.wikipedia.org/wiki/Inverse-gamma_distribution

Use dev-dependencies for random number generation

I just saw this in the code:

    #[ignore]
    #[test]
    fn test_mean_variance_stability() {
        // TODO: Implement tests. Depends on Mersenne Twister RNG implementation.
        // Currently hesistant to bring extra dependency just for test
    }

You can add dependencies to the Cargo.toml that are only used when running the tests, but not when using the library as a dependency: http://doc.crates.io/specifying-dependencies.html#development-dependencies

Review of iterator statistics trait

Currently the iterator statistics trait is treated as a special case since to act over the iterator the methods need to take a mutable reference, so all the traits from statrs::statistics are (going to be) combined in the IterStatistics trait that is implemented for all Iterators. I haven't come up with a better solution but for some reason this implementation doesn't sit too well with me and I'd love to have someone review it and provide feedback.

Implement Harmonic functions

https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/SpecialFunctions/Harmonic.cs

Implement Multinomial distribution

References:
https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/Multinomial.cs
https://en.wikipedia.org/wiki/Multinomial_distribution

Port streaming statistics

Port over streaming statistics. Implement statistics trait for Vector (should be simple wrapper of slice)

Implement Categorical distribution

References:
https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/Categorical.cs
https://en.wikipedia.org/wiki/Categorical_distribution

Change Univariate implementation for discrete distributions

Distributions are currently Univariate<i64, f64> but I'm pretty sure they can all be changed to Univariate<u64, f64>

Remove matches on floats

Go through code and remove matches on floats, replace them with if/else. See link

Implement multinomial function

https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/SpecialFunctions/Factorial.cs

Packages `exponential` and `factorial` need doc strings

Implement inverse of regularized lower incomplete gamma

Implement `_checked` interface

Result of discussion from #41. Most likely worth it to hold off on this until @migi's pull request is submitted and merged

Port over Statistics extensions

Port over Numerics/Statistics extensions for f64 slice and iterable

Better implementation for 'hidden' generics

For certain distributions, (Bernoulli comes to mind), the second generic parameter N is hidden since Bernoulli depends on Binomial which has two parameters P and N but the N parameter is always 1 in the Bernoulli distribution. This effectively prevents users from defining the numeric types for such a distribution with the constructor and requires them to explicitly define the type on the variable, leading to verbose declarations such as let n: Result<Bernoulli<f64, u64>> = Bernoulli::new(0.5). I'd like to find a way around this issue after 0.4.0 is released. (Or maybe before if a solution is found quick enough)

Implement cauchy distribution

https://en.wikipedia.org/wiki/Cauchy_distribution

https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/Cauchy.cs

Support for multiple numeric types

Currently statrs only supports f64 but I'd like to at the minimum extend that to f32 and possibly other numeric types as well (especially for things in the statistics module). The num crate might be worth looking into but I'm hesistant about introducing the dependency when it might make it's way in to the standard library at some point.

Run cargo publish with nightly

Next time the crate is published, use nightly cargo in order to add ourselves to the science category on crates.io

Sampling with rand crate

Experiement with sampling based on normal distribution in rand crate (or better yet implement ziggurat algo for normal sampling)

References:
https://github.com/rust-lang-nursery/rand/blob/master/src/distributions/mod.rs#L224
https://github.com/rust-lang-nursery/rand/blob/master/src/distributions/ziggurat_tables.rs

Testing for special functions

Special functions sorely in need of unit testing

Consolidate cdf error handling behavior

Some distributions panic:

Some distributions return 0 or 1:

Some distributions return NaN:

FisherSnedecor (check construction args as well)

No error handling defined:

Cauchy
Geometric
Normal

My gut reaction is to panic on all invalid input domains for cdf and possibly special functions as well. The other options are to move towards propagating NaN or returning Result<T, StatsError> for functions like cdf, pdf, pmf, ln_pdf, and ln_pmf (possibly including special functions).

Math.NET has an implementation but no unit tests, and I couldn't find a source for the calculation from a cursory search. If anyone can provide a source or derivation for either a closed form or numerical solution (or prove the calculation for https://github.com/mathnet/mathnet-numerics/blob/master/src/Numerics/Distributions/Categorical.cs#L274) then we can move forward with implementation

Sampling tests with KS test

References:

https://github.com/huonw/random-tests/blob/master/std_dists.rs#L116-L130
https://github.com/huonw/random-tests/blob/master/kolmogorov_smirnov.rs

Convert discrete distributions to return integers from sample

Discrete distributions currently implement Distribution<f64> but this should be changed to Distribution<i64> to be more accurate

statrs-dev / statrs Goto Github PK

statrs's People

Contributors

Stargazers

Watchers

Forkers

statrs's Issues

Recommend Projects

Recommend Topics

Recommend Org