Giter Club home page Giter Club logo

jstat's Introduction

jStat - JavaScript Statistical Library

npm version

jStat provides native javascript implementations of statistical functions. Full details are available in the docs. jStat provides more functions than most libraries, including the weibull, cauchy, poisson, hypergeometric, and beta distributions. For most distributions, jStat provides the pdf, cdf, inverse, mean, mode, variance, and a sample function, allowing for more complex calculations.

NOTICE: The previous case sensitive jStat module will no longer be updated. Instead use the all lowercase jstat when doing an npm install or similar.

Using jStat in a Browser

jStat can be used in the browser. The jStat object will be added to the window. For example:

<script src="components/jstat.js"></script> <!-- include jStat, from the CDN or otherwise -->

<script>
...
var jstat = this.jStat(dataset); // jStat will be added to the window
...
data[i]['cum'] = jstat.normal(jstat.mean(), jstat.stdev()).cdf(data[i].x);
...
</script>

CDN jsDelivr Hits

The library is hosted on jsDelivr using the following url:

//cdn.jsdelivr.net/npm/jstat@latest/dist/jstat.min.js

Note that 'latest' can be replaced with any released verion of jStat.

Module Loaders

Currently jStat is exposed as j$ and jStat inside an object, rather than exported directly. This may confuse some module loaders, however should be easily remedied with the correct configuration.

NodeJS & NPM

To install via npm:

npm install --save jstat

When loading under Node be sure to reference the child object.

var { jStat } = require('jstat').

RequireJS Shim

For RequireJS not only exports but also init function must be specified.

requirejs.config({
  paths: {
    'jstat': 'path/to/jstat/dist/jstat.min'
  },
  shim: {
    jstat: {
      exports: ['j$', 'jStat'],
      init: function () {
        return {
          j$: j$,
          jStat: jStat
        };
      }
    }
  }
});

Build Prerequisites

In order to build jStat, you need to have GNU make 3.8 or later, Node.js 0.2 or later, and git 1.7 or later. (Earlier versions might work OK, but are not tested.)

Windows users have two options:

  1. Install msysgit (Full installer for official Git), GNU make for Windows, and a binary version of Node.js. Make sure all three packages are installed to the same location (by default, this is C:\Program Files\Git).
  2. Install Cygwin (make sure you install the git, make, and which packages), then either follow the Node.js build instructions or install the binary version of Node.js.

Mac OS users should install Xcode (comes on your Mac OS install DVD, or downloadable from Apple's Xcode site) and http://mxcl.github.com/homebrew/. Once Homebrew is installed, run brew install git to install git, and brew install node to install Node.js.

Linux/BSD users should use their appropriate package managers to install make, git, and node, or build from source if you swing that way.

Building jStat

First, clone a copy of the jStat git repo by running git clone git://github.com/jstat/jstat.git.

To download all necessary libraries run npm install.

Then, to get a complete, minified version of jStat and all documentation, simply cd to the jstat directory and type make. If you don't have Node installed and/or want to make a basic, uncompressed, unlinted version of jstat, use make jstat instead of make.

The built version of jStat will be put in the dist/ subdirectory.

Generate just the documentation by running make doc. Documentation will be placed in dist/docs by default.

To remove all built files, run make clean.

Running Tests

Execute all tests by running make test.

Or if you wish to run a specific test, cd to test/<subdir> and run node <some_test>-test.js.

Get the Code

Both the minified and unminified source are located in the dist/ directory. For those who don't want to build it themselves.

Contribute

jStat is now going to follow most of the v8 JavaScript guidelines. There will be plenty of source that uses the old style, but we're going to work away from that.

Also, we'll be going through and reimplementing a good portion of the code to run faster. Hopefully it won't take too long to get the project on one basic standard.

When submitting pull requests, no need to check in dist/*.js. They'll be recompiled for distribution anyway.

Join the Community

We always like discussion of how to improve jStat. Join us at our mailing list and let us know what you'd like to see. Also come ask questions in the #jstat channel on irc.freenode.net.

jstat's People

Contributors

adamnovak avatar akrawitz avatar arturaugusto avatar budnix avatar fbukevin avatar gorbach avatar jakutis avatar jamescgibson avatar jfkw avatar kotarou3 avatar maciejkula avatar mrwillihog avatar mryellow avatar nunosempere avatar petulla avatar peytonm avatar pieterlukasse avatar pratapvardhan avatar rasmusab avatar richarddmorey avatar rintaun avatar skawian avatar sled avatar smarden1 avatar swamwithturtles avatar tlsim avatar trevnorris avatar tushargupta51 avatar utvara avatar yiyuezhuo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jstat's Issues

wrong col value in map methode (core.js)

I changed the subtract methode in my fork. But enconterd a bug in the map methode:
the col value passed to the function is wrong.

ps: it seems to be correct in the tushargupta51 fork

cacheable results

when a result is generated, the results should be cached on the single object so they don't have to be calculated again.

also would need to have a reset function that would clear the cache if the object is altered.

style guidelines

create simple style guidelines to be followed when coding for jstat.

Spelling mistake

Project description is

JavaScript Statistical Libraray

Should say

JavaScript Statistical Library

Not usable as a node.js module?

This library builds with npm and is listed in the npm directory, but it can't itself be require()'d like a module, and none of its files export any methods. Is that by design?

vector methods should return jStat object

If an instance method of jStat return an array, then the array should always be wrapped in a jStat object. example:

jStat( 1, 5, 5 ).cumsum()  // returns an Array, not a jStat instance

Division By Zero in gammap

If you call the gammap function with a and x such that a is greater than x by one, then line 119 of special.js throws a division by zero exception.

fn.cumsum( true )

running fn.cumsum( true ) results in an error if running over a matrix.

poisson cdf returns values >1

Similar issue to the binomial cdf, but this time there's no log(exp()) issue going on:

jStat(15,25,11).poisson(.725).cdf().toArray();
1,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002,
1.0000000000000002

Since most of the discrete cdfs are going to be evaluated by summing, it would make sense to add a check:

if( sum >= 1) return 1;

That has the benefit of speeding up the code by not evaluating useless terms. But it might also have the downside of masking precision issues (like it would have with the binomial).

choices for build

the user should be able to select what parts they want for the build.

this will also be usable on the webpage for custom deployments.

create Makefile

create a Makefile to use Node.js to build out jStat. similar to what jQuery is doing.

Matrix Inversion - Gauss-Jordan

On some test matrices I noticed matrix inversion was returning the wrong matrix. After much testing I noticed that the Gauss-Jordan algorithm has a problem since the left side matrix after Gauss-Jordan for matrix inversion was not the identity matrix. I would fix this myself but I was getting lost in the Gauss-Jordan algorithm so in the mean time I will do some other fixes.

require("jStat") fails

npm install jStat
require("jStat");

> require('jStat');
Error: Cannot find module 'jStat'
    at Function.Module._resolveFilename (module.js:331:15)
    at Function.Module._load (module.js:273:25)
    at Module.require (module.js:357:17)
    at require (module.js:373:17)
    at repl:1:1
    at REPLServer.defaultEval (repl.js:130:27)
    at bound (domain.js:255:14)
    at REPLServer.runBound [as eval] (domain.js:268:12)
    at REPLServer.<anonymous> (repl.js:277:12)
    at REPLServer.EventEmitter.emit (events.js:104:17)

Problem with jStat.studentt.inv

jStat.studentt.inv doesn't return the correct values. For example, jStat.studentt.inv(0.05, 5) returns the same value as jStat.studentt.inv(0.95, 5). A quick glance at the source makes me naively think that the line:
return (p > 0) ? x : -x;
should be:
return (p >= 0.5) ? x : -x;
But that's without my knowing if the rest of the function is correct.

PDFs return inconsistent values outside of support

The PDF functions for the different distributions are inconsistent in the value they return if the specified x falls outside of the support for the distribution.
For example:
jStat.triangular.pdf(-1,1,5,3) returns 0
jStat.chisquare.pdf(-1,5) returns NaN
jStat.centralF.pdf(-1,2,5) returns undefined
I suggest that a consistent value be returned across all of the distributions. Furthermore, I suggest that this value be either 0 or NaN. I think NaN is the appropriate way in Javascript to express the concept of being mathematically undefined, whereas undefined means undefined in computer programming terms. Whether the returned value should be 0 or NaN seems to be a mathematical issue, which I will leave to those more qualified.

gammaln precision error

A while back I simplified gammaln() for a performance improvement, at the cost of precision. But now that's become a problem.

Revisit gammaln() and make it more precise.

jstat needs to be cleaned up and needs a new release

Sorry for the generic issue title. It seems jstat is now stagnant and the repo doesn't have a proper, tagged release. It might be fine if http://jstat.org was updated with API documentation for v1.0.0 but that is missing as well.

The README mentions that the project is in the middle of a merge. I'm not asking for the merge to be completed but it would be nice if v1.0.0 was documented or if a new release was made from the repo code.

Prebuild dist folder

In order to use jstat without having to install node.js or to integrate it with WebJars, it would be nice if the dist folder would be part of the repository

jStat.fn.norm is incorrect

jStat.fn.norm is currently assigned in core.js even though the method has been moved to linearalgebra.js. This needs to be fixed.

performance tests

create performance tests to make sure jstat isn't hammering the browser.

js console

add a console users can use to generate data.

replace vows unit tests

vows.js isn't working on node v0.10. seems to be no longer supported. need to replace it w/ something else.

ES5 Function#bind fallback issue

<script>Function.prototype.bind = false;</script>
<script src="jstat.js"></script>
<script>
function Foo(a,b,c) {
  this.a = a;
  this.b = b;
  this.c = c;
}

var Bound = Foo.bind({}, 1, 2);
var b = new Bound(3);
console.log(Object.keys(b)); // get [], should be ['a', 'b', 'c']
console.log(b.c); // undefined, should be 3
</script>

Here is how es5-shim handles it.

fn.mode( true ) fails for matrices

fn.mode( true ) if ran over matrix. example:

jStat([[1,2,4],[3,4,1],[1,4,4]]).mode() === [1,4,4]
jStat([[1,2,4],[3,4,1],[1,4,4]]).mode(true) === false

The later result should be 4.

fn.variance & fn.stdev - flag population/sample

In the static methods for variance() and stdev() a flag can be passed to indicate whether to compute the population or the sample for matrices. This cannot be done for the instance methods. Though it should be implemented.

cannot use jStat.randn(n)

it throws an error:

jStat.randn()
-1.8389875144420988
jStat.randn(3)
TypeError: Object 0,0,0,0,0,0,0,0,0 has no method 'alter'

Known Issue: Regression

There is a known issue with regression. It occurs due to some type checking (some things require a jStat object while others don't etc). It will get fixed in future.

ttest function

I wanna use ttest to test if my data(in array) is normal distribution.

What's the params "value" means in the ttest function

jStat.ttest( value, array, sides)?

Thanks.

binomial cdf returns values >1

There's some imprecision here creeping in from somewhere.

> jStat.binomial.cdf(21,22,.3)
1.000000000005807

R returns:

> print(pbinom(21,22,.3),digits=22)
[1] 0.9999999999968618435986

The problem appears to be combinationln, because using no logs:

>  var d = [];  
> for(var i=0;i<22;i++) d[i] = jStat.combination(22,i) * Math.pow(.3,i) * Math.pow(1-.3,22-i)
> jStat.sum(d);  
0.9999999999968605

agrees with R, and using logs for only the likelihood term:

> for(var i=0;i<22;i++) d[i] = jStat.combination(22,i) * Math.exp(i * Math.log(.3) + (22-i)*Math.log(1-.3))
1.6108943932619987e-10
> jStat.sum(d);
0.99999999999686

also agrees with R. jStat.combination() fails, however, for moderately sized N and k:

> jStat.combination(200,150)
Infinity

So the logs are necessary for large N or small/large p situations. We should probably add a check to makes sure that N or p are "appropriately" sized before taking logs. And maybe if the cdf ever goes above 1 or below 0, we make it 1 or 0, respectively?

Instance function "multiply" taks only Array?

Is it by design? I am so confused.

var vector = jStat([1,1,1]);
var vectorT = vector.transpose();

console.log(vector.multiply(vectorT));
console.log(vector.multiply(vectorT.toArray()));
console.log(jStat.multiply(vector, vectorT));
console.log(jStat.multiply(vector, vectorT.toArray()));
Object[[NaN, NaN, NaN]]
Object[[]]
[NaN, NaN, NaN]
3

I built from this revision e4c2d0c

jstat.org link in Readme

The preamble to the README about the divergence between this repo and jstat.org is confusing, as jstat.org seems to redirect to this repo.

Just a heads up.

Where is jstat.org?

What happened to jstat.org and the statistical library that was located there?

Issue with beta distribution's PDF

Calling jStat.beta.pdf(0, 1, 4) returns NaN, when it should return 4. This seems to be true for all values of the beta parameter except for 1. Likewise, calling jStat.beta.pdf(1, 4, 1) also returns NaN instead of 4. Again, this problem seems to exist for all values of the beta parameter except for 1.

Multiple Mode Values in Distribution

The "Mode" is a rather nasty fellow... What to do if there is distribution where every value only exists once or if there are multiple mode values in general.

In jstat you decided to return false - if there is more than one mode value. This is understandable - but in my opinion not for the implementation in the framework to decide. We actually have customers who want to see the multiple mode values in their reports.

I would vote for returning an array when there are more than one mode values and a scalar when there is just one .

Florian

bug in gamma/chisq cdf and inverse cdf

(I just cloned the repository today, so I think I'm working with the most recent version; I would give you a revision number, but I don't know much about git)

There are several issues with the gamma cdf and inverse cdf.

  • On line 143 of distribution.js, the inverse cdf of the gamma distribution is defined as:
2 * jStat.gammapInv( p, 0.5 * dof );

However, the function gammapInv() is undefined. On line 173 of special.js, the function gammapinv() is defined (note the case change). This just seems to be a typo, and I would just correct it and submit the change, but:

  • The function for the cdf and inverse cdf of the gamma functions are still wrong even with the change. For instance,
jStat.gamma.cdf(1,4,2);

returns 1.764162786476752, which of course can't be correct because it is greater than 1. I believe the problem is the fact that the gamma.cdf(x, dof) function is defined on line 139 of distribution.js as

jStat.gammap( x / 2, dof / 2 );

however, the gamma cdf must be normalized (see the R help, for instance). I played around with getting that to work, but couldn't get them to match up with R even with normalizing; I figured someone with more familiarity with the jStat code would probably be able to fix it quickly.

Beta distribution PDF returns NaN for large parameters

For large values of the parameters, beta distribution returns NaN in evaluating the pdf.

For example jStat.beta.pdf(0.05,200, 4000).

The older jstat from jstat.org seems to use a different method for calculating the pdf in these cases, which succeeds where jStat doesn't.

I've done an interactive example comparing them on this page: http://www.peakconversion.com/calc/jstattest.html

The problem seems to be at this line:
pdf : function( x, alpha, beta ) {
return (x > 1 || x < 0) ? 0 : ( Math.pow( x, alpha - 1 ) * Math.pow( 1 - x, beta - 1 )) / jStat.betafn( alpha, beta );
},
which results in 0/0

Tracing execution through the older jstat it comes to this comment, which could be relevant:

/* n*p or n*q can underflow to zero if n and p or q are small. This
used to occur in dbeta, and gives NaN as from R 2.3.0. */
lc = jstat.stirlerr(n) - jstat.stirlerr(x) - jstat.stirlerr(n-x) - jstat.bd0(x,n*p) - jstat.bd0(n-x,n*q);

Looks like it's calculating the pdf quite differently.

gammap approximation breaks down for small k

The gammap function has issues with moderately small values of a, where the function doesn't increase monotonically as x increases. For instance,

jStat( 0.5, 3, 10 ).gamma( .15, 2 ).cdf().toArray();

yields

0.8253665805712278,
0.84370429375926,
0.8380937268336934,
0.8179624512257149,
0.7883836589089586,
0.75262694142181,
0.7129903285130516,
0.9738102454941774,
0.9785160308224301,
0.9823072094250467

If you look at a plot, it appears to have a discontinuity. The problem gets worse for progressively smaller values of a. The problem appears to be in lines 128-132 of special.js:

} else if ( x < a + 1 ) {
    for ( ; i <= ITMAX; i++ ) {
        sum += del *= x / ++ap;
}
    endval = sum * Math.exp( -x + a * Math.log( x ) - ( aln ));
}

because it appears to happen when x < a + 1. I had a brief look at the Numerical Recipes algorithm but didn't immediately see what might be wrong. Could be not enough terms (ITMAX) in the series?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.