numbers / numbers.js Goto Github PK
View Code? Open in Web Editor NEWAdvanced Mathematics Library for Node.js and JavaScript
License: Apache License 2.0
Advanced Mathematics Library for Node.js and JavaScript
License: Apache License 2.0
Let's Implement a legit algorithm. Maybe http://en.wikipedia.org/wiki/Pollard's_rho_algorithm
I'll work on this when I next need to take a break from finals studying.
I'm about to start implementing statistical regressions, however I'd like to hear your opinions on what the most useful implementation would be. The options I have in mind are:
A third possible option would be to specify the length of X in option 1, such that the regression model could be used to predict beyond the length of Y, but I think if we are going to address that level of complexity it would be easier to simply return the function and let the user decide which X values they care about.
We could implement a range
"helper" function for 2 that could look something like this:
var reg = numbers.statistics.exponentialRegression(array);
var results = reg(numbers.range(1,array.length));
npm run format
errors. I get the following:
npm ERR! [email protected] format: `gulp format`
npm ERR! Exit status 8
@LarryBattle thoughts on what this is due to?
I've been thinking a lot lately about what the API should look like, which seems important to nail down before we add too much more code (things'll just get harder to change and the API's key for marketing/usability). Lets use this issue to try to figure out the overarching principles.
Right now we have two conflicting modes of operation going on:
If we're moving in the object direction, I think we need to figure out how we're going to keep that feeling light and usable—cause it can get clunky really fast.
A couple principles came up from the last discussion:
Having factory methods on numbers
that make it easy to get from built-ins to our custom objects at the start of the chain. I.e.: numbers.createMatrix(arr).matrixMethod()
.
Instead of the createX
naming, we could also do newX
or makeX
, but, please, lets not end up with numbers.matrix(arr)
—there should be something in the method name that implies that an object is being created.
All object methods that take another custom object as an argument (e.g. Vector getting the distance between another Vector) should also take a built-in representation of that other object wherever possible. @milroc suggested this and I totally agree.
(One thing to consider here is whether passing in a native structure rather than one of our objects should ever cause the return type to also be a built-in rather than one of our custom objects. Though this may be a moot point if we have our objects extend from the built-ins...see below.)
We should also shorten some of our method names in general (along the lines of what @revivek did in #57)
Combining the first, then, calculating the distance between two vectors would look like:
numbers.createVector([0,1,3,4]).distanceFrom([3,5,6,7])
,
as opposed to the current:
(new numbers.linear.Vector([0,1,3,4])).distanceFrom(new numbers.linear.Vector([3,5,6,7]));
Then I've also been thinking about a couple other principles.
Maybe our data types could extend the built-in objects. So Matrix would extend Array
, for instance, and you could do things like numbers.createMatrix(arr).transpose()[0,0]
, because the Matrix
returned by .transpose()
would also be an Array
.* This could be super convenient. Most significantly, it would let our data structures interact with other libraries that expect native arrays.
The downside is that it requires giving up some encapsulation. For instance, the idea of caching the length
property in rowCount
goes away because the data can now be updated without the object knowing about it. Similarly, if we had a Set
object you could imagine caching the mean/average in an instance property to speed up a lot of calculations, but allowing direct access to this data would make it impossible to automatically know when to invalidate this cache. (Note that in both of these examples the data could really have been updated directly anyway, but at least in the non-array approach you'd have had to go through .data
, which could easily be documented as internal or even renamed .__data
. )
One option would be to say in the docs that the underlying Array
methods should only be used if they don't transform the underlying data. That would still leave some utility in the native Array interface (e.g. direct access to an element like in the transpose example, but also access to a row or set of rows with .slice
, the ability to loop over rows in implementations that support .forEach
, etc).
Another option would be to create a naming convention for a method that recalculates any internal properties, i.e. we could say "go crazy with the native array interface, setting things, deleting them, adding them, whatever, and then just call .update()
or whatever to reset the key stuff in the internal state". Having to call an update()
seems a little much though.
Btw, another really cool application of extending the built-ins: applying it to function
s too. So we could have something like:
function linearPointDiff(a, b) {
return this.m;
}
numbers.createLinearFn = function(m, b) {
//if this mode of extension doesn't make sense,
//see the footnote about extending arrays at the bottom
var func = function(x) { return m*x + b; }
func.isContinuous = true;
func.m = m;
func.b = b;
func.pointDiff = linearPointDiff;
return func;
}
var line = numbers.createLinearFn(3, 0);
line(4); //fuck, we can evaluate the function directly!
line.pointDiff(5); //and it has methods
If we decide not to have our objects extend the built-in data types, then we should create a consistently-named method on every object for getting from that object to a representation of it using a built-in structure. Maybe something like toBuiltIn
. That way the user can call this method at the end of the chain before handing their data off to the next part of their application.
For convenience, I think we should allow subclass methods to be called on super-class objects where applicable. For instance, if someone has created a Matrix object and tries to call determinant
, that should transparently forward the call to the determinant
on SquareMatrix
(if the Matrix is square).
I don't like the idea of the superclass definition knowing about its subclasses—the coupling seems way too tight—but we should be able to avoid that if we just keep all the code for adding the forwarding (which would modify Superclass.prototype
directly) in a separate part of the codebase. That way, there'd still be one location with the primary superclass definition and that chunk of code could easily be transplanted into another project and operate without any dependencies on the subclass.
Finally, there were two other things that were bothering me:
If there are things that we don't want to put in objects, what are they and where should those go? As I mentioned in the other issue, one example of this might be the methods that operate on single numbers, because having to create a Number
object to house those methods seems like overkill. In the other issue, I proposed putting these in a numbers.util
"static class" that would function basically how the library does now. But is that the best option? What are the alternatives?
As we think about restructuring in terms of objects, the objects that make the most sense to me seem to be those around mathematical constructs (Set, Sequence, Distribution, Function, etc), but this is very different from our current taxonomy which is based around mathematical fields (calculus, stats, linear algebra, primality, etc).
Now, on one hand, I can see this switch actually resolving some ambiguity and confusion. For instance, why are min
and max
on basic whereas median
and mode
are on stats? In a restructuring, they'd all be united under Set
.
But I'm worried that having these mathematical constructs as the top-level organizational structures might make the library seem less accessible from the outside. Maybe that's just a documentation issue, though? I.e. we could tag each method with the mathematical fields it's relevant too, and then the docs would still be able to show all the stats methods or all the calc methods. The other option would be to use these structures internally but somehow expose an API structured like the current one...but I can't imagine how that would work. Does this seem like a big problem?
Overall thoughts? Additions?
Sorry for the length, but the API is arguably the most important design decision for the library's success, so it seemed worth a full discussion.
CCs: @sjkaliski, @davidbyrd11. Also @KartikTalwar, whose been contributing a lot so might want to follow these developments.
*
Extending Array in javascript is a mess, but it can be done workably by having an object constructor that just returns a native array with methods tacked onto it directly ("parasitic inheritance" in Crockford-ese). And this can even be performant if the constructor tacks on functions which are only created once in an outside scope and then simply referenced by the returned array's properties...somehow, this even seems to end up faster than standard prototypal inheritance (I guess because Chrome really optimizes Array construction). I made a test for this here.
e.g.
numbers = require('numbers');
numbers.matrix.transpose([1,0,0,0]); // [ undefined ]
numbers.matrix.transpose([[1],[0],[0]]); // [ [1, 0, 0] ]
I'll fix this right meow
@sjkaliski:
> @milroc bring this back to life?
Just reviewing the status of things here and a lot of stuff is still under debate and I understand that but if the goal is to make this active or even keep it alive, more effort needs to be put in to merging and closing issues.
Someone shouldn't have to ping each of us multiple times just to get a response or acknowledgement for something they did to improve something you started. We all have a lot of other things going on and all but we need to establish a procedure to handle or at least reply to people contributing.
Takeaway: decide if numbers.js is worth reviving or make @StDako a contributor.
this is just me being nitpicky, and I think I'll really be the only one that cares, but are we going for a "less memory usage, more operations" approach or a "more memory usage, less operations" approach?
for example, I'm currently implementing the Gauss-Seidel iteration method for solving linear systems, and I'm faced with the problem of having an extra array OR computing matrix-vector multiplication. the former takes no extra FLOPS but adds a factor of n to space complexity, while the latter takes n^2 extra FLOPS but uses no extra memory.
I could write both methods, of course, and have different naming conventions. but yeah.
You can change numbers.calculus.riemann
to numbers.calculus.Riemann
in "How to use" section.
In the readme the below example is given:
var numbers = require('numbers');
var func = function(x) {
return Math.sin(x);
}
numbers.calculus.riemann(func, -2, 4, 200);
I'm curious as to whether it's done this way to illustrate that numbers.calculus.riemann is intended to receive a function, or if this more concise method might be preferable:
var numbers = require('numbers');
numbers.calculus.riemann(Math.sin, -2, 4, 200);
The evaluate function's been hanging around in the calculus methods to support the function-as-string syntax. I say we ditch that syntax and the evaluate function all together and only take real function objects as method args. Would offer a performance boost and better consistency/reliability.
Thoughts?
npm run lint
returns more than 100 warnings for all the javascript files in test
and lib
.
I suggest we fix those.
I might submit a pull request to fix this....soon.
Hi all,
I've been speaking with @StDako and I think a great next step for numbers.js is to move this project into a new organization, Numbers.
Not only will this be the new home of numbers.js, but an open space for interesting math-related libraries. The goal is for it to become a great resource for those who love (or are just becoming interested in) programming and math.
Cheers,
Steve
So I started writing tests for calculus.
Here's an example:
var assert = require('assert');
var numeric = require('../index.js');
var calculus = numeric.calculus;
suite('numeric', function() {
test('pointDiff should return the derivative at a point, provided function', function(done) {
var func = function(x) {
return 2 * x + 2;
};
assert.equal(2, calculus.pointDiff(func, 5));
done();
});
});
Now, the point derivative of 2x + 2 at any value is 2. The method returns the following: 1.9999999999242843
This is "basically" 2, but it won't pass tests. So I've considered two options:
I like two more, although it certainly is going to be a hassle to set up. But this way we can have an "acceptable" error bound, which is sustainable and acceptable for the browser, and has minimal effect in performance or large scale operations.
we're currently at v0.5.0, whatever the hell that means, and I'm interested in knowing what the plans are. I'm the one driving the dev, and I'm super 100% okay with that as numerical stuff is my game, but I just wanted opinions from everyone else as to what we should add. here are my thoughts, the version numbers are just some made-up-on-the-fly system:
release of linalg.js, which is capable of solving linear systems. this should have:
I've already implemented most of this and am going to submit a PR soon. gotta write tests first and that's going to be a mess.
TODO:
release of sparse.js, a combination of linalg.js and matrix.js but for sparse vectors and matrices.
TODO:
release of odes.js, used to solve ODEs. I've already written some solvers but I need to change some things.
TODO:
release of pdes.js, used to solve PDEs. I don't know much about solving PDEs numerically except for using the finite difference method which is pretty simple to implement I think.
TODO:
release of interpolate.js, used for interpolation stuff. already written a bit of this (linear splines) but I need to look into it more (cubic splines are best!)
this is something I may discuss privately with Steve & Kartik (the latter of whom I've already mentioned it to) but it will probably be its own project, so ideas for 1.0.0 are welcome.
@KartikTalwar @sjkaliski sorry for the essay.
@LarryBattle since you've been helping out recently too.
I realize that there would be quite some overhead, so this would have to be enabled separately, and might not even be supported by all tools.
http://visionmedia.github.com/mocha/
Sometimes you might want to write something other than an assertion test, but you'll still want them to be executed by the same test framework so just wrap it all in mocha... I'll help
I was working on some graphing, attempting to correlate date of post to some value. I'll take care of this.
There appear to have been quite a few changes since package.json
was last updated. It would be good to bump the version and publish to npm again.
Additionally, the browserified build hasn't been touched in a while despite changes. It would be nice to update it and push it to bower.
After building the public/numbers.js
file and including it in the page, no global numbers
object is available.
I think there needs to be a line like global.numbers = numbers
in the lib/numbers.js
file, but I don't know if that might create a potential leak or error in node.js, in cases where someone might want to use something other than numbers
as the library object.
E.g. for calc, basic operations, etc.
More generally, what parts of the API are fixed? Any need for BC?
This was sparked by a discussion in a previous issue / pull request.
The goal of this is issue is to consider abstracting out matrices and vectors as separate structures, on top of which our functions are applied.
Rather than using an array for vectors and multi-dimensional array for matrices, it may be worth considering creating new structures. We can bind specific functions to these as well (e.g. inverting a matrix).
The test cases for random.distribution.*
fail 15% of the time.
This is causing working builds to randomly fail in Travis CI.
Thus a rewrite of the test cases is needed.
Unfortunately I need to read up on this subject matter before I can proceed with a rewrite.
Error Message:
✖ 1 of 128 tests failed:
1) numbers random.distribution.irwinHall should return a normal distribution of length n within bounds of (m/2 - sub, m/2):
AssertionError: Math.abs(49.31523224141216 - 50) < 0.5
at Object.testing.approxEquals (/home/travis/build/sjkaliski/numbers.js/test/testing.js:14:10)
at Context.<anonymous> (/home/travis/build/sjkaliski/numbers.js/test/random.test.js:156:13)
at Test.Runnable.run (/home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runnable.js:196:15)
at Runner.runTest (/home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runner.js:344:10)
at /home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runner.js:390:12
at next (/home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runner.js:270:14)
at /home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runner.js:279:7
at next (/home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runner.js:227:23)
at Object._onImmediate (/home/travis/build/sjkaliski/numbers.js/node_modules/mocha/lib/runner.js:247:5)
at processImmediate [as _immediateCallback] (timers.js:345:15)
make: *** [test] Error 1
npm ERR! Test failed. See above for more details.
Source: https://travis-ci.org/sjkaliski/numbers.js/jobs/35846620
RangeError: Maximum call stack size exceeded
at Object.basic.max (\node_modules\numbers\lib\numbers\basic.js:226:19)
For big arrays (~10⁷ elements), Math.min and Math.max procuces a RangeError: Maximum call stack size exceeded in node.js.
See http://stackoverflow.com/questions/1669190/javascript-min-max-array-values
Now that we have more than one linear algebra–related type (Matrix, Vector, and, shortly, Square Matrix) seems like the namespace housing these should be called something other than "matrix".
I was thinking a "linear" namespace, where each property points to the constructor function, i.e.
numbers.linear.Matrix
numbers.linear.Vector
etc.
But that's easy.
My bigger concern is how to mesh the OO paradigm used in these data structures with the "helper methods operating on native data types" paradigm used in the other namespaces.
One option would be to alias numbers.matrix
to numbers.linear.Matrix
, allowing Matrix
's "static" methods (which could include addition, scalar, etc.) to mirror the API that's at numbers.matrix
now, and we could set up the same thing for Vector. This would look and feel stylistically consistent, but it also seems like it would defeat the point of having the types in the first place, as the user would be putting in and getting out raw arrays, with the conversion to and from a Matrix happening internally.
So maybe we do need to expose the objects as objects and just be inconsisten?
Alternatively, maybe the other namespaces can be reworked to be similarly object-oriented? So rather than have "calculus", "statistic", and "prime" namespaces, say, we might have objects like Function
, which would have methods like ReimannSum
and pointDiff
; Distribution
, which would have methods like mean
, mode
, randomSample
; and Number
, which would have methods like primeFactorization
and isPrime
.
Thoughts?
Hey all,
I was trying to test out some stuff from #80, and rather than changing up everyone's work on the Matrix data structure, I thought I'd give Set a try. The reason this is not a PR is that there is still work to be done (namely testing and potential problems below).
problems with Set currently:
Let's use Gulp instead of Make for build automation.
I'll push a pull request soon.
so it's published now....https://npmjs.org/package/numbers
i figured we could bounce it around the internet tonight, maybe throw it up on HN tomorrow too...
http://jsdoc.info/sjkaliski/numbers.js/ doesn't work as jsdoc.info has expired, so now we have nothing for documentation. not sure what anyone would want to do about this, or what we can do. we could host it on our own website (I can get a domain, preferably .info because it's cheaper), or someone could put it on their personal website. either way, no documentation isn't ideal
I noticed the gh-pages branch was created recently. Will we be continuing with a jsDoc compiler, or moving to a more manual documentation process?
If it's the latter, I would be interested in getting that started. I'm aware that the API is very much in flux at the moment, but I could get the structure and design worked out I think.
I closed #135 because I decided to stick with the [ [1, 0, 0] ]
way of creating row vectors instead of the [1, 0, 0]
. that being said, we should probably add a section to the readme saying that we're using this format as it may be ambiguous.
I generally prefer to use built in data structures for languages. However I feel like it might be useful to actually define Objects to encapsulate Vectors and Matrices that might help improve code maintenance and legibility when expanding the matrix portion of numbers.js.
I say Minimum Viable, because the least Object heavy, the better in my personal opinion.
The README refers to a chi square test in the stats module - but it doesn't appear to be there.
FYI I found this lib while searching for one specifically that contains a chi square test.
mentioned briefly in #113. it would be nice to be able to have each component of the repo be separate. not all users care about statistics and matrix operations. some questions arise:
exports
)the problem from 2) is that how do we merge all of this into one large module? I'm sure this isn't too difficult, I am just unaware of any library that does this.
Should have public
dir to hold numbers.js and numbers.min.js
e.g.
n = require('numbers')
var m1 = [[1,2,3],[4,5,6],[7,8,9]];
var m2 = n.matrix.scalar(m1, 10);
this function alters m1
as well.
I think it would be good to have an option for having both in-place operations and ... not in-place. Julia (http://julialang.org) uses the !
character to represent in-place operations, but we are not so lucky to have no access to this character. any suggestions?
It seems that getCol() only returns the first N elements of a matrix's row where N is the length of a row. I was working with a [100x9] matrix and getCol() only returned the first 9 elements of a column.
I believe the code for getCol() should be this. (Notice the for loop's condition check):
matrix.getCol = function(M, n) {
var result = [];
if (n < 0) {
throw new Error('The specified column must be a positive integer.');
} else if (n >= M[0].length) {
throw new Error('The specified column must be between 0 and the number of columns - 1.');
}
for (var i=0; i<M.length; i++) {
result.push(M[i][n]);
}
return result;
}
Rather than this:
matrix.getCol = function(M, n) {
var result = [];
if (n < 0) {
throw new Error('The specified column must be a positive integer.');
} else if (n >= M[0].length) {
throw new Error('The specified column must be between 0 and the number of columns - 1.');
}
for (var i=0; i<M[0].length; i++) {
result.push(M[i][n]);
}
return result;
}
Am I missing something simple? Maybe I have your notation for rows and columns confused
The functions gcd
and egcd
give weird results on edge cases like with zero and negative values:
numbers.basic.gcd(3,0) // result 1, expected 3
numbers.basic.egcd(3, 0) // result [1, 0, 1], expected [3, 1, 0]
numbers.basic.egcd(0, 3) // result [1, 0, 0], expected [3, 0, 1]
numbers.basic.gcd(-2, -6) // result 2, as expected
numbers.basic.egcd(-2, -6) // result [-2, 1, 0], expected [2, -1, 0]
numbers.basic.egcd(-2, 5) // result [1, 3, 4], expected [1, 2, 1]
random.sample() and statistic.randomSample() are both identical methods.
Should there only be one?
I vote for random.sample()
to stay.
It just makes the library more approachable for new developers (it's a standard on popular github .js libraries) and future proof's the library, in case for whatever reason we need to have dependencies we can have a lib file for that. I'd make a pull request to fix this if you want but figured I'd get other's input on it.
couldn't find any info about licensing. Is it MIT, BSD or anything similar?
This is outside of the refactoring being done on the matrix class (and should likely be done farther out in the development cycle (3.0.0 for example is when I'd expect it). I feel like this would require a lot of development for a very low amount of utility (given the power of other currently faster languages).
In case anyone would like to implement this prior to that and I would love some recommendations, here's an introduction on how Mathematica handles it:
f[x_] := 3 x
g[x_] := 7 x
h[x_] := f[x] g[x]
Plot[h[x], {x, -5, 5}]
m[x_] := {{g[x], f[x]}, {h[x], g[x]}}
d[x_] := m[x].m[x]
d[1] (*= {{112, 42}, {294, 112}}*)
If you'd like to implement this, I'd recommend posting on here to help get a better idea on how it may be useful.
It would be likely that we'd have to extend the javascript function object in order to make this work properly.
They are implemented using JS apply()
thus copying an array into the stack frame, on large arrays it causes out of memory and crash. Easy to reproduce in nodejs.
After running the test several times I'll usually see the below error:
✖ 1 of 61 tests failed:
1) numbers randomSample should return an array of random numbers in a certain bound:
AssertionError: 5 == 4
Not really a bug but could potentially lead to misleading error messages during testing.
The problem is that most of the test cases supply the expected value as the actual value and vice versa for assert.equal()
and assert.deepEqual()
.
_Node 0.10.x Assertion Signatures:_
Example of incorrect usage.
File: complex.test.js
Line: 16
Code:
assert.equal(10, res.im);
Should be:
Code:
assert.equal(res.im, 10);
Matrix.test.js is the only file that uses assert.deepEqual()
and assert.equal()
correctly.
I'll try to submit a pull request if I have time this week.
Quick question:
Does it make sense for me you send you a pull request for this modified determinant calculator?
https://github.com/KartikTalwar/numbers.js/blob/master/lib/numbers/matrix.js#L136
If it does, let me know. I'll send you the pull request (for master and abstract branches)
Thanks
So the other night I was wondering if anyone had thoughts on async vs sync methods. I am doing some research on how this could benefit certain math calculations, but was curious if anyone had any thoughts
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.