Giter Club home page Giter Club logo

randomised_pca's Introduction

Principal component analysis.

Constructs a nearly optimal rank-k approximation U diag(s) Va to A,
centering the columns of A first when raw is False, using n_iter
normalized power iterations, with block size l, started with a
min(m,n) x l random matrix, when A is m x n; the reference PCA_
below explains "nearly optimal." k must be a positive integer <=
the smaller dimension of A, n_iter must be a nonnegative integer,
and l must be a positive integer >= k.

The rank-k approximation U diag(s) Va comes in the form of a
singular value decomposition (SVD) -- the columns of U are
orthonormal, as are the rows of Va, and the entries of s are all
nonnegative and nonincreasing. U is m x k, Va is k x n, and
len(s)=k, when A is m x n.

Increasing n_iter or l improves the accuracy of the approximation
U diag(s) Va; the reference PCA_ below describes how the accuracy
depends on n_iter and l. Please note that even n_iter=1 guarantees
superb accuracy, whether or not there is any gap in the singular
values of the matrix A being approximated, at least when measuring
accuracy as the spectral norm || A - U diag(s) Va || of the matrix
A - U diag(s) Va (relative to the spectral norm ||A|| of A, and
accounting for centering when raw is False).

Notes
-----
To obtain repeatable results, reset the seed for the pseudorandom
number generator.

The user may ascertain the accuracy of the approximation
U diag(s) Va to A by invoking diffsnorm(A, U, s, Va), when raw is
True. The user may ascertain the accuracy of the approximation
U diag(s) Va to C(A), where C(A) refers to A after centering its
columns, by invoking diffsnormc(A, U, s, Va), when raw is False.

Parameters
----------
A : array_like, shape (m, n)
    matrix being approximated
k : int, optional
    rank of the approximation being constructed;
    k must be a positive integer <= the smaller dimension of A,
    and defaults to 6
raw : bool, optional
    centers A when raw is False but does not when raw is True;
    raw must be a Boolean and defaults to False
n_iter : int, optional
    number of normalized power iterations to conduct;
    n_iter must be a nonnegative integer, and defaults to 2
l : int, optional
    block size of the normalized power iterations;
    l must be a positive integer >= k, and defaults to k+2

Returns
-------
U : ndarray, shape (m, k)
    m x k matrix in the rank-k approximation U diag(s) Va to A or
    C(A), where A is m x n, and C(A) refers to A after centering
    its columns; the columns of U are orthonormal
s : ndarray, shape (k,)
    vector of length k in the rank-k approximation U diag(s) Va to
    A or C(A), where A is m x n, and C(A) refers to A after
    centering its columns; the entries of s are all nonnegative and
    nonincreasing
Va : ndarray, shape (k, n)
    k x n matrix in the rank-k approximation U diag(s) Va to A or
    C(A), where A is m x n, and C(A) refers to A after centering
    its columns; the rows of Va are orthonormal

Examples
--------
>>> from numpy.random import uniform
>>> A = uniform(low=-1.0, high=1.0, size=(100, 2))
>>> A = A.dot(uniform(low=-1.0, high=1.0, size=(2, 100)))

Then we can either do:
>>> from scipy.linalg import svd
>>> (U, s, Va) = svd(A, full_matrices=False)
>>> A = A / s[0]

Or, as provided by this file:
>>> from randomised_pca pca
>>> (U, s, Va) = pca(A, 2, True)

This example produces a rank-2 approximation U diag(s) Va to A such
that the columns of U are orthonormal, as are the rows of Va, and
the entries of s are all nonnegative and are nonincreasing.

References
----------
.. [PCA] Nathan Halko, Per-Gunnar Martinsson, and Joel Tropp,
         Finding structure with randomness: probabilistic
         algorithms for constructing approximate matrix
         decompositions, arXiv:0909.4061 [math.NA; math.PR], 2009
         (available at `arXiv <http://arxiv.org/abs/0909.4061>`_).

randomised_pca's People

Contributors

tygert avatar ajtulloch avatar vassago1911 avatar facebook-github-bot avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.