ganguli-lab / twpca Goto Github PK

🕝 Time-warped principal components analysis (twPCA)

License: MIT License

Python 3.84% Makefile 0.03% Jupyter Notebook 96.13%

dimensionality-reduction principal-components time-warp neuroscience

twpca's Introduction

⚠️ Please use our newer code --- Piecewise Linear Time Warping:

Our new work removes the assumption of low-dimensional dynamics, and uses a new optimization framework to avoid local minima in the warping function fitting routine. The new code package is also better optimized for speed, contains cross-validation routines, and has tools for working with spike data in continuous time.

[DEPRECATED] Time warped principal components analysis (TWPCA)

Ben Poole 🍺, Alex H. Williams 🎙️, Niru Maheswaranathan ⚽

Overview

Installation

Again, this package is deprecated, so it should only be used as legacy software. But if you want to install it, you can do so manually:

git clone https://github.com/ganguli-lab/twpca
cd twpca
pip install -e .

Description

Analysis of multi-trial neural data often relies on a strict alignment of neural activity to stimulus or behavioral events. However, activity on a single trial may be shifted and skewed in time due to differences in attentional state, biophysical kinetics, and other unobserved latent variables. This temporal variability can inflate the apparent dimensionality of data and obscure our ability to recover inherently simple, low-dimensional structure.

Here we present a novel method, time-warped PCA (twPCA), that simultaneously identifies temporal warps of individual trials and low-dimensional structure across neurons and time. Furthermore, we identify the temporal warping in a data-driven, unsupervised manner, removing the need for explicit knowledge of external variables responsible for temporal variability.

For more information, check out our abstract or poster.

We also encourage you to look into our new package, affinewarp, which was built with similar applications in mind.

Code

We provide code for twPCA in python (note: we use tensorflow as a backend for computation).

To apply twPCA to your own dataset, first install the code (pip install twpca) and load in your favorite dataset and shape it so that it is a 3D numpy array with dimensions (number of trials, number of timepoints per trial, number of neurons). For example, if you have a dataset with 100 trials each lasting 50 samples with 25 neurons, then your array should have shape (100, 50, 25).

Then, you can apply twPCA to your data by running from twpca import TWPCA; model = TWPCA(data, n_components).fit() where n_components is the number of low-rank factors you wish to fit and data is a 3D numpy as described above. A more thorough example is given below:

from twpca import TWPCA
from twpca.datasets import jittered_neuron

# generates a dataset consisting of a single feature that is jittered on every trial.
# This helper function returns the raw feature, as well as the aligned (ground truth)
# data and the observed (jittered) data.
feature, aligned_data, raw_data = jittered_neuron()

# applies TWPCA to your dataset with the given number of components (this follows the
# scikit-learn fit/trasnform API)
n_components = 1
model = TWPCA(raw_data, n_components).fit()

# the model object now contains the low-rank factors
time_factors = model.params['time']         # compare this to the ground truth feature
neuron_factors = model.params['neuron']     # in this single-neuron example, this will be a scalar

# you can use the model object to align data (compare this to the aligned_data from above)
estimated_aligned_data = model.transform()

We have provided a more thorough demo notebook demonstrating the application of tWPCA to a synthetic dataset.

Further detail

Motivation

Performing dimensionality reduction on misaligned time series produces illusory complexity. For example, the figure below shows that a dataset consisting of a single feature jittered across trials (red data) has illusory complexity (as the spectrum of singular values decays slowly).

The twPCA model

To address this problem for a sequence of multi-dimensional time-series we simultaneously fit a latent factor model (e.g. a matrix decomposition), and time warping functions to align the latent factors to each measured time series. Each trial is modeled as a low-rank matrix where the neuron factors are fixed (gray box below) while the time factors vary from trial to trial by warping a canonical temporal factor differently on each trial.

twpca's People

Contributors

Stargazers

Watchers

twpca's Issues

warp scales are not constrainted to be positive

this can result in invalid warping functions with negative slope

append to obj_history

When calling fit, we should append to obj_history (instead of reinitializing it) if reinitialize is set to False

calling fit() after pip install fails with error "fit() missing 1 required positional argument: 'X'"

Hey, I was trying to get the demo to work after pip installing and in Python 2.7 and 3.6 on Windows and 3.6 on Mac, but running even the demo notebook returned an error: "fit() missing 1 required positional argument: 'X'". Pulling the latest version from this repo worked, but it seems there is something broken in the pip version that was patched in the github version.

Handle variable length trials and data with NaN values

Trial lengths can be variable in some experiments. We should extend twPCA to handle this by simply ignoring NaN values in the data tensor. Users still specify the data as a trials x time x neuron array, but indicate missing data with nans.

Another possibility would be to pass in a vector holding the indices of trial start or end:

model = TWPCA(n_components).fit(data, trial_start=..., trial_end=...)

Improve transform for discrete data

For discrete data like spikes the transform function that maps the data into the aligned space can smear out the spikes in time, leading to continuous values. We should add an option to either round spikes to the nearest bin, or return the continuous times of the spikes in the aligned space.

Add ability to smooth the data prior to learning warp functions?

Currently we do this for spiking datasets, but it is not a part of the twpca package yet. We could incorporate a smooth kwarg that applies some temporal smoothing to data before model.fit is called. Then model.transform would transform the original data.

New loss functions

Users should have the ability to specify different loss functions. A few easy ones:

Logistic loss (spiking data where there is at most 1 spike per bin)
Poisson loss (spiking data with multiple spikes per bin)
Gamma loss (for calcium imaging data)

In the case of Logistic/Poisson loss functions, this would remove the need to smooth the data as a preprocessing step (see #2), but would necessitate adding regularization for smoothness on the temporal factors. This could also help with #3 - e.g. the reconstruction can be interpreted as a probability of spiking in the logistic case.

use mean instead of sum for reconstruction error

currently we compute reconstruction error by summing over comopnents. Instead, we should compute the mean by normalizing by the number of elements. This will hopefully make the choice of regularization hyperparameters more robust across datasets.

ImportError: DLL load failed: The specified module could not be found.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py in <module>
     57 
---> 58   from tensorflow.python.pywrap_tensorflow_internal import *
     59 

~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py in <module>
     27             return _mod
---> 28     _pywrap_tensorflow_internal = swig_import_helper()
     29     del swig_import_helper

~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py in swig_import_helper()
     23             try:
---> 24                 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
     25             finally:

~\Anaconda3\lib\imp.py in load_module(name, file, filename, details)
    241         else:
--> 242             return load_dynamic(name, filename, file)
    243     elif type_ == PKG_DIRECTORY:

~\Anaconda3\lib\imp.py in load_dynamic(name, path, file)
    341             name=name, loader=loader, origin=path)
--> 342         return _load(spec)
    343 

ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-40-fe2dcd6c15ce> in <module>
----> 1 from twpca import TWPCA
      2 from twpca.datasets import jittered_neuron
      3 
      4 # generates a dataset consisting of a single feature that is jittered on every trial.
      5 # This helper function returns the raw feature, as well as the aligned (ground truth)

~\Anaconda3\lib\site-packages\twpca\__init__.py in <module>
      8 __version__ = '0.0.2'
      9 
---> 10 from .model import TWPCA
     11 from . import regularizers
     12 from . import utils

~\Anaconda3\lib\site-packages\twpca\model.py in <module>
      4 from tqdm import trange
      5 
----> 6 import tensorflow as tf
      7 from . import warp, utils
      8 from .regularizers import l2, curvature

~\Anaconda3\lib\site-packages\tensorflow\__init__.py in <module>
     39 import sys as _sys
     40 
---> 41 from tensorflow.python.tools import module_util as _module_util
     42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader
     43 

~\Anaconda3\lib\site-packages\tensorflow\python\__init__.py in <module>
     48 import numpy as np
     49 
---> 50 from tensorflow.python import pywrap_tensorflow
     51 
     52 # Protocol buffers

~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py in <module>
     67 for some common reasons and solutions.  Include the entire stack trace
     68 above this error message when asking for help.""" % traceback.format_exc()
---> 69   raise ImportError(msg)
     70 
     71 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
  File "C:\Users\arpit\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "C:\Users\arpit\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "C:\Users\arpit\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "C:\Users\arpit\Anaconda3\lib\imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "C:\Users\arpit\Anaconda3\lib\imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: DLL load failed: The specified module could not be found.


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

abstract and poster is missing

For more information about TWPCA, you had provided "Abstract" and "poster". But the link does not work.
Please., correct these links.
Thank you

example in README is broken

... after the recent PR that was merged 😉