Giter Club home page Giter Club logo

torchntk's Introduction

TorchNTK

An Arbitrary** PyTorch Architecture Neural Tangent Kernel Library

This code was developed to bridge a gap in NTK computation before the release Pytorch1.11; but now with Pytorch 1.11 release I advise you take a look at functorch's NTK page, which generally will have better development + improvements than this repo. In other words, we do not expect to support this repo moving forward.

Installation

  • git clone this repository
git clone https://github.com/pnnl/torchntk
  • Add to PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:/my/path/TorchNTK/"
  • Make sure you have correct dependencies installed; Broadly, this code was tested with PyTorch 1.9, numba 0.53.1, and Tensorboard 2.6.0, on Python 3.8.8.

  • The torch.vmap function is only available on nightly releases of PyTorch. torch.vmap is only used for one implementation of an autograd calculation-- it is not required

  • For the notebooks comparing to neural tangents, you will also need jax, jaxlib, and neural-tangents installed. This can be tricky for windows users, and we suggest going to the original neural-tangents page for detailed installation instructions here

  • For the tensorboard.ipynb notebook, download the dataset from here and place into ./DATA/ ; though you very well could use any other dataset or simulated data.

Basic Usage

import torchntk
import torch

DEVICE = 'cpu' #or cuda, lets say

model = Pytorch_Model() #Any architecture-- BUT must terminate in single neuron
model.to(DEVICE)

Y = model(X) 

NTK_components = torchntk.autograd.autograd_components_ntk(model,Y)

or, a generally faster implementation exists if torch.vmap exists (currently available in pytorch nightly builds only)

import torchntk
import torch
from torch.utils.data import DataLoader, TensorDataset

DEVICE = 'cuda' #

model = Pytorch_Model() #Any architecture-- BUT must terminate in single neuron
model.to(DEVICE)

xloader = DataLoader(TensorDataset(My_data,My_targets),batch_size=64, shuffle=False)

NTK_components = torchntk.autograd.vmap_ntk_loader(model,xloader)

Finally, if you are using a fully connected network (a network composed only of torch.nn.Linear layers) you can use this last method which is typically much faster:

import torchntk
import torch

DEVICE = 'cuda'

def activation(X):
    return torch.tanh(X)
	
def d_activation(X):
    return torch.cosh(X)**-2

class MLP(torch.nn.Module):
    def __init__(self,):
        super(MLP, self).__init__()
        self.d1 = torch.nn.Linear(784,100,bias=True) 
        self.d2 = torch.nn.Linear(100,100,bias=True)
        self.d3 = torch.nn.Linear(100,1,bias=True) 
    def forward(self, x_0):
        x_1 = activation(self.d1(x_0)) / torch.sqrt(100)
        x_2 = activation(self.d2(x_1)) / torch.sqrt(100)
        x_3 = activation(self.d3(x_2)) / torch.sqrt(1)
        return x_3, x_2, x_1, x_0 


model = MLP()
model.to(DEVICE)

x_3, x_2, x_1, x_0 = model(X) #for some data, X

Xs = [x_0.T.detach(),
      x_1.T.detach(),
	  x_2.T.detach()]
	  
layers = [model.d1,
          model.d2,
		  model.d3]
		  
#this must match the layer's width
ds_int = [100, 100, 1]

#this must match what you divided the layer by, squared.
#i.e., if you didn't divide each layer by anything, this should be all ones.
ds_float = [100.0, 100.0, 1.0]


config = {'Xs':Xs,
          'layers':layers,
		  'ds_int':ds_int,
		  'ds_float':ds_float,
		  'dactivation_t':d_activation}
 
components = torchntk.explicit.explicit_ntk(**config)
#components is a list of torch.Tensor objects representing each component of
#the NTK from each parameterized operation in reverse order. Meaning, 
#components[0] is the outermost layer weight matrix NTK component, 
#components[1] is the outermost layer bias vector NTK component,
# ...
#components[-1] is the first layer's bias vector NTK components 

#to get the full NTK, simply sum the components across the list's dimension.

Logging with Tensorboard

check the tensorboard.ipynb notebook.

Once installed, Tensorboard can be started on the command line with:

tensorboard --logdir=LOGDIR

Possible Metrics of Interest

The condition number is the (minimum eigenvalue of the NTK / maximum eigenvalue of the NTK). It is negatively correlated with model performance

Credit

"torchntk.autograd.old_autograd_ntk" was directly adatapted from the TENAS group's code, available here , and you can view their paper on neural architecture seach here; authored by Chen, Wuyang and Gong, Xinyu and Wang, Zhangyang and titled: "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective"

Some backward propogation functions were originally copied then heavily modified from this article by Pierre Jaumier, available here

I've also included some utility functions that I directly copied from the PyTorch source; therefore, their license clause is included in ours.

Experimental autograd operations were adapted from web pages in the pre-release of Pytorch1.11; but now with Pytorch 1.11 release I advise you take a look at functorch's NTK page.

Software TODO (or how you can contribute)

  • Add explicit calculations for more varied architectures
  • Parallelize computation across multiple GPUs
  • make the notebook that demonstrates the different algorithms into a test such that pytest can be run on it, assert all outputs are ~same

torchntk's People

Contributors

awe2 avatar devcentral-pnnl avatar parthe avatar

Stargazers

 avatar steve avatar  avatar YGH avatar chenyiming avatar Henry Lao avatar Oli avatar Jseam avatar dudu sama avatar  avatar Minghui Chen avatar  avatar Rafael Oliveira avatar  avatar Yu Zheng avatar Jeff Carpenter avatar Sync avatar Zhongyi Zhou avatar Kameron Decker Harris avatar Davis Brown avatar Calvin-Khang Ta avatar Zhiheng_Ma avatar Piyushi Manupriya avatar Kento Nozawa avatar  avatar

Watchers

Nathan Baker avatar Leif Carlsen avatar  avatar  avatar

Forkers

parthe zhichao-w

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.