leoduba / lkan Goto Github PK

View Code? Open in Web Editor NEW

This project forked from indoxer/lkan

0.0 0.0 0.0 3.22 MB

Variations of Kolmogorov-Arnold Networks

License: MIT License

C++ 1.37% Python 94.70% Cuda 3.92%

lkan's Introduction

Large Kolmogorov-Arnold Networks

Implementations of KAN variations.

Installation

!!! CUDA version is not ready, just install with `pip install .`

WAY 1 (I don't tested):

installed python 3.10 + nvcc

pip install .

The best way:

Install conda https://conda.io/projects/conda/en/latest/user-guide/install/index.html

conda create -n lkan python==3.10
conda activate lkan
conda install cuda-nvcc
pip install .

To run mnist select config in main.py and run main.py.

To view charts, run tensorboard --logdir .experiments/

Info

Performance (rtx 2060 mobile, mnist):

MLP (31.8M parameters) - 51 it/s

KANLinear0 (32.3 M parameters) - 4.3 it/s

KANLinear (31M parameters) - 36.5 it/s

KANLinearFFT (33M parameters) - 40 it/s

KANConv2d on MNIST:

test_conv.py - file with training code

MNIST, batch_size=64, epochs=5, lr=0.0003

Normal CNN (42154 parameters) 98,85% accuracy (130 it/s - memory bootleneck on MNIST, so real performance is better)

Convolution KAN (40050 parameters) - 97,3% accuracy (need hyperparameter tuning or architecture change, KANLinearFFT is problematic in small size because of O(N^2*L*2*G), for small N and L like CNN kernels, 2*G is significant) (44 it/s - need more optimization)

Docs

See examples/

continual_training.ipynb - continual training, comparison of mlp, KANLinear and KanLinearFFT

Problems

update_grid on cuda raise error (torch.linalg.lstsq assume full rank on cuda, only one algorithm) - solved temporary, moved calculating lstsq to cpu
update_grid_from_samples in original KAN run model multiple times, is it necessary?
parameters counting, is grid parameter or not?
MLP training is almost instant, but KAN train slow on start

TODO/Ideas:

Citations

@misc{liu2024kan,
      title={KAN: Kolmogorov-Arnold Networks}, 
      author={Ziming Liu and Yixuan Wang and Sachin Vaidya and Fabian Ruehle and James Halverson and Marin Soljačić and Thomas Y. Hou and Max Tegmark},
      year={2024},
      eprint={2404.19756},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}