Pytorch Lightning code guideline for conferences

License: Apache License 2.0

Python 100.00%

deep-learning-project-template's People

Stargazers

Watchers

Forkers

lunayach jeff-sjtu vipermdl legendtianjin kashif bsridatta chengweitsai mlagunas dsevero cedias awaelchli vanamsterdam vr25 karanchahal ntzzc sunn-e crackercat javierlorenzod zachbellay zerebom nblauch notreallyme2 mkazmier aod321 gcunhase edgarriba jaeyounkim ringares gregory-eales vv111y protonish jzhou316 haritzpuerto zzilch vaibhavbaswal95 wizofe rshpeley gnperdue haryoa haipinglu anic1618 rukrei halfnelson zwj2019 myracheng tacalvin mattbui firekind projektosmium naseeihity chutlhu winday00 mldl yixinghuang noticeable 21jun eladar sqiangcao99 duylebkhcm rebooorn fudp zzr525 jack-low youarerare rizialdi zhangmozhe grok-phantom thanthoai vinid maveriq kajyuuen iamshant rutgervandeleur ulamaca romanshen jiujiangluck debashishc fubel zzragida atomscott alpha-warrior jaykimbravekjh mrdvince chrissem minalspatil jeffaudi 13301338176 pvmilk aybarburak lkhphuc sintobin shashank2000 iostream11 voldmr utisetur c3duan pointcloudyc the-neural-networker abdulsam wwwht

deep-learning-project-template's Issues

promote unit testing in open research

should we promote from here to provide minimal unitests ?

it's the first thing I check in research repos and usually none of them provide minimum testing, which makes me doubt about the reliability of showed results in the paper.

clone project

git clone https://github.com/YourGithubName/deep-learning-project-template

install project

cd deep-learning-project-template
pip install -e .
pip install -r requirements.txt
Next, navigate to any file and run it.

module folder

cd project

run module (example: mnist as your main contribution)

python lit_classifier_main.py
Imports

This project is setup as a package which means you can now easily import any file into any other file like so:

from project.datasets.mnist import mnist
from project.lit_classifier_main import LitClassifier
from pytorch_lightning import Trainer

model

model = LitClassifier()

data

train, val, test = mnist()

train

trainer = Trainer()
trainer.fit(model, train, val)

test using the best model!

trainer.test(test_dataloaders=test)
Citation

@Article{YourName,
title={Your Title},
author={Your team},
journal={Location},
year={Year}
great={vocal}
Asp.2n={kesnting}

Problem

The template test test_classifier.py does not pass when triggered as a GitHub action.

The issue comes from imports. Indeed the import from project.datasets.mnist import mnist is erroneous. There is no datasets folder in project because datasets is ignored (see l.37 of .gitignore).

Pulling the MNIST dataset everytime the GitHub action is triggered does not seem reasonable. My proposal would be to have a samples_datasets/ subfolder in test, not ignored, with a limited number of data points, and to use that for testing.

If anyone agree that it's a reasonable way to go, I'll create a PR :)

Parameter hidden_dim unused

Hey,

I was just quickly browsing through the templates when I noticed that for the Autoencoder project the argparse parameter 'hidden_dim' is unused:

https://github.com/PyTorchLightning/deep-learning-project-template/blob/faef5bbf4f3c392b9e4d71e606edaab5ce4d0aaf/project/lit_autoencoder.py#L54

I suspect this should be used by the model to define the number of hidden dimensions. The other argparse parameter is probably accounted for in the Trainer (me: new to lightning). Thanks for this useful project.

Must try

For ml project

cookiecutter

As it was mentioned in another ticket, maybe it is time to modernize the example a bit.
If/when this happens, it would be nice to consider 'cookiecutter' as a template engine,
so the instructions, like 'delete this', 'make a directory', etc., could be automated nicely.
E.g. similar to this template:
https://github.com/jeannefukumaru/cookiecutter-ml

Request for a more intermediate/advanced template, showing of the real power of Lightning

The current example templates in the project folder are nice, but simpler than most engineering/research projects will be.

Would it be possible to have a template with all batteries included that shows how to properly use:

The DataModule introduced in 0.9.0, which you likely want to use if your dataset does not support something like: dataset = MNIST('', train=True, download=True, transform=transforms.ToTensor())
The new Metric API introduced in 1.0.0 with self.log()
The to_torchscript function. Needing to have access to the code to restore models, can be troublesome if a script change was not git pushed, but still used to train a model. Or library mismatches. TorchScript makes sure that the saved model can be used to reproduce the results, with the added benefit of working in C++.

This repo is a bit outdated

Should we change it to support newer versions?

Examples using all the hooks

I think it would be more than beneficial to provide an example that would use all the hooks from the LightningModule and LightningDataModule so that the end user has a clear overview about customisation and use.

Generic extension of this template with Hydra configs

Recently I've been trying to develop a very general PyTorch Lightning + Hydra template.

The idea is to extend deep-learning-project-template, by just adding some Hydra config structure and simple example, which initializes modules from config parts in the most generic and obvious way.

I've been wondering if it would be a good idea to develop an official template of this sort and add it to Lightning ecosystem as an alternative for deep-learning-project-template?
Currently it seems like Hydra is still a little rough around the edges and it's not obvious how config should be structured for Lightning, but still such a template could be a useful addition in the future.

Here is a link to my version:
https://github.com/hobogalaxy/lightning-hydra-template

Metrics code is not as educational as it could be

The latest docs state that validation_end should return a dict, optionally with two special keys, progress_bar and log. The code in this template only puts the metrics instead at the top level, resulting in them not showing up in whatever logger the user has selected. This is particularly confusing because the example code on the main docs page doesn't make use of this either, and so the only way to learn about the existence of the log key is to look directly at the validation_end signature directly, leading new users to assume all metrics are logged automatically as in issues such as Lightning-AI/pytorch-lightning#324 (comment). I stumbled upon that issue whilst trying to debug the same problem, but since I started with MLFlow I initially thought I must be doing something wrong with the non-default logger and ended up somewhat confused.

This has been clarified in the example code talked about in the above issue; that change should probably be ported to this repo. I also think there's a good argument this is non-obvious enough to put in the "minimal example" in the docs; if agreed I'll open an issue on the main repo for that.

Encourage Modularity of Lightning Modules

Would be great if people's lightning modules became easily accessible via pip install. This functionality may work as is right now, but it would be great to showcase and encourage this pattern:

pip install git+https://github.com/teddykoker/teddys-model/

Then in python

from pl_bolts.datamodules import ImageNet
from teddys-model import TeddysModel

pl.Trainer().fit(TeddysModel(), ImageNet())

This encourages reusability of modules and datamodules which would be great for research reproducibility as well as just plug-n-play with different models.

It would be great if this repo can be used with the upcoming 1.0.0 release.

lightning-ai / deep-learning-project-template Goto Github PK