Comments (10)
@Borda apparently 1.2.0 came out a few hours ago which is breaking the summaryWriter... fixed the pytorch version for now to 1.1.0 unitl i sort that out (thus the 0.4.1 update...)
from lightning.
Hm, I just downgraded torch to 1.1 but I still have problems with Experiment. I'm not getting any output for TensorBoard. Are there some more examples I could look at?
from lightning.
youโre probably on a jupyter notebook. tensorbord and jupyter have issues
from lightning.
I am indeed. Bummer... So no jupyterlab + TensorBoard for now?
from lightning.
@cwerner yeah, maybe in this new version PT just released. update to 1.2.0 and lighting 0.4.2 to see what happens. they changed summaryWriter quite a bit
from lightning.
Cool. Will do. Thanks ๐
from lightning.
Sorry, I don't have connection now..
from lightning.
@williamFalcon just a heads-up. Still no output in Lightning, but it get tb output if I use summarywriter directly in my PyTorch Code from jupyterlab (on Port 6006, not in the extension). Could be that I do miss something in the lightning code?!
from lightning.
@cwerner ok, i need to look into this bc Experiment IS a SummaryWriter as well.
Can you post code with both?
But at a minimum i can make it so SummaryWriter can just be used directly. That way we can start the process to support other loggers.
from lightning.
Will be of my machine for a few days but will update and try. If I still have issues Iโll post some code here...
from lightning.
Related Issues (20)
- Construct objects from yaml by classmethod
- FSDP Strategy checkpoint loading
- Current FSDPPrecision does not support custom scaler for 16-mixed precision
- Differentiate testing multiple sets/models when logging
- Issue in Manual optimisation, during self.manual_backward call HOT 1
- Existing metric keys not moved to device after LearningRateFinder
- Checkpoint every_n_steps reruns epoch on restore HOT 3
- Metrics logged by self.log and metric.compute() are different HOT 1
- Multi-node Training with DDP stuck at "Initialize distributed..." on SLURM cluster HOT 3
- Full validation after first microbatch when training after LearningRateFinder
- Add a warning when some of the modules are in eval mode before the training stage
- why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. HOT 1
- AWS Trainium fails number of device validation when using more than 1 accelerator on the instances
- OnExceptionCheckpoint: training resumes if ckpt found, even if no ckpt_path provided
- TensorBoardLogger has the wrong epoch numbers much more than the fact HOT 1
- How to incorporate vLLM in Lightning for LLM inference?
- WandbLogger `save_dir` and `dir` parameters do not work as expected.
- Loading large models with fabric, FSDP and empty_init=True does not work
- Unable to extract confusion matrix as a metric from trainer HOT 1
- Torchmetrics Accuracy issue when dont shuffle test data. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning.