Comments (3)
return {} for now
from lightning.
but we're adding support for not needing to implement a val function if not needed #82
from lightning.
The recommended solution:
def validation_epoch_end(self, validation_outputs):
return {}
Also produces the error 'AttributeError: 'dict' object has no attribute 'callback_metrics':
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-27-f49f97955aea> in <module>
4 check_val_every_n_epoch=3,
5 )
----> 6 trainer.fit(model)
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/states.py in wrapped_fn(self, *args, **kwargs)
46 if entering is not None:
47 self.state = entering
---> 48 result = fn(self, *args, **kwargs)
49
50 # The INTERRUPTED state can be set inside the run function. To indicate that run was interrupted
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in fit(self, model, train_dataloader, val_dataloaders, datamodule)
1082 self.accelerator_backend = CPUBackend(self)
1083 self.accelerator_backend.setup(model)
-> 1084 results = self.accelerator_backend.train(model)
1085
1086 # on fit end callback
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/accelerators/cpu_backend.py in train(self, model)
37
38 def train(self, model):
---> 39 results = self.trainer.run_pretrain_routine(model)
40 return results
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in run_pretrain_routine(self, model)
1222
1223 # run a few val batches before training starts
-> 1224 self._run_sanity_check(ref_model, model)
1225
1226 # clear cache before training
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py in _run_sanity_check(self, ref_model, model)
1255 num_loaders = len(self.val_dataloaders)
1256 max_batches = [self.num_sanity_val_steps] * num_loaders
-> 1257 eval_results = self._evaluate(model, self.val_dataloaders, max_batches, False)
1258
1259 # allow no returns from eval
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py in _evaluate(self, model, dataloaders, max_batches, test_mode)
397
398 # log callback metrics
--> 399 self.__update_callback_metrics(eval_results, using_eval_result)
400
401 # Write predictions to disk if they're available.
~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py in __update_callback_metrics(self, eval_results, using_eval_result)
419 if isinstance(eval_results, list):
420 for eval_result in eval_results:
--> 421 self.callback_metrics = eval_result.callback_metrics
422 else:
423 self.callback_metrics = eval_results.callback_metrics
AttributeError: 'dict' object has no attribute 'callback_metrics'
Comment those two lines out and it works without any issue. This issue should not be closed until either the problem is fixed or the documentation is updated to show that it's not working.
from lightning.
Related Issues (20)
- Construct objects from yaml by classmethod
- FSDP Strategy checkpoint loading
- Current FSDPPrecision does not support custom scaler for 16-mixed precision
- Differentiate testing multiple sets/models when logging
- Issue in Manual optimisation, during self.manual_backward call HOT 1
- Existing metric keys not moved to device after LearningRateFinder
- Checkpoint every_n_steps reruns epoch on restore HOT 3
- Metrics logged by self.log and metric.compute() are different HOT 1
- Multi-node Training with DDP stuck at "Initialize distributed..." on SLURM cluster HOT 3
- Full validation after first microbatch when training after LearningRateFinder
- Add a warning when some of the modules are in eval mode before the training stage
- why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. HOT 1
- AWS Trainium fails number of device validation when using more than 1 accelerator on the instances
- OnExceptionCheckpoint: training resumes if ckpt found, even if no ckpt_path provided
- TensorBoardLogger has the wrong epoch numbers much more than the fact HOT 1
- How to incorporate vLLM in Lightning for LLM inference?
- WandbLogger `save_dir` and `dir` parameters do not work as expected.
- Loading large models with fabric, FSDP and empty_init=True does not work
- Unable to extract confusion matrix as a metric from trainer HOT 1
- Torchmetrics Accuracy issue when dont shuffle test data. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning.