Comments (13)
Hi @wjaskowski thanks for reaching out.
Indeed right now we do not accept NaN/None/(+/-)Inf values.
However, we had some internal discussion about it.
One idea is to make it similar to what TensorBoard does: each Nan/None is displayed as a graphic icon, like a triangle or star. Location on "y" axis is determined by preceding numeric value.
Location on "x" axis is preceding value +1.
what do you think?
from neptune-client.
Hello @fwindolf ,
This feature request is quite deep in our backlog, so currently, there is no ETA for it, unfortunately.
Is this behavior a blocker for your workflows?
from neptune-client.
This is a somewhat stale issue but the first thing that comes up when you google the behaviour.
Any news/updates on this?
from neptune-client.
Not really a blocker, but having NaN
s occur during training for whatever reason seems to be a common enough problem to justify experiment tracking not completely breaking imo.
So a
run["my_metric"].append(1.0)
run["my_metric"].append(float("nan"))
run["my_metric"].append(3.0)
will only show the 1.0. I see why adding NaN
support would open up quite a few edge cases for visualizations etc, but maybe a short term fix could be simply ignoring NaN
, +-inf
etc during the list iteration when syncing the metric.
from neptune-client.
Sounds good. You might also want to consider placing those triangles on the bottom/top of the visible plot.
But visualization is one thing - the most important one is to be able to send and download the data.
from neptune-client.
Thanks for suggestion π
We will consider it as well.
from neptune-client.
Would replacing the nan
/inf
values with 0
/some high-end value while logging be a viable workaround in your case?
Something like:
import math
metric = float("nan")
if math.isna(metric ):
run["my_metric'].append(0)
I've also submitted your feedback around ignoring NaN/inf to the product team. Thank you :)
from neptune-client.
Hello @fwindolf ,
Just checking if the above workaround works for you
from neptune-client.
Sorry I missed the notification of the last comment.
We solved it by not logging nans as 0, inf as a big number which is okay for now. It skews the readability of graphs but it's better than not seeing anything.
Thanks for forwarding the issue!
from neptune-client.
Did you mean not logging nans as 0? :)
from neptune-client.
Hi there! Is this something that is actively worked on? I experienced some solid trouble recently because my training process diverged and the logging did not show where the NANs started to show up at first. This can be very valuable information for debugging. What is bad about the way e.g. tensorboard handles NAN/inf?
While the workaround is fine in most cases, my model showed values of around zero all the time and then started to diverge so replacing NANs with zeros in principle works but is not ideal in my situation.
from neptune-client.
Hello @rschiewer ,
The product team is currently scoping this. This seems to involve relatively high engineering effort, so there is no ETA as of now, unfortunately :(
In your case, since the values hover around zero, can you replace NaNs with a high value so that they show up in charts, and you can then know when your model starts diverging?
from neptune-client.
Hey everyone! Just a quick update here.
Neptune v1.8.3 now skips trying to log NaN and Inf values and throws a warning instead. This means you no longer have to check for nan/inf values in your codeπ₯³
from neptune-client.
Related Issues (20)
- ZD766: I/O operation on closed file. HOT 9
- ZD745: Neptune synchronization throws Unauthorized error HOT 14
- Feature Request: axis formatting HOT 1
- Additional `development` model stage leve HOT 2
- Feature Request: Allow `startswith` & `endswith` filter types when browing tags HOT 1
- Feature Request: Disable neptune for local development HOT 5
- Feature Request: Inspect individual files in a FileSeries HOT 1
- BUG: cannot log metrics from different processes/threads to same run HOT 3
- BUG: GPL License Violation? HOT 3
- NPT-14150: Logging timestamps in milliseconds leads to no data getting logged for async mode and confusing `HTTPServiceUnavailable` error HOT 1
- BUG: NeptuneSSLVerificationError despite NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE = True HOT 2
- NPT-14389: `.neptune` folder is not cleaned up if multiple PyTorch Dataloaders are used HOT 6
- BUG: kedro neptune init fail HOT 10
- Feature Request: Stop truncating text in project datasets HOT 6
- Huggingface Trainer closes run automatically after training HOT 3
- Feature Request: Display the actual run name on the tool tip while hovering over run link on left. HOT 2
- Add SECURITY.md HOT 1
- NPT-14719: Offline mode messes up plots HOT 19
- BUG: NeptuneLogger version (_run_short_id) returns None HOT 8
- NPT-14525: Neptune reports "step must be strictly increasing" error if lightning logs in training and validation step HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from neptune-client.