neptune-ai / neptune-client Goto Github PK

📘 The MLOps stack component for experiment tracking

License: Apache License 2.0

Python 100.00%

comparison dl keras learning lightgbm logger logging machine ml mlops monitoring optimization optuna pytorch repository rl tensorflow versioning visualization xgboost

neptune-client's People

Contributors

Stargazers

Watchers

neptune-client's Issues

ability to change 'name', 'params' and 'description

Once I call neptune.create_experiment() is am unable to change: name, params, description.

I think that neptune-client should allow User to do it. My UC is that I want to append stuff to params and assign name and description right before training loop.

What do you guys think?
@lukasz-walkiewicz @jakubczakon

Not possible to send infs and nans

What is the reason to don't allow sending inf's and nan's as metric values? I imagine that it is impossible to plot them but this is still some information.

create a silent flag in neptune

I would like to be able to run scripts with neptune tracking but if I want to run it without tracking (for instance in tests) I want to have a flag that turns of neptune tracking.

Something like this would be great

import neptune

SILENT=True

if SILENT:
   neptune.set_silent()

neptune.init()
neptune.create_experiemnt()
...

experiment.get_system_properties() doesn't return "hostname"

I think there is some regression. For recent experiments experiment.get_properties() return either an empty dictionary or {'key1': 'value1', 'key2': '17', 'key3': 'other-value'} (whatever that is) in case of the sandbox project.

For older experiments, I still can get the properties.

This is probably a backend issue but there is no better place to put it.

Problem with argparse

When using argparse with neptune, python cannot recognize it correctly.

Example Code:

import argparse
parser = argparse.ArgumentParser("cifar")
parser.add_argument('--data', type=str, default='../data', help='location of the data corpus')
# parser.add_argument('--batch_size', type=int, default=64, help='batch size')
args = parser.parse_args()

# import neptune

If comment out as above run:

python temp.py --data ./data

has no problem.

However if uncomment above:

import argparse
parser = argparse.ArgumentParser("cifar")
parser.add_argument('--data', type=str, default='../data', help='location of the data corpus')
parser.add_argument('--batch_size', type=int, default=64, help='batch size')
args = parser.parse_args()

import neptune

Run

python temp.py --batch-size 100

will raise unrecognized argument error.
Any solution?

Failed to send channel value: SSL certificate validation failed.

Hi,

Just discovered neptune.ai and it's really great!

I have the following error:
Failed to send channel value: SSL certificate validation failed. Set NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE environment variable to accept self-signed certificates.

Even if I set NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE in my bashrc neptune-client still throws this error.

.bashrc:
export NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE=True

python:

Python 3.7.7 (default, Mar 10 2020, 15:16:38) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.getenv("NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE")
'True'
>>> `

System:
ubuntu 18.04
neptune-client 0.4.116
python 3.7.7

project_qualified_name is too long and unintuitive

It is a bit frustrating to type project_qualified_name every time I run neptune.init()

Could that be changed to project_name ?

Also, is there any other way to pass the project_name, like CLI?

Getting 'Failed to send channel value' error after hours of training

Hi!

First of all, thanks for sharing your amazing library!

I am quite new to Neptune, and am trying to run (and log) some trainings on 1 GPU. Everything went smoothly for ~20 hours, but then I got an error (Failed to send channel value.). I am wondering, what might have caused this.

It happened at the same time to all 3 of my jobs. I see a few possibilities:

I have been doing something wrong (eg. logging too much info).
There was a server error on your side.
Something wrong with my server.

I would highly appreciate any hint on how can I deal with this :-)

Logs from Neptune's stderr:

20:17:01 | E1023 04:07:48.158644 35187409351088 channels_values_sender.py:164] Failed to send channel value.
-- | --
20:17:01 | Traceback (most recent call last):
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
-- | --
20:17:01 | return func(*args, **kwargs)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:17:01 | channelsValues=input_channels_values
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:17:01 | six.reraise(*sys.exc_info())
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
-- | --
20:17:01 | raise value
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:17:01 | swagger_result = self._get_swagger_result(incoming_response)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:17:01 | return func(self, *args, **kwargs)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:17:01 | self.request_config.response_callbacks,
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:17:01 | raise_on_unexpected(incoming_response)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:17:01 | raise make_http_exception(response=http_response)
20:17:01 | bravado.exception.HTTPInternalServerError: 500 : {"code":500,"errorType":"INTERNAL_SERVER_ERROR","title":"Internal Server Error (2fb6177e655)"}
20:17:01 | During handling of the above exception, another exception occurred:
20:17:01 | Traceback (most recent call last):
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:17:01 | self._experiment._send_channels_values(channels_with_values)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:17:01 | self._backend.send_channels_values(self, channels_with_values)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:17:01 | raise ServerError()
20:17:01 | neptune.api_exceptions.ServerError: Server error. Please try again later.
20:18:56 | Traceback (most recent call last):
-- | --
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
20:18:56 | return func(*args, **kwargs)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:18:56 | channelsValues=input_channels_values
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:18:56 | six.reraise(*sys.exc_info())
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
20:18:56 | raise value
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:18:56 | swagger_result = self._get_swagger_result(incoming_response)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:18:56 | return func(self, *args, **kwargs)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:18:56 | self.request_config.response_callbacks,
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:18:56 | raise_on_unexpected(incoming_response)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:18:56 | raise make_http_exception(response=http_response)
20:18:56 | bravado.exception.HTTPInternalServerError: 500 : {"errorType":"INTERNAL_SERVER_ERROR","code":500,"title":"Internal Server Error (e50dc164b5c)"}
20:18:56 | During handling of the above exception, another exception occurred:
20:18:56 | Traceback (most recent call last):
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:18:56 | self._experiment._send_channels_values(channels_with_values)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:18:56 | self._backend.send_channels_values(self, channels_with_values)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:18:56 | raise ServerError()
20:18:56 | neptune.api_exceptions.ServerError: Server error. Please try again later.
20:19:02 | E1023 04:09:49.150690 35187409351088 channels_values_sender.py:164] Failed to send channel value.
20:19:02 | Traceback (most recent call last):
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
20:19:02 | return func(*args, **kwargs)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:19:02 | channelsValues=input_channels_values
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:19:02 | six.reraise(*sys.exc_info())
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
20:19:02 | raise value
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:19:02 | swagger_result = self._get_swagger_result(incoming_response)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:19:02 | return func(self, *args, **kwargs)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:19:02 | self.request_config.response_callbacks,
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:19:02 | raise_on_unexpected(incoming_response)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:19:02 | raise make_http_exception(response=http_response)
20:19:02 | bravado.exception.HTTPInternalServerError: 500 : {"errorType":"INTERNAL_SERVER_ERROR","code":500,"title":"Internal Server Error (d98440aac5e)"}
20:19:02 | During handling of the above exception, another exception occurred:
20:19:02 | Traceback (most recent call last):
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:19:02 | self._experiment._send_channels_values(channels_with_values)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:19:02 | self._backend.send_channels_values(self, channels_with_values)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:19:02 | raise ServerError()
20:19:02 | neptune.api_exceptions.ServerError: Server error. Please try again later.
20:19:06 | E1023 04:09:53.008028 35187409351088 channels_values_sender.py:164] Failed to send channel value.
20:19:06 | Traceback (most recent call last):
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
20:19:06 | return func(*args, **kwargs)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:19:06 | channelsValues=input_channels_values
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:19:06 | six.reraise(*sys.exc_info())
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
20:19:06 | raise value
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:19:06 | swagger_result = self._get_swagger_result(incoming_response)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:19:06 | return func(self, *args, **kwargs)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:19:06 | self.request_config.response_callbacks,
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:19:06 | raise_on_unexpected(incoming_response)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:19:06 | raise make_http_exception(response=http_response)
20:19:06 | bravado.exception.HTTPInternalServerError: 500 : {"errorType":"INTERNAL_SERVER_ERROR","code":500,"title":"Internal Server Error (b34b09c679a)"}
20:19:06 | During handling of the above exception, another exception occurred:
20:19:06 | Traceback (most recent call last):
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:19:06 | self._experiment._send_channels_values(channels_with_values)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:19:06 | self._backend.send_channels_values(self, channels_with_values)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:19:06 | raise ServerError()
20:19:06 | neptune.api_exceptions.ServerError: Server error. Please try again later.

Logs from my machine (jobs are still working, but not logging anything)

Traceback (most recent call last):
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/threads/ping_thread.py", line 37, in run
    self.__backend.ping_experiment(self.__experiment)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 611, in ping_experiment
    self.backend_swagger_client.api.pingExperiment(experimentId=experiment.internal_id).response()
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/client.py", line 279, in __call__
    request_config=request_config,
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/requests_client.py", line 399, in request
    self.authenticated_request(sanitized_params),
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/requests_client.py", line 440, in authenticated_request
    return self.apply_authentication(requests.Request(**request_params))
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/requests_client.py", line 445, in apply_authentication
    return self.authenticator.apply(request)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/oauth.py", line 90, in apply
    self.auth.refresh_token_if_needed()
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/oauth.py", line 51, in refresh_token_if_needed
    self._refresh_token()
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/oauth.py", line 54, in _refresh_token
    self.session.refresh_token(self.session.auto_refresh_url)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/requests_oauthlib/oauth2_session.py", line 446, in refresh_token
    self.token = self._client.parse_request_body_response(r.text, scope=self.scope)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 421, in parse_request_body_response
    self.token = parse_token_response(body, scope=scope)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 431, in parse_token_response
    validate_token_parameters(params)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 438, in validate_token_parameters
    raise_from_error(params.get('error'), params)
  File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/errors.py", line 405, in raise_from_error
    raise cls(**kwargs)
oauthlib.oauth2.rfc6749.errors.InvalidGrantError: (invalid_grant) Offline user session not found

Add option to download output/artifacts programatically

It would be good to add an option to do something like this:

...
experiment.get_output_files('my_model.h5', '/path/to/local/storage')

So that I could access (download) the artifacts from code.

Offline mode usage

I have a machine whose compute nodes have no access to the internet. Is it possible to use the offline mode during compute and load the results the the web interface after the compute job is finished?

Upload source code with *.py option.

When creating the experiment I need to list files by name:

neptune.create_experiment(upload_source_files=['main.py', 'utils.py', 'config.yaml'])

I would like to be able to just use * option. For example:

neptune.create_experiment(upload_source_files=['*.py', '*.yaml'])

ImportError when importing neptune

I get the following error message when importing neptune:

` import neptune

File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune_init_.py", line 19, in
from neptune import envs, projects, experiments

File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune\projects.py", line 23, in
from neptune.experiments import Experiment, push_new_experiment

File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune\experiments.py", line 31, in
from neptune.internal.utils.image import get_image_content

File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune\internal\utils\image.py", line 19, in
from PIL import Image

File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\PIL\Image.py", line 94, in
from . import _imaging as core

ImportError: cannot import name '_imaging'`

Python 3.6

Please help

new feature is requested

some text goes here, also @kamil-kaczmarek

add an option to pass a list of channels to .get_numeric_channel_values method

Currently, when fetching experiment data I have two options:

exp = project.get_experiment(id=['PROJ-28'])[0]
exp.get_numeric_channel_values('auc_train')
exp.get_numeric_channel_values('auc_train', 'auc_valid')

I would like to be able to pass a list without unpacking it.
Today:

channel_list = ['auc_train', 'auc_valid']
exp.get_numeric_channel_values(*channel_list)

I would like to:

channel_list = ['auc_train', 'auc_valid']
exp.get_numeric_channel_values(channel_list)

What do you think?

StringIO and 'neptune.send_artifact'

Hello,

It's a pleasure working with Neptune, thanks.
I am trying to log a StringIO as an artifact. I wish to avoid saving a temporary file in-order to log it and then proceed to remove it.

Neptune version: 0.4.126
It was installed using Conda on Ubuntu 18.04.
Python version 3.7.8

Here is a minimal example to reproduce:

import neptune
from io import StringIO
summary_string_io = StringIO()
summary_string_io.write("something, something.")
neptune.init('a/b')
neptune.create_experiment(name='minimal_example')
neptune.send_artifact(summary_string_io, destination="summary.txt")

The error I'm getting is (I omitted some of my paths):

Traceback (most recent call last):
File "", line 12, in
neptune.send_artifact(summary_string_io, destination="summary.txt")
File "lib/python3.7/site-packages/neptune/init.py", line 355, in send_artifact
return get_experiment().log_artifact(artifact, destination)
File "lib/python3.7/site-packages/neptune/experiments.py", line 620, in log_artifact
experiment=self)
File "lib/python3.7/site-packages/neptune/internal/storage/storage_utils.py", line 230, in upload_to_storage
upload_api_fun(**dict(kwargs, data=file_chunk_stream, progress_indicator=progress_indicator))
File "lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 692, in upload_experiment_output
query_params={})
File "lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 847, in _upload_loop
ret = with_api_exceptions_handler(self._upload_loop_chunk)(fun, part, data, **kwargs)
File "lib/python3.7/site-packages/neptune/utils.py", line 211, in wrapper
return func(*args, **kwargs)
File "lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 865, in _upload_loop_chunk
response = fun(data=part.get_data(), headers=headers, **kwargs)
File "lib/python3.7/site-packages/neptune/internal/storage/datastream.py", line 34, in get_data
return io.BytesIO(self.data)
TypeError: a bytes-like object is required, not 'str'

I tried looking for examples but I couldn't find any.

Thank you,
Ben.

JsonDecodeError

Hi, I'm running into the same issue that Richard was in #280. I'm creating neptune experiments within a loop testing the same model on multiple subsets of a dataset.

The model is the same each time and the JsonDecodeError occurs randomly, i.e. the code will run for hours then randomly throw the error.

Sorry if this is the wrong place to post, I would have commented in the other thread but the issue was already closed.

Python Logger logs not captured in neptune UI

I am just getting started with my first neptune project and am running into a problem with the logger. I haven't seen any information online about this particular problem so I wanted to post it here.

The following is an example of a project and experiment initialization that passes in a python logger object.

import logging
import neptune
from src.keys import NEPTUNE_TOKEN
from neptune.experiments import Experiment
neptune.init('richt3211/thesis', api_token=NEPTUNE_TOKEN)

logger = logging.getLogger()

exp = neptune.create_experiment(
name='test log',
description='testing logger',
logger=logger
)

logger.info('Starting experiment')

When I run this in a jupyter notebook, the experiment is created, but no logs appear.

I believe this is the correct way to capture a python logger in neptune according to the docs. If it's not the docs might need updated to give a clearer example. If it is the correct, way, there may be a bug.

Experiment state and stdout channel do not update when resuming an experiment

Hi,

Sending metrics when resuming an experiment works fine. However the experiment state does not change from "succeeded" to "running". And neither does it log system information (cpu/gpu information) or stdout when resuming an experiment. Code example:

import time
import neptune
from neptune.sessions import Session

# Initialize experiment
neptune.init(project_qualified_name='Test/project')
experiment = neptune.create_experiment(name='test')
experiment_id = experiment.id

# Send metrics
for i in range(10):
    time.sleep(2)
    print('logging this to STDOUT channel.')
    experiment.send_metric('iter', i)
    
experiment.stop()

# Resume experiment
session = Session()
project = session.get_project(project_qualified_name='Test/project')
experiment = project.get_experiments(id=experiment_id)[0]

# Send metrics
for i in range(10):
    time.sleep(2)
    print('Unable to log this to STDOUT channel.')
    experiment.send_metric('iter', 3*i)

experiment.stop()

.download_artifact() doesn't work

I have tried downloading the artifact programatically:

from neptune.sessions import Session

session = Session()
project = session.get_project(project_qualified_name='jakub-czakon/blog-hpo')
exp = project.get_experiments(id='BLOG-97')[0]

exp.download_artifact('forest_results.pkl', '.')

It results in the following error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-17-8f3ed3e70168> in <module>
      6 exp = project.get_experiments(id='BLOG-97')[0]
      7 
----> 8 exp.download_artifact('forest_results.pkl', '.')

~/.envs/npt_dev/lib/python3.5/site-packages/neptune/experiments.py in download_artifact(self, filename, destination_dir)
    328             raise NotADirectory(destination_dir)
    329 
--> 330         self._client.download_data(self._project, path, destination_path)
    331 
    332     def send_graph(self, graph_id, value):

~/.envs/npt_dev/lib/python3.5/site-packages/neptune/client.py in wrapper(*args, **kwargs)
     51     def wrapper(*args, **kwargs):
     52         try:
---> 53             return func(*args, **kwargs)
     54         except requests.exceptions.SSLError:
     55             raise SSLError()

~/.envs/npt_dev/lib/python3.5/site-packages/neptune/client.py in download_data(self, project, path, destination)
    700                                      query_params={
    701                                          "projectId": project.internal_id,
--> 702                                          "path": path
    703                                      }) as response:
    704             if response.status_code == NOT_FOUND:

~/.envs/npt_dev/lib/python3.5/site-packages/neptune/client.py in _download_raw_data(self, api_method, headers, path_params, query_params)
    823         url = self.api_address + api_method.operation.path_name + "?"
    824 
--> 825         for key, val in path_params.iteritems():
    826             url = url.replace("{" + key + "}", val)
    827 

AttributeError: 'dict' object has no attribute 'iteritems'

Error message unclear

Running

import neptune
neptune.init()

neptune.send_metric('metric', 0.3)

results in the generic python list error

IndexError: list index out of range

when it could say that you need to run:

neptune.create_experiment()

How do I download from public neptune drive

Hi,

I am trying to download the /public/dsb_2018_data/ data on the public Neptune drive. I can't find any sample code on downloading data from the Neptune public drive.

Any help is very appreciated.

Thanks

allow neptune.send_metric to take "step" in addition to timestamp

I have some metrics that is being logged each 1 step and other metrics that are logged each 10 time steps unfortunately this makes the graphs non-comparable on the interface.

Can we allow the neptune.send_metric to take step count. So when displaying them the both graphs become comparable?

Thanks

create_experiment() fails on windows 10

Hi there,

I enjoy neptune very much and on my macbook everything works fine. But when I run the same code on my Windows 10 machine, I get an error when calling create_experiment().

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\neptune\__init__.py", line 177, in create_experiment notebook_id=notebook_id File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\neptune\projects.py", line 400, in create_experiment click.echo(str(experiment.id)) File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\utils.py", line 218, in echo file = _default_text_stdout() File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\_compat.py", line 675, in func rv = wrapper_func() File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\_compat.py", line 436, in get_text_stdout rv = _get_windows_console_stream(sys.stdout, encoding, errors) File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\_winconsole.py", line 295, in _get_windows_console_stream func = _stream_factories.get(f.fileno()) AttributeError: 'StdOutWithUpload' object has no attribute 'fileno'

It happens when I run:

import neptune
import cfg
neptune.init(api_token=cfg.neptune_token, project_qualified_name=cfg.neptune_project_name)
neptune.create_experiment()

I run it in conda environments both times.

print experiment link when creating experiment

When creating experiment with

neptune.create_experiment()

it returns the experiment object.
It could also print a link to experiment to make it easier to find.

neptune.api_exceptions.InvalidApiKey: The provided API key is invalid.

I copy my API key my "Your API Token" directly,
but I get the error both in Colab and on my computer.
I didn't do anything wrong, I think this may be a bug.

Download artifact from a previous experiment??

Hi..

I have created an experiment
I logged a '.csv' file to that experiment and then stopped it(experiment.stop())
I now need to access that file by code...

something like, loading the experiment and get the output files by ‘experiment.download_artifact’..
The code might look like this,:-

previous_experiment = neptune.load_experiment(id = 'SAN-1')# loading a previously made experiment
my_csv = previous_experiment.download_artifact('xyz.csv')

Is it possible??

neptune-client crashes on MacOS Catalina

Given neptune-client newly installed on MacOS Catalina

pip3 install neptune-client

when importing neptune

python3 -c "import neptune"

Python crashes with

Path:                  /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/Python
Identifier:            Python
Version:               3.7.4 (3.7.4)
Code Type:             X86-64 (Native)
Parent Process:        Python [7526]
Responsible:           Terminal [7510]
User ID:               501

Date/Time:             2019-10-07 20:59:20.675 +0530
OS Version:            Mac OS X 10.15 (19A582a)
Report Version:        12
Anonymous UUID:        CB7F20F6-96C0-4F63-9EC5-AFF3E0989687


Time Awake Since Boot: 3000 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Application Specific Information:
/usr/lib/libcrypto.dylib
abort() called
Invalid dylib load. Clients should not load the unversioned libcrypto dylib as it does not have a stable ABI.

This seems to be exactly the problem with described in this SO question.

I was able to simply fix the problem by following this SO answer and doing:

pip3 uninstall cryptography
pip3 install cryptography

Not sure what the solution would be, other than noting this in README and even alerting the user on stdout during importing, if this is at all possible.

Asking for `experiment.name` takes a long time

Some properties of the experiments need to be downloaded from the backend. I understand this logic but this is impractical. My case:

I have 1000 experiments and want to download some of them and filter them by name but calling experiment.name requires 1000 calls to the server, which takes ages. This is a pain. Another server call is when I ask for experiment.get_properties() and another for experiment.state. Thus 3000 requests to the server just for such simple data.

I think, it would be perfectly reasonable to make get_experiments() download those data and keep them static and if this is too convoluted then, please at least allow to filter experiments by name in get_experiments().

Python logging only works by first initializing Neptune

I've noticed that the python logging module only writes to Neptune's STDOUT channel when I initialize the Neptune experiment first. Like so:

import sys
import neptune

neptune.init(project_qualified_name='test_project')
experiment = neptune.create_experiment(name='test')

logger = logging.getLogger()
logger.setLevel(logging.INFO)

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

logger.info('Initializing experiment')

for i in range(5):
    experiment.send_metric('iter', i)
    
    logger.info('Iteration: {}'.format(i))

logger.info('Wrapping up experiment')

experiment.stop()

The STDOUT channel of my neptune experiment does not show output of the logger instance when initializing it the other way around:

import sys
import neptune

logger = logging.getLogger()
logger.setLevel(logging.INFO)

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

neptune.init(project_qualified_name='test_project')
experiment = neptune.create_experiment(name='test')

logger.info('Initializing experiment')

for i in range(5):
    experiment.send_metric('iter', i)
    
    logger.info('Iteration: {}'.format(i))

logger.info('Wrapping up experiment')

experiment.stop()

A solution to this, if possible, would benefit me in a project with a complicated sequence of class initializations.

set_property value is not string then silently doesn't set property

I ran the following script:

import neptune

neptune.init('jakub-czakon/examples')

with neptune.create_experiment():
    neptune.send_metric('score', 0.9)
    neptune.set_property('was_logged', True)

and no property is set for this experiment. There is no error it just doesn't send it.
But if I run the following with str(True):

import neptune

neptune.init('jakub-czakon/examples')

with neptune.create_experiment():
    neptune.send_metric('score', 0.9)
    neptune.set_property('was_logged', str(True))

Everything works just fine.
Maybe we should have an automated str(value) in set_property() or raise warnings or something.

parameters should be property

I think that experiment's parameters should be property, so I do not need to do: npt_exp.get_parameters()['dropout'], but I can simply do: npt_exp.parameters['dropout'].

What do you think: @pitercl @jakubczakon ?

Invalid JSON received from frontend.

Got Invalid JSON received from frontend. message when login with neptune account login. It may due to I am using socks proxy. Please kindly suggest a way to solve this problem.

I modified neptune/client.py file on func def _upload_tar_data(self, experiment, api_method, data): to:

         proxies = {"http": "socks5h://127.0.0.1:9999", 'https': 'socks5h://127.0.0.1:9999'}
         return session.send(session.prepare_request(request), proxies=proxies)

So that it works for neptune-client without command line.

Notification through Telegram?

Any plan to do some notification with telegram, or even an App?

Discovery of git repo location fails

In my settings the discovery of git repo location fails. Let me give the context: I have a main script where the experiment is defined but the execution of the experiment is made from a different script, which is outside my repo (to be precise, I run my experiments using ray). In such a context, the git repo of my experiment script is not located correctly. The problem is in:

def discover_git_repo_location():
    import __main__

    if hasattr(__main__, '__file__'):
        return os.path.dirname(os.path.abspath(__main__.__file__))
    return None

which ask for the __main__, which, in my case, will be some external ray module.

I would appreciate any workaround tips, e.g., how to change my __main__.

Error: X-coordinates must be strictly increasing

sorry to bother you guys, really great app

# Connect your script to Neptune

PARAMS = {'boosting_type': 'gbdt',
          'objective': 'binary',
          'metric': 'auc',
          'bagging_fraction': 0.7,
          'seed': 2020,
          }

# Create an experiment and log hyperparameters
neptune.create_experiment(name='test-lgb-1',
                          description='1st test on LC data, train-test-split, basic-fe',
                          params={**PARAMS,
                                  # 'num_boosting_round': NUM_BOOSTING_ROUNDS
                                  },
                          upload_source_files=['train.py', 'environment.yaml'],
                          )

# read data
train = pd.read_csv('cfn-train.csv')
test = pd.read_csv('cfn-testa.csv')

fea = [f for f in train.columns if f not in ['id', 'isDefault']]
X_train = train[fea]
X_test = test[fea]
y_train = train['isDefault']

folds = 5
seed = 2008
kf = KFold(n_splits=folds, shuffle=True, random_state=seed)

# _______________________________

cv_scores = []
for i, (train_ind, valid_ind) in enumerate(kf.split(X_train, y_train)):
    print('************************************ {} ************************************'.format(str(i + 1)))
    X_train_split, y_train_split, X_val, y_val = X_train.iloc[train_ind], y_train[train_ind], X_train.iloc[valid_ind], y_train[valid_ind]

    train_matrix = lgb.Dataset(X_train_split, label=y_train_split)  # categorical_feature = ['grade, subGrade']
    valid_matrix = lgb.Dataset(X_val, label=y_val)

    gbm = lgb.train(PARAMS,
                    train_set=train_matrix,
                    # num_boost_round=NUM_BOOSTING_ROUNDS,
                    valid_sets=valid_matrix,
                    verbose_eval=200,
                    early_stopping_rounds=200,
                    # valid_names=['train', 'valid'],
                    callbacks=[neptune_monitor()],  # monitor learning curves (prefix)
                    )

    val_pred = gbm.predict(X_val, num_iteration=gbm.best_iteration)
    cv_scores.append(roc_auc_score(y_val, val_pred))
    print(cv_scores)

I am having this bug . I don't know how to fix... I searched and found an article, but really got me confused...

Failed to send channel value.
Traceback (most recent call last):
  File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\internal\channels\channels_values_sender.py", line 156, in _send_values
    self._experiment._send_channels_values(channels_with_values)
  File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\experiments.py", line 1138, in _send_channels_values
    self._backend.send_channels_values(self, channels_with_values)
  File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\utils.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\internal\backends\hosted_neptune_backend.py", line 560, in send_channels_values
    raise ChannelsValuesSendBatchError(experiment.id, batch_errors)
neptune.api_exceptions.ChannelsValuesSendBatchError: Received batch errors sending channels' values to experiment LEN-9. Cause: Error(code=400, message='X-coordinates must be strictly increasing for channel: 4176ea13-112e-41f7-a353-fcd8348b3379. Invalid point: InputChannelValue(timestamp=2020-09-30T07:11:18.643Z, x=0.0, numericValue=0.7006723423306764, textVa', type=None) (metricId: '4176ea13-112e-41f7-a353-fcd8348b3379', x: 0.0) Skipping 100 values.

NQL Allowing for Space in Field Names

I'm trying to use the advanced search option with NQL but am running into an issue trying to write a query for a specific field. My field is named "trained epochs" which is a nice indication to quickly check the progress of a model without looking at the charts.

I would like to write a query for experiments that only have at least one value for the trained epochs metric. However, because there is a space in the field name, I can't write a query to accomplish this.

So far I have tried the following queries

"trained epochs" > 0
epochs > 0
trained\ epochs > 0 # trying to escape the space character

Is it possible to allow for spaces in NQL fields? Another solution would be to have an option to rename the metric so that it is compatible with NQL. If it isn't possible to allow for spaces in NQL, it would be nice to have something in the docs that mentions this for field name creation so that users are encouraged (or even forced) to not have spaces in the metric and log names.

Default value of upload_source_files is None

Hi,

the default value of upload_source_files in create_experiment is None, which is not an iterable. Therefore the following loop causes an error:

for filepath in upload_source_files:
            expanded_source_files |= set(glob.glob(filepath))

(line 391-392 in neptune/projects.py)

Running code with:
project.create_experiment(name='foo', upload_source_files=[])
does solve this problem, yet is not an elegant solution.

According to the doc-string, it is an optional argument:
upload_source_files (:obj:list, optional, default is ``['main.py']``):

Could you please fix this issue?

Thanks in advance!

Problems when using `num_workers` in Pytorch on MacOS

I have code like this:

from torch.utils.data import DataLoader
# ...
def main():
    # ...
    train_loader = DataLoader(
        datasets.train,
        shuffle=True,
        batch_size=ARGS.batch_size,
        num_workers=ARGS.num_workers,
        pin_memory=True,
    )

where num_worker determines with how many threads pytorch reads the data. If this is set to 0, then pytorch reads the data in the main thread. On MacOS, setting it to any other value than 0 seems to mess up neptune (the problem does not seem to appear on Linux; I have not tried Windows).

What seems to happen is that neptune starts a new experiment for each worker (this is with num_workers=4):

https://ui.neptune.ai/tmk/fcm/e/FCM-38
https://ui.neptune.ai/tmk/fcm/e/FCM-39
https://ui.neptune.ai/tmk/fcm/e/FCM-40
https://ui.neptune.ai/tmk/fcm/e/FCM-41
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/runpy.py", line 263, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/tk324/PycharmProjects/fair-dist-matching/run_clust.py", line 4, in <module>
    main()
  File "/Users/tk324/PycharmProjects/fair-dist-matching/clustering/optimisation/train.py", line 170, in main
    input_shape = get_data_dim(context_loader)
  File "/Users/tk324/PycharmProjects/fair-dist-matching/shared/utils/utils.py", line 44, in get_data_dim
    x = next(iter(data_loader))[0]
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 291, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 737, in __init__
    w.start()
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 779, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/queues.py", line 107, in get
    if not self._poll(timeout):
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/connection.py", line 257, in poll
    return self._poll(timeout)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
    r = wait([self], timeout)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/connection.py", line 930, in wait
    ready = selector.select(timeout)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 53749) exited unexpectedly with exit code 1. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_clust.py", line 4, in <module>
    main()
  File "/Users/tk324/PycharmProjects/fair-dist-matching/clustering/optimisation/train.py", line 170, in main
    input_shape = get_data_dim(context_loader)
  File "/Users/tk324/PycharmProjects/fair-dist-matching/shared/utils/utils.py", line 44, in get_data_dim
    x = next(iter(data_loader))[0]
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 974, in _next_data
    idx, data = self._get_data()
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 941, in _get_data
    success, data = self._try_get_data()
  File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 792, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 53749) exited unexpectedly

With num_workers=0, this runs fine.

x and y seem to be switched in send_metric

It seems that when I send metrics through the send_metric method, it gets sent to x not y dimension.
It is later visualized as if it was y.

send list of values to channel

I would like to be able to send a list of values to a channel rather than send one value in a for loop.
Something like the following would be great:

neptune.send_metric('accuracy', [0.8, 0.4, 0.9])

neptune.send_artifacts(['model.pkl','report.pdf'])

neptune.send_images('diagnostics', ['roc_auc.png','pred_dist.png', 'conf_matrix.png'])

neptune.set_property({'data_version': 'f23fasdqw122312',
                                    'data_path': 'data/raw/table.csv',
                                    'model_path': 'models/model_v1.csv'})

cannot use '.' in tags

I am training to add feature version to tag as '1.0' and it results in error:

neptune.api_exceptions.ExperimentValidationError: Tags: [1.0] are invalid. Valid tags may contain only lowercase letters, digits, underscores and dashes.

Instantiate estimator from best parameters

Hello!

First off, I absolutely love the interface and easy integration/documentation you provide!! And hats off for @jakubczakon whom provided with great tutorials! However I stumbled on a problem when retrieving the best results. I saved artifacts but I'm not successful in retrieving them.

>>> exp = project.get_experiments('BAY-59')[0]
>>> artifacts = exp.download_artifacts()
>>> print(artifacts)
None

And if i try by retrieving the best results by project.get_leaderboard('BAY-59') the results is a string.

`log_artifact` fails silently

I tried to upload my pytorch model weights as an artifact through a BytesIO:

buffer = BytesIO()
print(getsizeof(buffer))  # 96
torch.save(model.state_dict(), buffer)
print(getsizeof(buffer))  # 101291839
experiment.log_artifact(artifact=buffer, destination="fold0.pth")

But the artifact is empty (0B) and the python script doesn't crash:

simplejson.errors.JSONDecodeError

This didn't happen with the same code, and I am using the fastai neptune callback

Traceback (most recent call last):
  File "finetune.py", line 116, in <module>
    neptune.init(project_qualified_name='natsume/electra-glue')
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/__init__.py", line 148, in init
    backend = HostedNeptuneBackend(api_token, proxies)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/utils.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 91, in __init__
    self._client_config = self._create_client_config(self.credentials.api_token, backend_client)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/utils.py", line 210, in wrapper
    return func(*args, **kwargs)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 930, in _create_client_config
    config = backend_client.api.getClientConfig(X_Neptune_Api_Token=api_token).response().result
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 200, in response
    swagger_result = self._get_swagger_result(incoming_response)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 124, in wrapper
    return func(self, *args, **kwargs)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
    self.request_config.response_callbacks,
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 337, in unmarshal_response
    op=operation,
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 374, in unmarshal_response_inner
    content_value = response.json()
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/requests_client.py", line 160, in json
    return self._delegate.json(**kwargs)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/requests/models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/simplejson/__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Http request to Neptune server failed: HTTPRequestTimeout().

It seems that I could not connect to neptune server.

download image channel

It would be useful to download images from the image channel.
For example:

exp =project.get_experiment(id='PROJ-1')
exp.download_image_channel('local/dest/to/image/dir')

And the images would land in my 'local/dest/to/image/dir` folder.

ModuleNotFoundError: No module named 'oauthlib.oauth2' but it is installed

I'm trying to run an experiment but I am getting an error:

neptune.init("blackarbsceo/hpo-es-features", api_token=api_key)
neptune.create_experiment(
    "hyperparameter-optuna-rf-test", upload_source_files=["*.py"]
)
neptune_callback = optuna_utils.NeptuneCallback(log_study=True, log_charts=True)
Traceback (most recent call last):
  File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\IPython\core\interactiveshell.py", line 3418, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-63e8b42c81c5>", line 17, in <module>
    import neptune
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\neptune\__init__.py", line 24, in <module>
    from neptune.internal.backends.hosted_neptune_backend import HostedNeptuneBackend
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\neptune\internal\backends\hosted_neptune_backend.py", line 58, in <module>
    from neptune.oauth import NeptuneAuthenticator
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\neptune\oauth.py", line 21, in <module>
    from oauthlib.oauth2 import TokenExpiredError, OAuth2Error
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'oauthlib.oauth2'

oauthlib                  3.1.0                    pypi_0    pypi
neptune-client            0.4.132+2.g26cdb5a          pypi_0    pypi
neptune-contrib           0.25.0                   pypi_0    pypi

add many tags at once

Hey,

I would be good/friendly to be able to add many tags at once. It can be for example npt_exp.append_tag(['tag1', 'tag2', 'tag3']) so list of tags. In this case adding single tag just as string is valid, like: npt_exp.append_tag('tag1')

@aniezurawski @jakubczakon what do you think?

parameters are all strings

Problem
my code

PARAMS = {'lr': 0.0005,
          'dropout': 0.2,
          'batch_size': 64,
          'optimizer': 'adam',
          }

project = neptune.Session().get_project('kamil/Tensor-Cell-Demo')
npt_exp = project.create_experiment(name='neural-net-mnist',
                                    params=PARAMS
                                    )

Now, when I do: npt_exp.get_parameters()['batch_size'], I have string returned, while it should be int.

Solution
npt_exp.get_parameters() should return dict with original types.

What do you guys think: @pitercl @lukasz-walkiewicz

ModuleNotFoundError (it is installed and appears with pip list)

Hi,

Running in VSCode JupyterNotebooks with python enviroment 3.8.2
neptune-client 0.4.130

when I try to:
import neptune

I get this error:

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import neptune
2
3 import pandas as pd
4 import numpy as np
5

ModuleNotFoundError: No module named 'neptune'

And yes, the library is installed:
!pip list

Package Version

... 6.0.7
nbformat 5.0.8
neptune-client 0.4.130
neptune-notebooks 0.0.16
nest-asyncio 1.4.3
...

I have tried with different versions but nothing, any idea?

Thank you

neptune-ai / neptune-client Goto Github PK

neptune-client's People

Contributors

Stargazers

Watchers

Forkers

neptune-client's Issues

Recommend Projects

Recommend Topics

Recommend Org