Giter Club home page Giter Club logo

pydgn's People

Contributors

dependabot[bot] avatar diningphil avatar gravins avatar sauravmaheshkar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pydgn's Issues

Keep getting raylet error

๐Ÿ”จ Describe the bug

Hi, I am keep getting raylet error when tryin g to run the example. Is there a way to stop using ray since I am running the experiment on my local computer?

Thank you!

(raylet) /home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/autoscaler/_private/cli_logger.py:57: FutureWarning: Not all Ray CLI dependencies were found. In Ray 1.4+, the Ray CLI, autoscaler, and dashboard will only be usable via pip install 'ray[default]'. Please update your install command.
(raylet) warnings.warn(
2022-10-14 13:08:12,974 WARNING worker.py:1189 -- The agent on node EW22-05284 failed with the following error:
Traceback (most recent call last):
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 354, in
loop.run_until_complete(agent.run())
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 144, in run
modules = self._load_modules()
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 98, in _load_modules
c = cls(self)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/modules/reporter/reporter_agent.py", line 148, in init
self._metrics_agent = MetricsAgent(dashboard_agent.metrics_export_port)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/metrics_agent.py", line 75, in init
prometheus_exporter.new_stats_exporter(
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 333, in new_stats_exporter
exporter = PrometheusStatsExporter(
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 266, in init
self.serve_http()
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 320, in serve_http
start_http_server(
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/prometheus_client/exposition.py", line 168, in start_wsgi_server
TmpServer.address_family, addr = _get_best_family(addr, port)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/prometheus_client/exposition.py", line 157, in _get_best_family
infos = socket.getaddrinfo(address, port)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/socket.py", line 918, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

(raylet) Traceback (most recent call last):
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 366, in
(raylet) raise e
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 354, in
(raylet) loop.run_until_complete(agent.run())
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
(raylet) return future.result()
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 144, in run
(raylet) modules = self._load_modules()
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 98, in _load_modules
(raylet) c = cls(self)
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/modules/reporter/reporter_agent.py", line 148, in init
(raylet) self._metrics_agent = MetricsAgent(dashboard_agent.metrics_export_port)
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/metrics_agent.py", line 75, in init
(raylet) prometheus_exporter.new_stats_exporter(
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 333, in new_stats_exporter
(raylet) exporter = PrometheusStatsExporter(
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 266, in init
(raylet) self.serve_http()
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 320, in serve_http
(raylet) start_http_server(
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/prometheus_client/exposition.py", line 168, in start_wsgi_server
(raylet) TmpServer.address_family, addr = _get_best_family(addr, port)
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/prometheus_client/exposition.py", line 157, in _get_best_family
(raylet) infos = socket.getaddrinfo(address, port)
(raylet) File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/socket.py", line 918, in getaddrinfo
(raylet) for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
(raylet) socket.gaierror: [Errno -2] Name or service not known
(raylet) /home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/autoscaler/_private/cli_logger.py:57: FutureWarning: Not all Ray CLI dependencies were found. In Ray 1.4+, the Ray CLI, autoscaler, and dashboard will only be usable via pip install 'ray[default]'. Please update your install command.
(raylet) warnings.warn(
2022-10-14 13:08:14,624 WARNING worker.py:1189 -- The agent on node EW22-05284 failed with the following error:
Traceback (most recent call last):
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 354, in
loop.run_until_complete(agent.run())
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 144, in run
modules = self._load_modules()
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 98, in _load_modules
c = cls(self)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/new_dashboard/modules/reporter/reporter_agent.py", line 148, in init
self._metrics_agent = MetricsAgent(dashboard_agent.metrics_export_port)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/metrics_agent.py", line 75, in init
prometheus_exporter.new_stats_exporter(
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 333, in new_stats_exporter
exporter = PrometheusStatsExporter(
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 266, in init
self.serve_http()
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py", line 320, in serve_http
start_http_server(
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/prometheus_client/exposition.py", line 168, in start_wsgi_server
TmpServer.address_family, addr = _get_best_family(addr, port)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/site-packages/prometheus_client/exposition.py", line 157, in _get_best_family
infos = socket.getaddrinfo(address, port)
File "/home/jwtxwd/anaconda3/envs/pydgn/lib/python3.8/socket.py", line 918, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

Metric improvement

Feature description

Metric should always use the result from the _handle_reduction function, even when accumulating the number of samples.

This is because _handle_reduction may do something more complicated in some metrics that override the function.

Be careful, however, that this works for both "mean" and "sum" reductions.

Ideas on how to do it

No response

Additional info

No response

Weighted Additive Loss

Feature description

Improve AdditiveLoss by adding the possibility of weighting the different losses.

Ideas on how to do it

No response

Additional info

No response

Training engine and returned data list

Feature description

When shuffling the dataset, the training engine will return the data list shuffled according to the permutation of the data sampler. We should make sure that we reorder the data list.

Ideas on how to do it

No response

Additional info

No response

Accumulate predictions for metric that require global statistics

Feature description

Metrics like AP require that all the samples of the train/valt/test dataset are taken into account when computing a score. We should add an option to allow, with greater usage of memory, to accumulate all predictions and target values until the end of an epoch, and subsequently compute the epoch score.

Ideas on how to do it

No response

Additional info

No response

Improving `load_dataset`

Feature description

The load_dataset function assumes:

  • the dataset_kwargs.pt are stored necessarily in the processed folder, which may not be the case (not all datasets assume the processed data has to be stored there)
  • no extra arguments can be passed to the dataset at runtime

Ideas on how to do it

The function should

  • accept extra kwargs dict to pass additional arguments to the dataset at run-time (e.g., to specify 'training', 'validation', 'test')
  • we should store the dataset_kwargs.pt file in the main dataset folder, since we have more guarantees that this folder will be created

Additional info

No response

Bug: training engine does not work with shuffle = False

๐Ÿ”จ Describe the bug

On line 495 of infer this error is raised:

AssertionError: Training Engine requires a pydgn.data.sampler.RandomSampler as the sampler of the loader when shuffle=True. Please see the documentation of DataProvider and the _get_loader method.

But when shuffle=False a torch.utils.data.sampler.SequentialSampler, not None. So we should fix it.

Telegram Bot support

Feature description

Add telegram bot support to send messages whenever an experiment breaks suddenly or the entire set of experiments (with the chance to have granularity) has finished

Ideas on how to do it

Once you create a bot, it should be easy to send a message to a particular chat.

Additional info

No response

Add option to specify subset of GPUs

Feature description

In case one wants to force specific GPU IDs to be used, it should be possible to do so when running pydgn-train.

Ideas on how to do it

No response

Additional info

No response

Filtering down configurations for analysis

Feature description

Add a utility feature to filter the set of configurations of an outer fold's model selection to perform subsequent analyses.

Ideas on how to do it

Use a dictionary-based approach, where we decide what to filter using key-value matchings and AND/OR logic.

Additional info

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.