Describe the bug Hi, I'm using Hyperparameter Optimizer (HPO) but

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

HPO converts all hyperparameters into strings about clearml HOT 3 CLOSED

wxdrizzle commented on June 10, 2024

HPO converts all hyperparameters into strings

from clearml.

Comments (3)

ainoam commented on June 10, 2024

@wxdrizzle Can you provide a simple code example?

BTW, why did you go with dict_params = task.get_parameters(cast=True) and task.set_parameters(dict_params)
rather than dict_params = task.connect(dict_params)?

from clearml.

wxdrizzle commented on June 10, 2024

Hi @ainoam , thanks a lot for your reply!

Code example

First, I created a training.py, the code in which is:

from clearml import Task
import sys

cli = sys.argv[1:]
if '--manually' in cli:
    run_by_agent = False
else:
    run_by_agent = True


def read_yaml(run_by_agent):
    if run_by_agent:
        dict_params = {}
    else:
        dict_params = {
            'dataset/modalities': ['CT', 'MRI'],
            'model/name': 'u-net',
        }
    return dict_params


if not run_by_agent:
    task = Task.init(project_name='tmp_project', task_name='tmp_task', task_type="training")
    # the following two lines are because I want the agents use my existing python environment
    task.set_base_docker(docker_image='/home/xxx/software/anaconda3/envs/research')
    task.set_packages([])
    dict_params = read_yaml(run_by_agent)
    task.set_parameters(dict_params)
else:
    task = Task.init()
    dict_params = task.get_parameters(cast=True)
print('run by agent?', run_by_agent)
print('dict_params: ', dict_params)
print('type of dataset/modalities', type(dict_params['dataset/modalities']))

Then I executed this file manually, by python training.py --manually. Here --manually is just to tell the code that this is not executed by an agent. (I'm not sure if there is a simpler way to automatically determine whether the file is run by an agent or manually; if so, please kindly let me know). Then, I saw the output is:

run by agent? False
dict_params:  {'dataset/modalities': ['CT', 'MRI'], 'model/name': 'u-net'}
type of dataset/modalities <class 'list'>

Note that I used task.set_base_docker() in order to prevent the next hyperparameter optimization step from creating new python environment. This method is from here.

You can see, the type of the hyperparameter "dataset/modalities" is "list", as expected.

Next, I found the ID of the generated task was "13f202cc8a014ba4b92f1e93e34352d1". Then, I created another file hyperparam_optim.py, the code in which is:

from clearml.automation import UniformParameterRange, UniformIntegerParameterRange, ParameterSet, DiscreteParameterRange
from clearml.automation import HyperParameterOptimizer, GridSearch, Objective
from clearml.automation.optuna import OptimizerOptuna
from clearml import Task

task = Task.init(project_name='tmp_project', task_name='hyperparam_optim', task_type=Task.TaskTypes.optimizer,
                 reuse_last_task_id=False)

objective_metric = Objective('test', 'dice_mean')

optimizer = GridSearch(
    base_task_id='13f202cc8a014ba4b92f1e93e34352d1',
    hyper_parameters=[
        DiscreteParameterRange('model/name', ['resnet']),
    ],
    objective_metric=objective_metric,
    num_concurrent_workers=16,
    objective_metric_title='test',
    objective_metric_series='dice_mean',
    objective_metric_sign='max',
    execution_queue='one_gpu_work',
    max_iteration_per_job=50000,
)

optimizer.start()
optimizer.wait()
optimizer.stop()

Note that the base_task_id is different if you try to reproduce it.

Then I ran python hyperparam_optim.py, and in the Webapp I saw one task ran by the HPO. The console output of that task is:

You can see, the type of the hyperparameter "dataset/modalities" changed from "list" to "str". As a current workaround, I have to manually go through all hyperparameters and use eval(xxxx) to change the type of xxxx from "str" to the type it should be. If you have any solution I'd be very grateful.

Regarding why I didn't use `task.connect()`

One reason is that my hyperparameters have different categories, i.e., some are for the dataset, and some are for the model, etc. For example, assume the hyperparameter dict dict_params have two keys: "model/name" and "dataset/name". If I use task.connect(dict_params), then these keys will appear in the section "General" in the Webapp, as shown below:

But I prefer to have several sections like "model" and "dataset". If I use task.set_parameters(dict_params) then I can achieve this, as shown below:

Another reason is I'm actually not sure how to use task.connect() with agents. As you can see above, in my daily practice, if I run experiments manually, I want to read from a specific yaml file to get the hyperparameters and initialize dict_params. However, consider the case that I clone an existing task, modify some hyperparameters, and the send it to a queue (so an agent can run it). In this case, I don't have a yaml file path, so I can only set dict_params to {} if I want to use dict_params=task.connect(dict_params). I found the result is still {}. That is to say, I'm not able to get modified hyperparameters using task.connect() when running an agent. Is anything I did wrong? I really appreciate it if you have suggestions on this also.

from clearml.

wxdrizzle commented on June 10, 2024

The problem is solved by installing the new version of clearml as described at #975 (comment)

Thanks a lot!!

from clearml.

HPO converts all hyperparameters into strings about clearml HOT 3 CLOSED

Comments (3)

Code example

Regarding why I didn't use `task.connect()`

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (3)

Code example

Regarding why I didn't use task.connect()

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

Regarding why I didn't use `task.connect()`