azure / azureml-cheatsheets Goto Github PK

View Code? Open in Web Editor NEW

31.0 31.0 17.0 4.98 MB

Azure Machine Learning Cheat Sheets

Home Page: https://azure.github.io/azureml-cheatsheets

License: MIT License

JavaScript 49.26% Python 41.46% CSS 9.28%

azure azure-machine-learning azureml ml

azureml-cheatsheets's Introduction

Azure Machine Learning Cheat Sheets

https://azure.github.io/azureml-cheatsheets

This website is built using Docusaurus 2, a modern static website generator.

Contributions

Please see the contributing guide.

Contributor License Agreement

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

azureml-cheatsheets's People

Contributors

Stargazers

Watchers

Forkers

wenxwei jpe316 isabella232 kit1980 mx-iao pawarbi vmagelo ruizhuanguw konabuta kyoro1 acartro timvink xia-xiao taehjoo2014 thegovind srutirk21 sourcecodecheck

azureml-cheatsheets's Issues

[Cheatsheet] Logging examples

Provide more run.log examples - and examples of how to read their metrics from run via sdk (highlighting the different format of the metrics list, dict, ...)

The following metrics can be added to a run while training an experiment.

Scalar

Log a numerical or string value to the run with the given name using log. Logging a metric to a run causes that metric to be stored in the run record in the experiment. You can log the same metric multiple times within a run, the result being considered a vector of that metric.

Example: run.log("accuracy", 0.95)

List

Log a list of values to the run with the given name using log_list.

Example: run.log_list("accuracies", [0.6, 0.7, 0.87])

Row

Using log_row creates a metric with multiple columns as described in kwargs. Each named parameter generates a column with the value specified. log_row can be called once to log an arbitrary tuple, or multiple times in a loop to generate a complete table.

Example: run.log_row("Y over X", x=1, y=0.4)

Table

Log a dictionary object to the run with the given name using log_table.

Example: run.log_table("Y over X", {"x":[1, 2, 3], "y":[0.6, 0.7, 0.89]})

Image

Log an image to the run record. Use log_image to log an image file or a matplotlib plot to the run. These images will be visible and comparable in the run record.

Example: run.log_image("ROC", path)

X = np.random.rand(10, 10)
plt.imshow(X)
plt.savefig('test.png')
run.log_image('test.png', 'test.png')

[Cheatsheet] Pip requirements examples

Provide examples on what is possible for pip installation in azureml environments. (ado work item)

More details:
This ask came from Deepspeed team as they develop their Deepspeed curated environment.

How can we include "non-standard" pip commands in environments? For example:

apex install builds a bunch of kernels (cuda/cpp) and needs to be installed like this:
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

which is easy to define in a traditional dockerfile but doing it in the CE setup is a bit tricky

The Conda file's pip section is effectively a pip requirements.txt file and supports most pip options. See doc at:
https://pip.pypa.io/en/stable/reference/pip_install/#requirements-file-format

That includes global-option, for example, but it is not a complete set of pip flags.

If there's one we need here that's not covered there, the only path today is to add it directly into the Dockerfile.

Medium-term, we're working on changes that will store our environments as a Dockerfile + context, which will make advanced use cases like this more straightforward.

[Snippet] Snippets for Hyperdrive runs

Proposal for the new snippet:

Adding Snippet for Hyperdrive runs

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters
https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.hyperdriverun?view=azure-ml-py

Example of the snippet:

from azureml.train.hyperdrive import RandomParameterSampling
from azureml.train.hyperdrive import normal, uniform, choice
param_sampling = RandomParameterSampling( {
        "learning_rate": normal(10, 3),
        "keep_probability": uniform(0.05, 0.1),
        "batch_size": choice(16, 32, 64, 128)
    }
)

from azureml.train.hyperdrive import BanditPolicy
early_termination_policy = BanditPolicy(slack_factor = 0.1, evaluation_interval=1, delay_evaluation=5)

from azureml.train.hyperdrive import HyperDriveConfig
hd_config = HyperDriveConfig(run_config=src,
                             hyperparameter_sampling=param_sampling,
                             policy=early_termination_policy,
                             primary_metric_name="accuracy",
                             primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                             max_total_runs=100,
                             max_concurrent_runs=4)

from azureml.core.experiment import Experiment
experiment = Experiment(workspace, experiment_name)
hyperdrive_run = experiment.submit(hd_config)

NOTE:
All snippets are added to this folder. Feel free to submit a PR directly to the repo to fast track this snippet request

Pipelines Documentation

Add documentation for creating and running pipelines.

[Cheatsheet] Environments Keyvault Example

Might be worth including a reference / example for Azure Keyvault e.g. setting / getting environment variables.

Installation instruction of npm and yarn does not link to the right place?

Which cheat sheet? Describe the issue

cheat sheet: Contributing
description: It seems the instruction of install npm and yarn link to the installation of docusaurus. Although there are links in the requirement section, I am not sure whether it is better to give out the links directly in the cheat sheet contributing page.

Datasets and Datastore clarification

The current page on datasets and datastores needs clarifying:

Too many ways to do the same thing. Make it clear the recommended approach (datasets)
For adding a datareference to ScriptRunConfig we mention secret environment variables. Make it clear that the user does not need to know this to use DataReference in their script.

Adding Version Number to the Notebook Snippets

Currently, customers don't know what version of the SDK the SNippets are for.

How about if we put SDK version into the snippet like this:

"Import Workspace": {
"prefix": ["import-workspace"],
"body": [
"from azureml.core import Workspace",
"$0"
],
"description": "Import Workspace class",
"python_sdk_version": "1.1.18"
},

Update to Show SDK V2

Describe the request

It would be great if you updated the cheetsheet to show information using SDK v2, as that's the recommended sdk version to use if starting a new project now.

Additional context

It perhaps would be worth keeping v1 around, but with a toggle switch or something to view both versions.

[Snippet] Need a bunch of snippets for registering assets in AML

Proposal for the new snippet:

Need snippets for registerings:

Models
Datastores
Datasets
Environment

Example of the snippet:

Dataset (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets)

titanic_ds = titanic_ds.register(workspace=ws,
                                 name='titanic_ds',
                                 description='titanic training data')

Datastore (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-access-data):

blob_datastore_name='azblobsdk' # Name of the datastore to workspace
container_name=os.getenv("BLOB_CONTAINER", "<my-container-name>") # Name of Azure blob container
account_name=os.getenv("BLOB_ACCOUNTNAME", "<my-account-name>") # Storage account name
account_key=os.getenv("BLOB_ACCOUNT_KEY", "<my-account-key>") # Storage account access key

blob_datastore = Datastore.register_azure_blob_container(workspace=ws, 
                                                         datastore_name=blob_datastore_name, 
                                                         container_name=container_name, 
                                                         account_name=account_name,
                                                         account_key=account_key)

file_datastore_name='azfilesharesdk' # Name of the datastore to workspace
file_share_name=os.getenv("FILE_SHARE_CONTAINER", "<my-fileshare-name>") # Name of Azure file share container
account_name=os.getenv("FILE_SHARE_ACCOUNTNAME", "<my-account-name>") # Storage account name
account_key=os.getenv("FILE_SHARE_ACCOUNT_KEY", "<my-account-key>") # Storage account access key

file_datastore = Datastore.register_azure_file_share(workspace=ws,
                                                     datastore_name=file_datastore_name, 
                                                     file_share_name=file_share_name, 
                                                     account_name=account_name,
                                                     account_key=account_key)

adlsgen2_datastore_name = 'adlsgen2datastore'

subscription_id=os.getenv("ADL_SUBSCRIPTION", "<my_subscription_id>") # subscription id of ADLS account
resource_group=os.getenv("ADL_RESOURCE_GROUP", "<my_resource_group>") # resource group of ADLS account

account_name=os.getenv("ADLSGEN2_ACCOUNTNAME", "<my_account_name>") # ADLS Gen2 account name
tenant_id=os.getenv("ADLSGEN2_TENANT", "<my_tenant_id>") # tenant id of service principal
client_id=os.getenv("ADLSGEN2_CLIENTID", "<my_client_id>") # client id of service principal
client_secret=os.getenv("ADLSGEN2_CLIENT_SECRET", "<my_client_secret>") # the secret of service principal

adlsgen2_datastore = Datastore.register_azure_data_lake_gen2(workspace=ws,
                                                             datastore_name=adlsgen2_datastore_name,
                                                             account_name=account_name, # ADLS Gen2 account name
                                                             filesystem='test', # ADLS Gen2 filesystem
                                                             tenant_id=tenant_id, # tenant id of service principal
                                                             client_id=client_id, # client id of service principal
                                                             client_secret=client_secret) # the secret of service principal

Model (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-update-web-service):

# Register new model.
new_model = Model.register(model_path="outputs/sklearn_mnist_model.pkl",
                           model_name="sklearn_mnist",
                           tags={"key": "0.1"},
                           description="test",
                           workspace=ws)

Environments (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-with-custom-image):

# Register Environments
from azureml.core.environment import Environment
myenv = Environment(name="myenv")
myenv.register(workspace=ws)

NOTE:
All snippets are added to this folder. Feel free to submit a PR directly to the repo to fast track this snippet request

[Snippet] AzureML Widgets

Proposal for the new snippet:

Adding a snippet for the AML Widgets

Documentation: https://docs.microsoft.com/en-us/python/api/azureml-widgets/azureml.widgets.rundetails?view=azure-ml-py#remarks
Python Package: https://pypi.org/project/azureml-widgets/

Example of the Full snippet:

  from azureml.core import Workspace,Experiment,Run
  from azureml.widgets import RunDetails

  ws = Workspace.from_config()
  exp = ws.experiments['$experiment_name']
  run = Run(exp,'$run_id')
  RunDetails(run).show()

Example of the simple snippet:

   from azureml.widgets import RunDetails
   RunDetails($run_object).show()

A rub_object = Run(experiment, run_id, outputs=None, **kwargs). Details

NOTE:
All snippets are added to this folder. Feel free to submit a PR directly to the repo to fast track this snippet request

Commands in ScriptRunConfig

ScriptRunConfig can accept a command argument. Example and some real-world use cases

e.g. some simplified version of the following

target = ws.compute_targets["gpu-nc24-ssh"]

env = Environment.from_pip_requirements('elr2-classification', '../requirements.txt')
env.docker.enabled = True
env.docker.base_image = 'mcr.microsoft.com/azureml/openmpi3.1.2-cuda10.1-cudnn7-ubuntu18.04'

mpi_config = MpiConfiguration(process_count_per_node=4, node_count=2)

command = "export PYTHONPATH=$PWD/ELR2 && cd ELR2-Scenarios/classification/elr2_version && python test_distributed_training.py".split()

src = ScriptRunConfig(
    source_directory='../../..',
    command=command,
    compute_target=target,
    distributed_job_config=mpi_config,
    environment=env,
)

run = Experiment(ws, "elr2-ddp-test").submit(src)

[Snippet] Some ML pipeline snippets for individual steps

Proposal for the new snippet:

https://docs.microsoft.com/en-us/azure/machine-learning/concept-ml-pipelines

Example of the snippet:

ws = Workspace.from_config() 
blob_store = Datastore(ws, "workspaceblobstore")
compute_target = ws.compute_targets["STANDARD_NC6"]
experiment = Experiment(ws, 'MyExperiment') 

input_data = Dataset.File.from_files(
    DataPath(datastore, '20newsgroups/20news.pkl'))

dataprep_step = PythonScriptStep(
    name="prep_data",
    script_name="dataprep.py",
    compute_target=cluster,
    arguments=[input_dataset.as_named_input('raw_data').as_mount(), dataprep_output]
    )
output_data = OutputFileDatasetConfig()
input_named = input_data.as_named_input('input')

steps = [ PythonScriptStep(
    script_name="train.py",
    arguments=["--input", input_named.as_download(), "--output", output_data],
    compute_target=compute_target,
    source_directory="myfolder"
) ]

pipeline = Pipeline(workspace=ws, steps=steps)

pipeline_run = experiment.submit(pipeline)
pipeline_run.wait_for_completion()

NOTE:
All snippets are added to this folder. Feel free to submit a PR directly to the repo to fast track this snippet request

[Snippet] Create Tabular or File Dataset

Proposal for the new snippet:

Add a snippet to make it easy to create datasets in AML
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets

Example of the snippet:

# create a FileDataset pointing to files in 'animals' folder and its subfolders recursively
datastore_paths = [(datastore, 'animals')]
animal_ds = Dataset.File.from_files(path=datastore_paths)

# create a FileDataset from image and label files behind public web urls
web_paths = ['https://azureopendatastorage.blob.core.windows.net/mnist/train-images-idx3-ubyte.gz',
             'https://azureopendatastorage.blob.core.windows.net/mnist/train-labels-idx1-ubyte.gz']
mnist_ds = Dataset.File.from_files(path=web_paths)

from azureml.core import Workspace, Datastore, Dataset

datastore_name = 'your datastore name'

# get existing workspace
workspace = Workspace.from_config()
    
# retrieve an existing datastore in the workspace by name
datastore = Datastore.get(workspace, datastore_name)

# create a TabularDataset from 3 file paths in datastore
datastore_paths = [(datastore, 'weather/2018/11.csv'),
                   (datastore, 'weather/2018/12.csv'),
                   (datastore, 'weather/2019/*.csv')]

weather_ds = Dataset.Tabular.from_delimited_files(path=datastore_paths)

NOTE:
All snippets are added to this folder. Feel free to submit a PR directly to the repo to fast track this snippet request

None of the "image links" on the GitHub.io-hosted site are clickable

Navigate to:

https://azure.github.io/azureml-examples/

None of the links "Cheat sheet", "VS Code Integration", "Command Line Tools" or their corresponding images are clickable:

[Cheatsheet] Example: Add path to data directory

Question

I am using ScriptRunConfig and with command argument. This expects that all the paths passed are relative to datastore mount. Following is the code which correctly mounts the training_dataset and test_dataset.

How can we specify the paths to the folders? Dataset always excepts files and DataPath doesn't work. Couldn't find an example online.

ds = Datastore.register_azure_file_share(workspace=ws, 
                                         datastore_name='NAME', 
                                         file_share_name='NAME',
                                         account_name='NAME', 
                                         account_key='key',
                                         create_if_not_exists=True)
 
ds = Datastore.get(ws,'NAME')
print(f'found the data datasource     {ds}')
 
 
training_dataset = Dataset.File.from_files(path=(ds, 'train_path_tsv'))
test_dataset = Dataset.File.from_files(path=(ds, 'test_path_tsv'))
output_dir = DataPath(datastore=ds, path_on_datastore='output/')
model_dir = DataPath(datastore=ds, path_on_datastore='model_path')
 
config = ScriptRunConfig(source_directory='.',
                         command=['python',
                                  'script.py',
                                  '--config',
                                  "config.yaml",
                                  '--output',
                                  output_dir,
                                  '--overwrite',
                                  '',                                  
                                  '--data.train_set.CSV.data_files',
                                  training_dataset.as_mount(),
                                   '--data.eval_set.CSV.data_files',
                                  test_dataset.as_mount(),
                                  '--model.model_name_or_path',
                                  model_dir],
                         compute_target=compute_target,
                         environment=env)

Potential answer

from azureml.pipeline.core import PipelineData
        output_dir = PipelineData(
            name="output_dir",
            datastore=pipeline_datastore,
            pipeline_output_name="output_dir",
            is_directory=True,
        )

Update algolia search index bot

Algolia search index needs updating. When we moved repo to azureml-cheatsheets we also need to update the algolia bot.

[Snippet] ParallelRunStep and ParallelRunConfig

Proposal for the new snippet:

Having a snippet available for both ParalleRunStep and ParallelRunConfig would help accelerate writing pipelines for scoring/training many models scenario.

Example of the snippet:

See: https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-pipeline-batch-scoring-classification#create-the-configuration-to-wrap-the-script

Distributed training - simplified environment variables

It is possible to simplify the set_environment_variables_for_nccl_backend method: https://github.com/Azure/azureml-examples/blob/main/tutorials/using-pytorch-lightning/src/azureml_env_adapter.py

We should also link to this example on azureml-example repo: https://github.com/Azure/azureml-examples/blob/main/tutorials/using-pytorch-lightning/4.train-multi-node-ddp.ipynb

the year of CopyRight

Which cheat sheet? Describe the issue

We found only "2021" in the footnote. Is it better to update to include recent year?

[Snippet] Environment with custom image

Proposal for the new snippet:

I think we should consider implementing this as a separate snippet or extend the one tagged get-environment-sdk as I see this scenario come up fairly frequently.

Example of the snippet:

See: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-with-custom-image

ScriptRunConfig with commands

ScriptRunConfig supports command argument (from which version of the SDK?). This is really useful. Let's add it to the cheatsheet.

e.g

command = "export PYTHONPATH=$PWD && cd foo && python lights_on.py".split()

src = ScriptRunConfig(
    source_directory='src',
    command=command,
    compute_target=target,
    distributed_job_config=mpi_config,
    environment=pytorch_env,
)

Search results bug - point to ja page

Search results are pointing to ja page even when en is selected (by default).

These links are typically broken

Multilingual support

Some customers in Japan request us to translate this page into Japanese. I want to contribute to create Japanese pages. Any plan to support non-English languages in this site ?

I don't know much about docusaurus, but it seems to have multi-language support. i18n - Introduction

Model deployment

We are missing model deployment.

Worse, the model deployment process is poorly documented in our official docs, and the process is not so straightforward.

Add multinode transformers example to distributed training guide

Add multinode transformers example to distributed training guide.

RE: https://stackoverflow.microsoft.com/questions/259878/259955#259955

[Snippet] AzureML Consume Dataset

Proposal for the new snippet:

Adding snippet to consume Datasets from AzureML Datasets. Similar to what is on the AzureML Dataset page.
This will download the dataset to the users working directory

Example of the snippet:

# azureml-core of version 1.0.72 or higher is required
from azureml.core import Workspace, Dataset

subscription_id = XXX
resource_group = XXX
workspace_name =XXX

workspace = Workspace(subscription_id, resource_group, workspace_name)

dataset = Dataset.get_by_name(workspace, name={dataset_name})
dataset.download(target_path='.', overwrite=False)

NOTE:
All snippets are added to this folder. Feel free to submit a PR directly to the repo to fast track this snippet request

Summary report for `Experiment`

Describe the request

We have an example on checking log with this picture, and what about adding the Summary as follows?

This is because the above graph is much easily checked like a kind of summary report, at first glance.

Additional context

[Snippet] MpiConfiguration

Proposal for the new snippet:

Automatically generate distributed training configuration e.g. MpiConfiguration, including import.

Example of the snippet:

from azureml.core.runconfig import MpiConfiguration
distributed_config = MpiConfiguration(
        process_count_per_node=args.process_count_per_node,
        node_count=args.node_count,
    )

This is one of those scenarios I have to look up every time!

Description about "datastore.upload_files"

Describe the request

When we use datastore.upload_files, we may encounter warning like this link, and we're urged to use Dataset.File.upload_directory instead:

"datastore.upload_files" is deprecated after version 1.0.69

How about adding some notes in this section?

https://azure.github.io/azureml-cheatsheets/docs/cheatsheets/python/v1/data/#upload-to-blob-datastore

ref. This is official reference: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.azure_storage_datastore.azureblobdatastore?view=azure-ml-py#azureml-data-azure-storage-datastore-azureblobdatastore-upload-files

Developing on a compute cluster

It is a frequent painpoint that developing using an azureml compute cluster is painful for (at least) the following reasons:

Idle time waiting for cluster to warm up / download images
Changes to environment cause image rebuild which is super slow

Teams often find it easier to work with a VM to avoid some of this (something which has its own drawbacks e.g. working with datasets on a VM, having to add an additional migration step from VM -> azureml sdk).

Proposal to write a guide for "Developing on an azureml compute cluster".

Typical use case might be testing distributed training - for which a cluster that matches your production environment is preferred.

Tips:

Increase idle time on the cluster
Set up cluster with SSH incase you want to get inside
Use command with ScriptRunConfig to:
- Add environment variables and,
- Update the python environment without having to rebuild the docker image
- Run a setup.py script ahead of the "training" script as a best practice to achieve the above

Another tip is the use of PYTHONPATH=$PWD if you are actively working on source code without having to update the python environment.

Run History missing in sidemenu

Which cheat sheet? Describe the issue

cheat sheet: https://azure.github.io/azureml-cheatsheets/docs/cheatsheets/python/v1/run-history/
description: This page is missing in left sidemenu. Should it be under Azure ML Resources?

[Cheatsheet] Distributed training guide updates

Distributed training guide

Details:

update PyTorch guide to use PyTorch job instead of MPI workaround

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

Documentation for debugging and log files

A common issue is how to debug failing jobs, including when to look in which log file.

azure / azureml-cheatsheets Goto Github PK

azureml-cheatsheets's Introduction

Azure Machine Learning Cheat Sheets

Contributions

Contributor License Agreement

Code of Conduct

azureml-cheatsheets's People

Contributors

Stargazers

Watchers

Forkers

azureml-cheatsheets's Issues

Proposal for the new snippet:

Example of the snippet:

Which cheat sheet? Describe the issue

Describe the request

Additional context

Proposal for the new snippet:

Example of the snippet:

Dataset (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets)

Datastore (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-access-data):

Model (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-update-web-service):

Environments (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-with-custom-image):

Proposal for the new snippet:

Example of the Full snippet:

Example of the simple snippet:

Proposal for the new snippet:

Example of the snippet:

Proposal for the new snippet:

Example of the snippet:

Question

Potential answer

Proposal for the new snippet:

Example of the snippet:

Which cheat sheet? Describe the issue

Proposal for the new snippet:

Example of the snippet:

Proposal for the new snippet:

Example of the snippet:

Describe the request

Additional context

Proposal for the new snippet:

Example of the snippet:

Describe the request

Which cheat sheet? Describe the issue

Distributed training guide

Details:

Recommend Projects

Recommend Topics

Recommend Org