aws-deepracer / aws-deepracer-workshops Goto Github PK

View Code? Open in Web Editor NEW

1.2K 99.0 714.0 332.29 MB

DeepRacer workshop content

License: MIT No Attribution

Jupyter Notebook 83.28% Python 16.58% Dockerfile 0.04% Shell 0.11%

aws-deepracer-workshops's Introduction

ReadMe - Guidance for training an AWS DeepRacer model using Amazon SageMaker

Table of contents

Overview
- Cost
Prerequisites
- Operating system
Deployment steps
Deployment validation
Running the guidance
Next steps
Cleanup

Overview

The AWS DeepRacer console is optimized to provide a user-friendly introduction to reinforcement learning to developers new to machine learning. As developers go deeper in their machine learning journey, they need more control and more options for further tuning and refining their reinforcement learning models for racing with AWS DeepRacer. This guidance is intended to provide developers with a deep dive on how they can use an Amazon SageMaker Notebook instance to directly train and evaluate DeepRacer models with full control, including: augmenting the simulation environment, manipulating inputs to the neural network, modifying neural network architecture, running distributed rollouts, debugging their model.

Architecture overview

architecture-for-training-an-aws-deepracer-model-using-amazon-sagemaker

Cost

You are responsible for the cost of the AWS services used while running this Guidance.

As of 10/30/2023, the cost for running this guidance with the default settings in the US East (N. Virginia) is approximately $31.27 per month for training 5 models, 1 hour each, with training spread across 5 days. Remember to shut down your SageMaker Notebook instance each day.

Service	Assumptions	Cost Per Month
Amazon SageMaker Studio Notebook	1 Notebook Instance used for 25 hours	$9.97
Amazon SageMaker Training	5 jobs per month x 1 instance per job x 1 hour per job, 32 GB SSD storage	$6.87
AWS Robo Maker 25 Simulation Unit Hours (SU-Hours)	$10
Amazon CloudWatch	5GB logs storage	$2.52
Amazon Simple Storage Service	10GB data, 1,000 PUT, 1000 GET requests	$0.26
Amazon Kinesis Video Streams	5 hours data Ingestion per day, 5 days storage	$1.65
VPC	All traffic is flowing through a Gateway VPC Endpoint	0
		31.27

Prerequisites (required)

This guidance is targeted towards those familiar with the AWS Console and AWS DeepRacer Service. The users are expected to have a basic understanding of AWS DeepRacer, SageMaker, RoboMaker services, and general Machine Learning concepts. It guides users to utilize these services directly to train, and tune their models to a higher level of performance. It should be run in US East N.Virginia region.

Operating system

Since the guidance runs in the AWS cloud, on an Amazon Sagemaker notebook instance, you should to run it through Mac or Windows instances. Linux is not recommended.

Deployment steps (required)

To deploy the dpr401 AWS CloudFormation stack in order to run the DPR401-notebook instance on Amazon SageMaker:

Log in to your personal AWS account in the AWS Console.
Search for AWS CloudFormation in the search bar on the top of the page.
On the AWS CloudFormation home page, select Create stack and choose With new resources (standard) from the drop-down menu.
On the Create stack page, under Specify template, select Amazon S3 URL under Template source and paste http://dpr401.s3.amazonaws.com/dpr401.yaml under Amazon S3 URL.
Select Next.
On the Stack description page, under Stack name, enter dpr401.
Select Next.
On the Configure stack options page, use the default setttings and select Next.
On the Review dpr401, review your stack and under Capabilities, check the box to acknowledge that AWS CloudFormation might create IAM resources with custom names.
Choose Submit.

Deployment validation

To validate that your AWS CloudFormation stack and Amazon Sagemaker notebook instance were created successfully:

To validate that your AWS CloudFormation stack, log in to your personal AWS account in the AWS Console.
Search for AWS CloudFormation in the search bar on the top of the AWS Console page.
On the Stacks page, under Stack name, verify you have a stack titled dpr401 with a Status of CREATE_COMPLETE.
Select the stack dpr401.
On the dpr401 page, confirm under the Resources tab that DPR401Notebook and DPR401Role are listed.
Next, to validate your Amazon Sagemaker notebook instance, search for Amazon Sagemaker in the search bar at the top of the page.
On the Amazon Sagemaker console homepage, on the left navigation menu, expand the Notebook section and select Notebook instances.
On the Notebook instances page, confirm DPR401-Notebook is listed under Name.

Running the guidance

To run the Guidance for training an AWS DeepRacer model using Amazon SageMaker:

On the Amazon Sagemaker console homepage, on the left navigation menu, expand the Notebook section and select Notebook instances.
On the Notebook instances page, find DPR401-Notebook under Name and select the Open Jupyter link for the notebook. This will open a new browser tab and place you in the Jupyter notebook interface.
Select guidance-for-training-an-aws-deepracer-model-using-amazon-sagemaker under the Files tab.
Under the files tab, open the dpr401 folder.
Choose the deepracer_rl.ipynb file to open the notebook in a new tab.
To initialize and build the environments, scroll down to the Workshop Checkpoint #1 cell and select it.
Select the Cell in the menu bar at the top of the notebook and choose Run All from the dropdown menu. This will execute all the cells up to this point in the notebook loading in all the modules needed by the notebook, seting up the environment, and downloading and building the docker containers. Initializing and building the environments will take up to 20 minutes.

Train the RL model

The training process involves using AWS RoboMaker to emulate driving experiences in the environment, relaying the experiences at fixed intervals to Amazon SageMaker as input to train the deep neural network, and updating the network weights to an S3 location.

Select the Workshop Checkpoint #1 cell, and click the ▶ Run button until you reach Workshop Checkpoint #2. This will execute the following steps:

Configure presets: One can modify the presets to define initial agent parameters. It is commented out by default so the default parameters are used. One can use this to manually "pre-train" the neural network.
Copy custom files to S3 bucket so that Amazon SageMaker and AWS RoboMaker can pick it up: This is where the chosen reward function and model_metadata.json file are copied to S3 where Amazon SageMaker and AWS Robomaker can pick them up. These files are discussed in the later section "Customize Your Training."
Train the RL model using the Python SDK Script mode: This section first defines the metric_definitions and the custom_hyperparameters, then starts the sagemaker training job. Visit the DeepRacer documentation for detailed information on the hyperparameters.
Configure the AWS Robomaker Job: This creates the simulation application from the docker container, and sets up the kinesis video streams.
Launch the Simulation job(s) on AWS RoboMaker: This starts the simulation job.
Visualizing the simulations in AWS RoboMaker: This provides a link for accessing the AWS RoboMaker environment, and another for watching the Kinesis Video Stream.
Creating temporary folder top plot metrics: This creates a local temp folder used for storing metrics from the training job.

Allow the job to run

It will take a few minutes for the Amazon SageMaker job and the AWS RoboMaker jobs to boot up and start training. One can see this in the Amazon Sagemaker training and AWS RoboMaker simulation jobs consoles. A status of Preparing indicates the environment is still starting.
At this point, allow your Amazon SageMaker and AWS RoboMaker jobs to perform training. The default is 600 seconds (10 minutes), set by job_duration_in_seconds = 600 earlier in the notebook.
Feel free to visit the AWS RoboMaker job and the Kinesis Video stream in the meantime to watch the training progress.

After training is complete

After training is complete, click on the Workshop Checkpoint #2 cell and select the ▶ Run button until you have reached Workshop Checkpoint #3.

If you would like to import your trained model into the AWS DeepRacer console, visit Import Model and paste in the S3 path provided in the Upload Your Model into the DeepRacer console cell.

Evaluate your models

After training your model you can evaluate the current state of the training by using an evaluation simulation.

Select the Workshop Checkpoint #3 cell and click the ▶ Run button until you have reached Workshop Checkpoint #4. This will start an evaluation job.

There are several similarities between the training Simulation Job parameters, including the world_name and race_type. Since you are evaluating a trained model you can run an evaluation against the same world name and race type. You can also run an evaluation using a different world name and race type.

There are additional parameters that you can change to customize the evaluation.

Key value pairs

Key	Value
yaml_config['NUMBER_OF_TRIALS']	Set the number of laps for evaluation
yaml_config['DISPLAY_NAME']	Displayed in the upper left corner to identify the current racer
yaml_config['LEADERBOARD_TYPE']	Leave as "LEAGUE"
yaml_config['LEADERBOARD_NAME']	Displayed on the bottom area of the media output
yaml_config['CAR_COLOR']	Controls the color of the racecar
yaml_config['NUMBER_OF_RESETS']	The number of resets allowed per lap
yaml_config['PENALTY_SECONDS']	Leave as "5"
yaml_config['OFF_TRACK_PENALTY']	Number of seconds to add to the race time when the race car leaves the track
yaml_config['COLLISION_PENALTY']	Number of seconds to add to the race time when the race car collides with an obstacle like a box in the OBJECT_AVOIDANCE race type

Below is an example of the evaluation job media output using the parameters set in the notebook.

Plot metrics for evaluation job

For evaluation jobs the metrics you plot are based on the time your race car takes to go around the track including the penalties. This is different than the training job because the evaluation jobs are evaluating the model training not the rewards returned during training.

Head to head evaluation

One will note after Workshop Checkpoint #4 that the notebook contains a Head-to-head evaluation. This is out of scope for the DPR401 workshop, but if you have two models you want to train, you can configure the s3 path to the second model and perform a head-to-head evaluation.

View your output logs and videos

Open up the Amazon S3 Console and select your sagemaker-us-east-1- bucket. Select the deepracer-notebook-sagemaker-- prefix and explore:

The training_metrics.json and evaluation_metrics.json files contain metrics on the behavior of the model, mainly suitable for showing progress during training or evaluation.
The sim- folders contain the simulation logs for the AWS RoboMaker environment.
The iteration-data folder contains videos of the training and evaluation jobs and the simulation trace logs. The logs are much more detailed than the json files mentioned above, and are suitable for analysis. There are three video vides: a top-down view, a 45 degree from behind view, and the same 45 degree from behind view but with a track overlay.
The model folder contains the unfrozen tensorflow graph. This is what is trained further or imported into the DeepRacer console.
The train-output folder (a few folders deep) contains the model.tar.gz file, appropriate for loading onto a physical AWS DeepRacer vehicle and optimization with the Intel OpenVino toolkit.

Next steps

Now that one has successfully trained and evaluated a AWS DeepRacer model using the notebook, one can explore how to customize training in a variety of areas in this section.

This section will show one what areas to modify for customizations; once one makes the modifications one will need to go back and re-run the appropriate sections of the notebook to apply the modifications. It is intended as a general guidebook for one to pursue their own path for customization, not a prescriptive set of steps. Feel free to "think big" and brainstorm on the possibilities.

Create a reward function

In general, you design your reward function to act like an incentive plan. You can customize your reward function with relevant input parameters passed into the reward function. Reward function files

Explore to src/artifacts/rewards/ in the notebook to see example reward functions.
Once you pick or modify one, locate the cell in the notebook labeled Copy custom files to S3 bucket so that Amazon SageMaker and AWS RoboMaker can pick it up and modify this line as appropriate to copy your new reward function appropriately.
!aws s3 cp ./src/artifacts/rewards/default.py {s3_location}/customer_reward_function.py

Example reward functions

Follow the Center Line in Time Trials
follow_center_line.py

This example determines how far away the agent is from the center line, and gives higher reward if it is closer to the center of the track, encouraging the agent to closely follow the center line.

Stay inside the two borders in time trials stay_inside_two_border.py This example simply gives high rewards if the agent stays inside the borders, and let the agent figure out what is the best path to finish a lap. It is easy to program and understand, but likely takes longer to converge.

Prevent zig-zag in time trials prevent_zig_zag.py This example incentivizes the agent to follow the center line but penalizes with lower reward if it steers too much, which helps prevent zig-zag behavior. The agent learns to drive smoothly in the simulator and likely keeps the same behavior when deployed in the physical vehicle.

Stay On One Lane without Crashing into Stationary Obstacles or Moving Vehicles object_avoidance_head_to_head.py This reward function rewards the agent to stay between the track borders and penalizes the agent for getting too close to the next object in the front. The agent can move from lane to lane to avoid crashes. The total reward is a weighted sum of the reward and penalty. The example gives more weight to the penalty term to focus more on safety by avoiding crashes. You can play with different averaging weights to train the agent with different driving behaviors and to achieve different driving performances.

Create your own reward function

If you wish to create your own reward function there is a pattern to the function that you must use:

def reward_function(params) :
    
    reward = ...

    return float(reward)

A list of parameters and their definitions is located here: [https://docs.aws.amazon.com/deepracer/latest/developerguide/deepracer-reward-function-input.html]

Add additional python modules

Several python modules are included in the AWS RoboMaker simapp, but if you want to add more, locate the cell labeled Run these commands if you wish to modify the Amazon SageMaker and AWS Robomaker code and add in additional !docker cp commands to copy the modules you want into the container.

Use another programming language

If one wants to use another programming language for their reward function, one can the python boto3 library to invoke an Amazon Lambda function. Such a method may look like the following:

import boto3,jsonlambdaservice = boto3.client('lambda')
def reward_function(params):
    response = lambdaservice.invoke(FunctionName='YourFunctionHere',
                      Payload=json.dumps(params))
        return(float(response["Payload"]))

One will need to modify the IAM role for the notebook to have Lambda invoke permissions.

Alternatively, one could load the alternate program and any required interpreters and libraries into the docker container, then call out from the python reward function with an os.system() or subprocesses.run() call. In such a case, one needs to consider how to pass the parameters and receive the return value, perhaps by writing temp files to disk or by assigning environmental variables. Note that the reward function is run 10 to 15 times a second during training, so the overhead introduced by calling another executable may be an issue. Due to this overhead, most reinforcement learning researchers stick to Python, as this is the language the rl_coach framework is written in.

Modify the training algorithm

Training adjusts the weights and biases of the neural network so that the correct decisions are made. There are many methods, or algorithms, for how to determine which weights and biases should be adjusted and by how much.

The default algorithm for DeepRacer training is PPO, or Proximal Policy Optimization. This algorithm works with both discrete and continuous action spaces, and tends to be stable but data hungry.

The SAC, or Soft Actor Critic, algorithm is also available. This algorithm only works with a continuous action spaces, and is less stable but also requires less training to learn.

How to set an algorithm

PPO policy

The default algorithm is PPO; if no training algorithm is set in the model_metadata.json file, this is the algorithm used. The metric_definitions and customer_hyperparameter in the notebook in the Train the RL model using the Python SDK Script mode cells are coded for PPO.

SAC policy

If you want to change the training algorithm to SAC, first modify the model_metadata.json file with "training_algorithm" : "sac" and a continuous action space, such as:

{  "action_space" : {
    "steering_angle" : {
      "high" : 30.0,
      "low" : -30.0
    },
    "speed" : {
      "high" : 1.0,
      "low" : 0.5
    }
  },
  "sensor" : [ "FRONT_FACING_CAMERA" ],
  "neural_network" : "DEEP_CONVOLUTIONAL_NETWORK_SHALLOW",
  "version" : "4",
  "training_algorithm" : "sac",
  "action_space_type" : "continuous",
  "preprocess_type" : null,
  "regional_parameters" : null
}

Find example model_metadata.json files in src/artifacts/actions, such as the front-shallow-continuous-sac.json file. After choosing an example, modifying it, or creating a new one, locate the cell labeled Copy custom files to S3 bucket so that Amazon SageMaker and AWS RoboMaker can pick it up and modify the following line to instead copy the file you intend:

!aws s3 cp ./src/artifacts/actions/default.json {s3_location}/model/model_metadata.json

Additionally, modify the hyperparameters in the notebook (Look two cells below the label Train the RL model using the Python SDK Script mode) to include the SAC hyperparameters instead:

custom_hyperparameter = {    "s3_bucket": s3_bucket,
    "s3_prefix": s3_prefix,
    "aws_region": aws_region,
    "model_metadata_s3_key": "%s/model/model_metadata.json" % s3_prefix,
    "reward_function_s3_source": "%s/customer_reward_function.py" % s3_prefix,
    "batch_size": "64",
    "lr": "0.0003",
    "exploration_type": "Additive_noise",
    "e_greedy_value": "0.05",
    "epsilon_steps": "10000",
    "discount_factor": "0.999",
    "sac_alpha": "0.2",
    "stack_size": "1",
    "loss_type": "Mean squared error",
    "num_episodes_between_training": "20",
    "term_cond_avg_score": "100000.0",
    "term_cond_max_episodes": "100000"
}

Cleanup

Deprovision resources so your account does not continue to be charged after completing the workshop. In the notebook, scroll down and select the cell for Workshop Checkpoint #5. Click the ▶ Run button to execute the rest of the cells in the notebook. This will cancel the AWS RoboMaker and Amazon SageMaker jobs (if still running), delete the Amazon Kinesis video streams, delete the Amazon Elastic Container (ECR) repositories, and delete the AWS RoboMaker simapp.

Clean up the Amazon S3 bucket

Consider uncommenting the Clean your S3 bucket cell and executing it if you want to empty the Amazon S3 bucket of generated logs and data, including the trained model. You may also choose to visit the S3 Console [https://s3.console.aws.amazon.com/s3/buckets] and delete the bucket.

If you choose not to do this, you may incur S3 storage costs.

Delete any imported DeepRacer models

If you imported a model into the DeepRacer console, delete it by visiting DeepRacer > Your Models, selecting the model, and choosing Delete under the Actions menu. If you choose not to do this, you may incur AWS DeepRacer model storage costs.

Terminate and Delete the Role and Notebook

Visit CloudFormation Stacks and select the radio button for the dpr401 stack. Select the Delete button. This will terminate and delete the Amazon Sagemaker notebook and delete the IAM role.

Developer tools

Log Analyzer and Visualizations

License summary

This sample code is made available under a modified MIT license. See the LICENSE file.

aws-deepracer-workshops's People

Stargazers

Watchers

Forkers

jonathankeebler partnercloudsupport nllarson klimtever eanewman fricanojohnb kargnas cristiano-andrade tarekbecker msalviejo kranthigy alanmars andynelson hzainuddin2002 deepak-muley thomasgzh openstack-ninja davur himagb polo149278 chie8842 ashwiniverma prateekkakirwar brettswift onema vanb lucmacpaquet adicostil fatatooine jihys maroon1st saranshlamba pankaj-lakhina jmyung sahilvv teos0009 irfanhsiddiqui bliii akranga dmessing neomaximus dangarfield clarson99 jeffsheets poncin elmic11111 mllanes greendavegreen kyungjinbae berimboloenterprises mansjun kojikita paulmgithub shicks1066 ldmacdonald rfzeg maximosalviejo ghback okeee0 objectbuild mdloo 1luvc0d3 haystackers leafyq o-can rthille 52xiami jingxianlin salmariazi muthyamr chinhnguyen8855 kkochubey1 scottsmall richardshang prk2104 scbronder alexdeethegreat muhyun ronanlee abhatia05 tharindurajarathna gkpraveen1988 lkngin benluteijn leecardona jamiekang senayyakut helpful-bus aleksandrey jongwony chainheal86 pydemia seungheekoh deceneu3 namwoo mikecee totopiahan paulsoh jeffgukang jikkimi

aws-deepracer-workshops's Issues

NameError: name 'plt' is not defined

When running jupyter notebook 'DeepRacer Log Analysis.ipynb' I get below error in a Windows environment (Python 3.8.1).

NameError                                 Traceback (most recent call last)
<ipython-input-2-fd8b8819ef67> in <module>
      1 # Plot the results
----> 2 fig, ax = plt.subplots(figsize=(20,10))
      3 plot_points(ax, waypoints[:-1,0:2])
      4 plot_points(ax, waypoints[:-1,2:4])
      5 plot_points(ax, waypoints[:-1,4:6])

NameError: name 'plt' is not defined

See full screen print at https://www.screencast.com/t/kvOtMEJf

The same works fine on MacOS (Python 3.7.3).
Has anybody experienced the same problem and found a work around?

Wifi light neither turns red nor blue

Exhausted troubleshooting the AWS Deepracer, the Wifi light neither turns red nor blue. Followed the instructions meticulously

Access Key for DeepRacer Workshop

I went to the DeepRacer workshop and got a temporary AWS account to play around with DeepRacer while at re:Invent. I'm trying to download the log files to follow along with the notebook example, but I don't seem to have permissions to generate an Access Key to configure the aws cli. What should I do?

I went here: https://console.aws.amazon.com/iam/home?region=us-east-1#/users/aws-deepracer-workshop?section=security_credentials and "Create access key" is disabled.

KeyError: 'outputLocation' when trying to get all the info

At the following codes within the DeepRacer log analysis, the job_desc from the json outcome of robomaker.escribe_simulation_job did not have the key "outputLocation" (KeyError: 'outputLocation'). Therefore, we could not locate the s3_bucket and s3_prefix for the simtrace log and retrieve the corresponding csv file. Below please find the codes as well as the screenshot of the error.

Get all the infro

job_desc = robomaker.describe_simulation_job(job=robomaker_job_arn)

is_training = job_desc['simulationApplications'][0]['launchConfig']['launchFile'] == "distributed_training.launch"
s3_bucket = job_desc['outputLocation']['s3Bucket']
s3_prefix = job_desc['outputLocation']['s3Prefix']
job_type = "training" if is_training else "evaluation"
simtrace_path = "iteration-data/{}/".format(job_type)

Downlaod all the simtrace iteration data

!aws s3 sync s3://{s3_bucket}/{s3_prefix}/{simtrace_path} ./tmp --exclude "" --include "-{job_type}-simtrace.csv"

Need access to the console to practice

Just wanted to use the console but I don't have access.

sagemaker role errors

I'm trying to run the Deepracer log analysis tool from https://github.com/aws-samples/aws-deepracer-workshops/blob/master/log-analysis/DeepRacer%20Log%20Analysis.ipynb on my local laptop. However I get below error while trying to run step [5] "Create an IAM role".

try:
    sagemaker_role = sagemaker.get_execution_role()
except:
    sagemaker_role = get_execution_role('sagemaker')

print("Using Sagemaker IAM role arn: \n{}".format(sagemaker_role))

Couldn't call 'get_role' to get Role ARN from role name arn:aws:iam::26********:root to get Role path.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-3bea8175b8c7> in <module>
      1 try:
----> 2     sagemaker_role = sagemaker.get_execution_role()
      3 except:

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in get_execution_role(sagemaker_session)
   3302     )
-> 3303     raise ValueError(message.format(arn))
   3304 

ValueError: The current AWS identity is not a role: arn:aws:iam::26********:root, therefore it cannot be used as a SageMaker execution role

During handling of the above exception, another exception occurred:

NameError                                 Traceback (most recent call last)
<ipython-input-5-3bea8175b8c7> in <module>
      2     sagemaker_role = sagemaker.get_execution_role()
      3 except:
----> 4     sagemaker_role = get_execution_role('sagemaker')
      5 
      6 print("Using Sagemaker IAM role arn: \n{}".format(sagemaker_role))

NameError: name 'get_execution_role' is not defined

Does anybody know what needs to be done to execute above code without errors?

Kumo Torakku training track

Hi,

Please could you bundle in the .npy for the new Kumo Torakku training track, and also the leaderboard evaluation track so we can analyse our training?

Thanks!
Lyndon

Unable to open notebook `DeepRacer Log Analysis.ipynb`

I am unable to load the notebook above locally. Would someone re-commit a cleaned-up version of it?

Change track size

Analyze the reward distribution for your reward function

because Kumo Torakku has negative y values, I shamelessly took
RichardFan's modificationg for plot_track and refactored it to offer an x_shift and y_shift
They may not apply to other tracks. You will need to change it in the future. Simply add parameters:
track_size=(700,1000), y_shift=300

track = la.plot_track(df, l_center_line, l_inner_border, l_outer_border)

plt.title("Reward distribution for all actions ")
im = plt.imshow(track, cmap='hot', interpolation='bilinear', origin="lower")

I found problem when unquote the track_size line.
It pop up an error

  File "<ipython-input-36-3595092372fb>", line 4
    track_size=(700,1000), y_shift=300
              ^
SyntaxError: can't assign to literal

Hopefully someone can help to fix it. Thanks!

How to train DeepRacer when account will be expired?

It was told on DeepRacer workshop that access to Robomaker simulation of the track as well ass Sagemaker notebook will be available to download to continue train DeepRacer before official preview will be opened. Where to get all this staff?

The simulator might has a bug, my simulator is aways starts from (0,0) coor.

Hi there,
hope you are doing well. I have a horrible week using AWS DeepRacer to train my model. What my issue is that the simulator starts always at the (0,0) coordinate and not on the track I supposed to train. See the images below

and

After about 30 minutes, the car was still seeing only the grasses around. Is there something wrong with the simulator???

If there's a bug, please stop this gimmick, because we have to pay money for using the AWS!!!

Thanks,
Bill

Burner Account fails to run any simulations

I'm not sure where else to ask this now that the workshops are closed, but my burner account has yet to successfully run a training simulation (or, it at least doesn't provide any useful data/no video). Evaluations also fail after the "training run" completes, but also never register in the console or are able to be reviewed in any way.

Is it possible anyone has an extra they could send me? This seems to be a problem that only my account is experiencing, unfortunately. I'm aware that scale is still very low and timeouts are common, but my issue seems to prevent any useful functionality.

Thanks for any help.

Need to add Championship Cup Warm-up track

How can I get hold of the .npy file for the Championship Cup Warm-up track?

I'm happy to open a pull request to add it to the tracks folder. I don't see it in there at the moment:

I searched the DeepRacer forum and the internet, and tried checking my S3 logs for the training and evaluation that I did on this new track, but I couldn't find any mention of its .npy file. 🤷‍♂

Track waypoint data documentation

I've been looking at the Championship Cup 2019 Track.

In this repo, there's a .np file of waypoints.

https://github.com/aws-samples/aws-deepracer-workshops/blob/master/log-analysis/tracks/ChampionshipCup2019_track.npy

The track there is approximately 1.066 wide. I'm assuming the units of that file are meters. (Since the overall width of the map is about 8, and the official width is 34 feet.)

However all the tracks on this documentation page are 24 inches wide, which is 0.6096m.

Even accounting for the ambiguous 3cm grey curb, that would still only get us to 0.7602m.

Where did these numpy files come from?

I haven't seen a physical track myself, but in all the videos they do not look like they're a whole meter wide.

I would love to see a README at /log-analysis/tracks which explains what a .np file is, which column is which, that the units are in meters, and where the data came from.

Fix code scanning alert - Incomplete URL substring sanitization

Tracking issue for:

https://github.com/aws-deepracer/aws-deepracer-workshops/security/code-scanning/54

Lab 1 - Section 3 : outdated content

Link: https://github.com/aws-deepracer/aws-deepracer-workshops/tree/master/Workshops/2019-AWSSummits-AWSDeepRacerService/Lab1#section-3-model-training-and-improving-your-model

This section describes feature and content that are no longer accessible in the console.
The Japanese translation of the lab removed that section.

Can you please address or clarify?

Broken link

"The latest workshop lab is run as part of AWS DeepRacer events conducted in 2022."

Workshop Lab Link in Readme does not work.

Error : UnknownServiceError

Boto3 : 1.15.143

Code at below, Boto3 raises UNKNOWNSERVICE Error

envroot = os.getcwd()
aws_data_path = set(os.environ.get('AWS_DATA_PATH', '').split(os.pathsep))
aws_data_path.add(os.path.join(envroot, 'models'))
os.environ.update({'AWS_DATA_PATH': os.pathsep.join(aws_data_path)})

region = "us-east-1"
dr_client = boto3.client('deepracer', region_name=region,
endpoint_url="https://deepracer-prod.{}.amazonaws.com".format(region))
models = dr_client.list_models(ModelType="REINFORCEMENT_LEARNING",MaxResults=100)["Models"]
for model in models:
if model["ModelName"]==model_name:
break

This is Error Logs.

UnknownServiceError Traceback (most recent call last)
in
6 region = "us-east-1"
7 dr_client = boto3.client('deepracer', region_name=region,
----> 8 endpoint_url="https://deepracer-prod.{}.amazonaws.com".format(region))
9 models = dr_client.list_models(ModelType="REINFORCEMENT_LEARNING",MaxResults=100)["Models"]
10 for model in models:

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/boto3/init.py in client(*args, **kwargs)
89 See :py:meth:boto3.session.Session.client.
90 """
---> 91 return _get_default_session().client(*args, **kwargs)
92
93

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/boto3/session.py in client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
261 aws_access_key_id=aws_access_key_id,
262 aws_secret_access_key=aws_secret_access_key,
--> 263 aws_session_token=aws_session_token, config=config)
264
265 def resource(self, service_name, region_name=None, api_version=None,

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/session.py in create_client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
836 is_secure=use_ssl, endpoint_url=endpoint_url, verify=verify,
837 credentials=credentials, scoped_config=self.get_scoped_config(),
--> 838 client_config=config, api_version=api_version)
839 monitor = self._get_internal_component('monitor')
840 if monitor is not None:

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in create_client(self, service_name, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, api_version, client_config)
78 'choose-service-name', service_name=service_name)
79 service_name = first_non_none_response(responses, default=service_name)
---> 80 service_model = self._load_service_model(service_name, api_version)
81 cls = self._create_client_class(service_name, service_model)
82 endpoint_bridge = ClientEndpointBridge(

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _load_service_model(self, service_name, api_version)
119 def _load_service_model(self, service_name, api_version=None):
120 json_model = self._loader.load_service_model(service_name, 'service-2',
--> 121 api_version=api_version)
122 service_model = ServiceModel(json_model, service_name=service_name)
123 return service_model

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/loaders.py in _wrapper(self, *args, **kwargs)
130 if key in self._cache:
131 return self._cache[key]
--> 132 data = func(self, *args, **kwargs)
133 self._cache[key] = data
134 return data

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/loaders.py in load_service_model(self, service_name, type_name, api_version)
376 raise UnknownServiceError(
377 service_name=service_name,
--> 378 known_service_names=', '.join(sorted(known_services)))
379 if api_version is None:
380 api_version = self.determine_latest_version(

UnknownServiceError: Unknown service: 'deepracer'. Valid service names are: accessanalyzer, acm, acm-pca, alexaforbusiness, amplify, apigateway, apigatewaymanagementapi, apigatewayv2, appconfig, appflow, application-autoscaling, application-insights, appmesh, appstream, appsync, athena, autoscaling, autoscaling-plans, backup, batch, braket, budgets, ce, chime, cloud9, clouddirectory, cloudformation, cloudfront, cloudhsm, cloudhsmv2, cloudsearch, cloudsearchdomain, cloudtrail, cloudwatch, codeartifact, codebuild, codecommit, codedeploy, codeguru-reviewer, codeguruprofiler, codepipeline, codestar, codestar-connections, codestar-notifications, cognito-identity, cognito-idp, cognito-sync, comprehend, comprehendmedical, compute-optimizer, config, connect, connectparticipant, cur, dataexchange, datapipeline, datasync, dax, detective, devicefarm, directconnect, discovery, dlm, dms, docdb, ds, dynamodb, dynamodbstreams, ebs, ec2, ec2-instance-connect, ecr, ecs, efs, eks, elastic-inference, elasticache, elasticbeanstalk, elastictranscoder, elb, elbv2, emr, es, events, firehose, fms, forecast, forecastquery, frauddetector, fsx, gamelift, glacier, globalaccelerator, glue, greengrass, groundstation, guardduty, health, honeycode, iam, identitystore, imagebuilder, importexport, inspector, iot, iot-data, iot-jobs-data, iot1click-devices, iot1click-projects, iotanalytics, iotevents, iotevents-data, iotsecuretunneling, iotsitewise, iotthingsgraph, ivs, kafka, kendra, kinesis, kinesis-video-archived-media, kinesis-video-media, kinesis-video-signaling, kinesisanalytics, kinesisanalyticsv2, kinesisvideo, kms, lakeformation, lambda, lex-models, lex-runtime, license-manager, lightsail, logs, machinelearning, macie, macie2, managedblockchain, marketplace-catalog, marketplace-entitlement, marketplacecommerceanalytics, mediaconnect, mediaconvert, medialive, mediapackage, mediapackage-vod, mediastore, mediastore-data, mediatailor, meteringmarketplace, mgh, migrationhub-config, mobile, mq, mturk, neptune, networkmanager, opsworks, opsworkscm, organizations, outposts, personalize, personalize-events, personalize-runtime, pi, pinpoint, pinpoint-email, pinpoint-sms-voice, polly, pricing, qldb, qldb-session, quicksight, ram, rds, rds-data, redshift, redshift-data, rekognition, resource-groups, resourcegroupstaggingapi, robomaker, route53, route53domains, route53resolver, s3, s3control, s3outposts, sagemaker, sagemaker-a2i-runtime, sagemaker-runtime, savingsplans, schemas, sdb, secretsmanager, securityhub, serverlessrepo, service-quotas, servicecatalog, servicediscovery, ses, sesv2, shield, signer, sms, sms-voice, snowball, sns, sqs, ssm, sso, sso-admin, sso-oidc, stepfunctions, storagegateway, sts, support, swf, synthetics, textract, timestream-query, timestream-write, transcribe, transfer, translate, waf, waf-regional, wafv2, workdocs, worklink, workmail, workmailmessageflow, workspaces, xray

Model load failed Error

Hi,

I am currently trying to upload my simulated RL model, which was trained on the AWS DeepRacer Console, onto my deepracer. I noticed that there is no longer an upload button on the AWS DeepRacer console.

After looking through the AWS DeepRacer training course, I found that the tar.gz file of the model can be uploaded to the DeepRacer through a thumbdrive. From here, I uploaded the tar.gz file into a newly created models folder in my thumbdrive and connected it to my DeepRacer.

Once I log into the DeepRacer console through the IP Address, I am able to see that my model is an option that I can upload onto the DeepRacer. However, once I try to upload the model onto the DeepRacer, I run into the following issue:

"Model load failed! Please check the ROS logs"

From here, I go into logs and open the ROS Display Logs for the message. According to my ROS logs, it says "File Not Found!"

I do not understand why I am getting this error as I followed the steps and got everything before to work. My RL model simulated completely fine and is evaluated correctly and I created the right hierarchy on my thumbdrive by creating a folder called "models" and then passed the tar.gz file in.

Does anyone know why I am getting this error? How do I resolve this?

Thank you

Sample reward_function method documentation wrong

In the documentation of the reward_function methods that are given as sample, the documentation explains the parameters to the function. Waypoints are explained as follows:

@waypoints (ordered list) :: list of waypoint in order; each waypoint is a set of coordinates
(x,y,yaw) that define a turning point

This is actually not true, because the simulator only provides x and y coordinates for the waypoints, not the yaw.

stream_name = 'XXXX'
fname = 'logs/deepracer-%s.log' %stream_name
cw_utils.download_log(fname, stream_prefix=stream_name)

---------------------------------------------------------------------------
NoRegionError                             Traceback (most recent call last)
<ipython-input-48-da5f8f73ddad> in <module>
----> 1 cw_utils.download_log(fname, stream_prefix=stream_name)

~/projects/deepracer-models/aws-deepracer-workshops/log-analysis/cw_utils.py in download_log(fname, stream_name, stream_prefix, log_group, start_time, end_time)
     59             end_time=end_time
     60         )
---> 61         for event in logs:
     62             f.write(event['message'].rstrip())
     63             f.write("\n")

~/projects/deepracer-models/aws-deepracer-workshops/log-analysis/cw_utils.py in get_log_events(log_group, stream_name, stream_prefix, start_time, end_time)
     12 
     13 def get_log_events(log_group, stream_name=None, stream_prefix=None, start_time=None, end_time=None):
---> 14     client = boto3.client('logs')
     15     if stream_name is None and stream_prefix is None:
     16         print("both stream name and prefix can't be None")

/anaconda3/lib/python3.7/site-packages/boto3/__init__.py in client(*args, **kwargs)
     89     See :py:meth:`boto3.session.Session.client`.
     90     """
---> 91     return _get_default_session().client(*args, **kwargs)
     92 
     93 

/anaconda3/lib/python3.7/site-packages/boto3/session.py in client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
    261             aws_access_key_id=aws_access_key_id,
    262             aws_secret_access_key=aws_secret_access_key,
--> 263             aws_session_token=aws_session_token, config=config)
    264 
    265     def resource(self, service_name, region_name=None, api_version=None,

/anaconda3/lib/python3.7/site-packages/botocore/session.py in create_client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
    837             is_secure=use_ssl, endpoint_url=endpoint_url, verify=verify,
    838             credentials=credentials, scoped_config=self.get_scoped_config(),
--> 839             client_config=config, api_version=api_version)
    840         monitor = self._get_internal_component('monitor')
    841         if monitor is not None:

/anaconda3/lib/python3.7/site-packages/botocore/client.py in create_client(self, service_name, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, api_version, client_config)
     84         client_args = self._get_client_args(
     85             service_model, region_name, is_secure, endpoint_url,
---> 86             verify, credentials, scoped_config, client_config, endpoint_bridge)
     87         service_client = cls(**client_args)
     88         self._register_retries(service_client)

/anaconda3/lib/python3.7/site-packages/botocore/client.py in _get_client_args(self, service_model, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, client_config, endpoint_bridge)
    326         return args_creator.get_client_args(
    327             service_model, region_name, is_secure, endpoint_url,
--> 328             verify, credentials, scoped_config, client_config, endpoint_bridge)
    329 
    330     def _create_methods(self, service_model):

/anaconda3/lib/python3.7/site-packages/botocore/args.py in get_client_args(self, service_model, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, client_config, endpoint_bridge)
     45         final_args = self.compute_client_args(
     46             service_model, client_config, endpoint_bridge, region_name,
---> 47             endpoint_url, is_secure, scoped_config)
     48 
     49         service_name = final_args['service_name']

/anaconda3/lib/python3.7/site-packages/botocore/args.py in compute_client_args(self, service_model, client_config, endpoint_bridge, region_name, endpoint_url, is_secure, scoped_config)
    115 
    116         endpoint_config = endpoint_bridge.resolve(
--> 117             service_name, region_name, endpoint_url, is_secure)
    118 
    119         # Override the user agent if specified in the client config.

/anaconda3/lib/python3.7/site-packages/botocore/client.py in resolve(self, service_name, region_name, endpoint_url, is_secure)
    400         region_name = self._check_default_region(service_name, region_name)
    401         resolved = self.endpoint_resolver.construct_endpoint(
--> 402             service_name, region_name)
    403         if resolved:
    404             return self._create_endpoint(

/anaconda3/lib/python3.7/site-packages/botocore/regions.py in construct_endpoint(self, service_name, region_name)
    120         for partition in self._endpoint_data['partitions']:
    121             result = self._endpoint_for_partition(
--> 122                 partition, service_name, region_name)
    123             if result:
    124                 return result

/anaconda3/lib/python3.7/site-packages/botocore/regions.py in _endpoint_for_partition(self, partition, service_name, region_name)
    133                 region_name = service_data['partitionEndpoint']
    134             else:
--> 135                 raise NoRegionError()
    136         # Attempt to resolve the exact region for this partition.
    137         if region_name in service_data['endpoints']:

NoRegionError: You must specify a region.