RL Pick-and-Throw

Project Description

This project models a gripper for Pick-and-throw tasks with PyBullet in order to throw scraps into designated buckets.

In this repository, we use RL in order to learn efficient policies to control a cartesian gripper. The grasping is hardcoded and the gripper is controlled by a policy learned with RL. The gripper has 4-dof and the agent only chose a target position for the gripper, a release position and the velocity of the gripper during the movement.

Installation

You need to install Python 3.8 and the requirements provided in the requirement file. Use the command below to download the required packages in order to run the project:

pip install -r requirements.txt

Usage

To run the simulation, move to the code directory and use the following command:

python3 main.py [params]

where params are the parameters of the simulation. The available parameters are:

Parameter	Short name	Description	Values
--agent	-a	The agent to use	pap, timeOpt, sac, td3, ppo
--episodes	-e	The number of episodes to run	int
--gui	-g	Whether to use the GUI or not	True, False
--verbose	-v	Whether to print the logs or not	True, False
--reward	-r	The reward function to use	success, success_and_time, success_time_and_distance
--model	-m	The directory with the model to use	str
--seed	-s	The seed to use	int
--save_data	-d	The directory where you want to save your experiments	str
--domain_randomization	-dr	Whether to use domain randomization or not	True, False

Training

To train an agent, move to the code directory and use the following command:

python3 train_agent.py [params]

where params are the parameters of the training. The available parameters are:

Parameter	Short name	Description	Values
--agent	-a	The agent to use	ppo, sac, td3
--episodes	-e	The number of episodes to run	int
--save	-s	The path to save the model	str
--model	-m	The directory with the pretrained model to use	str
--reward	-r	The reward function to use	neural_net, weighted, success, lin_reg
--hyperparams	-hp	The directory to the yaml file with the hyperparameters	str

To train a new estimator of the PaP for the reward function, move to the code directory and use the following command:

python3 train_reward.py [params]

where params are the parameters of the training. The available parameters are:

Parameter	Short name	Description	Values
--episodes	-e	The number of episodes to run	int
--save_path	-s	The path to save the model	str
--pretrained_path	-m	The directory with the pretrained model to use	str

Workspace configuration

The workspace is defined in the file workspace.yaml. You can change the workspace by changing the values of the parameters in this file. It defines a rectangular worspace x-y-z. The worspace centre is centred in the origin of the robot and the offset_z is the height of the conveyor belt. It also contains the gripper delay, which is the time the gripper takes to close and open and the gripper latency, which is the time the gripper takes to start to open or close after the action is sent to the gripper.

Demo

you can test the repo with the following instructions:

python3 main.py -a td3 -m models/td3_sb3_1 -r success_and_time -v 1 -e 10 -g 1 -hp hyperparams/td3_sb3.yaml

python3 train_agent.py -a sac -e 10000 -r success_and_time -s models/SAC_demo -hp hyperparams/sac_sb3.yaml

louettearthur / pick-and-throw Goto Github PK

pick-and-throw's Introduction

RL Pick-and-Throw

Project Description

Installation

Usage

Training

Workspace configuration

Demo

pick-and-throw's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent