Giter Club home page Giter Club logo

learning_from_play's Introduction

Learning from Play

Contributors Forks Stargazers Issues MIT License


Logo

Learning From Play

Bringing self-supervised learning to multi-task robotic manipulation.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents

  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgements

About The Project

Built With

Data collection

  • There are two options here. 1. Full teleoperation in 'data_collection/vr_data_collection.py' 2. Scripted data collection (where it does some simple scripted commands, e.g top down block grasping, and door/drawer manipulation). Scripted data collection is a good way to test if the model can learn basic non-diverse motions, but a strong model needs teleoperated data - as a result only the teleoperated pathway is up to date, we last used scripted ~2 months ago and have not tested it since.
  • To teleoperate, you'll need to set up pyBullet VR https://docs.google.com/document/d/1I4m0Letbkw4je5uIBxuCfhBcllnwKojJAyYSTjHbrH8/edit?usp=sharing, then run the 'App_PhysicsServer_SharedMemory_VR' executable you create in that process, then run 'data_collection/vr_data_collection.py'. The arm will track your controller, the main trigger will close the gripper and the secondary trigger will save the trajectory you have collected. We save 'full state' not images during data collection - because this allows us to determinstically reset the environment to that state and then collect images from any angle desired!
  • The npz files created during this process are converted to tf records using 'notebooks/Creating_tf_records'
  • This isn't the easiest of processes, so here is a link to the validation dataset https://drive.google.com/drive/folders/1AoN9grOONiO4tT12mXKvW1arB5suk7Bo?usp=sharing. Contact us for the training dataset, we'll get back to you the next day at the latest - and more than anything would love to chat ideas with anyone interested in this area! (If anyone is interested in using this as a data generator for offline RL, let us know and we'll put in the work to create a module which collects and labels data according to a sparse reward function).

Training

  • To train a model on Colab, use notebooks/train_lfp.ipynb, which will walk you through hardware specific setup (GCS/GDRIVE/Local and TPU/GPU), creating the dataloader and models, using the trainer class and logging to WandB or Comet.
  • To train a model on GCP, follow 'useful_commands.md'. The first commands should be entered into GCP's console - then once your TPU instance is created use the GCP 'compute' pane to SSH in and follow the remaining steps (which clone the repo, install the dependencies and launch a basic deterministic model. Before training a model on GCP, you'll need to set up a GCS bucket to store data in and save model weights - the name of this bucket is defined at the top of 'useful_commands.md'. You'll see that we've created two buckets, one in Iowa (as Colab's TPUs are all hosted there) and one in Groningen (as our allocation of TFRC TPUs is hosted there).
  • Particular args which you may want to modify are: -tfr (looks for tfrecords in the data folder instead of npz, necessary for GCS), -i (images) -n (if not none, it makes the model probabilistic)

Deploying

  • Pretrained model https://drive.google.com/drive/folders/11nwcfXqc0n7Ava2sSCKHcCjJPn52RV7t?usp=sharing
  • Once you've trained or downloaded a model, download it into the 'saved_models' folder.
  • Run the notebooks/deploy notebook with the same args as you trained with. The args for the pretrained data are contained within a .txt file in the folder.
  • This notebook walks you through some pre-checks (it'll plot trajectory reconstructions to make sure the model's outputs make sense, and plot the latent space), then opens up the environment and has two ways of testing the environment. 1. By taking examples from the validation set, and initialising the environment to the first state and setting the final state as goal. 2. By resetting randomly, and using a tester class to generate goals from a predefined set (e.g, door left, block shelf). These goals will adjust the environment to ensure the test is valid (e.g, the door left test will make sure the door is on the right side of the cupboard).
  • The deploy notebook also does some of the tests which feature in our blog post. It uses the goal set testing method to load in different models and test them against a set of goals, it generates adversarial blocks to test robustness and it allows for save/replay of trajectories while displaying the latent space to visualise the plan sampling.

License

Distributed under the MIT License. See LICENSE for more information.

Acknowledgements

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC)

Contact

Project Link: https://github.com/sholtodouglas/learning_from_play

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.