Giter Club home page Giter Club logo

shapestacks's Introduction

ShapeStacks

ShapeStacks teaser image

This repository contains a Python interface to the ShapeStacks dataset. It also includes baseline models for intuitive physics tasks trained on ShapeStacks.

For more information about the project, please visit our project page at http://shapestacks.robots.ox.ac.uk

If you use the ShapeStacks dataset or the intuitive physics models of this repository, please cite our publication:

@InProceedings{Groth_2018_ECCV,
author = {Groth, Oliver and Fuchs, Fabian B. and Posner, Ingmar and Vedaldi, Andrea},
title = {ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Software Requirements

The code has been tested on Ubuntu 16.04 with Python 3.5.2 The major software requiremets can be installed via:

$ sudo apt-get install python3-pip python3-dev virtualenv

Also, in order to run the intuitive physics models efficiently on GPU, the latest NVIDIA drivers, CUDA and cuDNN frameworks which are compatible with Tensorflow should be installed.

Installation

Python

All Python dependencies of the ShapeStacks code should live in their own virtual environment. All runtime requirements can be easily installed via the following commands:

$ virtualenv -p python3 venv
$ source venv/bin/activate
(venv) $ pip3 install -r requirements.txt

Additional requirements for development purposes can be found in dev_requirements.txt and can be added on demand.

(venv) $ pip3 install -r dev_requirements.txt

MuJoCo

In order to run simulation related code (e.g. create new scenarios, render scenarios or run a stacking algorithm) you need to have the MuJoCo physics engine installed.

We provide a quick installation guide here, but in case any issues occur during installation, we refer to the original maintainers of MuJoCo and mujoco-py for troubleshooting.

Data Setup

ShapeStacks Data

The ShapeStacks dataset together with additional documentation can be downloaded here.

After downloading and unpacking the data, the dataset directory living under SHAPESTACKS_DATASET should look like this:

${SHAPESTACKS_DATASET}/
|__ meta/
    |__ blacklist_stable.txt
    |__ blacklist_unstable.txt
|__ mjcf/
    |__ meshes/
    |__ textures/
    |__ assets.xml
    |__ env_blocks-easy-h=2-vcom=0-vpsf=0-v=1.xml
    |__ ...
    |__ env_ccs-hard-h=6-vcom=5-vpsf=0-v=120.xml
    |__ world_blocks-easy-h=2-vcom=0-vpsf=0-v=1.xml
    |__ ...
    |__ world_ccs-hard-h=6-vcom=5-vpsf=0-v=120.xml
|__ recordings/
    |__ env_blocks-easy-h=2-vcom=0-vpsf=0-v=1/
    |__ ...
    |__ env_ccs-hard-h=6-vcom=5-vpsf=0-v=120/
|__ splits/
    |__ blocks_all/
        |__ ...
    |__ ccs_all/
        |__ eval.txt
        |__ test.txt
        |__ train.txt
        |__ eval_bgr_mean.npy
        |__ test_bgr_mean.npy
        |__ train_bgr_mean.npy
    |__ default/
        |__ ...

FAIR Real Block Tower Images

For convenient use with this codebase, we also provide a restructured version of the real image test set of block towers released by Lerer et al. which can be downloaded here.

After downloading and unpacking the data, the dataset directory living under FAIRBLOCKS_DATASET should look like this:

${FAIRBLOCKS_DATASET}/
|__ meta/
|__ recordings/
    |__ img_frame1_1.png
    |__ ...
    |__ img_frame1_516.png
|__ splits/
    |__ default/
        |__ test.txt
        |__ test_bgr_mean.npy

Data Provider

ShapeStacks and Fairblocks Provider

We provide interfaces to ShapeStacks and FAIR's real block tower images via public input functions in shapestacks_provider.py and fairblocks_provider.py. Those input functions can be used as input_fn to set up a tf.estimator.Estimator in tensorflow.

Segmentation Utilities

We provide utility functions to load the custom segmentation maps of ShapeStacks in segmentation_utils.py.

Running Scripts

Before any scripts containing a __main__ function can be run, the virtual environment needs to be activated and some environment variables need to be set. This can be conveniently done via:

$ . ./activate_venv.sh
Set environment varibale SHAPESTACKS_CODE_HOME=/path/to/this/repository
Activated virtual environment 'venv'.

The complimentary script deactivate_venv.sh deactivates the environment again and unsets all environment variables.

$ . ./deactivate_venv.sh
Unset environment varibale SHAPESTACKS_CODE_HOME=
Deactivated virtual environment 'venv'.

Example: Training a stability predictor

The script train_inception_v4_shapestacks.py can be used to train a visual stability predictor on the ShapeStacks dataset. The main parameters are:

  • --data_dir which needs to point to the dataset location SHAPESTACKS_DATASET (see the dataset section for details)
  • --model_dir which defines a MODEL_DIR where all the tensorflow output and snapshots will be stored during training
  • --real_data_dir can optionally point to to the location FAIRBLOCKS_DATASET (see the dataset section for details) to evaluate the performance of trained model snapshots on the real block tower images

An example run of the training script looks like this:

(venv) $ cd intuitive_physics/stability_predictor
(venv) $ python train_inception_v4_shapestacks.py \
--data_dir ${SHAPESTACKS_DATASET} \
--real_data_dir ${FAIRBLOCKS_DATASET} \
--model_dir ${MODEL_DIR}

After a successful run of the training script, a model directory should have been created and populated like this:

${MODEL_DIR}/
|__ eval_*/
    |__ events.out.tfevents.*
|__ snapshots/
  |__ eval=0.xxxxxx/
      |__ checkpoint
      |__ model.ckpt-xxxxxx.data-000000-of-000001
      |__ model.ckpt-xxxxxx.index
      |__ model.ckpt-xxxxxx.meta
  |__ topn_eval_models.dict
|__ checkpoint
|__ events.out.tfevents.*
|__ graph.pbtxt
|__ model.ckpt-xxxxxx.data-000000-of-000001
|__ model.ckpt-xxxxxx.index
|__ model.ckpt-xxxxxx.meta
|__ ...

You can track the training progress by pointing a tensorboard to the model's root directory:

(venv) $ tensorboard --logdir=stability_predictor:${MODEL_DIR}

The most recent model checkpoints during training are kept in the models's root directory. If the training script finds existing checkpoints in MODEL_DIR, it will automatically load the most recent one of them and resume training from there.

During training, the checkpoints which perform best on the validation set are also saved to the snapshots/ subdirectory. The amount of best checkpoints to keep can be set via --n_best_eval.

We provide the best performing models from the ShapeStacks paper on our project page.

Example: Running a stability predictor

After a stability predictor has been trained, the latest checkpoint or a particular snapshot can be loaded back into a tf.estimator.Estimator.

To instantiate a stability predictor as a tf.estimator.Estimator from the latest checkpoint in the MODEL_DIR you can use the following Python code:

import sys
import os
import tensorflow as tf

sys.path.insert(0, os.environ['SHAPESTACKS_CODE_HOME'])
from tf_models.inception.inception_model import inception_v4_logregr_model_fn

# ...

stability_predictor = tf.estimator.Estimator(
    model_fn=inception_v4_logregr_model_fn,
    model_dir=MODEL_DIR,
    config=run_config,
    params={})

You can also set the model_dir parameter of tf.estimator.Estimator to MODEL_DIR/snapshots/<snapshot_name> to load the weights of a particular snapshot.

Afterwards, you can call the standard estimator APIs evaluate() or predict() on the loaded estimator to run it on new data. A working example can be found in the provided test script test_inception_v4_shapestacks.py.

Licensing

The model implementations under tf_models are taken from the official tensorflow models repository and are licensed under the Apache License, Version 2.0.

shapestacks's People

Contributors

dependabot[bot] avatar martinengelcke avatar ogroth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

shapestacks's Issues

Only first frame image is provided?

Hi, thanks for open source such a great dataset. I have downloaded the RGB images and unzip it, but for one data point, I only find frame0 images xxx-mono-0.png with different camera views. Does that mean the whole falling videos are provided? Would be possible to upload them for download? I understand the dataset should be very large and it's been 4 years since the dataset release, so completely fine if not possible

Mass and gravity

Hi @ogroth and thanks for releasing your code. It works well! Is it possible to determine the mass of an object (cuboid, rectangle, ...) when I create a new scenario? And the gravity of the world?
Thanks

[Question] is it possible to generalise?

Hi!
I am trying to integrate physics intuition during robot training. I am currently using Mujoco and that's why I think this project would be relatively easy to integrate.

How difficult would it be to extend this project to different tasks/physic events?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.