Giter Club home page Giter Club logo

gqn-datasets's Introduction

Datasets used to train Generative Query Networks (GQNs) in the ‘Neural Scene Representation and Rendering’ paper.

The following version of the datasets are available:

  • rooms_ring_camera. Scenes of a variable number of random objects captured in a square room of size 7x7 units. Wall textures, floor textures as well as the shapes of the objects are randomly chosen within a fixed pool of discrete options. There are 5 possible wall textures (red, green, cerise, orange, yellow), 3 possible floor textures (yellow, white, blue) and 7 possible object shapes (box, sphere, cylinder, capsule, cone, icosahedron and triangle). Each scene contains 1, 2 or 3 objects. In this simplified version of the dataset, the camera only moves on a fixed ring and always faces the center of the room. This is the ‘easiest’ version of the dataset, use version for fast training.

  • rooms_free_camera_no_object_rotations. As in rooom_ring_camera, except the camera moves freely. However the objects themselves do not rotate around their axes, which makes the modeling task somewhat easier. This version is ‘medium’ difficulty.

  • rooms_free_camera_with_object_rotations. As in rooms_free_camera_no_object_rotations, the camera moves freely, however objects can rotate around their vertical axes across scenes. This is the ‘hardest’ version of the dataset.

  • jaco. a reproduction of the robotic Jaco arm is placed in the middle of the room along with one spherical target object. The arm has nine joints. As above, the appearance of the room is modified for each episode by randomly choosing a different texture for the walls and floor from a fixed pool of options. In addition, we modify both colour and position of the target randomly. Finally, the joint angles of the arm are also initialised at random within a range of physically sensible positions.

  • shepard_metzler_5_parts. Each object is composed of 7 randomly coloured cubes that are positioned by a self-avoiding random walk in 3D grid. As above, the camera is parametrised by its position, yaw and pitch, however it is constrained to only move around the object at a fixed distance from its centre. This is the ‘easy’ version of the dataset, where each object is composed of only 5 parts.

  • shepard_metzler_7_parts. This is the ‘hard’ version of the above dataset, where each object is composed of 7 parts.

  • mazes. Random mazes that were created using an OpenGL-based DeepMind Lab game engine (Beattie et al., 2016). Each maze is constructed out of an underlying 7 by 7 grid, with walls falling on the boundaries of the grid locations. However, the agent can be positioned at any continuous position in the maze. The mazes contain 1 or 2 rooms, with multiple connecting corridors. The walls and floor textures of each maze are determined by random uniform sampling from a predefined set of textures.

Usage example

To select what dataset to load, instantiate a reader passing the correct version argument. Note that the constructor will set up all the queues used by the reader. To get tensors call read on the data reader passing in the desired batch size.

  import tensorflow as tf

  root_path = 'path/to/datasets/root/folder'
  data_reader = DataReader(dataset='jaco', context_size=5, root=root_path)
  data = data_reader.read(batch_size=12)

  with tf.train.SingularMonitoredSession() as sess:
    d = sess.run(data)

Download

Raw data files referred to in this document are available to download here. To download the datasets you can use the gsutil cp command; see also the gsutil installation instructions.

Notes

This is not an official Google product.

gqn-datasets's People

Contributors

buesma avatar derpson avatar fabioviola avatar fvioladm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gqn-datasets's Issues

Incorrect file paths

It seems the _get_dataset_files method generates filenames from 0 to num_files - 1. e.g.

root/shepard_metzler_5_parts/train/000-of-900.tfrecord
to
root/shepard_metzler_5_parts/train/899-of-900.tfrecord

however the files on google cloud are numbered from 1 to num_files

root/shepard_metzler_5_parts/train/001-of-900.tfrecord
to
root/shepard_metzler_5_parts/train/900-of-900.tfrecord

this causes training to crash when the program tries to access the missing 0th file.

Data in numpy format

Hey guys,

I converted the entire dataset to numpy and I was wondering if you'd like to integrate that into the official bucket. My free 300$ will be used up at some point :D Data to be found here. Let me know what you think.

Jens

Environment

Have you considered realising the environment you used?

Top-down views of maze and normalization parameters

Hi I had 2 questions pertaining to the dataset:

  1. The paper mentions 'top-down views' of the maze configurations. Are these views also included in the maze dataset, and if so, at which file indices would I be able to find them?

  2. I am trying to normalize the datasets and finding their means and variances is taking about a full day per dataset. If the authors already have this information, would it be possible to know the mean and standard deviation of each training set, for each RGB channel?

Thanks for your help!

GQN official implementation

Hello, it has been a while so I would like to ask is there any plan to release the public code as long as the trained model ?

Whole Dataset

Dear Fabio Viola,

I checked the gqn dataset from https://github.com/deepmind/gqn-datasets. Shepard_metzler_7_parts contains 900 tfrecords for training. Each tfrecord has 20 scenes. So there are only 18000 scenes. For Mazes, there are 1080 records with 100 scenes for each record, which means 108000 in total.

So I think this link just contains a part of the whole data, right? If so, could you send the link for the whole dataset? Many thanks ion advance!

Best,
Bing

About GQN download

Is it free to download gqn datasets?
And should I need to apply to another new google cloud bucket?

Reading the dataset without tensorflow

Hi,

I have searched for quite a long time now and I'm looking for a fast and efficient way of reading your dataset without tensorflow. I could indeed use a minimum of tensorflow code, but what I've seen is that we are forced to run the DataReader.read method inside a tensorflow session.

I've looked into solutions like using this code https://github.com/pgmmpk/tfrecord, but it's handled a different way and the data wrongly decoded.

Do you have recommendations on how to use the dataset without or with minimal tensorflow code?

Thanks in advance.

Camera intrinsic parameter

Hello guys, I am working on GQN dataset and I would like to know about the camera intrinsic parameter for the room_ring_camera dataset ?
In the paper, it says that :"Images are rendered using MuJoCo’s default OpenGL renderer" so I guess camera parameters can be shared for future research.

Resolution

Hi,

I am a little curious about the resolution of rendering images.
Do you have tried higher resolution?
Or the training process of high resolution is slower, thus 64x64 is used in the paper.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.