Giter Club home page Giter Club logo

106aproj's Introduction

If you are viewing this as a README on GitHub, click here to view the project website.

Introduction

Have you ever wanted your resident large, expensive humanoid robot to help you solve a children's puzzle? We aim to show how a Baxter robot can be programmed to solve a jigsaw puzzle with computer vision techniques and minimal human interaction. More concretely, we give the robot an image of the completed puzzle, along with the puzzle's dimensions and number of pieces, then place pieces on the table one by one and have Baxter place them with correct position and orientation.

Design + Implementation

Hardware

We worked with:

modules

Calibration

calibration

We break up our approach to puzzle solving into two main stages: calibration and a piece solver loop.

The purpose of the calibration stage is to connect the pixel coordinates of images from the camera with physical coordinates usable by baxter. To do this, we:

  1. Place a square AR tag at the upper left corner of the area in which Baxter will solve the puzzle
  2. Capture an image from the webcam, segment the AR tag, and calculate a deskewing transformation using the expected orientation and aspect ratio (1:1 for a square) of the AR tag when viewed top-down. This deskewing transformation allows us to transform raw images from the camera into a "deskewed pixel space" that simulates what the image would like like if the camera were directly above the table, pointing downwards. By calculating the center of the segmented AR tag in deskewed pixel space, we get the origin of our puzzle-solving coordinate frame in deskewed pixel space. segmentation
  3. Place baxter's camera directly above the AR tag pointing down. We then process the images from Baxter's hand camera with AR tag recognition software to find the coordinate transformation between Baxter's hand and the AR tag. This allows us to calculate the coordinate transformation between Baxter and the middle of the AR tag. We store this transformation, allowing Baxter to know where the origin of the puzzle-solving coordinate frame is even after the AR tag is removed.
  4. Remove the AR tag. Baxter is now ready to solve the puzzle!

Physical Piece Solving

Physical Piece solving (picking up and placing a new piece with roughly correct position and orientation) uses the following loop:

  1. Capture an image of the table before the new piece is placed
  2. Put the new piece on the table roughly at the origin, and capture another image. Using pixel-wise absolute difference between this image and the one from the previous step (and some additional processing), we segment the new puzzle piece. piece segmentation
  3. Using the deskewed pixel position and rotation of the piece from Abstract Piece Solving (see next section), compute the physical coordinates where the piece should be placed. This step also requires pixels-per-meter (ppm) in the deskewed image space, which is calculated with knowledge of the dimensions of the AR tag from the original calibration image.
  4. Using the moveit motion planning library, have Baxter pick up the piece from the origin and place it at the desired location with the desired orientation. actuation

Abstract Piece Solving

Abstract Piece Solving refers to the problem of: given a (mostly) segmented image of a puzzle piece as well as a reference image of the completed puzzle, determine the location (in pixel coordinates) and rotation of the piece within the reference image. The steps were as follows:

  1. Use SURF keypoint matching to calculate the transformation (known as a homography) that maps from pixels in the piece image to pixels in the reference image. see this for reference. If SURF fails, try SIFT instead.

  2. Calculate the angle between the top edge of the original piece image and that edge after being transformed by the found homography.

  3. Calculate the centroid of the image of the homography. We assume the puzzle is grid-like in that each piece in the solved puzzle has a clear row and column index. Therefore, we use the closest location on that implicit grid as the outputted piece location. matching

Challenges

As can be seen from our videos, one of the main challenges we faced was the losses of precision incurred in the conversion from pixel position to physical position. The main causes of these losses were:

  • problems with the baxter hand camera image quality made AR tag recognition somewhat suspect in the calibration phase
  • not placing pieces directly on the origin, causing pickup to fail
  • the suction gripper would sometime latch on to a piece at a non-center point of the piece; this caused problems because our algorithm attempts to have Baxter release the piece directly above the location determined to be that piece's correct center in physical coordinates.
  • The simplification we made by assuming the puzzle is roughly gridlike is not perfect (i.e. piece centers don't fall exactly on the intersection points of the grid), so even if all else goes right the piece may be place slightly off center from its correct final position.
  • For the particular Baxter we were using, the camera feed from the right hand was interpreted as coming from the left hand's camera, and vice versa. This meant that instead of just using built in transform calculation packages that worked with the AR tag package, we had to manually compose transforms to determine the transform between the base of the Baxter robot and the origin of the puzzle-solving frame.

Results

We were able to largely accomplish our goal of having Baxter be able to assist in solving a large jigsaw puzzle! Our short video shows Baxter placing 5 different puzzle pieces in roughly the correct positions and almost perfect orientations. Besides minor hiccups in piece pickup, the video shows how Baxter could theoretically assist with solving an entire puzzle.

Using the puzzle assistant in practice is not the most user-friendly, but it also performs a nontrivial task. After taking any large jigsaw puzzle and inputting a reference image for it (taken from a picture of the box, for example) and the puzzle dimensions, you can have Baxter do all the "heavy lifting" of solving the puzzle for you!

{% include youtube.html %} Link to video

Team

Evan Lohn

Evan is a 4th year EECS undergraduate interested in machine learning and basketball.

Contributions

His focus was on the pipeline from deskewed images to final physical puzzle positions, as well as on abstract puzzle solving.

Rebecca Abraham

Rebecca is a 4th year EECS undergraduate and musician with interests in robotics.

Contributions

She implemented abstract puzzle solving using SURF and SIFT and adapted the AR tags for use with our calibration routine, replacing our initial, less robust attempt.

Jason Huynh

Jason is a 4th year EECS undergraduate with a side interest in personal electric vehicles.

Contributions

He worked on the image deskewing and the ROS pipeline, writing the code for communicating solved puzzle piece positions to Baxter as well as for the pick and place routine.

This website showcases the work of Rebecca Abraham, Jason Huynh, and Evan Lohn on a primarily self-directed EECS C106A class project.

106aproj's People

Contributors

evanlohn avatar rebeccaabraham avatar beccyabraham avatar

Watchers

Jason Huynh avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.