um-arm-lab / conq_python Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 2.0 2.49 MB

Python library and example scripts for Conq, our Boston Dynamics Spot

License: MIT License

Python 99.82% Dockerfile 0.18%

conq_python's People

Contributors

Stargazers

Watchers

Forkers

saketspradhan

conq_python's Issues

Run click_map with a map, record new map, and then continue running click_map

Run click_map nav and update the map once the end of the map is reached
Was able to run the click_map nav and record a new map, but had some lease issues when switching back to click_map

Create test dataset for garden tool images in different environments

Create a new dataset (~1000 images) of garden tool images w/wo occlusions
Upload it to roboflow

Experiment robustness of Conq localization when minor changes to the environment occur

Object Locations in Map Class

Create a class to represent an object, such as a hammer, tool, etc. in the map
Find how waypoints are designated in Spot i.e. pose, lat/long, etc. specifically for the navigateTo function

Description

BDAII has been actively developing a ROS2 driver for Spot for a while now. When first investigating this in early 2023, the driver wasn't in a state that we could use. Now that the lab has begun the switch to ROS2 and the driver is far more developed, we should heavily consider using this.

Tasks

Investigate the ROS2 driver
- Is there any core functionality missing?
  - Ensure that arm control is fully implemented.
Plan out how the driver would be used.
- Would we run the driver on the laptop? On the core?
- Would we continue development with the Python routines?
  - It's likely that the data collection routines should stay in Python given ROS2 bandwidth complaints.
  - If we do continue Python routine development, would we need interface shims for the actual sending/receiving of messages?
    - This could be a lot of work and confusing to people onboarding in the future.
  - Instead, it might be more reasonable to build strong walls between what works with ROS and what works with the Python API directly.

Merge Arthur's Clickmap Updates Into Master

Arthur Lovekin, a previous student who worked extensively with Conq, laid a lot of groundwork for interactively navigating a pre-recorded map. This work is located in the clickmap_updates branch and should be merged into master.

Determine what, if any, development standards and tooling should be used for the project

This can be a rabbit hole, but it's worth thinking about what development standards and tools should be enforced when developing with other people. I talked with Peter about this and we're of the opinion that formatting is probably the only standard that should be considered.

Investigate Python formatting enforcement options
- Look into pre-commit tool for auto-formatting on commits (uses pre-commit hooks)
- Other GitHub-based options?

Get familiar with manipulation stack

Description

We've developed some basic functionality for grasping/manipulating an object given a detection from the perception pipeline. The folk(s) working on the manipulation stack should become familiar with how this is done.

Tasks

Look through the existing code we have for:
- Grasping
- Manipulation
Sit down/chat with Dylan and/or Peter about current capabilities, limitations, future directions.
Generate any necessary documentation for this.

Develop modification of map during run time

Modify the map in real-time with object and waypoint locations to enable smarter navigation:

Investigate using missions vs modifying graph during run time
Implement and test

Add localization state to DataRecorder class

Scope out replay capability for Spot

High-Level Description

As of right now, we don't have a simulator for the Spot robot. Even if we did, generating photorealistic images can be a difficult task. Due to this, it would be great if we could prototype and test development with recorded logs, especially development concerning perception and navigation, e.g. detecting objects and localizing those objects in the map.

We should scope out how difficult this would be and determine if the amount of work is worth it.

Notes From Peter

Logging over WiFi is slow. It's about 1 FPS if you log all information from all cameras. The data logger class in the repo takes the robot state object and images as protobuf messages and pickles them in a background thread.

If we want higher frequency logging there are some workarounds:

Plug in ethernet cable and hold the laptop, walking behind the robot.
- Peter says to not underestimate this. It worked for him just fine when recording data for the OpenXEmbodiment dataset.
Investigate getting a NUC, Spot core is pain to work with, plug this in via ethernet.
- We've run into the issue that the Spot core is a pain to work with in the past. This is because the core runs Ubuntu 18.04 (annoying if we're doing anything outside of data logging), doesn't have a GPU (not a concern for data logging), and we haven't got the WiFi dongles to work with the core yet, so installing things (e.g. via pip) is laborious.

Components To Scope

Data Collection
- Do we need to capture information from all cameras? A subset?
- Capture all joint configurations? Or just poses in odom/body/world frame?
  - Peter and I agree that capturing joint configurations isn't terribly important.
- Storage data types?
  - If using ROS, this is handled for us with their message types and storing information in a bag file (Peter's not a fan)
  - Peter's data logging class just pickles the protobuf messages.
Replay tools
- There's two approaches to consider in regards to replay time:
  - Use the ROS replay methodology that plays messages in real time, where the routine you're testing can drop messages if not processing fast enough.
    - This is the most "realistic" but may not be required to prototype if we manually rate limit the messages by e.g. capturing at a low frequency.
  - Non-real-time replay. Just process the messages with no respect to how long computation takes. Basically, an ideal scenario.
- Regardless of if data collection uses ROS, it could be beneficial to write a data conversion routine from protobuf -> ROS bag so that replay is dead simple.
  - Added benefit of forcing development of perception ROS interface for use with eventual BDAII ROS2 Spot driver.

Additional Tasks

Ask members of the ROAHM lab (Adam Li?) if they have any replay capability already coded up for Spot.

Explore using UM-GPT for Semantic Waypoint Generation

Look into using https://umgpt.umich.edu/maizey to create custom API where you could ask "Where should I put the shears" and have it respond with some waypoint or location in our map
Further Flesh out the system/design

Grasp Detection: Pre-trained GPD models

Explore pre-trained models for 6-DOF grasps

GPD
PointNetGPD
ContactGraspNet

Get familiar with current navigation stack

Description

We/Boston Dynamics have already written functionality to record a map offline and navigate with that pre-recorded map online. The folk(s) working on navigation should become familiar with the stack and understand the capabilities/limits of the current stack.

Tasks

Look into code we have for:
- Offline map recording
- Clickmap navigation
Sit down/chat with Dylan about the process for using the navigation stack.
Generate documentation for this process.

Prototype map visualization with log replay using rerun

Description

It will be helpful to be able to visualize the map during log replay so that we can visualize detections, localization state, localized objects, etc. Get this to a point where visualization of an offline-recorded GraphNav map is fairly streamlined.

Navigate robot in loop of waypoints

With a map uploaded, have the robot navigate along all waypoints in a loop. This will be used later for patrolling for and adding world objects to the map.

Create test dataset for Occluded Garden tools

Collect testing image dataset from the toolshed at the Wilson student team center

Implement A State Machine/Behavior Tree For Tool Retrieval Demo

Currently, the control logic for the tool retrieval demo is hard coded with conditionals. This should be replaced with a somewhat general approach using a state machine/behavior tree. However, we don't want to reinvent the wheel here. We should use a state machine or behavior tree library that makes this much simpler for us. This GitHub page that lists some libraries might be a good first place to look.

Error when optimizing anchors after recording a map

After running the map recording, closing loops, I get this error when selecting the option to optimize anchors:

E0214 11:12:03.750103680   32274 hpack_parser.cc:999]                  Error parsing 'content-type' metadata: invalid value
Unclassified exception: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Stream removed"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-14T11:12:03.750339124-05:00", grpc_status:2, grpc_message:"Stream removed"}"

Integrate object detection on top of click map with navigating in a loop

When an object is detected, add as a world object

Find communication method that works for everyone

Description

We use Slack in the lab and will almost certainly be continuing that. However, I need to figure out if we're going to be using the old Agrobots Slack channel or creating a new one.

Tasks

Figure out if we'll keep the old Slack channel or if we'll create a new one.
Add everyone to the channel.

Consider adding map to log data

Description

Now that we've added localization state to the data recorder (#18), we'll probably want to be able to visualize/use the map in replay. However, the map is currently stored in a separate location than the log files. This can be difficult if wanting to share a full log with someone.

We should consider doing the following when recording a log while Conq is using a GraphNav map:

duplicate the map and store in the log directory
indicate the log was recorded using the map in the metadata.json
indicate the relative path of the map in the metadata.json

Give everyone card access to the lab

Test NanoSAM on Jetson Xavier NX

Upgrade OS from JetPack 4.6.2 to JetPack 5.1.2
Flash with SD card and/or with the Jetson SDK Manager

Consider using per-image capture time in `ConqDataRecorder` Logs

Right now, ConqDataRecorder logs the time that the data recorder started recording all images in a recorded packet. For robust replay, it'd be best if we had a per-image timestamp. However, it's yet to be seen if this is an issue. Consider changing the single timestamp to per-image timestamps.

Allow loading of multiple episodes in playback

Description

When creating the log playback functionality, I didn't realize that multiple episodes corresponded to the same run/continuous running of Conq. To visualize the whole experiment, I need to allow for loading/playback of multiple episode pickle files.

Implement down-sampling per pickle file so we only store the frequency of data that we want (no wasted memory).
Implement iteration over all pickle files in the ConqLog class.
Change playback demo log argument to reflect this change.

Get quote for Spot CORE I/O

Run the tool retrieval demo

Description

Running the tool retrieval demo from last semester is a good way to get familiar with the project. We should run the demo and dive into how each component works, discuss pros/cons of approaches, and brainstorm next steps for the manipulation, perception, and navigation components.

Tasks

~~Dylan should run this once by himself to make sure we don't have any bugs~~
- We agreed that it'd be best for the team to just hack through this since we don't want to try rerunning in Duderstadt.
Run the demo with everyone
Discuss demo results
- What works well and shouldn't be tinkered with?
- What works okay and may be worth rewriting to make more robust?
- Are there any missing pieces that should be developed?
  - #3 is already highlighted as a need. This should also be implemented in a general way such that developing new behaviors/tasks is straightforward.
- Are there any components that function okay but are a headache to work with?

Get familiar with current perception stack

Description

We've already whipped up some perception capability for the tool retrieval demo. This uses Roboflow for data annotation and maybe (?) training the model, I forget. The folk(s) working on perception this semester should familiarize themselves with the code and processes we have for this so far.

Tasks

Look through the existing code we have for:
- Data logging
- Model inference
Sit down/chat with Dylan about the perception pipeline from data collection through model inference
Generate documentation for this process

Run SAM inference on existing garden tools dataset

Set up an API on https://replicate.com/
run inference in 10 batch
document stuff; what works, what doesn't

Research GPS candidates for Conq

Spot SDK 4.0.0 now supports using GPS for GraphNav.
Find a GPS that supports NMEA-0183 messaging protocol
When comparing, use accuracy, cost, size, and how rugged the device is to make the decision

Collect garden tools images from Conq perspective

Create a dataset of garden tools images using src/regrasping_demo/get_detections.py << get_color_img

Visual Servoing: Finishing up

Separate threads for visual pose stream and controller
Transformation between camera frame and hand frame
Transformation between hand frame and gravity aligned body frame
Settle on Object pose convention
Switch hand pose frame or object pose frame
Settle on get_object_pose() --> Return (pos,rot) / (pos,quat) / (pos,euler)
Modify arm move commands' args based on return pose type
Fix PD controller
Estimate latency

um-arm-lab / conq_python Goto Github PK

conq_python's People

Contributors

Stargazers

Watchers

Forkers

conq_python's Issues

Description

Tasks

Description

Tasks

High-Level Description

Notes From Peter

Components To Scope

Additional Tasks

Description

Tasks

Description

Description

Tasks

Description

Description

Description

Tasks

Description

Tasks

Recommend Projects

Recommend Topics

Recommend Org