Sep 3 – 4, 2025
Hörsaalgebäude, Campus Poppelsdorf, Universität Bonn
Europe/Berlin timezone

Completing the Puzzle of Unseen Perception in Grasping and Navigation

Not scheduled
1h 30m
Open Space (first floor)

Open Space (first floor)

Poster Embodied AI Poster Session

Speaker

Anas Gouda (TU Dortmund)

Description

We address two critical capabilities required for autonomous robots operating in indoor environments, both centered around robust perception of unseen objects. This generalization can support various applications, but here we focus on mobile robotics.

The first focus is on robotic grasping, where 6D pose estimation is needed for successful manipulation. While 6D tracking is now reliable, the main bottleneck lies in 2D segmentation of unseen objects. Grasp success remains limited by the ability to accurately segment novel instances. Our goal is to bring 2D segmentation to practical levels for reliable, generalizable grasping.

The second focus is on high-speed navigation, where mobile robots must avoid dynamic, previously unseen obstacles in real time. To enable this, we develop a model for 3D bounding box detection of moving objects using event cameras. These sensors offer low-latency, high-temporal-resolution input essential for detecting fast motion and enabling rapid trajectory adjustments.

Together, these efforts target key missing pieces in current perception pipelines for handling unseen objects. We collected two datasets, MR6D and MTevent, which we use to benchmark models and improve performance in both segmentation and 3D detection tasks. By advancing segmentation for grasping and enabling fast 3D detection for navigation, this work contributes toward more adaptable and capable robotic systems.


Figure 1: Example from the MR6D dataset showing an image from the O3dyn robot camera with projected 6D annotation of a pallet. No pipeline can robustly predict 6D poses for several challenging cases, such as pallets, included in the dataset.

Figure 2: Example from the MTevent dataset showing 3D bounding box annotations for a forklift captured using our stereo-event + RGB camera system. Most existing work on moving object detection with event cameras focuses on simple scenes and 2D detection. No current method can handle realistic scenarios as shown in this image.

Authors

Presentation materials