RoboSplat

Novel Demonstration Generation
with Gaussian Splatting
Enables Robust One-Shot Manipulation

RSS 2025

RoboSplat is framework that leverages 3D Gaussian Splatting (3DGS) to generate novel demonstrations for RGB-based policy learning.

Starting from a single expert demonstration and multi-view images, our method generates diverse and visually realistic data for policy learning, enabling robust performance across six types of generalization (object poses, object types, camera views, scene appearance, lighting conditions, and embodiments) in the real world.

Compared to previous 2D data augmentation methods, our approach achieves significantly better results across various generalization types. Notably, we achieve this within a unified framework.

RoboSplat Demonstration

Method

We start from a single manually collected demonstration and multi-view images that capture the whole scene. The former provides task-related keyframes, while the latter helps scene reconstruction. After aligning the reconstructed frame with the real-world frame and segmenting different scene components, we carry out autonomous editing of the scene in pursuit of six different types of generalization.

Augmented Demonstrations

Real World Deployment

Five Tasks

Pick Object

Close Drawer

Pick-Place-Close

Dual-Pick-Place

Sweep

Generalization

Our Team

Sizhe Yang^*,1,2

Wenye Yu^*,1,3

Jia Zeng¹

Jun Lv³

Kerui Ren^1,3

Cewu Lu³

Dahua Lin^1,2

Jiangmiao Pang¹

¹Shanghai AI Laboratory, ²The Chinese University of Hong Kong, ³Shanghai Jiao Tong University
* Equal contribution

If you have any questions, please contact Sizhe Yang and Wenye Yu.