Chenghao Gu

I am a second year Master student at Tsinghua University, SIGS, supervised by Prof. Zhi Wang on 3D Vision and Embodied AI. My research interests include Robot Data Generation, Real-to-Sim-to-Real, Generalizable Robot Learning, 3D Visual Reconstruction, and Interactive AI Generation.

Email / Google Scholar / Github /

Research Interests

My research focuses on computer graphics and computer vision, particularly on 3D reconstruction and simulation techniques that facilitate effective robot learning.

Experience

Tsinghua University, SIGS
M.S. in Data Science and Information Technology
2024.09 – 2027.06 (Expected)

Tencent Robotics X
World Model Research Intern
2026.02 – Present

Publications

IGen: Scalable Data Generation for Robot Learning from Open-World Images
Chenghao Gu*, Haolan Kang*, Junchao Lin*, Jinghe Wang, Duo Wu, Shuzhao Xie, Fanding Huang, Junchen Ge, Ziyang Gong, Letian Li, Hongying Zheng, Changwei Lv, Zhi Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
project page / arXiv / pdf

IGen scalably generates realistic visual observations and executable robot actions from open-world images, enabling policies trained purely on synthesized data to match real-world data performance for robotic manipulation.

DragScene: Interactive 3D Scene Editing with Single-view Drag Instructions
Chenghao Gu, Zhenzhe Li, Zhengqi Zhang, Yunpeng Bai, Shuzhao Xie, Zhi Wang
arXiv preprint, 2024
arXiv / pdf

DragScene is an effective drag-style 3D scene editing framework, enabling controllable and view-consistent edits on real-world 3D scenes from a single reference view by combining 2D latent optimization with point-based 3D clues.

Tuning-Free Visual Customization via View Iterative Self-Attention Control
Xiaojie Li, Chenghao Gu, Shuzhao Xie, Yunpeng Bai, Weixiang Zhang, Zhi Wang
arXiv preprint, 2024
arXiv / pdf / code

VisCtrl is a tuning-free method that injects the appearance and structure of a user-specified subject into a target image via iterative self-attention control, enabling consistent personalized editing of images, videos, and 3D scenes with only a single reference image.

Template from Jon Barron. Last updated in July 2025.