2025

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

Jiahua Ma*, Yiran Qin*, Yixiong Li, Xuanqi Liao, Yulan Guo, Ruimao Zhang#(* equal contribution, # corresponding author, project lead)

Conference on Robot Learning (CoRL)

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

Jiahua Ma*, Yiran Qin*, Yixiong Li, Xuanqi Liao, Yulan Guo, Ruimao Zhang#(* equal contribution, # corresponding author, project lead)

Conference on Robot Learning (CoRL)

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin#(* equal contribution, # corresponding author)

ArXiv Preprint

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin#(* equal contribution, # corresponding author)

ArXiv Preprint

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai#(* equal contribution, # corresponding author)

ArXiv Preprint

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai#(* equal contribution, # corresponding author)

ArXiv Preprint

Towards robust evaluation of stem education: Leveraging mllms in project-based learning

Yanhao Jia, Xinyi Wu, Qinglin Zhang, Yiran Qin, Luwei Xiao, Shuai Zhao#(# corresponding author)

ArXiv Preprint

Towards robust evaluation of stem education: Leveraging mllms in project-based learning

Yanhao Jia, Xinyi Wu, Qinglin Zhang, Yiran Qin, Luwei Xiao, Shuai Zhao#(# corresponding author)

ArXiv Preprint

Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval

Jiwen Yu, Jianhong Bai, Yiran Qin, Quande Liu#, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu#(# corresponding author)

ArXiv Preprint

Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval

Jiwen Yu, Jianhong Bai, Yiran Qin, Quande Liu#, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu#(# corresponding author)

ArXiv Preprint

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Yiran Qin*, Li Kang*, Xiufeng Song*, Zhenfei Yin#, Xiaohong Liu, Xihui Liu, Ruimao Zhang#, Lei Bai#(* equal contribution, # corresponding author)

International Conference on Computer Vision (ICCV) 2025 Best Paper Award at CVPR 2025 MEIS Workshop

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Yiran Qin*, Li Kang*, Xiufeng Song*, Zhenfei Yin#, Xiaohong Liu, Xihui Liu, Ruimao Zhang#, Lei Bai#(* equal contribution, # corresponding author)

International Conference on Computer Vision (ICCV) 2025 Best Paper Award at CVPR 2025 MEIS Workshop

A Survey of Interactive Generative Video

Jiwen Yu*, Yiran Qin*, Haoxuan Che*, Quande Liu#, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Hao Chen, Xihui Liu#(* equal contribution, # corresponding author)

ArXiv Preprint

A Survey of Interactive Generative Video

Jiwen Yu*, Yiran Qin*, Haoxuan Che*, Quande Liu#, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Hao Chen, Xihui Liu#(* equal contribution, # corresponding author)

ArXiv Preprint

2024

GameFactory: Creating New Games with Generative Interactive Videos
GameFactory: Creating New Games with Generative Interactive Videos

Jiwen Yu*, Yiran Qin*, Xintao Wang#, Pengfei Wan, Di Zhang, Xihui Liu#(* equal contribution, # corresponding author)

International Conference on Computer Vision (ICCV) 2025

GameFactory: Creating New Games with Generative Interactive Videos

Jiwen Yu*, Yiran Qin*, Xintao Wang#, Pengfei Wan, Di Zhang, Xihui Liu#(* equal contribution, # corresponding author)

International Conference on Computer Vision (ICCV) 2025

Interactive Generative Video as Next-Generation Game Engine
Interactive Generative Video as Next-Generation Game Engine

Jiwen Yu*, Yiran Qin*, Haoxuan Che, Quande Liu, Xintao Wang#, Pengfei Wan, Di Zhang, Xihui Liu#(* equal contribution, # corresponding author)

ArXiv Preprint

Interactive Generative Video as Next-Generation Game Engine

Jiwen Yu*, Yiran Qin*, Haoxuan Che, Quande Liu, Xintao Wang#, Pengfei Wan, Di Zhang, Xihui Liu#(* equal contribution, # corresponding author)

ArXiv Preprint

WorldSimBench: Towards Video Generation Models as World Simulators
WorldSimBench: Towards Video Generation Models as World Simulators

Yiran Qin*, Zhelun Shi*, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao#, Lei Bai#, Ruimao Zhang#(* equal contribution, # corresponding author)

International Conference on Machine Learning (ICML) 2025 Oral at CVPR 2025 WorldModelBench Workshop

WorldSimBench: Towards Video Generation Models as World Simulators

Yiran Qin*, Zhelun Shi*, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao#, Lei Bai#, Ruimao Zhang#(* equal contribution, # corresponding author)

International Conference on Machine Learning (ICML) 2025 Oral at CVPR 2025 WorldModelBench Workshop

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

Lijun Li*, Zhelun Shi*, Xuhao Hu, Bowen Dong, Yiran Qin, Xihui Liu, Lu Sheng, Jing Shao#(* equal contribution, # corresponding author)

Conference on Computer Vision and Pattern Recognition (CVPR) 2025

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

Lijun Li*, Zhelun Shi*, Xuhao Hu, Bowen Dong, Yiran Qin, Xihui Liu, Lu Sheng, Jing Shao#(* equal contribution, # corresponding author)

Conference on Computer Vision and Pattern Recognition (CVPR) 2025

High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation

Ziye Wang, Yiran Qin, Lin Zeng, Ruimao Zhang#(# corresponding author)

International Conference on Learning Representations (ICLR) 2025 Oral

High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation

Ziye Wang, Yiran Qin, Lin Zeng, Ruimao Zhang#(# corresponding author)

International Conference on Learning Representations (ICLR) 2025 Oral

NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants
NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants

Yiran Qin*, Ao Sun*, Yuze Hong, Benyou Wang, Ruimao Zhang#(* equal contribution, # corresponding author)

International Conference on Robotics and Automation (ICRA) 2025

NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants

Yiran Qin*, Ao Sun*, Yuze Hong, Benyou Wang, Ruimao Zhang#(* equal contribution, # corresponding author)

International Conference on Robotics and Automation (ICRA) 2025

Minedreamer: Learning to follow instructions via chain-of-imagination for simulated-world control
Minedreamer: Learning to follow instructions via chain-of-imagination for simulated-world control

Enshen Zhou*, Yiran Qin*, Zhenfei Yin, Yuzhou Huang, Ruimao Zhang#, Lu Sheng#, Yu Qiao, Jing Shao(* equal contribution, # corresponding author, project lead)

International Conference on Intelligent Robots and Systems (IROS) 2025

Minedreamer: Learning to follow instructions via chain-of-imagination for simulated-world control

Enshen Zhou*, Yiran Qin*, Zhenfei Yin, Yuzhou Huang, Ruimao Zhang#, Lu Sheng#, Yu Qiao, Jing Shao(* equal contribution, # corresponding author, project lead)

International Conference on Intelligent Robots and Systems (IROS) 2025

Story3d-agent: Exploring 3d storytelling visualization with large language models

Yuzhou Huang, Yiran Qin, Shunlin Lu, Xintao Wang#, Rui Huang, Ying Shan, Ruimao Zhang#(# corresponding author)

ArXiv Preprint

Story3d-agent: Exploring 3d storytelling visualization with large language models

Yuzhou Huang, Yiran Qin, Shunlin Lu, Xintao Wang#, Rui Huang, Ying Shan, Ruimao Zhang#(# corresponding author)

ArXiv Preprint

2023

Mp5: A multi-modal open-ended embodied system in minecraft via active perception
Mp5: A multi-modal open-ended embodied system in minecraft via active perception

Yiran Qin*, Enshen Zhou*, Qichang Liu*, Zhenfei Yin, Lu Sheng#, Ruimao Zhang#, Yu Qiao, Jing Shao(* equal contribution, # corresponding author, project lead)

Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Mp5: A multi-modal open-ended embodied system in minecraft via active perception

Yiran Qin*, Enshen Zhou*, Qichang Liu*, Zhenfei Yin, Lu Sheng#, Ruimao Zhang#, Yu Qiao, Jing Shao(* equal contribution, # corresponding author, project lead)

Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration

Chaoqun Wang*, Yiran Qin*, Zijian Kang, Ningning Ma, Ruimao Zhang(* equal contribution)

International Conference on Robotics and Automation (ICRA) 2024

Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration

Chaoqun Wang*, Yiran Qin*, Zijian Kang, Ningning Ma, Ruimao Zhang(* equal contribution)

International Conference on Robotics and Automation (ICRA) 2024

Boosting 3D Object Detection via Self-Distilling Introspective Data

Chaoqun Wang, Yiran Qin, Zijian Kang, Ningning Ma, Yukai Shi, Zhen Li, Ruimao Zhang#(# corresponding author)

IEEE Transactions on Intelligent Transportation Systems (TITS)

Boosting 3D Object Detection via Self-Distilling Introspective Data

Chaoqun Wang, Yiran Qin, Zijian Kang, Ningning Ma, Yukai Shi, Zhen Li, Ruimao Zhang#(# corresponding author)

IEEE Transactions on Intelligent Transportation Systems (TITS)

SupFusion: Supervised LiDAR-camera fusion for 3D object detection
SupFusion: Supervised LiDAR-camera fusion for 3D object detection

Yiran Qin*, Chaoqun Wang*, Zijian Kang, Ningning Ma, Zhen Li, Ruimao Zhang#(* equal contribution, # corresponding author)

International Conference on Computer Vision (ICCV) 2023

SupFusion: Supervised LiDAR-camera fusion for 3D object detection

Yiran Qin*, Chaoqun Wang*, Zijian Kang, Ningning Ma, Zhen Li, Ruimao Zhang#(* equal contribution, # corresponding author)

International Conference on Computer Vision (ICCV) 2023