ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments

Ziyang Gong1,2*, Zehang Luo1*, Anke Tang1*, Zhe Liu1,5*, Shi Fu3, Zhi Hou1†,‡, Ganlin Yang6, Weiyun Wang7, Xiaofeng Wang1, Jianbo Liu1, Gen Luo8, Haolan Kang5, Shuang Luo3, Yue Zhou9, Yong Luo10, Li Shen11, Xiaosong Jia7, Yao Mu2, Xue Yang2‡, Chunxiao Liu1, Junchi Yan2, Hengshuang Zhao5, Dacheng Tao3‡, Xiaogang Wang1‡
1ACE Robotics, 2Shanghai Jiao Tong University, 3Nanyang Technological University, 4The Chinese University of Hong Kong, 5The University of Hong Kong, 6University of Science and Technology of China, 7Fudan University, 8Xiamen University, 9East China Normal University, 10Wuhan University, 11Sun Yat-sen University
*Equal contribution   Project Leader   Corresponding author

Intelligence Domains Comparison

From ACE Brains paper: Table 2–5, four domains, six bar charts per domain

Overview

ACE-Brain-0 is a generalist multimodal foundation model designed to unify perception, reasoning, and decision-making across diverse embodied domains, including spatial cognition, autonomous driving, low-altitude sensing and embodied interaction. Built upon a unified multimodal large language model (MLLM) architecture, ACE-Brain-0 learns a shared spatial reasoning substrate that enables generalization across heterogeneous physical environments and agent embodiments.

Extensive evaluation across 24 benchmarks demonstrates that ACE-Brain-0 achieves state-of-the-art or competitive performance across multiple domains, validating its effectiveness as a unified embodied intelligence model.

ACE-Brain Teaser

Key Features

Spatial Intelligence as Scaffold

Built around a unified spatial representation that bridges perception and thinking across heterogeneous embodiments, enabling robust 3D scene understanding as the core cognitive backbone.

Scaffold-Specialize-Reconcile

SSR training paradigm first establishes a shared spatial foundation, then cultivates domain-specific experts in isolation, and finally harmonizes them through data-free model merging, eliminating gradient interference and catastrophic forgetting.

Universal Embodiment

A single foundation brain that generalizes across spatial cognition, autonomous driving, low-altitude sensing, and embodied interaction, covering four distinct intelligence domains within one unified architecture and training paradigm.

Performance Highlights

Comprehensive comparison of ACE-Brain-0 against state-of-the-art models across 24 benchmarks spanning four intelligence domains. Bold values denote the best result per benchmark; denotes lower-is-better.

Benchmark VeBrain Pelican-VL MiMo-Embodied RoboBrain2.5 Vlaser ACE-Brain-0
  Spatial Cognition
VSIBench 39.9 52.8 48.5 41.0 60.3 63.3
MMSI-Bench 27.3 26.0 31.7 29.3 27.2 32.2
BLINK 79.7 56.8 0.0 84.3 84.9 83.9
SITE 51.4 52.3 44.8 52.6 47.5 53.1
SAT 73.3 67.3 78.7 63.3 66.7 92.0
MindCube 30.1 31.0 32.3 28.1 34.6 82.1
Multi3DRef 67.8 7.9 8.2 8.2 8.2 59.6
  Autonomous Driving
MME-RealWorld 60.1 57.9 60.3 60.0 41.6 71.2
MAPLM 22.9 24.9 74.5 22.5 29.1 77.8
DriveAction 78.3 77.2 81.0 80.5 78.1 81.3
NuscenesQA 29.3 14.8 56.7 33.2 33.1 58.8
NuPlanQA 82.9 83.4 73.7 79.3 78.3 91.7
LingoQA 55.0 56.0 69.9 48.0 59.6 65.8
  Low-Altitude Sensing
UrbanVideo-Bench 36.5 37.1 26.0 37.5 30.4 56.9
AirCop 51.9 50.8 50.2 49.9 25.3 70.3
AVI-Math 25.4 22.5 33.7 26.1 19.3 35.0
Airspatial  1583.4 1586.6 289.4 1509.3 1597.7 258.0
HRVQA 37.9 38.6 22.2 13.4 27.0 61.2
  Embodied Interaction
ERQA 40.3 39.8 46.8 44.3 41.0 41.5
RoboVQA 29.2 28.1 0.9 32.9 7.9 64.6
OpenEQA 63.8 63.3 74.1 62.6 56.3 70.0
EgoPlan2 27.3 39.4 43.0 44.9 53.4 55.3
EmbSpatial 70.5 73.2 76.2 75.6 75.3 77.3
EB-Habitat 15.0 16.3 16.7 26.3 40.0 42.3

Notes:

  • Bold values indicate the best result in each row.
  • (Airspatial) is a lower-is-better metric; ACE-Brain-0 achieves the lowest error.
  • Results sourced from Table 2–5 of the ACE-Brain-0 paper.

BibTeX

@misc{gong2026acebrain0spatialintelligenceshared,
      title={ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments}, 
      author={Ziyang Gong and Zehang Luo and Anke Tang and Zhe Liu and Shi Fu and Zhi Hou and Ganlin Yang and Weiyun Wang and Xiaofeng Wang and Jianbo Liu and Gen Luo and Haolan Kang and Shuang Luo and Yue Zhou and Yong Luo and Li Shen and Xiaosong Jia and Yao Mu and Xue Yang and Chunxiao Liu and Junchi Yan and Hengshuang Zhao and Dacheng Tao and Xiaogang Wang},
      year={2026},
      eprint={2603.03198},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.03198}, 
}