1Institute of Artificial Intelligence (TeleAI), China Telecom 2ShanghaiTech University 3Zhejiang University 4Shanghai Jiao Tong University
*Equal Contribution Corresponding Author
图片1 图片2 图片3 图片4

Video

For more cool real-world experimental videos, please scroll down!

Abstract

Soccer presents a significant challenge for humanoid robots, demanding tightly integrated perception-action capabilities for tasks like perception-guided kicking and whole-body balance control. Existing approaches suffer from inter-module instability in modular pipelines or conflicting training objectives in end-to-end frameworks. We propose Perception-Action integrated Decision-making (PAiD), a progressive architecture that decomposes soccer skill acquisition into three stages: motion-skill acquisition via human motion tracking, lightweight perception-action integration for positional generalization, and physics-aware sim-to-real transfer. This staged decomposition establishes stable foundational skills, avoids reward conflicts during perception integration, and minimizes sim-to-real gaps. Experiments on the Unitree G1 demonstrate high-fidelity human-like kicking with robust performance under diverse conditions—including static or rolling balls, various positions, and disturbances—while maintaining consistent execution across indoor and outdoor scenarios. Our divide-and-conquer strategy advances robust humanoid soccer capabilities and offers a scalable framework for complex embodied skill acquisition.

Method

图片1



Experiments


Penalty kick

Left Position (near)
Left Position (far)
Forward Position (near)
Forward Position (far)
Right Position (near)
Right Position (far)

Consecutive random position shots

11 consecutive successful shots (randomly positioning the football)

Star player's style of shooting

C. Ronaldo (original)
C. Ronaldo
Neymar (original)
Neymar
Mbappé (original)
Mbappé

Grass Terrain Test

Grass ground standard motion
Grass ground star motion

Kick moving soccer

From right to left
From left to right



BibTeX


      @misc{kong2026learningsoccerskillshumanoid,
      title={Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework}, 
      author={Jipeng Kong and Xinzhe Liu and Yuhang Lin and Jinrui Han and Sören Schwertfeger and Chenjia Bai and Xuelong Li},
      year={2026},
      eprint={2602.05310},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2602.05310}, 
}