SEA-Nav: Efficient Policy Learning for Safe and Agile Quadruped Navigation in Cluttered Environments

Shiyi Chen, Mingye Yang, Haiyan Mao, Jiaqi Zhang, Haiyi Liu,
Shuheng He, Debing Zhang, Zihao Qiu, Chun Zhang
Tsinghua University Imperial College London

Abstract

Efficiently training quadruped robot navigation in densely cluttered environments remains a significant challenge. Existing methods are either limited by a lack of safety and agility in simple obstacle distributions or suffer from slow locomotion in complex environments, often requiring excessively long training phases. To this end, we propose SEA-Nav (Safe, Efficient, and Agile Navigation), a reinforcement learning framework for quadruped navigation. Within diverse and dense obstacle environments, a differentiable control barrier function (CBF)-based shield constraints the navigation policy to output safe velocity commands. An adaptive collision replay mechanism and hazardous exploration rewards are introduced to increase the probability of learning from critical experiences, guiding efficient exploration and exploitation. Finally, kinematic action constraints are incorporated to ensure safe velocity commands, facilitating successful physical deployment. To the best of our knowledge, this is the first approach that achieves highly challenging quadruped navigation in the real world with minute-level training time.

Framework Overview

SEA-Nav framework
We present SEA-Nav, a highly efficient RL navigation framework for quadruped robots that integrates ACSI experience replay and an end-to-end differentiable LSE-CBF shield. It achieves agile and safe deployment with minute-level training.

One-take Demonstration

In a one-take experiment with 10 trials, the robot completed every run with zero collisions.

Result after approximately 30 minutes of training

Result after approximately 60 minutes of training

Cluttered Room

SEA-Nav

SEA-Nav-b*

Dynamic Obstacle

SEA-Nav

SEA-Nav-b*

Obstacle Course

SEA-Nav

SEA-Nav-b*

S Blend Track

SEA-Nav

SEA-Nav-b*

* SEA-Nav-b uses the Unitree built-in MPC controller and onboard sparse LiDAR.