Vision-Based Hierarchical Reinforcement Learning for Quadrotor UAV Navigation

Sun, Qiyu; Ji, Jiaxin; Mu, Jinzhen; Xu, Jing; Kocarev, Ljupco; Kurths, Jürgen

doi:10.1109/TMECH.2025.3596019

DetailsSummary

Vision-Based Hierarchical Reinforcement Learning for Quadrotor UAV Navigation

Sun, Q., Ji, J., Mu, J., Xu, J., Kocarev, L., Kurths, J. (2025): Vision-Based Hierarchical Reinforcement Learning for Quadrotor UAV Navigation. - IEEE/ASME Transactions on Mechatronics, 30, 6, 4154-4164.
https://doi.org/10.1109/TMECH.2025.3596019

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://publications.pik-potsdam.de/pubman/item/item_33877 Version Permalink: https://publications.pik-potsdam.de/pubman/item/item_33877_1

Genre: Journal Article

Files

show Files

Locators

show

Creators

show

hide

Creators:
Sun, Qiyu¹, Author
Ji, Jiaxin¹, Author
Mu, Jinzhen¹, Author
Xu, Jing¹, Author
Kocarev, Ljupco¹, Author
Kurths, Jürgen², Author

Affiliations:
1External Organizations, ou_persistent22
2Potsdam Institute for Climate Impact Research, ou_persistent13

Content

show

hide

Free keywords: -

Abstract: Vision-based reinforcement learning (RL) methods enable efficient policy learning and adaptive decision-making for quadrotor uncrewed aerial vehicles (UAVs) navigation in complex, high-dimensional flight environments. Although end-to-end vision-based RL approaches are effective, they often function as closed-box models, lacking interpretability. We develop an explainable vision-based hierarchical RL algorithm for QUAV navigation, integrating perception, obstacle avoidance, and motion control into a unified framework. Due to the high-dimensional state space and complex dynamics of QUAV tasks, traditional RL methods often suffer from sparse and difficult-to-obtain rewards. To address this, we introduce the echoic hindsight experience replay mechanism, which accelerates convergence by transforming failed episodes into successful ones. Building on this, we propose an RL-based proportional-integral-derivative-retarded control method that leverages multirate measurements to enhance low-level control performance, improving maneuverability and precision in QUAV operations. Both simulated and real-world experiments demonstrate the effectiveness of our proposed method for UAV navigation in complex environments.

Details

show

hide

Language(s): eng - English

Dates: Published Online: 2025-09-05Finally published : 2025-12-01

Publication Status: Finally published

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: Peer

Identifiers: DOI: 10.1109/TMECH.2025.3596019
MDB-ID: No data to archive
PIKDOMAIN: RD4 - Complexity Science
Organisational keyword: RD4 - Complexity Science

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: IEEE/ASME Transactions on Mechatronics

Source Genre: Journal

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: 30 (6) Sequence Number: - Start / End Page: 4154 - 4164 Identifier: CoNE: https://publications.pik-potsdam.de/cone/journals/resource/1941-014X
Publisher: Institute of Electrical and Electronics Engineers (IEEE)