English
 
Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Vision-Based Hierarchical Reinforcement Learning for Quadrotor UAV Navigation

Authors

Sun,  Qiyu
External Organizations;

Ji,  Jiaxin
External Organizations;

Mu,  Jinzhen
External Organizations;

Xu,  Jing
External Organizations;

Kocarev,  Ljupco
External Organizations;

/persons/resource/Juergen.Kurths

Kurths,  Jürgen
Potsdam Institute for Climate Impact Research;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Sun, Q., Ji, J., Mu, J., Xu, J., Kocarev, L., Kurths, J. (2025): Vision-Based Hierarchical Reinforcement Learning for Quadrotor UAV Navigation. - IEEE/ASME Transactions on Mechatronics, 30, 6, 4154-4164.
https://doi.org/10.1109/TMECH.2025.3596019


Cite as: https://publications.pik-potsdam.de/pubman/item/item_33877
Abstract
Vision-based reinforcement learning (RL) methods enable efficient policy learning and adaptive decision-making for quadrotor uncrewed aerial vehicles (UAVs) navigation in complex, high-dimensional flight environments. Although end-to-end vision-based RL approaches are effective, they often function as closed-box models, lacking interpretability. We develop an explainable vision-based hierarchical RL algorithm for QUAV navigation, integrating perception, obstacle avoidance, and motion control into a unified framework. Due to the high-dimensional state space and complex dynamics of QUAV tasks, traditional RL methods often suffer from sparse and difficult-to-obtain rewards. To address this, we introduce the echoic hindsight experience replay mechanism, which accelerates convergence by transforming failed episodes into successful ones. Building on this, we propose an RL-based proportional-integral-derivative-retarded control method that leverages multirate measurements to enhance low-level control performance, improving maneuverability and precision in QUAV operations. Both simulated and real-world experiments demonstrate the effectiveness of our proposed method for UAV navigation in complex environments.