An improved Q-learning path planning algorithm based on IAPF
DOI:
https://doi.org/10.54097/yzhqnd95Keywords:
Q-learning, APF, Reinforce learning, Path planning.Abstract
Unmanned Aerial Vehicles (UAVs), particularly quadrotor UAVs, are widely recognized for their cost-effectiveness, operational flexibility, and vertical takeoff and landing capabilities, making them ideal for specific airspace operations and emergency response scenarios. Efficient path planning is critical for UAV mission execution, requiring optimal flight paths that ensure safe obstacle avoidance while addressing challenges such as dynamic environments, energy optimization, and multi-parameter management. Despite advancements in path planning techniques, including the artificial potential field (APF) method and reinforcement learning (RL), issues like local optima and parameter tuning complexity persist, limiting adaptability in dynamic environments. This study proposes a hybrid approach integrating Markov Decision Process (MDP) theory with an improved artificial potential field (IAPF) method to enhance quadrotor UAV path planning in three-dimensional environments. By generating global waypoints based on known obstacle, the method minimizes flight path deviations and improves navigation performance. The results demonstrate significant advancements in trajectory accuracy and adaptability, offering a robust solution for UAV path optimization in complex scenarios.
Downloads
References
[1] L. C. Shen (2013). Theories and methods of Autonomous Cooperative Control for Multiple UAVs, 1st ed; National Defence Industry Press, Beijing, China. 2003; Volume 1, pp. 1 - 25.
[2] Gong Youmin. Research on trajectory tracking and autonomous landing control of quadrotor UAV [D]. Harbin Institute of Technology, 2017.
[3] Wang Shuwei, Li Jia, Feng Jian, et al. Path optimization of quadrotor aircraft based on BOA-BP neural network [J]. Modern Defense Technology, 1 - 7 [2024 - 07 - 31].
[4] Lu Yanjun, Li Yueru. Trajectory planning of quadrotor aircraft based on improved APF method [J]. Fire and Command Control, 2018, 43 (11): 119 - 122.
[5] Si Bingshan, Dong Zhiming, Sun Maofan. Research on path optimization algorithm of unmanned vehicle group based on RL [J]. computer simulation, 2024, 41 (02): 455 - 461.
[6] Kan Huang, Xin Changfan, Tan Zheqing, et al. Research on collision avoidance path planning of UAV Based on MDP [J]. Computer Measurement and Control, 2024, 32 (06): 292 - 298. DOI:10. 16526/j. cnki. 11-4762/tp. 2024. 06. 042.
[7] Guo Jing, Li Xiang, Xian Yong. Adaptive APF path planning method based on RL [J]. Journal of Ordnance and Equipment Engineering, 1 - 9 [2024 – 07 - 31].
[8] YAO Q, ZHENG Z, QI L, et al. Path planning method with improved APF—a RL perspective [J]. IEEE access, 2020 (8): 135513 - 135523.
[9] O. Khatib, "Real-time obstacle avoidance for manipulators and mobile robots," The international journal of robotics research, vol. 5, no. 1, pp. 90 - 98, 1986.
[10] Chen Kangxiong, Liu Lei. Path planning algorithm for UAV Based on disturbed fluid and td3 [J]. Electro Optic and Control, 2024, 31 (01): 57 - 62.
[11] Dewey, Dan. “RL and the Reward Engineering Principle.” AAAI Spring Symposia (2014).
[12] XIN J, ZHAO H, LIU D, et al. Application of deep RL in mobile rot path planning [C] Chinese Automation Congress (CAC) Jinan, IEEE,2017: 7112 - 7116.
[13] YAO J, LI X, ZHANG Y, et al. Path planning of unmanned helicopter in complex environment based on heuristic deep Q-network [J]. International Journal of Aerospace Engineering, 2022, 2022: 1360956.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







