Intelligent Perception System of Moving Robot for Dynamic Environment
DOI:
https://doi.org/10.54097/9nd37191Keywords:
Deep learning; Gold-yolo; DCN2.Abstract
Object detection poses a significant challenge in the realm of computer vision for mobile robot grasping tasks. The rapid advancements in deep learning have led to the emergence of numerous novel object detection algorithms, resulting in improved accuracy in large-scale environments. Mobile robot grasping tasks demand highly responsive target detection, as any delay could hinder the robot's ability to react appropriately, thus impacting task execution. Additionally, the constantly changing background environment caused by a fast-moving robot introduces dynamic interference. To address these issues of real-time processing and dynamic background interference in mobile robot grasping tasks, it is crucial to enhance target detection speed, comprehensively capture features, and interpret image context information. This paper introduces a new information fusion structure based on the Yolov8 model and Gold-yolo aggregation distribution mechanism. The sampling approach retains the traditional Faster Implementation of CSP Bottleneck with 2 convolutions module (C2f) from YOLOv8 while incorporating the variable convolution module from Data Communication Network (DCN2). These two sampling methods are interwoven, enhancing the accuracy of detecting various shape information of the same object as the robot moves.
Downloads
References
[1] Girshick R, Donahue J, Darrell T, Malik J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 580-587, doi: 10.1109/CVPR.2014.81.
[2] Uijlings J R, van de Sande K E, Gevers T, Smeulders A W. Selective search for object recognition. International Journal of Computer Vision, 2013, 104: 154, doi: 10.1007/s11263-013-0634-5.
[3] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C, Berg A C. SSD: Single Shot MultiBox Detector. In: Leibe B, Matas J, Sebe N, Welling M, eds. Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, Cham, 2016. https://doi.org/10.1007/978-3-319-46448-0_2.
[4] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. Advances in neural information processing systems, 2017, pp. 5998-6008.
[5] Wang Z, Zhang H, Li Y, Liu J. Gold YOLO: Towards Real-Time Object Detection with High Accuracy and Efficiency. International Journal of Computer Vision, 2023, 129(5): 1748-1764, doi: 10.1007/s11263-023-03079-y.
[6] Zhu J, Hu T, Zheng L, Zhou N, Ge H, Hong Z. YOLOv8-C2f-Faster-EMA: An Improved Underwater Trash Detection Model Based on YOLOv8. Sensors (Basel), 2024, 24(8): 2483, doi: 10.3390/s24082483.
[7] Tang X, Zhao S. The application prospects of robot pose estimation technology: exploring new directions based on YOLOv8-ApexNet. Frontiers in Neurorobotics, 2024, 18: 1374385, doi: 10.3389/fnbot.2024.1374385.
[8] Howard A G, Sandler M, Chen L C, Chen Z, Deng J, Rubner M, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv preprint arXiv:1801.04381, 2018.
[9] Liu W, Wang Y, Song Q, Chen S. VisDrone 2019: A Benchmark and Dataset for Multi-Object Detection in Aerial Imagery. arXiv preprint arXiv:1910.01177, 2019.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







