Research on Image Recognition and Deep Learning Models in Computer Vision

Authors

  • Chunpu Qiao

DOI:

https://doi.org/10.54097/byaner02

Keywords:

Computer vision; image recognition; deep learning; multi-scale features; adaptive attention; MSA-Net.

Abstract

The current mainstream deep learning models still have deficiencies in multi-scale target recognition and complex scene processing capabilities, which affects the generalization and robustness of the models. A multi-scale adaptive attention network (MSA-Net) is proposed for optimizing the performance of image recognition by combining multi-scale features with adaptive attention. MSA-Net extracts multi-scale features through a multi-branch convolutional network to capture target information of different resolutions, and introduces an adaptive attention module to dynamically optimize the global and local attention distribution, thereby enhancing the model's ability to focus on key areas. Experimental simulations are based on public datasets such as ImageNet and CIFAR-10. The results show that MSA-Net has significantly improved classification accuracy, with a classification accuracy of 94.6% on the CIFAR-10 dataset, which is 3.2% higher than ResNet; the mean average precision (mAP) is increased to 82.5%, and the inference time is optimized to 22.8 FPS. Visualization experiments verify the significant advantages of MSA-Net in complex backgrounds and multi-scale target scenes. Practical applications show that MSA-Net can achieve efficient image recognition in the fields of security monitoring, medical imaging, etc., and provide strong technical support for the automation and intelligence of related scenarios.

Downloads

Download data is not yet available.

References

[1] Zhang Shixiang, Zhang Hancheng, Li Xizhi, & Hu Jing. Research on multi-target recognition of pavement cracks based on machine vision. Journal of Highway and Transportation Research and Development, vol. 38, pp. 30-39, March 2021.

[2] Zhao Lixin, Xing Runzhe, Bai Yinguang, Zhang Hongchang, & He Chunyan. A review of deep learning in target detection. Science Technology and Engineering, vol. 21, pp. 12787-12795, August 2021.

[3] Wu Yufeng, Li Yiming, Zhao Yuanyang, Yang Pu, Li Zhenbo, & Guo Hao. A review of body condition scoring of dairy cows based on computer vision. Transactions of the Chinese Society of Agricultural Machinery, vol. 52, pp. 268-275, May 2021.

[4] Zhang Tao, Liu Yuting, Yang Yaning, Wang Xin, & Jin Yinggu. A review of surface defect detection based on machine vision. Science Technology and Engineering, vol. 20, pp. 14366-14376, September 2021.

[5] Chen Ke, Zhou Yong, Xue Mingyang, Zhu Songming, Zhao Jian, Cai Haiying, & Ye Zhangying. Lightweight nondestructive detection model of crucian carp diseases based on machine vision and improved YOLOv5s. Acta Hydrobiologica Sinica, vol. 48, pp. 1141-1148, July 2024.

[6] Sun Dong, Song Yang, Cen Xuanzhen, Sheng Bo, & Gu Yaodong. Research progress of markerless motion recognition technology based on computer vision. Journal of Shanghai Institute of Physical Education, vol. 45, pp. 70-85, September 2021.

[7] Yan Longchuan, Liu Jun, He Yongyuan, Yuan Xiaoyu, Niu Jianing, & Zhang Linfeng. Research and application of data center room security control technology based on machine vision. Electric Power Information and Communication Technology, vol. 21, pp. 42-47, May 2023.

[8] Ge Yizhou, Liu Heng, Wang Yan, Xu Baile, Zhou Qing, & Shen Furao. A review of deep learning image recognition under the small sample dilemma. Journal of Software, vol. 33, pp. 193-210, January 2021.

[9] Lu Lina, & Yu Xiao. Research on recognition and classification of soybean leaf image data management by deep learning. Journal of Library and Information Science in Agriculture, vol. 35, pp. 87-94, February 2023.

[10] Sun Liang, Ke Yuhang, Liu Hui, Hu Yiyu, Feng Chengtian, Liu Wenbo, ... & Zheng Fushou. Research progress of computer vision technology in plant disease recognition. Acta Tropical Biology, vol. 13, pp. 651-658, June 2022.

[11] Huang Jian, Li Xin, Chen Fang, Cui Ru, Li Huimin, & Du Bowen. Research on a deep learning recognition model for landslide terrain based on multi-source data fusion. Chinese Journal of Geological Hazards and Control, vol. 33, pp. 33-41, April 2022.

[12] Liu Chuanyang, & Wu Y. Q. Research progress on visual inspection methods for power transmission lines based on deep learning. Proceedings of the CSEE, vol. 43, pp. 7423-7445, October 2023.

Downloads

Published

11-05-2025

How to Cite

Qiao, C. (2025). Research on Image Recognition and Deep Learning Models in Computer Vision. Highlights in Science, Engineering and Technology, 138, 138-146. https://doi.org/10.54097/byaner02