Research on Olympic Medal Prediction and Influencing Factors Based on Multi-Dimensional Feature Fusion and Machine Learning

Authors

  • Ruihan Chen
  • Zhe Hu

DOI:

https://doi.org/10.54097/82y5eb28

Keywords:

Multiple Linear Regression Model, Gradient Boosting Model, SHAP Analysis, LASSO-Logistic Regression, Olympic Medal Prediction.

Abstract

To address the limitations of existing Olympic medal prediction models—which over-rely on macroeconomic indicators and lack interpretability—this study proposes a hybrid machine learning framework integrating multi-dimensional feature engineering and model interpretability. This article construct a 19-feature system incorporating historical medals, athlete gender ratios, elite coach attributes (simulated via Monte Carlo), and host-nation effects. Three synergistic models are developed: (1) A multiple linear regression predicting 2028 medal distributions (RMSE=0.25); (2) A LASSO-Logistic regression identifying first-time medal-winning nations (e.g., UAE and Samoa, probability >0.5); (3) A Gradient Boosting Tree with SHAP interpretability quantifying elite coaches’ contribution (SHAP mean=0.217, *p*<0.001). Hyperparameter optimization via RandomizedSearchCV achieves high accuracy (gold medal MAE=0.09, total medals MAE=0.25). Key innovations include dynamic feature fusion and micro-level coach impact quantification, providing actionable insights for Olympic resource allocation.

Downloads

Download data is not yet available.

References

[1] Bernard, A. B., & Busse, M. R.. Who wins the Olympic Games: Economic resources and medal totals [J]. Review of Economics and Statistics, 2004, 86 (1): 413-417.

[2] R. Sayeed, M. T. Hassan, M. N. Rahman, F. B. Zaman, S. Ahmed and M. S. U. Miah. Machine Learning Models for Predicting Olympic Medal Outcomes [C]. 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), 2025, 1-6.

[3] Schlembach, C., Schmidt, S. L., Schreyer, D., & Wunderlich, L. Forecasting the Olympic medal distribution – A socioeconomic machine learning model [J]. Technological Forecasting and Social Change, 2022, 175:121314.

[4] LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions [J]. Advances in Neural Information Processing Systems, 2017, 30: 4768-4777.

[5] Bin Xiao, Zheng Chen, Yanxue Wu, Min Wang, Shengtong Hu, Xingpeng Zhang. Dynamic feature fusion network for time series prediction [J]. International Journal of Approximate Reasoning, 2025, 183: 109436.

[6] Yuan Jun Jie. Preliminary exploration of Olympic gold medal prediction models in the big data era: Evidence from World Athletics Championships performance [J]. Bulletin of Sport Science and Technology, 2021, 29 (06): 132-134.

[7] Engineering J O H. Retracted: Effects of Aerobic Training on Cardiopulmonary Function Based on Multiple Linear Regression Analysis [J]. Journal of Healthcare Engineering, 2023, 2023: 9864103.

[8] Yadi Wang, Wenbo Zhang, Minghu Fan, Qiang Ge, Baojun Qiao, Xianyu Zuo, Bingbing Jiang. Regression with adaptive lasso and correlation based penalty [J]. Applied Mathematical Modelling, 2022, 105: 179-196.

[9] Yilin Zhou, Haoran Zhu, Yijie Yuan, Ziyu Song, and Brendan C. Machine Learning Classification of Chirality and Optical Rotation Using a Simple One-Hot Encoded Cartesian Coordinate Molecular Representation [J]. Mort Journal of Chemical Information and Modeling 2025 65 (9), 4281-4292

[10] Yan L, Zong W, Wenlin Y. Explainable Prediction Model for Acute Kidney Injury Based on XGBoost and SHAP [J]. Journal of Electronics and Information Technology, 2022, 44 (01): 27-38.

[11] Li Yang and Abdallah Shami. On hyperparameter optimization of machine learning algorithms: Theory and practice [J]. Neurocomputing, 2020, 415: 295-316.

Downloads

Published

28-09-2025

How to Cite

Chen, R., & Hu, Z. (2025). Research on Olympic Medal Prediction and Influencing Factors Based on Multi-Dimensional Feature Fusion and Machine Learning. Highlights in Science, Engineering and Technology, 155, 489-497. https://doi.org/10.54097/82y5eb28