Olympic Medal Count Prediction Research Based on A Hybrid ARIMA-Xgboost-Lightgbm Model

Authors

  • Zijun Du
  • Yirui Zheng

DOI:

https://doi.org/10.54097/gj0pmz16

Keywords:

Olympic medal prediction, ARIMA, XGBoost, LightGBM.

Abstract

This study proposes an integrated ARIMA–XGBoost–LightGBM model to predict both medal counts and medal-winning probabilities for the 2028 Los Angeles Olympic Games. The framework begins by applying the ARIMA model to forecast time-series features, such as the number of athletes and participating countries. These features are then fed into an XGBoost regressor to estimate gold, silver, bronze, and total medals per country. Additionally, a LightGBM classifier is utilized to predict the probability that a nation will win at least one medal. Model performance was evaluated using ten-fold cross-validation, with R² values exceeding 0.84 across all medal categories, demonstrating high accuracy and robust generalization ability. Notably, six countries are projected to win their first Olympic medals based on the predicted probabilities. This ensemble approach effectively combines time series forecasting with machine learning algorithms, showcasing its potential in supporting strategic sports planning and medal outcome prediction. It provides valuable forecasting capabilities for complex, dynamic events such as the Olympics.

Downloads

Download data is not yet available.

References

[1] DE BOSSCHER V. The global sporting arms race: An international comparative study on sports policy factors leading to international sporting success [M]. Meyer & Meyer Verlag, 2008.

[2] SCHLEMBACH C, SCHMIDT S L, SCHREYER D, et al. Forecasting the Olympic medal distribution–a socioeconomic machine learning model [J]. Technological Forecasting and Social Change, 2022, 175: 121314.

[3] Moolchandani J, Chole V, Sahu S, et al. Predictive Analytics in Sports: Using Machine Learning to Forecast Outcomes and Medal Tally Trends at the 2024 Summer Olympics[C]//2024 4th International Conference on Technological Advancements in Computational Sciences (ICTACS). IEEE, 2024: 1987-1992.

[4] WANG Y, WANG J, HUANG T-Y, et al. STGCN-LSTM for Olympic Medal Prediction: Dynamic Power Modeling and Causal Policy Optimization [J]. arXiv preprint arXiv:250117711, 2025.

[5] ZHAO S, CAO J, STEVE J. Research on Olympic medal prediction based on GA-BP and logistic regression model [J]. F1000Research, 2025, 14: 245.

[6] Nagpal P, Gupta K, Verma Y, et al. Paris Olympic (2024) Medal Tally Prediction[C]//International Conference on Data Management, Analytics & Innovation. Singapore: Springer Nature Singapore, 2023: 249-267.

[7] Fazlollahi P, Afarineshkhaki A, Nikbakhsh R. Predicting the medals of the countries participating in the Tokyo 2020 olympic games using the test of networks of multilayer perceptron (MLP)[J]. Annals of Applied Sport Science, 2020, 8(4): 0-0.

[8] SUN A W. Medal count disparities at the Olympic Games: An econometric analysis of the determinants of national Olympic success using an economic growth framework [D]; Master thesis]. Department of Economics: Copenhagen Business School, 2020.

[9] REIS F J, ALAITI R K, VALLIO C S, et al. Artificial intelligence and machine-learning approaches in sports: Concepts, applications, challenges, and future perspectives [J]. Brazilian Journal of Physical Therapy, 2024: 101083.

[10] LECKEY C, VAN DYK N, DOHERTY C, et al. Machine learning approaches to injury risk prediction in sport: a scoping review with evidence synthesis [J]. British Journal of Sports Medicine, 2025, 59(7): 491-500.

Downloads

Published

28-09-2025

How to Cite

Du, Z., & Zheng , Y. (2025). Olympic Medal Count Prediction Research Based on A Hybrid ARIMA-Xgboost-Lightgbm Model. Highlights in Science, Engineering and Technology, 155, 243-251. https://doi.org/10.54097/gj0pmz16