Predictive Model Construction Based on GA And K-Means Algorithm

Authors

  • Yongshuo Du
  • Bosheng Huang
  • Yunfei Hou

DOI:

https://doi.org/10.54097/xjtk4p55

Keywords:

Medal prediction model; GA genetic algorithm; K-means cluster analysis; RCA index.

Abstract

In this paper, a data trend prediction framework based on genetic algorithm (GA) and K-means cluster analysis is proposed, focusing on the synergistic application of computational intelligence algorithms in complex data modeling. First, a prediction model for the total number of medals and the number of gold medals is constructed by the GA algorithm. Second, K-means cluster analysis was utilized to quantify the competitive sports strengths of the participating countries into discrete classes, which were entered into the regression equation as dummy variables to significantly improve the goodness-of-fit. In addition, the RCA index was introduced to quantify country-sports comparative advantage and to analyze the relevance of medal distribution. The model was validated for predictive accuracy by RMSE, MAPE and MAE, and the contribution of coaches was quantified based on residual analysis. The experimental results show that the GA regression model has the smallest medal prediction error, and the feature coding mechanism enhances the interpretability of the model, providing an efficient computational framework for Olympic performance prediction.

Downloads

Download data is not yet available.

References

[1] Christoph Schlembach, Sascha L. Schmidt, Dominik Schreyer, Linus Wunderlich,Forecasting the Olympic medal distribution – A socioeconomic machine learning model,Technological Forecasting and Social Change,Volume 175,https://doi.org/10.1016/j.techfore.2021.121314.

[2] Csurilla, Gergely, and Imre Fertő. 2024. “ How to win the first Olympic medal? And the second?” Social Science Quarterly 105: 1544–1564. https://doi.org/10.1111/ssqu.13436

[3] De Bosscher, V., Shibli, S., & Weber, A. Ch. (2018). Is prioritisation of funding in elite sport effective? An analysis of the investment strategies in 16 countries. European Sport Management Quarterly, 19(2), 221–243. https://doi.org/10.1080/16184742.2018.1505926

[4] Wang Guofan, Zhao Wu, Liu Xujun, etc Research on Olympic Performance Prediction Based on GA and Regression Analysis [J]. China Sports Science and Technology, 2011, 47 (01): 4 8+16. DOI: 10.16470/j.csst.2011.002

[5] Zhang Yu, Xia Binghui, Zhang Lingling, et al. Uptime symmetric imaging model based on RUN optimization algorithm[J/OL]. Advances in Lasers and Optoelectronics,1-15[2025-04-03]. http:// kns.cnki. net/kcms/detail/31.1690.TN.20250324.1658.036.html.

[6] Chunru Chen. Dimensionality reduction and cluster analysis of large-scale data based on PCA and K-means[J]. Information Record Material,2025,26(02): 156-158.DOI: 10.16009/j.cnki.cn13-1295/ tq. 2025. 02.008.

Downloads

Published

23-05-2025

How to Cite

Du, Y., Huang, B., & Hou, Y. (2025). Predictive Model Construction Based on GA And K-Means Algorithm. Highlights in Science, Engineering and Technology, 140, 358-364. https://doi.org/10.54097/xjtk4p55