Gaussian Process Regression Model Based on Random Sampling and Secondary Encoding Techniques
DOI:
https://doi.org/10.54097/trr45v06Keywords:
Gaussian Process Regression, Random Sampling, Model Fusion.Abstract
Gaussian Process Regression (GPR) is a flexible non-parametric method that has been widely used in various prediction tasks due to its superior performance in fitting nonlinear functions. However, as the sample size increases, the computational complexity of GPR models grows exponentially, limiting their application to large-scale datasets. To address this issue, this paper proposes a GPR model based on the Stacking framework. The core innovation of the model consists of two parts: first, random sampling techniques are employed to extract multiple subsamples from the original dataset, and independent GPR models are trained for each subsample. Since the subsample sizes are relatively small, this strategy effectively reduces computational complexity and further improves efficiency through parallel processing of multiple models. Second, to overcome the performance variance among different submodels, a model fusion mechanism is adopted. The predictions from the individual submodels are treated as new features, and a secondary GPR model is trained as a combiner to optimize the aggregation of these predictions. This two-layer structural design not only significantly reduces the computational cost of GPR but also enhances the generalization capability of the predictive model through model fusion. Simulation experiments and real-world data analyses demonstrate that the proposed method exhibits a clear competitive advantage over traditional regression models.
Downloads
References
[1] Maulud D, Abdulazeez A M. A review on linear regression comprehensive in machine learning [J]. Journal of Applied Science and Technology Trends, 2020, 1 (2): 140-147.
[2] Neily N, Ammar B B, Kammoun H M. Prediction of COVID-19 active cases using polynomial regression and arima models [C] // International Conference on Intelligent Systems Design and Applications. Cham: Springer International Publishing, 2021: 1351-1362.
[3] Salibian-Barrera M. Robust nonparametric regression: review and practical considerations [J]. Econometrics and Statistics, 2023.
[4] Manzhos S, Ihara M. Degeneration of kernel regression with Matern kernels into low-order polynomial regression in high dimension [J]. The Journal of Chemical Physics, 2024, 160 (2).
[5] Cheung K Y, Lee S M S. High-dimensional local polynomial regression with variable selection and dimension reduction [J]. Statistics and Computing, 2024, 34 (1): 1.
[6] Di Bai. Gaussian process regression and extensions for stockmarket prediction and comparative analysis [D]. Shandong University, 2022. DOI: 10.27272/d.cnki.gshdu.2022.002950.
[7] Palar P S, Parussini L, Bregant L, et al. On kernel functions for bi-fidelity Gaussian process regressions [J]. Structural and Multidisciplinary Optimization, 2023, 66 (2): 37.
[8] Lyu C, Liu X, Mihaylova L. Review of Recent Advances in Gaussian Process Regression Methods [C] // UK Workshop on Computational Intelligence. Cham: Springer Nature Switzerland, 2022: 226-237.
[9] Ganaie M A, Hu M, Malik A K, et al. Ensemble deep learning: A review [J]. Engineering Applications of Artificial Intelligence, 2022, 115: 105151.
[10] LI Jiang-yun, DAI Wen-jiang, ZHANG Xuan-qing, et al. Distributed Hydrological Model Ensemble Simulation Based on Bayesian Model Average Method [J]. CHINA WATER & WASTEWATER, 2023, 39 (03): 116-122. DOI: 10.19853/j.zgjsps.1000-4602.2023.03.018.
[11] Ture B A, Akbulut A, Zaim A H, et al. Stacking-based ensemble learning for remaining useful life estimation [J]. Soft Computing, 2024, 28 (2): 1337-1349.
[12] Liu B. Review of swarm intelligence algorithm optimization of BP neural network [J]. Academic Journal of Computing & Information Science, 2023, 6 (6): 151-155.
[13] Asselman A, Khaldi M, Aammou S. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm [J]. Interactive Learning Environments, 2023, 31 (6): 3360-3379.
[14] Antoniadis A, Lambert-Lacroix S, Poggi J M. Random forests for global sensitivity analysis: A selective review [J]. Reliability Engineering & System Safety, 2021, 206: 107312.
[15] McEligot A J, Poynor V, Sharma R, et al. Logistic LASSO regression for dietary intakes and breast cancer [J]. Nutrients, 2020, 12 (9): 2652.
[16] ZHU Rong, ZOU Guohua, ZHANG Xinyu. Optimal Model Averaging Estimation for Partial Functional LinearModels [J]. Journal of Systems Science and Mathematical Sciences, 2018, 38 (07): 777-800.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







