Predicting the spread of Chikungunya using an ensemble regression approach: A case study of Chad, Brazil and Paraguay

by Mohamed El Bachir¹, Ebenezer Maka Maka²˒³, Yannick Malong²˒³, Benjamin Garga⁴, Daouda Hassana Daouda¹, Hamadjam Abboubakar³˒⁵*

¹ University of Ngaoundéré, Faculty of Science, Department of Mathematics and Computer Science, P.O. Box 454, Ngaoundéré, Cameroon.
² University of Douala, ENSPD, Department of Computer Engineering and Telecommunications, P.O. Box 2701, Douala, Cameroon.
³ University of Douala, National Higher Polytechnic School of Douala, Laboratory of Computer Science, Data Science and Artificial Intelligence, Douala, Cameroon.
⁴ University of Ngaoundéré, ENSAI, Department of Electrical, Electronic and Automatic Engineering, P.O. Box 455, Ngaoundéré, Cameroon.
⁵ University of Ngaoundéré, School of Geology and Mining Engineering, P.O. Box 115, Meiganga, Cameroon.

*Corresponding author: [email protected]

Received: 19.11.2025         Accepted: 13.01.2026         Published online: 15.01.2026

The Chikungunya virus, primarily transmitted by female Aedes aegypti and Aedes albopictus mosquitoes, poses a growing global public health challenge due to its debilitating symptoms and rapid spread. Recent outbreaks in Southeast Asia, South America, and Central and East Africa highlight the difficulty of accurately predicting epidemics, given the complex interactions among environmental, climatic, and biological factors. Traditional epidemiological surveillance systems often remain insufficient for early outbreak detection. This study applies advanced machine learning techniques, specifically ensemble regression, to develop predictive models of Chikungunya epidemics in Chad, Brazil, and Paraguay. Random Forest and XGBoost regressors optimized via Grid Search are combined within a Voting Regressor ensemble framework. The ensemble model demonstrated superior predictive performance, achieving lower RMSE and MAE than individual models. At the 5% significance level, no statistically significant differences were observed between the Voting Regressor and XGBoost (p = 0.2126 and p = 0.2081, respectively) or Random Forest (p = 0.2607 and p = 0.2997, respectively), as determined by both the paired t-test and the Wilcoxon signed-rank test.

Share Article