Machine Learning Model for Predicting Rice Crop Yield: A Case Study in Hadejia and Auyo, Nigeria
DOI:
https://doi.org/10.56919/usci.2541.024Keywords:
Gradient Boosting, FAO, MSE, RMSE, R2Abstract
Study’s Excerpt:
• Machine learning crop yield prediction models can adapt to changing environmental conditions and provide real-time predictions for improved decision-making.
• Three sets of data were used for easy analysis, model training, and predictive tasks.
• Random Forest Algorithm crop predicted a yield with 95% accuracy.
Full Abstract:
Accurate crop yield prediction is essential for addressing food security challenges, particularly in regions facing climatic variability and resource constraints. This study proposes a machine learning–based framework for rice yield prediction in Hadejia and Auyo, Jigawa State, Nigeria, by integrating soil properties, irrigation methods, water usage, fertilization practices, pest infestation data, and local weather variables. Four ensemble learning algorithms, Random Forest, Gradient Boosting, XGBoost, and LightGBM, were trained and evaluated using both a traditional 80/20 hold-out split and k-fold cross-validation to ensure robust performance assessment. Among these models, Random Forest achieved the highest predictive accuracy, recording an R² of 0.9529 and RMSE of 1.1118, demonstrating its effectiveness in capturing complex, non-linear interactions among agronomic factors. The proposed approach underscores the value of localized data, offering farmers, policymakers, and stakeholders a scalable decision-support tool for optimizing resource allocation, mitigating risks, and enhancing overall agricultural productivity. This research provides a practical roadmap for precision agriculture initiatives in Jigawa State and other regions with similar agroecological conditions by illustrating how comprehensive feature integration and ensemble-based machine learning can significantly improve yield forecasts.
References
Abu Al-Haija, M., & Krichen, W. (2022). Machine-learning-based Darknet traffic detection system for IoT applications. Electronics, 11(4), 556. https://doi.org/10.3390/electronics11040556
Agarwal, & Tarar, S. (2021). A hybrid approach for crop yield prediction using machine learning and deep learning algorithms. Journal of Physics: Conference Series, 1714(1), 012012. https://doi.org/10.1088/1742-6596/1714/1/012012
Alexandros, O., Catal, C., & Kassahun, A. (2022). Hybrid deep learning-based models for crop yield prediction. Applied Artificial Intelligence, 36(1). https://doi.org/10.1080/08839514.2022.2031823
Alibabaei, S., Ghahremani, M., & Omid, M. (2021). Integration of maximum crop response with machine learning algorithms for crop yield prediction. Geo-spatial Information Science, 24(2), 241–252.
Aravind, S., & Indumathi, T. (2021). A comprehensive review on gradient boosting models for classification. Materials Today: Proceedings, 37, 3203–3206.
Archana, & Senthil, K. P. (2023). A survey on deep learning-based crop yield prediction. Nature Environment and Pollution Technology, 22(2). https://doi.org/10.46488/NEPT.2023.v22i02.004
Aworka, R., Adoni, W. Y. H., Zoueu, J. T., Mutombo, F. K., Krichen, M., & Kimpolo, C. L. M. (2022). Agricultural decision system based on advanced machine learning models for yield prediction: Case of East African countries. Smart Agricultural Technology, 2, 100048. https://doi.org/10.1016/j.atech.2022.100048
Bhimavarapu, U., Battineni, G., & Chintalapudi, N. (2022). Improved optimization algorithm in LSTM to predict crop yield. Computers, 12(1), 10. https://doi.org/10.3390/computers12010010
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Chakraborty, S., Ghosh, P., & Singh, R. (2022). Usability of the weather forecast for tackling climatic variability and its effect on maize crop yield in Northeastern Hill Region of India. Agronomy, 12(1), 18. https://doi.org/10.3390/agronomy12010018
Charoen-Ung, P., & Mittrapiyanuruk, P. (2018). Sugarcane yield grade prediction using random forest and gradient boosting tree techniques. In 2018 15th International Joint Conference on Computer Science and Software Engineering (JCSSE) (pp. 1–6). IEEE. https://doi.org/10.1109/JCSSE.2018.8457391
Chen, K., O'Leary, R. A., & Evans, F. H. (2019). A simple and parsimonious generalized additive model for predicting wheat yield in a decision support tool. Computers and Electronics in Agriculture, 162, 651–656.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785
Chlingaryan, A., Sukkarieh, S., & Whelan, B. (2018). Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review. Computers and Electronics in Agriculture, 151, 61–69. https://doi.org/10.1016/j.compag.2018.05.012
Deepa, S. N., & Sivaselvan, B. (2019). Prediction of the compressive strength of high-performance concrete mix using tree-based modeling. Ain Shams Engineering Journal, 10(2), 297–304.
Egbunu, M. T., Ogedengbe, T. S., Yange, T., & Gbaden, T. (2021). Towards food security: The prediction of climatic factors in Nigeria using random forest approach. Journal of Computer Science and Information Technology, 7(4), 70–80. https://doi.org/10.35134/jcsitech.v7i4.15
Elavarasan, D., & Vincent, P. M. D. (2020). Crop yield prediction using deep reinforcement learning model for sustainable agrarian applications. IEEE Access, 8, 86886–86901. https://doi.org/10.1109/ACCESS.2020.2992480
Eli, A. J., Umar, I., & Akinyemi, M. (2023). Rice yield forecasting: A comparative analysis of multiple machine learning algorithms. Journal of Information Systems and Informatics, 5(2). https://doi.org/10.51519/journalisi.v5i2.506
Ferrer, A., Martínez, B., & Gómez, J. (2020). Crop yield estimation and interpretability with Gaussian processes. Frontiers in Remote Sensing, 2, 1010978.
Gao, Y., Wang, S., Guan, K., Wolanin, A., You, L., Ju, W., & Zhang, Y. (2020). The ability of sun-induced chlorophyll fluorescence from OCO-2 and MODIS-EVI to monitor spatial variations of soybean and maize yields in the Midwestern USA. Remote Sensing, 12(7), 1111. https://doi.org/10.3390/rs12071111
Gopal, P. S. M., & Bhargavi, R. (2019). A novel approach for efficient crop yield prediction. Computers and Electronics in Agriculture, 165, 104968. https://doi.org/10.1016/j.compag.2019.104968
Jiya, U., Iliyasu, A., & Ebem, D. U. (2023). Agricultural research and food security under climate change: The place of machine learning models. Journal of Advanced Mathematics and Computer Science, 11(1). https://doi.org/10.22624/AIMS/MATHS/V11N1P2
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (pp. 3146–3154).
Khan, M., Khan, S., & Khan, M. Z. (2021). Optimizing soil fertility through machine learning: Enhancing agricultural productivity and sustainability. Journal of Agricultural Informatics, 12(3), 45–58.
Kheir, A. M. S., Negm, A., & El-Bastawesy, M. (2021). Remote sensing and GIS for estimating crop water consumption in dry environments: A case study of the Nile Delta region. Remote Sensing Applications: Society and Environment, 22, 100474. https://doi.org/10.1016/j.rsase.2021.100474
Mamatha, & Kavitha, J. C. (2023). Machine learning-based crop growth management in greenhouse environment using hydroponics farming techniques. Measurement: Sensors, 25, 100665. https://doi.org/10.1016/j.measen.2023.100665
Martini, M., Offermann, F., Söder, M., Frühauf, C., & Finger, R. (2022). Machine learning can guide food security efforts when primary data are not available. Nature Food, 3(9), 716–728. https://doi.org/10.1038/s43016-022-00587-8
Meng, Q., Hou, P., & Li, T. (2021). Integrating random forest and crop modeling improves the crop yield prediction of winter wheat and oilseed rape. Frontiers in Remote Sensing, 2, 1010978.
Muhammed I., Khan, S., & Khan, M. Z. (2021). Optimizing Soil Fertility through Machine Learning: Enhancing Agricultural Productivity and Sustainability. Journal of Agricultural Informatics, 12(3), 45-58.
Paudel, U., Adhikari, R., & Shrestha, S. (2021). A comparative analysis of machine learning models for rice yield prediction in Nepal. Heliyon, 7(3), e06404.
Pedamkar, P. (2020). Random forest in machine learning. Analytics Vidhya. Retrieved from https://www.analyticsvidhya.com
Prasad, S., Chawla, I., & Ghosh, S. (2021). Integrating satellite data and machine learning techniques for crop yield prediction: A case study of rice in India. Remote Sensing Applications: Society and Environment, 22, 100482.
Ramesh, A., Hebbar, V., Yadav, T., Gunta, A., & Balachandra, A. (2022). CYPUR-NN: Crop yield prediction using regression and neural networks. In Emerging research in computing, information, communication and applications (ERCICA 2020). Springer. https://doi.org/10.1007/978-981-16-8126-1_16
Seungtaek, & Tarar, S. (2021). A hybrid approach for crop yield prediction using machine learning and deep learning algorithms. Journal of Physics: Conference Series, 1714(1), 012012. https://doi.org/10.1088/1742-6596/1714/1/012012
Shahhosseini, M., Hu, G., Huber, I., Archontoulis, S. V., & Laird, D. (2021). A comprehensive review of crop yield prediction using machine learning. Frontiers in Plant Science, 12, 616605. https://doi.org/10.3389/fpls.2021.616605
Shuaibu, M. N., Muhammad, N., & Abu-safyan, Y. (2021). Forecasting rice production in Jigawa State, Nigeria using fuzzy inference system. Dutse Journal of Pure and Applied Sciences, 7(4), 203–213.
Singh, A., Kumar, P., & Kumar, A. (2022). Machine learning-based crop yield prediction: A survey. Journal of King Saud University - Computer and Information Sciences, 34(5), 1297–1313.
Sun, Y., Wang, S., & Tang, X. (2019). Outlier detection based on clustering by fast search and find of density peaks. Information Sciences, 480, 354–364.
United Nations. (2021). The Sustainable Development Goals Report 2021. United Nations.
Van Oort, B. G. H., Timmermans, F., Schils, R. L. M., & van Eekeren, N. (2023). Recent weather extremes and their impact on crop yields of the Netherlands. European Journal of Agronomy, 142, 126662. https://doi.org/10.1016/j.eja.2022.126662
Wickramasinghe, R., Weliwatta, P., Ekanayake, P., & Jayasinghe, J. (2021). Modeling the relationship between rice yield and climate variables using statistical and machine learning techniques. Journal of Mathematics, 2021, 6646126. https://doi.org/10.1155/2021/6646126
World Health Organization. (2021). World health statistics 2021: Monitoring health for the SDGs, sustainable development goals. World Health Organization.
Zhang, G. P. (2006). Avoiding pitfalls in neural network research. IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, 37(1), 3–16. https://doi.org/10.1109/TSMCC.2006.876059
Zhang, Z., Wu, R. M. X., Yan, W., Fan, J., Gou, J., & Liu, B. (2022). A comparative analysis of the principal component analysis and entropy weight methods to establish the indexing measurement. PLOS ONE, 17(1), e0262261. https://doi.org/10.1371/journal.pone.0262261
Zhi, X., Cao, Z., Zhang, T., Qin, L., Qi, L., Ge, A., Guo, X., Wang, C., Da, Y., Sun, W., & Liu, Y. (2022). Identifying the determinants of crop yields in China since 1952 and its policy implications. Agricultural and Forest Meteorology, 327, 109216. https://doi.org/10.1016/j.agrformet.2022.109216
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Inuwa Abdurrahman, Abubakar Miyim Muhammad

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.