Crop Prediction Based on Soil Classification using Machine Learning with Classifier Ensembling
David Johnson
Hongyang Sun
Globally, agriculture is the most significant source, which is the backbone of any country, and is an emerging field of research these days. There are many different types of soil, and each type has different characteristics for crops. Different methods and models are used daily in this region to increase yields. The macronutrient and micronutrient content of the soil, which is also a parametric representation of various climatic conditions like rain, humidity, temperature, and the soil's pH, is largely responsible for the crop's growth. Consequently, farmers are unable to select the appropriate crops depending on environmental and soil factors. The method of manually predicting the selection of the appropriate crops on land has frequently failed. We use machine learning techniques in this system to recommend crops based on soil classification or soil series. A comparative analysis of several popular classification algorithms, including K-Nearest Neighbors (KNN), Random Forest (RF), Decision Tree (DT), Support Vector Machines (SVM), Gaussian Naive Bayes (GNB), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), and Voting Ensemble classifiers, is carried out in this work to assist in recommending the cultivable crop(s) that are most suitable for a particular piece of land depending on the characteristics of the soil and environment. To achieve our goal, we collected and preprocessed a large dataset of crop yield and environmental data from multiple sources. Our results show that the voting ensemble classifier outperforms the other classifiers in terms of prediction accuracy, achieving an accuracy of 94.67%. Feature importance analysis reveals that weather conditions such as temperature and rainfall, and fertilizer usage are the most critical factors in predicting crop yield.