Bibliographic Details
Title: |
Predicting Soil Textural Classes Using Random Forest Models: Learning from Imbalanced Dataset. |
Authors: |
Mallah, Sina, Delsouz Khaki, Bahareh, Davatgar, Naser, Scholten, Thomas, Amirian-Chakan, Alireza, Emadi, Mostafa, Kerry, Ruth, Mosavi, Amir Hosein, Taghizadeh-Mehrjardi, Ruhollah |
Source: |
Agronomy; Nov2022, Vol. 12 Issue 11, p2613, 16p |
Subject Terms: |
RANDOM forest algorithms, DIGITAL soil mapping, MACHINE learning, SOILS, SOIL texture, SOIL mapping, LAND cover |
Geographic Terms: |
IRAN |
Abstract: |
Soil provides a key interface between the atmosphere and the lithosphere and plays an important role in food production, ecosystem services, and biodiversity. Recently, demands for applying machine learning (ML) methods to improve the knowledge and understanding of soil behavior have increased. While real-world datasets are inherently imbalanced, ML models overestimate the majority classes and underestimate the minority ones. The aim of this study was to investigate the effects of imbalance in training data on the performance of a random forest model (RF). The original dataset (imbalanced) included 6100 soil texture data from the surface layer of agricultural fields in northern Iran. A synthetic resampling approach using the synthetic minority oversampling technique (SMOTE) was employed to make a balanced dataset from the original data. Bioclimatic and remotely sensed data, distance, and terrain attributes were used as environmental covariates to model and map soil textural classes. Results showed that based on mean minimal depth (MMD), when imbalanced data was used, distance and annual mean precipitation were important, but when balanced data were employed, terrain attributes and remotely sensed data played a key role in predicting soil texture. Balanced data also improved the accuracies from 44% to 59% and 0.30 to 0.52 with regard to the overall accuracy and kappa values, respectively. Similar increasing trends were observed for the recall and F-scores. It is concluded that, in modeling soil texture classes using RF models through a digital soil mapping approach, data should be balanced before modeling. [ABSTRACT FROM AUTHOR] |
|
Copyright of Agronomy is the property of MDPI and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
Database: |
Complementary Index |