Random Forest
Details
Rather than considering the random sample of \(m\) predictors from the total of \(p\) predictors in each split, random forest does not consider a majority of the \(p\) predictors, and considers in each split a fresh sample of \(m_{try}\) which we usually set to \(m_{try} \approx \sqrt{p}\) Random forests which de-correlate the trees by considering \(m_{try} \approx \sqrt{p}\) show an improvement over bagged trees \(m = p\).
Examples
# \donttest{
sample_data <- sample_data[c(1:750),]
yvar <- c("Loan.Type")
xvar <- c("sex", "married", "age", "havejob", "educ", "political.afl",
"rural", "region", "fin.intermdiaries", "fin.knowldge", "income")
BchMk.RF <- RF_Model(sample_data, c(xvar, "networth"), yvar )
#> + Fold01: mtry= 2
#> - Fold01: mtry= 2
#> + Fold01: mtry= 7
#> - Fold01: mtry= 7
#> + Fold01: mtry=12
#> - Fold01: mtry=12
#> + Fold02: mtry= 2
#> - Fold02: mtry= 2
#> + Fold02: mtry= 7
#> - Fold02: mtry= 7
#> + Fold02: mtry=12
#> - Fold02: mtry=12
#> + Fold03: mtry= 2
#> - Fold03: mtry= 2
#> + Fold03: mtry= 7
#> - Fold03: mtry= 7
#> + Fold03: mtry=12
#> - Fold03: mtry=12
#> + Fold04: mtry= 2
#> - Fold04: mtry= 2
#> + Fold04: mtry= 7
#> - Fold04: mtry= 7
#> + Fold04: mtry=12
#> - Fold04: mtry=12
#> + Fold05: mtry= 2
#> - Fold05: mtry= 2
#> + Fold05: mtry= 7
#> - Fold05: mtry= 7
#> + Fold05: mtry=12
#> - Fold05: mtry=12
#> + Fold06: mtry= 2
#> - Fold06: mtry= 2
#> + Fold06: mtry= 7
#> - Fold06: mtry= 7
#> + Fold06: mtry=12
#> - Fold06: mtry=12
#> + Fold07: mtry= 2
#> - Fold07: mtry= 2
#> + Fold07: mtry= 7
#> - Fold07: mtry= 7
#> + Fold07: mtry=12
#> - Fold07: mtry=12
#> + Fold08: mtry= 2
#> - Fold08: mtry= 2
#> + Fold08: mtry= 7
#> - Fold08: mtry= 7
#> + Fold08: mtry=12
#> - Fold08: mtry=12
#> + Fold09: mtry= 2
#> - Fold09: mtry= 2
#> + Fold09: mtry= 7
#> - Fold09: mtry= 7
#> + Fold09: mtry=12
#> - Fold09: mtry=12
#> + Fold10: mtry= 2
#> - Fold10: mtry= 2
#> + Fold10: mtry= 7
#> - Fold10: mtry= 7
#> + Fold10: mtry=12
#> - Fold10: mtry=12
#> Aggregating results
#> Selecting tuning parameters
#> Fitting mtry = 2 on full training set
BchMk.RF
#> Random Forest
#>
#> 601 samples
#> 12 predictor
#> 4 classes: 'No.Loan', 'Formal', 'Informal', 'L.Both'
#>
#> Pre-processing: centered (12), scaled (12)
#> Resampling: Cross-Validated (10 fold)
#> Summary of sample sizes: 539, 542, 541, 541, 540, 541, ...
#> Resampling results across tuning parameters:
#>
#> mtry Accuracy Kappa
#> 2 0.7638425 0.2074747
#> 7 0.7455337 0.2130976
#> 12 0.7438671 0.2255766
#>
#> Accuracy was used to select the optimal model using the largest value.
#> The final value used for the model was mtry = 2.
# }