predict(gbmFit1, newdata = testx)[1:5]为了比较不同的模型,还可用装袋决策树建立第二个模型,命名为gbmFit2
gbmFit2= train(trainx, trainy,method = "treebag",trControl = fitControl)另一种得到预测结果的方法是使用extractPrediction函数,得到的部分结果如下显示
models = list(gbmFit1, gbmFit2)
predValues = extractPrediction(models,testX = testx, testY = testy)
head(predValues)
obs pred model dataType object
1 Active Active gbm Training Object1
2 Active Active gbm Training Object1
3 Active Inactive gbm Training Object1
4 Active Active gbm Training Object1
5 Active Active gbm Training Object1
6 Active Active gbm Training Object1从中可提取检验样本的预测结果
testValues = subset(predValues, dataType == "Test")如果要得到预测概率,则使用extractProb函数
probValues = extractProb(models,testX = testx, testY = testy)对于分类问题的效能检验,最重要的是观察预测结果的混淆矩阵
testProbs = subset(probValues, dataType == "Test")
Pred1 = subset(testValues, model == "gbm")结果如下,可见第一个模型在准确率要比第二个模型略好一些
Pred2 = subset(testValues, model == "treebag")
confusionMatrix(Pred1$pred, Pred1$obs)
confusionMatrix(Pred2$pred, Pred2$obs)
Reference Prediction Active Inactive Active 65 12 Inactive 9 45 Accuracy : 0.8397
Reference
Prediction Active Inactive
Active 63 12
Inactive 11 45
Accuracy : 0.8244
最后是利用ROCR包来绘制ROC图
prob1 = subset(testProbs, model == "gbm")
prob2 = subset(testProbs, model == "treebag")
library(ROCR)
prob1$lable=ifelse(prob1$obs=='Active',yes=1,0)
pred1 = prediction(prob1$Active,prob1$lable)
perf1 = performance(pred1, measure="tpr", x.measure="fpr" )
plot( perf1 )
没有评论:
发表评论