I am trying to run a Partial Least Squares Discriminant Analysis (PLSDA) in R, followed by a prediction and confusion matrix, using the mdatools package. The mdatools::plsda function works normally, as does stats::predict, but I have been experiencing difficulties with caret::confusionMatrix.
Here is a reproducible example:
library(tidyverse)
library(mdatools)
library(caret)
#GetData
data(iris)
#Sampling 50/50 for train and valid
set.seed(2000)
Train <- iris %>% group_by(Species) %>% sample_frac(.5, replace = FALSE)
Valid <- anti_join(iris, Train)
#Separating categorical variable from numerical variables
PLSDA_Response = as.matrix(Train[,5])
PLSDA_Predictors = as.matrix(Train[,c(1:4)])
Predict_Response = as.matrix(Valid[,5])
Predict_Predictors = as.matrix(Valid[,c(1:4)])
#Run PLS-DA
pls_comp <- mdatools::plsda(PLSDA_Predictors, as.factor(PLSDA_Response), ncomp = 1)
#Run the prediction
pls_comp_predict <- stats::predict(pls_comp, Predict_Predictors)
#Run the Confusion Matrix
pls_CM <- caret::confusionMatrix(pls_comp_predict, as.factor(Predict_Predictors))
The message I receive is: Error: `data` and `reference` should be factors with the same levels.
Can someone help me, please? Thanks! 🙂
Rodrigo Nehara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.