I’ve got a dataframe containing feature variables, query ids and relevancy input to run a rank:ndcg model in Xgboost. I’m getting the error below. I’ve checked the data and they are all numeric values with no NAs or non-numeric values. So looks like it’s an issue with the query ids- I’ve got the same number of rows as query ids. If I just add the top query ID, the Matrix doesn’t bring up the error. Code is below and my full dataframe has 20 columns and 201000 rows:
<code>model_nh <- model_nh %>% select(race_ID,result_relevancy,var1,var2,var3,var4)
#race ID needs to be in ascending order
model_nh <- model_nh %>% arrange(race_ID)
feature_cols <- c("var1", "var2", "var3", "var4")
X <- as.matrix(model_nh[, feature_cols])
train_label <- model_nh$result_relevancy
query_ids <- model_nh$race_ID
dtrain <- xgb.DMatrix(data = X, label = train_label, group = query_ids)
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The sum of groups must equal to the number of rows in the input data
</code>
<code>model_nh <- model_nh %>% select(race_ID,result_relevancy,var1,var2,var3,var4)
#race ID needs to be in ascending order
model_nh <- model_nh %>% arrange(race_ID)
feature_cols <- c("var1", "var2", "var3", "var4")
X <- as.matrix(model_nh[, feature_cols])
train_label <- model_nh$result_relevancy
query_ids <- model_nh$race_ID
dtrain <- xgb.DMatrix(data = X, label = train_label, group = query_ids)
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The sum of groups must equal to the number of rows in the input data
</code>
model_nh <- model_nh %>% select(race_ID,result_relevancy,var1,var2,var3,var4)
#race ID needs to be in ascending order
model_nh <- model_nh %>% arrange(race_ID)
feature_cols <- c("var1", "var2", "var3", "var4")
X <- as.matrix(model_nh[, feature_cols])
train_label <- model_nh$result_relevancy
query_ids <- model_nh$race_ID
dtrain <- xgb.DMatrix(data = X, label = train_label, group = query_ids)
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The sum of groups must equal to the number of rows in the input data