ggplot2 - R xgboost importance plot with many features -


i trying out kaggle housing prices challenge : https://www.kaggle.com/c/house-prices-advanced-regression-techniques

here script wrote

train    <- read.csv("train.csv") train$id <- null previous_na_action = options('na.action') options(na.action = 'na.pass') sparse_matrix <- sparse.model.matrix(saleprice~.-1,data = train) options(na.action = previous_na_action) model <- xgboost(data = sparse_matrix, label = train$saleprice, missing = na, max.depth = 6, eta = 0.3, nthread = 4, nrounds = 16, verbose = 2, objective = "reg:linear") importance <- xgb.importance(feature_names = sparse_matrix@dimnames[[2]], model = model) print(xgb.plot.importance(importance_matrix = importance)) 

the data has on 70 features, used xgboost max.depth = 6 , nrounds = 16.

the importance plot getting messed up, how view top 5 features or something.

enter image description here

check out top_n argument xgb.plot.importance. want.

# plot top 5 important variables. print(xgb.plot.importance(importance_matrix = importance, top_n = 5)) 

edit: on development version of xgboost. alternative method this:

print(xgb.plot.importance(importance_matrix = importance[1:5])) 

Comments

Popular posts from this blog

serialization - Convert Any type in scala to Array[Byte] and back -

matplotlib support failed in PyCharm on OSX -

python - Matplotlib: TypeError: 'AxesSubplot' object is not callable -