Beginner experimenting with LGBM here. I have code that looks like this
clf = lgb.LGBMClassifier(max_depth=3, verbosity=-1, n_estimators=3)
clf.fit(train_data[features], train_data['y'], sample_weight=train_data['weight'])
print (f"I have {clf.n_estimators_} estimators")
fig, ax = plt.subplots(nrows=4, figsize=(50,36), sharex=True)
lgb.plot_tree(clf, tree_index=7, dpi=600, ax=ax[0]) # why does it have 7th tree?
lgb.plot_tree(clf, tree_index=8, dpi=600, ax=ax[1]) # why does it have 8th tree?
#lgb.plot_tree(clf, tree_index=9, dpi=600, ax=ax[2]) # crashes
#lgb.plot_tree(clf, tree_index=10, dpi=600, ax=ax[3]) # crashes
I am surprised that despite n_estimators=3
, I seem to have 9 trees? How do I actually set the number of trees, and related to that, what does n_estimators
do? I’ve read the docs, and I thought it would be the number of trees, but it seems to be something else.
Separately, how do I interpret the separate trees, with their ordering, 0, 1, 2, etc. I know random forest, and how there every tree is equally important. In boosting, the first tree is most important, the next one significantly less, the next significantly less. So in my head, when I look at the tree diagrams, how can I “simulate” the LightGBM inference process?