15 Nov Gradient Boosting of Decision Trees
The 1811 release of SAP IBP (Integrated Business Planning) has some amazing new features, and near the top of my list of favorites sits the new Gradient Boosting forecasting algorithm. This new machine learning algorithm provides the next step into advanced forecasting techniques in IBP.
Technically, its full name is Gradient Boosting of Decision Trees. Decisions trees are a machine learning method of determining relationships by using a hierarchical set of decision points- thus, creating a tree structure (like the one you see in this handy-dandy graphic, that can be used to decide whether you should go play outside, or stay in and binge-watch whatever Netflix recommends).
The power of the algorithm comes in the boosting, in which the algorithm works through an iterative process to minimize the error of the decision tree using the train and test datasets.
Now that you’re all honorary Decision Tree Experts, let’s take a look at my top 3 takeaways from this new feature:
1. It’s (Still) Considered the Best
Despite being around since the mid 1990’s, Gradient Boosting is still the go-to data science competition forecasting algorithm. Just like regression techniques it allows you to model relationships between variables, but one of its greatest benefits is the ability to model complex non-linear relationships without having to transform data. This means you can say goodbye to using the natural log in a linear regression model and letting the algorithm determine the relationship. Don’t quite believe me? An empirical comparison of different learning algorithms found that…
With excellent performance on all eight metrics, calibrated boosted trees the best learning algorithm overall. Random forests are close second. (Caruana et al., 2005)
Which leads into the next takeaway…
2. Calibration Takes Time
The key word in the results of the study, calibrated. The algorithm is only the best when it is properly calibrated. Since it only has 3 calibration parameters (Maximum Number of Trees, Maximum Tree Depth, and Learning Rate) don’t expect to get the best results with a single try. Calibration takes time, and to achieve the best results it needs to be done for each data set. This means finding the best set of parameters for every planning object or group of planning objects. You’ll find yourself to be like Goldilocks moving from one set of parameters to the next until you find the set that is just right, and then you get to begin again with another dataset. When you take the time to tackle tuning the Gradient Boosting Algorithm, find someone who understands how the algorithm works to save some time, because overfitting the data is very common and frustrating.
3. It’s a Big Step Forward
The release of this new model allows you to have access to an advanced forecasting technique that normally requires programming in Python or R. By eliminating the programming barrier to entry for both the gradient boosting algorithm and the time series analysis, SAP IBP has stepped forward towards delivering a much more robust forecasting and demand planning tool. I look forward to the future releases to come to the demand planning application.