Forecasting products where demand is intermittent (ID) has long been a problem for forecasters.
The reason for this is that, where the historic record contains many periods with zero demand, it is difficult for forecasting algorithms to pick up the correct demand signal to drive the forecast. And this is a non-trivial problem.
There are many businesses where this type of demand IS their business – companies that supply spare parts for instance. But, many ‘normal’ businesses are encountering this problem more and more as they move to more frequent and more granular forecasts in order to drive supply chain efficiencies.
For example, moving from monthly to weekly forecasting will help ensure that products are resupplied at the right time in the month but introducing weekly buckets will also increase the number of periods with zero demand in the historic record. The same applies where companies move from forecasting demand in aggregate to forecasting by customer. As a result it is not uncommon to find upwards of a third of a product portfolio having ID characteristics.
Unsurprisingly this issue is attracting increasing attention in the academic community, but although the array of available tools for forecasting ID is increasing, there is very little consensus about what is the right approach to use and in what circumstances.
Part of the reason for this confusion is that in these circumstances it is very difficult to measure how good a forecast actually is.
Why? There are two basic problems:
Reason Number 1: The denominator problem.
All forecast error metrics consist of a numerator – a measure of error – and a denominator. For example the most common metric, MAPE, is calculated by taking the error (expressed in absolute terms) and dividing it by the actual demand, and expressing the result as a percentage.
This presents an obvious problem: with ID the denominator is often zero so the resulting MAPE will bounce around depending on whether there is demand in that period and will often be incomputable (infinity).
One approach to this is to use SMAPE instead. SMAPE uses the average of the actual and the forecast as the denominator, which deals with the infinity problem but can still be a very volatile metric that is difficult to interpret and use.
Another approach involves another new metric: MASE, which stands for Mean Absolute Scaled Error. This metric uses the average naïve error as the denominator. Because a naïve forecast uses the actual demand as a forecast for the subsequent period the naïve forecast error reflects the volatility of demand, and this has the result of making the resulting error metric less volatile and more meaningful.
MASE is not without its own problems, not least from a practical point of view it is problematical to have one type of metric for ID products and another for non-ID products, particularly when the boundary between them is blurred and changeable.
But there is a bigger problem, which is rarely discussed in academic circles because I suspect it is a bit embarrassing: which is we don’t know how to measure error properly in the first place!
Reason number 2: The numerator problem
The root of this problem is that forecasting algorithms cannot forecast in which period demand will occur; they all attempt to forecast the average level of demand. In the case of ID the forecasts will be very wrong in those periods when there is zero demand. But so long as over time they are right on average everything will be fine because positive and negative errors will cancel each other out; right?
Look at this pattern of (intermittent) demand over 5 periods:
What is the best possible forecast for this series and what is the average error?
If you said ‘2’ per period you would be correct, and the average forecast error would be:
((4 x 2) + 8)/5 = 3.2.
But if the forecast were zero for every period – which is clearly wrong because that means that there would be no stock to service demand – what would the average error be?
The answer is: ((4 x 0) + 10)/5 = 2 !!
How can this be; that a forecast that is obviously and potentially calamitously bad has a lower error than one which is ‘obviously’ correct?
The reason for this is that error measured in this way ‘optimises on the median’. This means that the lowest average error occurs when there are an equal number of errors with higher and lower values. In the case of series with 50% or more periods with no demand this means that the ‘best’ forecast is one that is consistently zero!
The implication is that traditional error metrics are a completely unreliable guide to the quality of forecasts for ID products. As a result it is impossible to select the most appropriate forecasting algorithm for ID products and difficult to determine how well they are being forecast.
What can be done?
One approach is to compare the forecast with the average level of demand but this brings a new set of issues, one of which is that we want forecasts that pick up shifts in the level of demand which is not possible if we pursued this approach.
What is required is an approach which simultaneously measures both the extent to which the forecast captures the level of demand on average (i.e. the forecast bias) and the level of variation around the average. When this is expressed in relation to the level of naïve error we have a way of dealing with all of the problems that ID can throw at a practitioner trying to measure the performance of the forecast process.