When practitioners are asked what forecast accuracy they use the usual response is ‘MAPE’, which is short for Mean Absolute Percentage Error.
It is easy to see why; intuitively, it makes sense.
Because it represents the average difference between the actual and the forecast expressed as a percentage of the actual (or the forecast) it seems to capture the most important attribute of a forecast – how close to the actuals it is, in relative terms. It is also easy to calculate and to communicate.
However, in my view MAPE is an unsound measure with little practical usefulness; its widespread use has probably held back the progress of forecasting over the last decade or so.
How can I justify this statement?
Reason Number 1: It is statistically suspect
Whether the forecast or the actual is used as the denominator in the equation, MAPE suffers because it is bounded on the lower end – you cannot go lower than zero percent – but is unbounded on the upper end – the percentage error can be infinite. This is a particular problem if the denominator is small or zero, which is often the case if we are attempting to forecast products with intermittent demand, for example.
To get round the asymmetric nature of this metric, academics have proposed using a modified measure, called SMAPE – which stands for symmetrical MAPE. This takes the average of the forecast and the actual as the denominator and this change means that the metric now has an upper bound (of 200%).
Unfortunately, this doesn’t solve all our problems because it can lead to the same absolute error having a different impact on SMAPE depending on whether it is over- or under-forecast. It also doesn’t address the second source of statistical weakness – the problem of weighting.
The usual method of calculating MAPE involves calculating the percentage for a period and then taking a simple arithmetical average over a date range.
This means that periods with low levels of activity are given the same weight as those with high levels of activity which is obviously a problem if the level of activity varies and we are trying to get a sense of the quality of our forecast on average.
In addition, all periods are given equal weight. So good performance many months ago, will disguise a more recent deterioration in performance, which is obviously of concern from a practical point of view.
Reason Number 2: The results are not comparable
All other things being equal, a product with highly volatile demand will have higher MAPE than one with stable demand, because they are more difficult to forecast. So having a higher MAPE doesn’t mean that the forecast is worse.
In addition, the many businesses forecasting hundreds or thousands of products on a frequent basis cannot, for practical reasons, routinely track the performance of the entire portfolio. Averages of low level MAPE are not helpful for the reasons mentioned above but calculating MAPE at higher levels in a product hierarchy is also problematic because the level of volatility in demand will reduce as product actual and forecast values are aggregated
As a result MAPE cannot be used to compare the forecasts performance between products, geographies or different levels in a hierarchy.
Reason Number 3: It can’t tell you whether a forecast is good or bad
‘So what?’ you might say. I am mostly interested in seeing how performance evolves over time, and MAPE works well for this. I don’t think so!
The final problem with MAPE is that we can’t tell whether a particular level of MAPE is good or bad. Take this string of MAPE results:
2.0%, 3.2%, 2.4%, 3.0%, 4.2%
Is that level of error good or bad?
We have no idea. If the level of demand is very stable it may be poor. Alternatively, if demand is volatile, it could be an exceptional level of performance.
And should we be worried by the values for the last three data points? Is it evidence of deterioration in performance or is it just noise? Is the final value, 4.2%, statistically significant evidence that the forecast process is now out of control or could it be a random ‘blip’? Again we have no idea.
Sadly, it is impossible to make sound judgements about forecast quality using MAPE, nor is it possible to use it to set targets or to manage forecast performance in any other way with confidence.
In order to manage and improve the performance of a forecast process practitioners need a metric that is reliable, comparable and actionable…and MAPE isn’t it.