Demand Modeling Explained - Part 3: The Signal and the Noise
Part 3 - read the first here - read the second here
This third part in the series discusses the core concept of demand modeling: isolating signal from the noise in the data.
It is often said that it is impossible to generate a perfect demand forecast, and this is true if your definition of perfect is that future demand will come in exactly as forecasted. What makes it impossible is the existence of noise: seemingly random or arbitrary quantities or fluctuations of quantities within the total demand pattern. This noise is not forecastable. The best we could hope to achieve is to be able to filter the historic noise out completely and keep all the historic signal - the nicely behaving and forecastable part. We could then generate a perfect forecast of the signal only. The future noise, which is not forecastable, would be the only thing that would amount to the forecast error. The only thing we can be sure of is that there will be noise in the future, just as there has been in history, we just cannot predict how, when and where it will occur.
In demand modeling we give up on second-guessing the noise. It is futile to try to predict it, so why waste effort, time and costs doing so. When demand planners, sales managers or collaborators make manual adjustments to demand forecast quantities, this is what they are doing: trying to predict the noise that the forecasting mechanism could not. They only ever succeed in improving the accuracy if the forecasting mechanism was inferior to begin with or is malnourished.
Instead, demand modeling aims to get better at isolating every last little bit of signal from the noise.
What is signal?
In short: signal is that part of the demand data that has predictive value.
However, there are many different ways one could perceive that. One essential perspective is whether one considers signal (and noise) in a deterministic or a stochastic way. In the demand modeling world everything is stochastic, which includes its view of signal and noise. This distinction is an essential part of demand modeling, and so is an essential concept to understand to appreciate the value of demand modeling. Therefore, let me explain briefly what each of these terms means. Again in short:
Deterministic means that an outcome is exact and inevitable. Deterministic systems take exact values as input and their output is also in terms of exact values. All of their internal processes also view all data to be exact.
Stochastic means an outcome can have any value from a range of values, and each value within that range has a certain probability of occurring. Stochastic systems can take inputs in either deterministic or stochastic form and they output stochastic results which can be simplified as needed to deterministic values for use by other systems that may be deterministic in nature. All internal calculations in stochastic systems are considerate of the full range of values and their probabilities.
Traditional forecasting systems are deterministic: they output a single forecasted demand quantity for each item in each time period. That quantity is exact and inevitable; demand will occur exactly as forecasted, always. And there is zero forecast error. Right! Okay okay, so developers of such forecasting approaches realize this is not true, and they determine a forecast error based on some assumptions and do so... only after the fact of determining the forecast itself! See for example "Variability, Volatility, Uncertainty, and Error" for a more technical description of how this error is determined and how it is a poor proxy for the uncertainty around the demand. Two main problems with this approach as it pertains to the current topic are 1) that the internal forecasting calculation is completely ignorant of the stochastic nature of the demand resulting in an unreliable and inaccurate forecast, and 2) that systems consuming the output forecast consider it as exact and inevitable, where clearly it is neither of those. Vendors and users of such systems give lip service to the inaccuracy, but have no way of incorporating it in any of the downstream planning processes in any remotely reliable way.
Demand modeling on the other hand is stochastic: it utilizes the fact that demand and supply are uncertain in nature. Not only is the direct demand uncertain: a customer that usually orders exactly 100 units of an item every Tuesday, may one week suddenly order 150 units, or may order on a Wednesday instead. Both the quantity and the time are variable and uncertain. Since supply is also variable in nature any dependent demand will be even more uncertain than direct demand. Dependent demand may be for finished goods in backward warehouses or plants, or may also be demand for components that need to be forecasted (in case they are sold in their own right, otherwise they can simply be planned, rather than forecasted). With supply considerations in the mix - variations in lead time (they are not exact) and variations in quantity (someone kits or ships more or less) - the stochastic nature increases. Note that since your customers have their own supply chain to their customers in turn, their variability in supply contributes largely to the stochastic nature of the demand they impose on you.
In other words traditional forecasting ignores the inherent stochastic nature of demand, whilst demand modeling embraces it.
With this perspective applied to signal and noise, the claim of the second paragraph of this article will start making sense. With the traditional deterministic view of a stochastic world it is impossible to separate signal from noise and then perfectly forecast the signal. There is natural variability in the signal which would not allow a perfect deterministic forecast. Any deviation of the demand, however normal it may be, can only be accounted for as forecast error, since the forecast was expressed as an exact number. Naturally, any noise will also show as variability and needs to also be accounted in exactly the same way as the variability in the clean signal. This illustrates how a deterministic view makes it impossible to cleanly separate signal and noise.
Using a stochastic view of a stochastic world provides a natural way to separate signal and noise. It's like a round peg in a round hole. The signal is not an exact number but a range of values each with a probability of occurring, just like the overall demand values as they occur in the real world. Another, more traditional way of looking at that is that there is a variability around the mean value of the signal, but that variability is still part of the signal. The difference between signal and noise is now NOT synonymous with the difference between the mean of the demand and the variance around it, but rather with the distinction between the portion of demand that is predictable and the portion that is not.
How is the signal determined?
How does demand modeling determine what portion is signal and what is not. The approach is a form of decomposition, similar to traditional approaches, but stochastic, not deterministic. Depending on the aspects you choose to include the total data gets split into one or more sets, each consisting of both a signal and a noise portion. In a deterministic approach you very quickly hit the limits where more data and more decomposition would cease to add value and more likely than not you would soon start to make the results worse instead of better. In a stochastic approach that is not true: we may at worst incorrectly qualify some of the signal as noise, but all its effects would still be included, just considered unpredictable.
In the simplest case we only have a base forecast without trend, annual seasonality, nor any causal effects. In this case the total demand gets split into a portion that is signal and a portion that is noise. The last blog post describing the differences between traditional forecasting methods and demand modeling explains how the statistical baseline is determined. That statistical baseline in fact constitutes the signal in this simplest case. The ideal situation for demand modeling is to make the granularity of this baseline as detailed as possible. The most common granularity used is by individual sales order line, which is sub-daily by item and by ship-to location (i.e. the customer's warehouse). This is the most granular that can be readily found in any ERP system. But when more granular data is available demand modeling will eagerly accept that, all the way to point-of-sale scan data from store cash registers. The more detailed, the more signal is preserved, and the clearer the signal can be identified from within the noise.
Even in this simplest of cases the signal identified will however not typically be an equidistant horizontal range around a horizontal line (the mean forecast). From the detail all kinds of patterns can be identified. For example each ship-to location may show very clear ordering patterns favoring certain days of the week, and may even exclude some days completely, such as Saturdays, Sundays and holidays. Similarly there may be clear patterns for weeks within the month, driven for example by sales targets in fiscal calendars. Due to the daily and sub-daily detail all that signal is automatically detected.
Now assume we have a series of demand history which is untrended and has no annual seasonality, but has a half dozen or so spikes in demand at random points in time. Without telling the demand model that something happened at those times it will consider those spikes as being noise. It will also disturb the determination of what is signal at the remainder of the time when there are no spikes. Let's say for example that it determined that the data were 50% signal and 50% noise. Now if we were to tell the demand modeling engine that those spikes were actually promotions that were run and nothing more, it may determine that there was an average uplift of 150%, that the baseline was actually 75% signal and 25% noise and the uplift had 60% signal and 40% noise. All of this of course roughly speaking for explanatory purposes. By its stochastic nature the demand model will in fact form a much richer image of the signal and the noise than mere percentages. In this fashion we have given the model market intelligence that allowed it to better isolate the signal from the noise.
We could continue to fine-tune in this manner. Say that there were differences between the promotions that were run. These differences will have smaller impact than the choice of including the promotions at all or not, but still will help better isolate the signal. Say we were to tell the model that some of the spikes were a 20% discount, some were a 3-for-2 and some had 25% discount. Not only will it identify different uplifts for each and an improved signal-to-noise ratio for each than the 60/40 found before, but it will also improve again the signal-to-noise ratio of the baseline to say 80% signal and 20% noise. You can keep on fine-tuning this way for any relevant market intelligence: was there a brochure, a hostess in the store, special shelf space, end caps, and so forth. Each additional piece of information, if significant, will help further fine-tune the isolation of the signal from the noise.
Trend and seasonality impacts can be added in a similar way, although for short-term forecasts it is usually not worthwhile to identify a separate signal-to-noise ratio for the trend. It can simply be incorporated in the baseline forecast. For long-term forecasts the continuation of a trend becomes more and more uncertain, so may prove useful to determine separately. For seasonality it may be worthwhile sometimes to determine a separate ratio, especially when seasonal effects are strong.
In this fashion any impacts whatsoever can be included into the demand model with more or less supporting information, as it is available. The more availability the higher the likelihood of the model being able to distill more signal from the noise. More on that in part 6.
What is noise?
I want to just briefly touch on what the noise looks like. In traditional forecasting systems we distinguish between the forecast and the forecast error. All the variability is contained in the forecast error. To determine whether the forecast is good, one of the tests performed is to check how the forecast error is dispersed. In short, the error needs to be stationary, independent and identically distributed around a zero mean. If it is not, it is a sign that the quality of the forecast is questionable. Stationary means that the error does not change in time. For example there is no trend or seasonality present in the error, which should all be included in the forecast. Independent means that the occurrence of one error does not influence the occurrence of another one. Identically distributed means that the probability of any error is the same as all the other ones. A zero mean means that the average across all the errors is zero.
For noise none of this applies.
For most statisticians this is a hard thing to accept, because noise and variability are practically synonymous in their world. In demand modeling noise is purely that portion of the demand that is not predictable. And the unpredictable part has whatever characteristics happen to be unpredictable. It may frequently be stationary, independent and identically distributed around zero or close to it, but that is just coincidence.
Just for a simple example, say we have 2 years worth of monthly demand and every month the value is exactly 100 units, except for 1 month where the value is 1300 units. In traditional forecasting that 1300 is either removed as an outlier (which is highly likely in this extreme example), in which case the monthly forecast would be a correct 100, but the variability would be severely underestimated, or it would not be identified as an outlier (which is more likely in most realistic cases, where the noise is not as extreme as this example) and the forecast would be the average 150 per month, which is clearly very wrong, but at least the variability will be near-correct too. In demand modeling, say we did not identify the cause of the spike, 1200 of the 1300 would be considered noise; the average of the forecast would be 100; and the total variability would fully include the effect of the spike. So in this simple example, the noise per month averages zero everywhere except for one huge positive spike, clearly not averaging zero overall.
How can we view the results?
So demand modeling goes into great detail in identifying all sources of signal and noise. To view the total forecast, all of that - all the various signal contributions and all the various noise contributions - can be combined into one combined range of values per time period, where the mean (i.e. the average) in each period can be shown as a time-dependent line equivalent to how the forecast is represented in traditional forecasting systems. The total variability can be represented in many different ways. A good example is the screenshot of the last figure in the prior blog post.
To represent the forecast at any level of aggregation is a piece of cake. Since it is generated in the highest granularity, it can simply be summed to whatever granularity a user may want to interact with the data. If at any level the user may want to add market intelligence they can do so. How this is then incorporated into the model is explained in detail in part 6.
Similarly, because all the various impacts are determined separately, each can potentially be shown independently or in whatever combination makes sense. For example show a forecast minus promotional lift or inclusive, or just show the promotional lift impact. Many combinations may not make much sense to view in isolation for any practical purpose, so commercial applications may choose to only expose the option for a smaller subset of combinations. This also helps protect the intellectual property of how the model works from competitors. For this reason, signal and noise will likely never be visible in isolation in any commercial application. For practical purposes only the overall variability is of interest, not which part of it is signal versus noise. Those are just needed for the engine to determine the forecast in the most reliable way. Whatever is the resultant overall variability is what matters.
In the end, as a user you will not likely see signal and noise split out in a demand model. As such it is not important to understand in order to use a demand modeling system. However, if you want to know what makes a demand model tick under the hood, this is the essential concept.