Forecasts

Analyze

Description & Overview

Page "Analyze" is designed for building forecast models and in-depth study of forecast influencing factors. The page includes two elements:

"Parameters" area is designed to select and configure the basic parameters for the forecast calculation;
"Results" area is designed to show the forecast results, factors that affecting the forecasts, as well as to study their nature.

How we build forecast

MySales has the most technologically and mathematically advanced system core (engine), tested on hundreds of millions different positions and stores, which is responsible for building a stable quality forecast for each position in each store.

The forecast is built at different levels, starting with the most aggregated levels, ending with the most detailed:

Product group - all stores
Product group - region
Product group - store
SKU - all stores
SKU - region
SKU - store

Forecasting is an automatic, multi-stage, iterative process that has a hierarchy and works with a huge data array. When building a forecast, all available history is analyzed at the high levels, up to the last 3-4 years of sales. This is a huge amount of data. Just imagine. If you have, for example 100 stores and 10,000 SKUs, then this is one million combinations multiplied by 3-4 years of sales history. At each recalculation of all forecasts, MySales analyzes gigabytes of information for a small retail chains and terabytes of information for chains with hundreds of stores.

Describing the MySales forecast calculation algorithm, we would like to warn you against trying to implement such an algorithm yourself. Even if you manage to implement such an algorithm in a reasonable amount of time, it will take years to test it on millions of different positions, catching errors, and also spend a lot of effort on optimizing calculation performance to bring it to an acceptable level before you can get economic benefits from using it.

We can distinguish the following main stages of building the MySales forecast, which system performs for each combination at each level:

Downloading and preparing data from the DBMS, as well as preparing the MySales file storage, which is used to optimize the forecast calculation speed, as well as to reduce the load on the DBMS when processing huge amounts of data. Not only historical sales, prices, discounts, balances, arrivals are loaded, but also the product hierarchy, product directory, geographic hierarchy, store directory and external data (weather, macroeconomics, competitor prices)
Inclusion of data for analogues by stores or by position. At the same time, the system always sees the latest sales data for a new store or new position and uses analogues data only for those periods where there is no information about the new product
Formation of data packages for forecast calculation in a multiprocessor environment using multiple (usually from 3 to 9) threads
Primary cleaning of sales data from key influencing factors to calculate trend and seasonality. This stage is important in order to separate the impact of prices, discounts and promotions on sales and correctly calculate seasonal uplifts. Considering that in retail the amount of historical data for many SKUs is rarely large enough to exclude periods where the influence of discounts or prices took place due to the constant rotation of the assortment. So, MySales, instead of excluding it, cleaning forecast from the influence of these factors in order not to reduce the already limited history sales, excluding only the most extreme periods. Also, at all stages, when analyzing sales, MySales excludes periods with significant sales losses due to the fact that the product was absent or it was not enough on stock to ensure sales. This approach gives the advantage that, in fact, the trend and seasonality are calculated correctly even if the sales history is limited to only one year. And missing periods may be filled from higher forecasting levels
Seasonality calculation. MySales considers both multiplicative (seasonal coefficient) and additive seasonality. This is necessary choose the most optimal method of seasonality application after analyzing whether the volatility of the seasonal periods increases along with the trend (general growth or decline in sales). Moreover, for the product groups level, seasonality of the average price of the group is also calculated, which is necessary to predict the average price of the group over a long horizon, since seasonal peaks usually have a higher average price in the group.
Calculation of the trend of overall growth or decline in sales. It is also carried out after cleaning sales from the main influencing factors (price, discount, promotional), as well as after cleaning from seasonality. The system also calculates average sales and median using cleared sales
Filling in the original matrix of historical values of predictors (influencing factors). A number of predictors are calculated values, for example, the ratio of the price for the current period to the average price for previous periods
Analysis how sales depends on influencing factors (price, discount, weather, macroeconomic factor, which most often acts as the exchange rate, cannibalization, etc.). When calculating each influencing factor, the system selectively cleans it of other, most significant influencing factors, for example, of promo and seasonality. When analyzing price dependence, the system also analyzes the trend of inflation or deflation in order to clear historical prices from such an impact. A number of factors are analyzed separately for the low, high and medium seasons, so that price elasticity, for example for ice cream, differs for winter and summer
If it is not possible to determine the dependencies of the influencing factors at detailed levels, then the system takes key ones (for example, the influence of price, seasonality) from higher levels. For example, for the SKU-store level, such dependencies can be taken from the SKU-region or SKU-all stores level, from the group-store, group-region, or group-all stores levels
Filling the target matrix of future predictor values. Here, a differentiated approach is used, which for some factors can be a simple average or median, for some - predicted values, and for some - information that was entered by the user in the customer’s system and loaded into the MySales data warehouse. It is recommended that you always load into MySales the assortment matrix, future prices as soon as they become known and enter the data about promotional activities
Calculation of correlation coefficients and formation of an automatic forecasting model based on these coefficients using those factors that affect sales
Testing automatic, as well as pre-configured forecast models. In this case, the predictive model is understood as a set of influencing factors.
Choosing the model that gives the best accuracy on past sales. When evaluating each model, periods where the product was not enough to ensure sales, as well as promotional periods that have the greatest impact on sales, are excluded
After the best model is selected, MySales builds forecast for future periods
A forecast is also carried out with a number of actions aimed at ensuring its stability and reliability, for example, calculating the minimum sales values, calculating the autocorrelation of the model error, in order to adjust the forecast for the coming weeks and make it more accurate
The next stage is the calculation of promo uplifts. At this stage, the system applies the promotion coefficients of uplift to the forecast generated by a set of neural networks (Dusya), or data from comparable promos, if such was found. Promo uplifts are adapted to the individual characteristics of each predicted position in each store to take into account the individual characteristics of different combinations and their sensitivity to promotional factors. The minimum and maximum limits of the promotion uplifts are also calculated to ensure a stable result for new positions where the history of the promotion is not enough
After calculating promo uplifts, the system performs rebalancing of the model taking into account all the influencing factors. This is necessary to balance the influence of the price and the promo effect in the promo forecast, because they often have a high correlation
At this stage, the forecast is ready, now we calculate the safety stock. SS is calculated as the standard deviation of the forecast from sales in previous periods. SS is also, as in the example above with ice cream, different in summer and winter, so the forecast is divided into 3 ranges: high, low and medium and safety stock is calculated separately for each range
Further, the forecast is used to calculate price recommendations so that you can determine the most optimal price for sales in value or margin
The final step is to calculate the possible economic effect in the past, in the form of increased sales, or sales losses for periods in which influencing factors were not known to the system, as well as a stocks reduction

It is also worth noting that the system has a separate algorithm and calculation sequence for new positions that do not have an analogue: for such positions, the system uses the sales of the average position in the product group, adjusting them to the price elasticity of the group using the specific price of the new item, and also applying a number of other restrictions and calculations to make the result more stable and accurate.

There is also a separate forecasting algorithm for expanding distribution. Example: a position was sold in 10 of 30 stores in the region, given the good dynamics and potential, the category manager decided to list it in all 30 stores in the region. In this case, the system uses sales dynamics at the level of SKU-region, group-region and group-store to build forecast for this position in a store where there is no sales history and this position is being listed for the first time.

The set of all factors that the system takes into account is set by the user and can be individually adapted. The default set of factors is described in the "Default Predictors" section.

Parameters

"Parameters" area is designed to choose and configure the basic parameters of the forecast model. It consists of the following categories:

Stores
Groups
SKU

Data selection happens in a next way:

button opens the drop down list for selecting data. It shows previously used data, and has an option to select new data when you click on "select...";

button allows you to select all the data for relevant category;
кнопка позволяет не строить прогноз для отдельных SKU, а строить только на уровне Группы.

Button "Settings" opens additional parameters for configure the forecast (to hide additional parameters, you need to click this button again):

parameter "End weeek" – is responsible for choosing the last week for which the system will see historical data, even if there is historical data after chosen week (last week of the training sample). Used to verify the constructed forecast;
parameter "Profile" – selection the type of forecast:
- "52-weeks" – 52 weeks forecast calculation;
- "52-weeks log" – 52 weeks forecast calculation with a detailed report on the process of building a forecast model (the report can be seen in the tab "Models");
- "26-weeks" – 26 weeks forecast calculation ;
- "Brain 1000" – forecast calculation using a neural network in 1000 iterations;
- "Brain 3000" – forecast calculation using a neural network in 3000 iterations;
parameter "Supplier" – supplier selector. The supplier selection is the same as selection store / group.

When you press "run" the forecast calculation begins. Detailed information about the forecast calculation progress is displayed under the progress bar. At the end of the forecast calculation, audible warning will occur.

Results

"Results" area is designed to display the forecast results and the factors that affects them, as well as for studying their nature.