Time series forecasting is an important task for effective and efficient planning in many fields like finance, weather and energy. Traditional approaches like SARIMA models often require manual data pre-processing steps (e.g. differencing to make the data stationary) and it’s also hard to explain why these models produce the prediction results to people without forecasting expertise. In addition, these models are not allowed to add additional domain knowledge to improve precision. For solving these problems, Facebook researchers recently released FBProphet, a time series forecasting tool supporting both Python and R. FBProphet provides a decomposition regression model that is extendable and configurable with interpretable parameters.

In this blog, I will use FBProphet to forecast item demand using the data from the Kaggle competition “Store Item Demand Forecasting Challenge”

Data analysis

Let’s start by importing the Python packages that we need. 

Then, we load the train data using Pandas. You may need to change the path to where you put the train.csv file.

As there are many items and stores, I will restrict the analysis to item 1 and store 1.

 y
count1826.000000
mean19.971522
std6.741022
min4.000000
25%15.000000
50%19.000000
75%24.000000
max50.000000

Ok, we have the data now. Let’s plot them.

As you can see, the dataset contains five years (2013-2018). We will use the last 180 days for testing and the rest for training. Then, we will plot them to double-check.

Forecasting with FBProphet

Let’s create a Prophet model, add the weekly, yearly components and then fit it with the train dataset.

Now, I can ask the model to predict the future on the testing dataset. I will ask it to estimate also the training points, so the model will predict for the whole dataset as follows.

 dsyhat
02013-01-019.232334
12013-01-029.853309
22013-01-0310.262132
32013-01-0411.855213
42013-01-0513.551500

FBProphet comes with a handy plot function allowing us to plot the three fitted components. Let’s plot them.

It can be seen from the trend component that the sales always increase from 2013 to 2018. Meanwhile, the yearly component shows that the sales peak is around June. Finally, the sales rise gradually from Monday to Sunday.

Let’s plot the forecast, train, and test datasets to validate this.

It can be observed from the above plot that the down trend during the last 6 months is captured correctly and the forecast values have 7 patterns (or 7 different cosine-like lines). Those are corresponding to 7 weekdays. Let’s plot the Monday and Sunday datapoints of the actual (including train and test) and forecast dataset.

It can be seen that the Monday and Sunday datapoints are separated and FBProphet captures this property correctly. I can now plot all weekdays to gain some insights.

Finally, FBProphet SMAPE on this dataset is 18% and calculated as below. To improve the precision, we need to tune FBProphet hyperparameters and add additional regressors.

The SMAPE error is: 18.283400082248754

Conclusion

FBProphet is very flexible as it allows you to add multiple seasonal components and additional regressors. Its component results are interpretable, hence users can intuitively tune the parameters to improve the performance. In addition, unlike SARIMA models, FBProphet does not require regularly spaced data points, therefore users do not need to interpolate missing data, e.g. when removing outliers. Finally, FBProphet uses STAN to fit the model, and STAN still has not supported GPU yet, so FBProphet does not scale up when fed with many data points.