AR-Net Future Method Produces Autocorrelation Issues, Historic Data Errors Linearly Increase #1546
Replies: 4 comments
-
@ourownstory - I think you're probably the person that would need to answer this. I think this is mainly your method. I can see where you do it in your code, but I don't understand your rationale. |
Beta Was this translation helpful? Give feedback.
-
@mmangione Thank you for the nice explanation of what you are observing. First off, the accuracy of prediction on your holdout set tends to become progressively worse as you get farther from the end of training data. Especially if there are any trend or other paradigm changes. However, this seems to be a smaller contributing factor in your case. Second, the prediction accuracy for each forecast step is lower as you predict farther into the future - further away from your prediction origin (last state observed by model). This is why yhat-30 performs far worse than yhat-1 - just like the weather forecast for tomorrow is far more accurate than one made for 30 days from today. Third, your dataset is very small for the size of your model. If you are predicting 30 days ahead, and are using 60 days as lagged observations, with no hidden layers, your model size for the AR component alone is (30x60 = ) 1800 parameters, which will most likely overfit on (1.5x365 -30 -60 +1 = ) 458 data samples. As you are using statistics commonly used with ARIMA and are referring to a 'shift-method [...] to iteratively produce future predictions', I presume you may be expecting that the model is fitted in a traditional ARIMA style for a single forecast prediction step and then unrolled for the desired number of steps. This is however not the case - the model fits a matrix (with optional latent layers) regressing each lag onto each forecast step - akin fitting an AR model for each forecast step. I suggest reducing your forecast horizon, and if a larger horizon is needed, to fit a second model working on lower frequency data, e.g. weekly data. I hope this can help you despite my late answer, and please let me know if I misunderstood anything. |
Beta Was this translation helpful? Give feedback.
-
@ourownstory Thank you for taking your time to answer my questions! I appreciate the detail in your answers. They do help me understand quite a bit what's happening here and how I might improve the model. I have a couple of follow-up questions:
I have since switched from Ljung-Box to a more general Lagrange Multiplier method. It is saying the same thing, the emergence of a significant value indicates a potential "missing variable". In this case, to support your point, an improper time horizon would appear as a missing variable since the predictive power of the data alone is not enough. (Off topic, but it is curious that in the overfit state, the autocorrelation measures all fell within tolerance and suggested no autocorrelation issues. That makes me wary of these methods for this application.)
I suspect that it might simply be that my model was transitioning from an overfit state to a more correct fit state through the regressing process. Although, I am curious as to your assessment. Thank you for taking the time to help me understand this. P.S. I have already improved my models significantly from your last post. Based on your feedback, I started to include economic data as regressor variables. That seems to have helped significantly as these are unit shipments from a warehouse. |
Beta Was this translation helpful? Give feedback.
-
Hi @mmangione you are welcome. I would suggest using (seasonal) Naive predictions as a baseline benchmark to troubleshoot/benchmark against. It is a solid and simple to interpret, yet not so simple to beat, baseline in many applications. Next, look at how your accuracy (in your train set) changes from yhat1 to yhatN. To help properly I would need to better understand your forecasting task and available data. |
Beta Was this translation helpful? Give feedback.
-
Discussed in #1519
Originally posted by mmangione January 23, 2024
We use NP for a number of prediction tasks and I've been focusing on improving the accuracy of our forecasts. One problem that I've been running into is that the shift-method the NP team uses to iteratively produce future predictions is both undocumented, undiscussed, and unexamined. Can someone from the NP team explain it to me?
For context here is the result of the training fit. This is a 60-day hold out test where we fit the training data and test it on the holdout set for prediction. It worked great.:
Here is the result of the 30 day forecast fit:
As I progress forward in my future predictions, the previous performance suffers. The historic fits to the data become less and less well-fitting, and every metric seems to suffer. I first noticed this in the uncertainty measurements. I have been using CQR and I notice that my miscoverage rate linearly increases with yhat.
So, I began to examine my results and found that it is no fluke. Each successive yhat has poorer performance on it's own historic data. So, I started looking at the residuals, and that's when I saw an indicator as to what is happening... whatever operation is being performed on the data is introducing autocorrelation issues. Very significant ones too.
In one particular dataset that was daily data with 1.5 years of history and a 30 day forward prediction window, yhat1 had:
However, when I looked at yhat30, it had:
For yhat1, the autocorrelation falls with in the acceptable parameters. For yhat30, the autocorrelation is very pronounced. DW says we've introduced positive autocorrelation errors, and LB just says it's displaying significant issues - on it's own historic prediction data.
TLDR: Whatever is being done to produce these future predictions is introducing autocorrelation issues, does not maintain consistency for ar-lag numbers, and seems to result in linearly increasing errors for the historical data period. These are pretty significant results too.
Beta Was this translation helpful? Give feedback.
All reactions