Skip to content

Commit

Permalink
Return residuals using Prophet
Browse files Browse the repository at this point in the history
Additional option for Prophet to return residuals i.e. actual - prediction for historical data.
  • Loading branch information
Nabeel committed Jan 17, 2020
1 parent c3adb9f commit 6340942
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 2 deletions.
9 changes: 8 additions & 1 deletion core/_prophet.py
Original file line number Diff line number Diff line change
Expand Up @@ -402,9 +402,11 @@ def _set_params(self):
self.load_script = 'true' == self.kwargs['load_script'].lower()

# Set the return type
# Valid values are: yhat, trend, seasonal, seasonalities, all.
# Valid values are: yhat, trend, seasonal, seasonalities, all, y_then_yhat, residual.
# Add _lower or _upper to the series name to get lower or upper limits.
# The special case of 'all' returns all output columns from Prophet. This can only be used with 'load_script=true'.
# 'y_then_yhat' returns actual values for historical periods and forecast values for future periods
# 'residual' returns y - yhat for historical periods
if 'return' in self.kwargs:
self.result_type = self.kwargs['return'].lower()

Expand Down Expand Up @@ -779,6 +781,11 @@ def _forecast(self):
self.forecast.loc[:len(self.forecast) - self.periods - 1, self.result_type] \
= self.input_df.loc[:len(self.request_df) - self.periods - 1, 'y']

# For return=residual we return y - yhat for historical periods and Null for future periods
elif 'residual' in self.result_type:
# Create the residuals for historical periods by subtracting yhat from y
self.forecast.loc[:len(self.request_df)-self.periods-1, self.result_type] = self.input_df.loc[:len(self.request_df)-self.periods-1, 'y'] - self.forecast.loc[:len(self.request_df)-self.periods-1, 'yhat']

# Update to the original index from the request data frame
self.forecast.index = self.request_index.index

Expand Down
2 changes: 1 addition & 1 deletion docs/Prophet.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Any of these arguments can be included in the final string parameter for the Pro

| Keyword | Description | Sample Values | Remarks |
| --- | --- | --- | --- |
| return | The output of the expression | `all`, `yhat`, `yhat_upper`, `yhat_lower`, `y_then_yhat`, `y_then_yhat_upper`, `y_then_yhat_lower`, `trend`, `trend_upper`, `trend_lower`, `seasonal`, `seasonal_upper`, `seasonal_lower`, `yearly`, `yearly_upper`, `yearly_lower` & any other column in the forecast output | `yhat` refers to the forecast values. This is the default value. The `y_then_yhat` options allow you to plot the actual values for historical data and forecast values only for future dates. Upper and lower limits are available for each type of output.<br><br>The `all` option returns all the columns from the Prophet forecast. This option is only valid if used in combination with the `load_script=true` parameter as it will return multiple columns. |
| return | The output of the expression | `all`, `yhat`, `yhat_upper`, `yhat_lower`, `y_then_yhat`, `y_then_yhat_upper`, `y_then_yhat_lower`, `trend`, `trend_upper`, `trend_lower`, `additive_terms`, `additive_terms_upper`, `additive_terms_lower`, `residual` & any other column in the forecast output | `yhat` refers to the forecast values. This is the default value. The `y_then_yhat` options allow you to plot the actual values for historical data and forecast values only for future dates. Upper and lower limits are available for each type of output.<br><br>The `residual` option returns actual minus predictions (i.e. y - yhat).<br><br>The `all` option returns all the columns from the Prophet forecast. This option is only valid if used in combination with the `load_script=true` parameter as it will return multiple columns. |
| freq | The frequency of the time series | `D`, `MS`, `M`, `H`, `T`, `S`, `ms`, `us` | The most common options would be D for Daily, MS for Month Start and M for Month End. The default value is D, however this will mess up results if you provide the values in a different frequency, so always specify the frequency. See the full set of options [here](http://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases). |
| debug | Flag to output additional information to the terminal and logs | `true`, `false` | Information will be printed to the terminal as well to a log file: `..\qlik-py-env\core\logs\Prophet Log <n>.txt`. Particularly useful is looking at the Request Data Frame to see what you are sending to the algorithm and the Forecast Data Frame to see the possible result columns. |
| load_script | Flag for calling the function from the Qlik load script. | `true`, `false` | Set to `true` if calling the Prophet function from the load script in the Qlik app. This will change the output to a table consisting of two fields; `ds` which is the datetime dimension passed to Prophet, and the specified return value (`yhat` by default). `ds` is returned as a string in the format `YYYY-MM-DD hh:mm:ss TT`.<br/><br/>This parameter only applies to the `Prophet` function. |
Expand Down

0 comments on commit 6340942

Please sign in to comment.