AbstractPipeline¶

class AbstractPipeline[source]¶

Bases: etna.core.saving.AbstractSaveable

Interface for pipeline.

Inherited-members

Methods

`backtest`(ts, metrics[, n_folds, mode, ...])	Run backtest with the pipeline.
`fit`(ts)	Fit the Pipeline.
`forecast`([prediction_interval, quantiles, ...])	Make predictions.
`load`(path)	Load an object.
`predict`(ts[, start_timestamp, ...])	Make in-sample predictions on dataset in a given range.
`save`(path)	Save the object.

abstract backtest(ts: etna.datasets.tsdataset.TSDataset, metrics: List[etna.metrics.base.Metric], n_folds: Union[int, List[etna.pipeline.base.FoldMask]] = 5, mode: str = 'expand', aggregate_metrics: bool = False, n_jobs: int = 1, joblib_params: Optional[Dict[str, Any]] = None, forecast_params: Optional[Dict[str, Any]] = None) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

Run backtest with the pipeline.

Parameters

ts (etna.datasets.tsdataset.TSDataset) – Dataset to fit models in backtest
metrics (List[etna.metrics.base.Metric]) – List of metrics to compute for each fold
n_folds (Union[int, List[etna.pipeline.base.FoldMask]]) – Number of folds or the list of fold masks
mode (str) – One of ‘expand’, ‘constant’ – train generation policy
aggregate_metrics (bool) – If True aggregate metrics above folds, return raw metrics otherwise
n_jobs (int) – Number of jobs to run in parallel
joblib_params (Optional[Dict[str, Any]]) – Additional parameters for joblib.Parallel
forecast_params (Optional[Dict[str, Any]]) – Additional parameters for forecast()

Returns

metrics_df, forecast_df, fold_info_df – Metrics dataframe, forecast dataframe and dataframe with information about folds

Return type

Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]

abstract fit(ts: etna.datasets.tsdataset.TSDataset) → etna.pipeline.base.AbstractPipeline[source]¶

Fit the Pipeline.

Parameters: ts (etna.datasets.tsdataset.TSDataset) – Dataset with timeseries data
Returns: Fitted Pipeline instance
Return type: etna.pipeline.base.AbstractPipeline

abstract forecast(prediction_interval: bool = False, quantiles: Sequence[float] = (0.025, 0.975), n_folds: int = 3) → etna.datasets.tsdataset.TSDataset[source]¶

Make predictions.

Parameters

prediction_interval (bool) – If True returns prediction interval for forecast
quantiles (Sequence[float]) – Levels of prediction distribution. By default 2.5% and 97.5% taken to form a 95% prediction interval
n_folds (int) – Number of folds to use in the backtest for prediction interval estimation

Returns

Dataset with predictions

Return type

etna.datasets.tsdataset.TSDataset

abstract predict(ts: etna.datasets.tsdataset.TSDataset, start_timestamp: Optional[pandas._libs.tslibs.timestamps.Timestamp] = None, end_timestamp: Optional[pandas._libs.tslibs.timestamps.Timestamp] = None, prediction_interval: bool = False, quantiles: Sequence[float] = (0.025, 0.975)) → etna.datasets.tsdataset.TSDataset[source]¶

Make in-sample predictions on dataset in a given range.

Currently, in situation when segments start with different timestamps we only guarantee to work with start_timestamp >= beginning of all segments.

Parameters

ts (etna.datasets.tsdataset.TSDataset) – Dataset to make predictions on.
start_timestamp (Optional[pandas._libs.tslibs.timestamps.Timestamp]) – First timestamp of prediction range to return, should be >= than first timestamp in ts; expected that beginning of each segment <= start_timestamp; if isn’t set the first timestamp where each segment began is taken.
end_timestamp (Optional[pandas._libs.tslibs.timestamps.Timestamp]) – Last timestamp of prediction range to return; if isn’t set the last timestamp of ts is taken. Expected that value is less or equal to the last timestamp in ts.
prediction_interval (bool) – If True returns prediction interval for forecast.
quantiles (Sequence[float]) – Levels of prediction distribution. By default 2.5% and 97.5% taken to form a 95% prediction interval.

Returns

Dataset with predictions in [start_timestamp, end_timestamp] range.

Raises

ValueError: – Value of end_timestamp is less than start_timestamp.
ValueError: – Value of start_timestamp goes before point where each segment started.
ValueError: – Value of end_timestamp goes after the last timestamp.

Return type

etna.datasets.tsdataset.TSDataset