etna.auto.Auto#
- class Auto(target_metric: BaseMetric, horizon: int, metric_aggregation: Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'] = 'mean', backtest_params: dict | None = None, experiment_folder: str | None = None, pool: Pool | PoolGenerator | List[BasePipeline] = Pool.no_freq_super_fast, generate_params: Dict[str, Any] | None = None, runner: AbstractRunner | None = None, storage: BaseStorage | None = None, metrics: List[BaseMetric] | None = None)[source]#
Bases:
AutoBase
Automatic pipeline selection via defined or custom pipeline pool.
Note
This class requires
auto
extension to be installed. Read more about this at installation page.Note
Class initialization could be slow due to downloading of pretrained models when using default pools.
Initialize Auto class.
- Parameters:
target_metric (BaseMetric) – Metric to optimize.
horizon (int) – Horizon to forecast for.
metric_aggregation (Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics. By default, mean aggregation is used.
backtest_params (dict | None) – Custom parameters for backtest instead of default backtest parameters.
experiment_folder (str | None) – Name for saving experiment results, it determines the name for optuna study. By default, isn’t set.
pool (Pool | PoolGenerator | List[BasePipeline]) – Pool of pipelines to choose from. By default,
no_freq_super_fast
pool is used. For description of all available pools seePool
docs.generate_params (Dict[str, Any] | None) – Dictionary with parameters to fill pool templates. Available parameters are
timestamp_column
,chronos_device
andtimesfm_device
. For full description seePool
docs. For usage example see205-automl
notebook.runner (AbstractRunner | None) – Runner to use for distributed training. By default,
LocalRunner
is used.storage (BaseStorage | None) – Optuna storage to use. By default, sqlite storage is used.
metrics (List[BaseMetric] | None) – List of metrics to compute. By default,
Sign
,SMAPE
,MAE
,MSE
,MedAE
metrics are used.
Methods
fit
(ts[, timeout, n_trials, initializer, ...])Start automatic pipeline selection.
Get pipelines from
pool
.objective
(ts, target_metric, ...[, ...])Optuna objective wrapper for the pool stage.
summary
()Get Auto trials summary.
top_k
([k])Get top k pipelines with the best metric value.
- fit(ts: TSDataset, timeout: int | None = None, n_trials: int | None = None, initializer: _Initializer | None = None, callback: _Callback | None = None, **kwargs) BasePipeline [source]#
Start automatic pipeline selection.
There are two stages:
Pool stage: trying every pipeline in a pool
Tuning stage: tuning
tune_size
best pipelines from a previous stage by usingTune
.
Tuning stage starts only if limits on
n_trials
andtimeout
aren’t exceeded. Tuning goes from the best pipeline to the worst, and trial limits (n_trials
,timeout
) are divided evenly between each pipeline. If there are no limits on number of trials only the first pipeline will be tuned until user stops the process.- Parameters:
ts (TSDataset) – TSDataset to fit on.
timeout (int | None) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.
n_trials (int | None) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.
initializer (_Initializer | None) – Object that is called before each pipeline backtest, can be used to initialize loggers.
callback (_Callback | None) – Object that is called after each pipeline backtest, can be used to log extra metrics.
**kwargs – Parameter
tune_size
(default: 0) determines how many pipelines to fit during tuning stage. Other parameters are passed into optunaoptuna.study.Study.optimize()
.
- Return type:
- static objective(ts: TSDataset, target_metric: BaseMetric, metric_aggregation: Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'], metrics: List[BaseMetric], backtest_params: dict, config_mapping: Dict[str, dict], initializer: _Initializer | None = None, callback: _Callback | None = None) Callable[[Trial], float] [source]#
Optuna objective wrapper for the pool stage.
- Parameters:
ts (TSDataset) – TSDataset to fit on.
target_metric (BaseMetric) – Metric to optimize.
metric_aggregation (Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics.
metrics (List[BaseMetric]) – List of metrics to compute.
backtest_params (dict) – Custom parameters for backtest instead of default backtest parameters.
initializer (_Initializer | None) – Object that is called before each pipeline backtest, can be used to initialize loggers.
callback (_Callback | None) – Object that is called after each pipeline backtest, can be used to log extra metrics.
config_mapping (Dict[str, dict]) – Mapping from config hashes to configs.
- Returns:
function that runs specified trial and returns its evaluated score
- Return type:
objective
- summary() DataFrame [source]#
Get Auto trials summary.
There are columns:
hash: hash of the pipeline;
pipeline: pipeline object;
metrics: columns with metrics’ values;
elapsed_time: fitting time of pipeline (doesn’t include model initialization);
state: state of the trial;
study: name of the study in which trial was made.
- Returns:
dataframe with detailed info on each performed trial
- Return type:
study_dataframe