etna.auto.Tune#

class Tune(pipeline: BasePipeline, target_metric: BaseMetric, horizon: int, metric_aggregation: Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'] = 'mean', backtest_params: dict | None = None, experiment_folder: str | None = None, runner: AbstractRunner | None = None, storage: BaseStorage | None = None, metrics: List[BaseMetric] | None = None, sampler: BaseSampler | None = None, params_to_tune: Dict[str, BaseDistribution] | None = None)[source]#

Bases: AutoBase

Automatic tuning of custom pipeline.

This class takes given pipelines and tries to optimize its hyperparameters by using params_to_tune.

Trials with duplicate parameters are skipped and previously computed results are returned.

Note

This class requires auto extension to be installed. Read more about this at installation page.

Initialize Tune class.

Parameters:

pipeline (BasePipeline) – Pipeline to optimize.
target_metric (BaseMetric) – Metric to optimize.
horizon (int) – Horizon to forecast for.
metric_aggregation (Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics. By default, mean aggregation is used.
backtest_params (dict | None) – Custom parameters for backtest instead of default backtest parameters.
experiment_folder (str | None) – Name for saving experiment results, it determines the name for optuna study. By default, isn’t set.
runner (AbstractRunner | None) – Runner to use for distributed training. By default, LocalRunner is used.
storage (BaseStorage | None) – Optuna storage to use. By default, sqlite storage is used with name “etna-auto.db”.
metrics (List[BaseMetric] | None) – List of metrics to compute. By default, Sign, SMAPE, MAE, MSE, MedAE metrics are used.
sampler (BaseSampler | None) – Optuna sampler to use. By default, TPE sampler is used.
params_to_tune (Dict[str, BaseDistribution] | None) – Parameters of pipeline that should be tuned with corresponding tuning distributions. By default, pipeline.params_to_tune() is used.

Methods

`fit`(ts[, timeout, n_trials, initializer, ...])	Start automatic pipeline tuning.
`objective`(ts, pipeline, params_to_tune, ...)	Optuna objective wrapper.
`summary`()	Get trials summary.
`top_k`([k])	Get top k pipelines with the best metric value.

fit(ts: TSDataset, timeout: int | None = None, n_trials: int | None = None, initializer: _Initializer | None = None, callback: _Callback | None = None, **kwargs) → BasePipeline[source]#

Start automatic pipeline tuning.

Parameters:

ts (TSDataset) – TSDataset to fit on.
timeout (int | None) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.
n_trials (int | None) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.
initializer (_Initializer | None) – Object that is called before each pipeline backtest, can be used to initialize loggers.
callback (_Callback | None) – Object that is called after each pipeline backtest, can be used to log extra metrics.
**kwargs – Additional parameters for optuna optuna.study.Study.optimize().

Return type:

BasePipeline

static objective(ts: TSDataset, pipeline: BasePipeline, params_to_tune: Dict[str, BaseDistribution], target_metric: BaseMetric, metric_aggregation: Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'], metrics: List[BaseMetric], backtest_params: dict, initializer: _Initializer | None = None, callback: _Callback | None = None) → Callable[[Trial], float][source]#

Optuna objective wrapper.

Parameters:

ts (TSDataset) – TSDataset to fit on.
pipeline (BasePipeline) – Pipeline to tune.
params_to_tune (Dict[str, BaseDistribution]) – Parameters of pipeline that should be tuned with corresponding tuning distributions.
target_metric (BaseMetric) – Metric to optimize.
metric_aggregation (Literal['median', 'mean', 'std', 'notna_size', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics.
metrics (List[BaseMetric]) – List of metrics to compute.
backtest_params (dict) – Custom parameters for backtest instead of default backtest parameters.
initializer (_Initializer | None) – Object that is called before each pipeline backtest, can be used to initialize loggers.
callback (_Callback | None) – Object that is called after each pipeline backtest, can be used to log extra metrics.

Returns:

function that runs specified trial and returns its evaluated score

Return type:

objective

summary() → DataFrame[source]#

Get trials summary.

There are columns:

hash: hash of the pipeline;
pipeline: pipeline object;
metrics: columns with metrics’ values;
elapsed_time: fitting time of pipeline (doesn’t include model initialization);
state: state of the trial.

Returns:: dataframe with detailed info on each performed trial
Return type:: study_dataframe

top_k(k: int = 5) → List[BasePipeline][source]#

Get top k pipelines with the best metric value.

Only complete and non-duplicate studies are taken into account.

Parameters:: k (int) – Number of pipelines to return.
Returns:: List of top k pipelines.
Return type:: List[BasePipeline]