etna.transforms.DensityOutliersTransform#

class DensityOutliersTransform(in_column: str, window_size: int = 15, distance_coef: float = 3, n_neighbors: int = 3, distance_func: Literal['absolute_difference'] | Callable[[float, float], float] = 'absolute_difference', ignore_flag_column: str | None = None)[source]#

Bases: OutliersTransform

Transform that uses get_anomalies_density() to find anomalies in data.

Warning

This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.

Create instance of DensityOutliersTransform.

Parameters:
  • in_column (str) – name of processed column

  • window_size (int) – size of windows to build

  • distance_coef (float) – factor for standard deviation that forms distance threshold to determine points are close to each other

  • n_neighbors (int) – min number of close neighbors of point not to be outlier

  • distance_func (Literal['absolute_difference'] | ~typing.Callable[[float, float], float]) – distance function. If a string is specified, a corresponding vectorized implementation will be used. Custom callable will be used as a scalar function, which will result in worse performance.

  • ignore_flag_column (str | None) – column name for skipping values from outlier check

Methods

detect_outliers(ts)

Call get_anomalies_density() function with self parameters.

fit(ts)

Fit the transform.

fit_transform(ts)

Fit and transform TSDataset.

get_regressors_info()

Return the list with regressors created by the transform.

inverse_transform(ts)

Inverse transform TSDataset.

load(path)

Load an object.

params_to_tune()

Get default grid for tuning hyperparameters.

save(path)

Save the object.

set_params(**params)

Return new object instance with modified parameters.

to_dict()

Collect all information about etna object in dict.

transform(ts)

Transform TSDataset inplace.

Attributes

This class stores its __init__ parameters as attributes.

original_values

Backward compatibility property.

outliers_timestamps

Backward compatibility property.

detect_outliers(ts: TSDataset) Dict[str, Series][source]#

Call get_anomalies_density() function with self parameters.

Parameters:

ts (TSDataset) – dataset to process

Returns:

dict of outliers in format {segment: [outliers_timestamps]}

Return type:

Dict[str, Series]

fit(ts: TSDataset) OutliersTransform[source]#

Fit the transform.

Parameters:

ts (TSDataset) – Dataset to fit the transform on.

Returns:

The fitted transform instance.

Return type:

OutliersTransform

fit_transform(ts: TSDataset) TSDataset[source]#

Fit and transform TSDataset.

May be reimplemented. But it is not recommended.

Parameters:

ts (TSDataset) – TSDataset to transform.

Returns:

Transformed TSDataset.

Return type:

TSDataset

get_regressors_info() List[str][source]#

Return the list with regressors created by the transform.

Returns:

List with regressors created by the transform.

Return type:

List[str]

inverse_transform(ts: TSDataset) TSDataset[source]#

Inverse transform TSDataset.

Apply the _inverse_transform method.

Parameters:

ts (TSDataset) – TSDataset to be inverse transformed.

Returns:

TSDataset after applying inverse transformation.

Return type:

TSDataset

classmethod load(path: Path) Self[source]#

Load an object.

Warning

This method uses dill module which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.

Parameters:

path (Path) – Path to load object from.

Returns:

Loaded object.

Return type:

Self

params_to_tune() Dict[str, BaseDistribution][source]#

Get default grid for tuning hyperparameters.

This grid tunes parameters: window_size, distance_coef, n_neighbors. Other parameters are expected to be set by the user.

Returns:

Grid to tune.

Return type:

Dict[str, BaseDistribution]

save(path: Path)[source]#

Save the object.

Parameters:

path (Path) – Path to save object to.

set_params(**params: dict) Self[source]#

Return new object instance with modified parameters.

Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a model in a Pipeline.

Nested parameters are expected to be in a <component_1>.<...>.<parameter> form, where components are separated by a dot.

Parameters:

**params (dict) – Estimator parameters

Returns:

New instance with changed parameters

Return type:

Self

Examples

>>> from etna.pipeline import Pipeline
>>> from etna.models import NaiveModel
>>> from etna.transforms import AddConstTransform
>>> model = NaiveModel(lag=1)
>>> transforms = [AddConstTransform(in_column="target", value=1)]
>>> pipeline = Pipeline(model, transforms=transforms, horizon=3)
>>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2})
Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
to_dict()[source]#

Collect all information about etna object in dict.

transform(ts: TSDataset) TSDataset[source]#

Transform TSDataset inplace.

Parameters:

ts (TSDataset) – Dataset to transform.

Returns:

Transformed TSDataset.

Return type:

TSDataset

property original_values: Dict[str, Series] | None[source]#

Backward compatibility property.

property outliers_timestamps: Dict[str, List[Timestamp]] | Dict[str, List[int]] | None[source]#

Backward compatibility property.