etna.transforms.MeanEncoderTransform#
- class MeanEncoderTransform(in_column: str, out_column: str, mode: EncoderMode | str = 'per-segment', handle_missing: str = MissingMode.category, smoothing: int = 1)[source]#
- Bases: - IrreversibleTransform- Makes encoding of categorical feature. - For timestamps that are before the last timestamp seen in - fittransformations are made using the formula below:\[\frac{TargetSum + RunningMean * Smoothing}{FeatureCount + Smoothing}\]- where - TargetSum is the sum of target up to the current timestamp for the current category, not including the current timestamp 
- RunningMean is target mean up to the current timestamp, not including the current timestamp 
- FeatureCount is the number of categories with the same value as in the current timestamp, not including the current timestamp 
 - For future timestamps: - for known categories encoding are filled with global mean of target for these categories calculated during - fit
- for unknown categories encoding are filled with global mean of target in the whole dataset calculated during - fit
 - All types of NaN values are considering as one category. - Init MeanEncoderTransform. - Parameters:
- in_column (str) – categorical column to apply transform 
- out_column (str) – name of added column 
- mode (EncoderMode | str) – - mode to encode segments - ’per-segment’ - statistics are calculated across each segment individually 
- ’macro’ - statistics are calculated across all segments. In this mode transform can work with new segments that were not seen during - fit
 
- handle_missing (str) – - mode to handle missing values in - in_column- ’category’ - NaNs they are interpreted as a separate categorical feature 
- ’global_mean’ - NaNs are filled with the running mean 
 
- smoothing (int) – smoothing parameter 
 
 - Methods - fit(ts)- Fit the transform. - fit_transform(ts)- Fit and transform TSDataset. - Return the list with regressors created by the transform. - Inverse transform TSDataset. - load(path)- Load an object. - Get default grid for tuning hyperparameters. - save(path)- Save the object. - set_params(**params)- Return new object instance with modified parameters. - to_dict()- Collect all information about etna object in dict. - transform(ts)- Transform TSDataset inplace. - Attributes - This class stores its - __init__parameters as attributes.- idx- fit(ts: TSDataset) Transform[source]#
- Fit the transform. - Parameters:
- ts (TSDataset) – Dataset to fit the transform on. 
- Returns:
- The fitted transform instance. 
- Return type:
- Transform 
 
 - fit_transform(ts: TSDataset) TSDataset[source]#
- Fit and transform TSDataset. - May be reimplemented. But it is not recommended. 
 - classmethod load(path: Path) Self[source]#
- Load an object. - Warning - This method uses - dillmodule which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.- Parameters:
- path (Path) – Path to load object from. 
- Returns:
- Loaded object. 
- Return type:
- Self 
 
 - params_to_tune() Dict[str, BaseDistribution][source]#
- Get default grid for tuning hyperparameters. - This grid tunes - smoothingparameter. Other parameters are expected to be set by the user.- Returns:
- Grid to tune. 
- Return type:
 
 - set_params(**params: dict) Self[source]#
- Return new object instance with modified parameters. - Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a - modelin a- Pipeline.- Nested parameters are expected to be in a - <component_1>.<...>.<parameter>form, where components are separated by a dot.- Parameters:
- **params (dict) – Estimator parameters 
- Returns:
- New instance with changed parameters 
- Return type:
- Self 
 - Examples - >>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )