etna.transforms.embeddings.models.TS2VecEmbeddingModel#

class TS2VecEmbeddingModel(input_dims: int, output_dims: int = 320, hidden_dims: int = 64, depth: int = 10, device: Literal['cpu', 'cuda'] = 'cpu', batch_size: int = 16, num_workers: int = 0, max_train_length: int | None = None, temporal_unit: int = 0, is_freezed: bool = False)[source]#

Bases: BaseEmbeddingModel

TS2Vec embedding model.

If there are NaNs in series, embeddings will not contain NaNs.

Each following calling of fit method continues the learning of the same model.

For more details read the paper.

Notes

Model’s weights are transferred to cpu during loading.

Init TS2VecEmbeddingModel.

Parameters:
  • input_dims (int) – The input dimension. For a univariate time series, this should be set to 1.

  • output_dims (int) – The representation dimension.

  • hidden_dims (int) – The hidden dimension of the encoder.

  • depth (int) – The number of hidden residual blocks in the encoder.

  • device (Literal['cpu', 'cuda']) – The device used for training and inference. To swap device, change this attribute.

  • batch_size (int) – The batch size. To swap batch_size, change this attribute.

  • num_workers (int) – How many subprocesses to use for data loading. See (api reference torch.utils.data.DataLoader). To swap num_workers, change this attribute.

  • max_train_length (int | None) – The maximum allowed sequence length for training. For sequence with a length greater than max_train_length, it would be cropped into some sequences, each of which has a length less than max_train_length.

  • temporal_unit (int) – The minimum unit to perform temporal contrast. When training on a very long sequence, this param helps to reduce the cost of time and memory.

  • is_freezed (bool) – Whether to freeze model in constructor or not. For more details see freeze method.

Notes

In case of long series to reduce memory consumption it is recommended to use max_train_length parameter or manually break the series into smaller subseries.

Methods

encode_segment(x[, mask, sliding_length, ...])

Create embeddings of the whole series.

encode_window(x[, mask, sliding_length, ...])

Create embeddings of each series timestamp.

fit(x[, lr, n_epochs, n_iters, verbose])

Fit TS2Vec embedding model.

freeze([is_freezed])

Enable or disable skipping training in fit.

list_models()

Return a list of available pretrained models.

load([path, model_name])

Load an object.

save(path)

Save the object.

set_params(**params)

Return new object instance with modified parameters.

to_dict()

Collect all information about etna object in dict.

Attributes

This class stores its __init__ parameters as attributes.

is_freezed

Return whether to skip training during fit.

encode_segment(x: ndarray, mask: Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last'] = 'all_true', sliding_length: int | None = None, sliding_padding: int = 0) ndarray[source]#

Create embeddings of the whole series.

Parameters:
  • x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).

  • mask (Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last']) –

    the mask used by encoder on the test phase can be specified with this parameter. The possible options are:

    • ’binomial’ - mask timestamp with probability 0.5 (default one, used in the paper). It is used on the training phase.

    • ’continuous’ - mask random windows of timestamps

    • ’all_true’ - mask none of the timestamps

    • ’all_false’ - mask all timestamps

    • ’mask_last’ - mask last timestamp

  • sliding_length (int | None) – the length of sliding window. When this param is specified, a sliding inference would be applied on the time series.

  • sliding_padding (int) – contextual data length used for inference every sliding windows.

Returns:

array with embeddings of shape (n_segments, output_dim)

Return type:

ndarray

encode_window(x: ndarray, mask: Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last'] = 'all_true', sliding_length: int | None = None, sliding_padding: int = 0, encoding_window: int | None = None) ndarray[source]#

Create embeddings of each series timestamp.

Parameters:
  • x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).

  • mask (Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last']) –

    the mask used by encoder on the test phase can be specified with this parameter. The possible options are:

    • ’binomial’ - mask timestamp with probability 0.5 (default one, used in the paper). It is used on the training phase.

    • ’continuous’ - mask random windows of timestamps

    • ’all_true’ - mask none of the timestamps

    • ’all_false’ - mask all timestamps

    • ’mask_last’ - mask last timestamp

  • sliding_length (int | None) – the length of sliding window. When this param is specified, a sliding inference would be applied on the time series.

  • sliding_padding (int) – the contextual data length used for inference every sliding windows.

  • encoding_window (int | None) – when this param is specified, the computed representation would be the max pooling over this window. This param will be ignored when encoding full series

Returns:

array with embeddings of shape (n_segments, n_timestamps, output_dim)

Return type:

ndarray

fit(x: ndarray, lr: float = 0.001, n_epochs: int | None = None, n_iters: int | None = None, verbose: bool | None = None) TS2VecEmbeddingModel[source]#

Fit TS2Vec embedding model.

Parameters:
  • x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).

  • lr (float) – The learning rate.

  • n_epochs (int | None) – The number of epochs. When this reaches, the training stops.

  • n_iters (int | None) – The number of iterations. When this reaches, the training stops. If both n_epochs and n_iters are not specified, a default setting would be used that sets n_iters to 200 for a dataset with size <= 100000, 600 otherwise.

  • verbose (bool | None) – Whether to print the training loss after each epoch.

Return type:

TS2VecEmbeddingModel

freeze(is_freezed: bool = True)[source]#

Enable or disable skipping training in fit.

Parameters:

is_freezed (bool) – whether to skip training during fit.

static list_models() List[str][source]#

Return a list of available pretrained models.

Main information about available models:

  • ts2vec_tiny:

    • Number of parameters - 40k

    • Dimension of output embeddings - 16

Returns:

List of available pretrained models.

Return type:

List[str]

classmethod load(path: Path | None = None, model_name: str | None = None) TS2VecEmbeddingModel[source]#

Load an object.

Model’s weights are transferred to cpu during loading.

Parameters:
  • path (Path | None) –

    Path to load object from.

    • if path is not None and model_name is None, load the local model from path.

    • if path is None and model_name is not None, save the external model_name model to the etna folder in the home directory and load it. If path exists, external model will not be downloaded.

    • if path is not None and model_name is not None, save the external model_name model to path and load it. If path exists, external model will not be downloaded.

  • model_name (str | None) – Name of external model to load. To get list of available models use list_models method.

Returns:

Loaded object.

Raises:
  • ValueError: – If none of parameters path and model_name are set.

  • NotImplementedError: – If model_name isn’t from list of available model names.

Return type:

TS2VecEmbeddingModel

save(path: Path)[source]#

Save the object.

Parameters:

path (Path) – Path to save object to.

set_params(**params: dict) Self[source]#

Return new object instance with modified parameters.

Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a model in a Pipeline.

Nested parameters are expected to be in a <component_1>.<...>.<parameter> form, where components are separated by a dot.

Parameters:

**params (dict) – Estimator parameters

Returns:

New instance with changed parameters

Return type:

Self

Examples

>>> from etna.pipeline import Pipeline
>>> from etna.models import NaiveModel
>>> from etna.transforms import AddConstTransform
>>> model = NaiveModel(lag=1)
>>> transforms = [AddConstTransform(in_column="target", value=1)]
>>> pipeline = Pipeline(model, transforms=transforms, horizon=3)
>>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2})
Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
to_dict()[source]#

Collect all information about etna object in dict.

property is_freezed[source]#

Return whether to skip training during fit.