etna.transforms.embeddings.models.TSTCCEmbeddingModel#

class TSTCCEmbeddingModel(input_dims: int, output_dims: int = 32, tc_hidden_dim: int = 32, kernel_size: int = 7, dropout: float = 0.35, timesteps: int = 7, heads: int = 1, depth: int = 4, jitter_scale_ratio: float = 1.1, max_seg: int = 4, jitter_ratio: float = 0.8, use_cosine_similarity: bool = True, n_seq_steps: int = 0, device: Literal['cpu', 'cuda'] = 'cpu', batch_size: int = 16, num_workers: int = 0, is_freezed: bool = False)[source]#

Bases: BaseEmbeddingModel

TSTCC embedding model.

If there are NaNs in series, embeddings will not contain NaNs.

Each following calling of fit method continues the learning of the same model.

Using custom output_dims, set it to a value > 3 to have the loss calculated correctly.

For more details read the paper.

Notes

  • This model cannot be fitted with batch_size=1. So, it cannot be fitted on a dataset with 1 segment.

  • Model’s weights are transferred to cpu during loading.

Init TSTCCEmbeddingModel.

Parameters:
  • input_dims (int) – The input dimension. For a univariate time series, this should be set to 1.

  • output_dims (int) – The representation dimension.

  • tc_hidden_dim (int) – The output dimension after temporal_contr_model.

  • kernel_size (int) – Kernel size of first convolution in encoder.

  • dropout (float) – Dropout rate in first convolution block in encoder.

  • timesteps (int) – The number of timestamps to predict in temporal contrasting model.

  • heads (int) – Number of heads in attention block in temporal contrasting model. Parameter output_dims must be a multiple of the number of heads.

  • depth (int) – Depth in attention block in temporal contrasting model.

  • n_seq_steps (int) – Max context size in temporal contrasting model.

  • jitter_scale_ratio (float) – Jitter ratio in weak augmentation.

  • max_seg (int) – Number of segments in strong augmentation.

  • jitter_ratio (float) – Jitter ratio in strong augmentation.

  • use_cosine_similarity (bool) – If True NTXentLoss uses cosine similarity, if False NTXentLoss uses dot product.

  • device (Literal['cpu', 'cuda']) – The device used for training and inference. To swap device, change this attribute.

  • batch_size (int) – The batch size (number of segments in a batch). To swap batch_size, change this attribute.

  • num_workers (int) – How many subprocesses to use for data loading. See (api reference torch.utils.data.DataLoader). To swap num_workers, change this attribute.

  • is_freezed (bool) – Whether to freeze model in constructor or not. For more details see freeze method.

Methods

encode_segment(x)

Create embeddings of the whole series.

encode_window(x)

Create embeddings of each series timestamp.

fit(x[, n_epochs, lr, temperature, lambda1, ...])

Fit TSTCC embedding model.

freeze([is_freezed])

Enable or disable skipping training in fit.

list_models()

Return a list of available pretrained models.

load([path, model_name])

Load an object.

save(path)

Save the object.

set_params(**params)

Return new object instance with modified parameters.

to_dict()

Collect all information about etna object in dict.

Attributes

This class stores its __init__ parameters as attributes.

is_freezed

Return whether to skip training during fit.

encode_segment(x: ndarray) ndarray[source]#

Create embeddings of the whole series.

Parameters:

x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).

Returns:

array with embeddings of shape (n_segments, output_dim)

Return type:

ndarray

encode_window(x: ndarray) ndarray[source]#

Create embeddings of each series timestamp.

Parameters:

x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).

Returns:

array with embeddings of shape (n_segments, n_timestamps, output_dim)

Return type:

ndarray

fit(x: ndarray, n_epochs: int = 40, lr: float = 0.001, temperature: float = 0.2, lambda1: float = 1, lambda2: float = 0.7, verbose: bool = False) TSTCCEmbeddingModel[source]#

Fit TSTCC embedding model.

Parameters:
  • x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).

  • n_epochs (int) – The number of epochs. When this reaches, the training stops.

  • lr (float) – The learning rate.

  • temperature (float) – Temperature in NTXentLoss.

  • lambda1 (float) – The relative weight of the first item in the loss (temporal contrasting loss).

  • lambda2 (float) – The relative weight of the second item in the loss (contextual contrasting loss).

  • verbose (bool) – Whether to print the training loss after each epoch.

Return type:

TSTCCEmbeddingModel

freeze(is_freezed: bool = True)[source]#

Enable or disable skipping training in fit.

Parameters:

is_freezed (bool) – whether to skip training during fit.

static list_models() List[str][source]#

Return a list of available pretrained models.

Main information about available models:

  • tstcc_medium:

    • Number of parameters - 234k

    • Dimension of output embeddings - 16

Returns:

List of available pretrained models.

Return type:

List[str]

classmethod load(path: Path | None = None, model_name: str | None = None) TSTCCEmbeddingModel[source]#

Load an object.

Model’s weights are transferred to cpu during loading.

Parameters:
  • path (Path | None) –

    Path to load object from.

    • if path is not None and model_name is None, load the local model from path.

    • if path is None and model_name is not None, save the external model_name model to the etna folder in the home directory and load it. If path exists, external model will not be downloaded.

    • if path is not None and model_name is not None, save the external model_name model to path and load it. If path exists, external model will not be downloaded.

  • model_name (str | None) – name of external model to load. To get list of available models use list_models method.

Returns:

Loaded object.

Raises:
  • ValueError: – If none of parameters path and model_name are set.

  • NotImplementedError: – If model_name isn’t from list of available model names.

Return type:

TSTCCEmbeddingModel

save(path: Path)[source]#

Save the object.

Parameters:

path (Path) – Path to save object to.

set_params(**params: dict) Self[source]#

Return new object instance with modified parameters.

Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a model in a Pipeline.

Nested parameters are expected to be in a <component_1>.<...>.<parameter> form, where components are separated by a dot.

Parameters:

**params (dict) – Estimator parameters

Returns:

New instance with changed parameters

Return type:

Self

Examples

>>> from etna.pipeline import Pipeline
>>> from etna.models import NaiveModel
>>> from etna.transforms import AddConstTransform
>>> model = NaiveModel(lag=1)
>>> transforms = [AddConstTransform(in_column="target", value=1)]
>>> pipeline = Pipeline(model, transforms=transforms, horizon=3)
>>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2})
Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
to_dict()[source]#

Collect all information about etna object in dict.

property is_freezed[source]#

Return whether to skip training during fit.