View Jupyter notebook on the GitHub.
Embedding models#
This notebooks contains examples with embedding models.
Table of contents
Using embedding models directly
Using embedding models with transforms
Baseline
EmbeddingSegmentTransform
EmbeddingWindowTransform
Saving and loading models
[1]:
import warnings
warnings.filterwarnings("ignore")
1. Using embedding models directly#
We have two models to generate embeddings for time series: TS2VecEmbeddingModel
and TSTCCEmbeddingModel
.
Each model has following methods: - fit
to train model: - encode_segment
to generate embeddings for the whole series. These features are regressors. - encode_window
to generate embeddings for each timestamp. These features aren’t regressors and lag transformation should be applied to them before using in forecasting. - freeze
to enable or disable skipping training in fit
method. It is useful, for example, when you have a pretrained model and you want only to generate
embeddings without new training during backtest
. - save
and load
to save and load pretrained models, respectively.
[2]:
from pytorch_lightning import seed_everything
seed_everything(42, workers=True)
Global seed set to 42
[2]:
42
[3]:
from etna.datasets import TSDataset
from etna.datasets import generate_ar_df
df = generate_ar_df(periods=10, start_time="2001-01-01", n_segments=3)
ts = TSDataset(df, freq="D")
ts.head()
[3]:
segment | segment_0 | segment_1 | segment_2 |
---|---|---|---|
feature | target | target | target |
timestamp | |||
2001-01-01 | 1.624345 | 1.462108 | -1.100619 |
2001-01-02 | 1.012589 | -0.598033 | 0.044105 |
2001-01-03 | 0.484417 | -0.920450 | 0.945695 |
2001-01-04 | -0.588551 | -1.304504 | 1.448190 |
2001-01-05 | 0.276856 | -0.170735 | 2.349046 |
Now let’s work with models directly.
They are expecting array with shapes (n_segments, n_timestamps, num_features). The example shows working with TS2VecEmbeddingModel
, it is all the same with TSTCCEmbeddingModel
.
[4]:
x = ts.df.values.reshape(ts.size()).transpose(1, 0, 2)
x.shape
[4]:
(3, 10, 1)
[5]:
from etna.transforms.embeddings.models import TS2VecEmbeddingModel
from etna.transforms.embeddings.models import TSTCCEmbeddingModel
model_ts2vec = TS2VecEmbeddingModel(input_dims=1, output_dims=2)
model_ts2vec.fit(x, n_epochs=1)
segment_embeddings = model_ts2vec.encode_segment(x)
segment_embeddings.shape
[5]:
(3, 2)
As we are using encode_segment
we get output_dims
features consisting of one value for each segment.
And what about encode_window
?
[6]:
window_embeddings = model_ts2vec.encode_window(x)
window_embeddings.shape
[6]:
(3, 10, 2)
We get output_dims
features consisting of n_timestamps
values for each segment.
2. Using embedding models with transforms#
In this section we will test our models on example.
[7]:
HORIZON = 6
2.1 Baseline#
Before working with embedding features, let’s make forecasts using usual features.
[8]:
from etna.datasets import load_dataset
ts = load_dataset("m3_monthly")
ts.drop_features(features=["origin_timestamp"])
ts.df_exog = None
ts.head()
[8]:
segment | M1000_MACRO | M1001_MACRO | M1002_MACRO | M1003_MACRO | M1004_MACRO | M1005_MACRO | M1006_MACRO | M1007_MACRO | M1008_MACRO | M1009_MACRO | ... | M992_MACRO | M993_MACRO | M994_MACRO | M995_MACRO | M996_MACRO | M997_MACRO | M998_MACRO | M999_MACRO | M99_MICRO | M9_MICRO |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature | target | target | target | target | target | target | target | target | target | target | ... | target | target | target | target | target | target | target | target | target | target |
timestamp | |||||||||||||||||||||
0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 1428 columns
[9]:
from etna.metrics import SMAPE
from etna.models import CatBoostMultiSegmentModel
from etna.pipeline import Pipeline
from etna.transforms import LagTransform
model = CatBoostMultiSegmentModel()
lag_transform = LagTransform(in_column="target", lags=list(range(HORIZON, HORIZON + 6)), out_column="lag")
pipeline = Pipeline(model=model, transforms=[lag_transform], horizon=HORIZON)
metrics_df, _, _ = pipeline.backtest(ts, metrics=[SMAPE()], n_folds=3)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 4.3s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 9.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 13.6s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 13.6s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.4s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.2s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s finished
[10]:
print("SMAPE: ", metrics_df["SMAPE"].mean())
SMAPE: 14.719683971886594
2.2 EmbeddingSegmentTransform#
EmbeddingSegmentTransform
calls models’ encode_segment
method inside.
[11]:
from etna.transforms import EmbeddingSegmentTransform
from etna.transforms.embeddings.models import BaseEmbeddingModel
def forecast_with_segment_embeddings(emb_model: BaseEmbeddingModel, training_params: dict) -> float:
model = CatBoostMultiSegmentModel()
emb_transform = EmbeddingSegmentTransform(
in_columns=["target"], embedding_model=emb_model, training_params=training_params, out_column="emb"
)
pipeline = Pipeline(model=model, transforms=[lag_transform, emb_transform], horizon=HORIZON)
metrics_df, _, _ = pipeline.backtest(ts, metrics=[SMAPE()], n_folds=3)
smape_score = metrics_df["SMAPE"].mean()
return smape_score
You can see training parameters of the model to pass it to transform.
Let’s begin with TSTCCEmbeddingModel
[12]:
?TSTCCEmbeddingModel.fit
[13]:
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
emb_model = TSTCCEmbeddingModel(input_dims=1, tc_hidden_dim=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_segment_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 35.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.2min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.8min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.8min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 1.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 2.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 3.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 3.2s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s finished
[14]:
print("SMAPE: ", smape_score)
SMAPE: 14.214904390075835
Better then without embeddings. Let’s try TS2VecEmbeddingModel
.
[15]:
emb_model = TS2VecEmbeddingModel(input_dims=1, hidden_dims=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_segment_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 26.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 55.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.4min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.4min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.9s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 2.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 2.8s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s finished
[16]:
print("SMAPE: ", smape_score)
SMAPE: 13.549340740762041
Much better. Now let’s try another transform.
2.3 EmbeddingWindowTransform#
EmbeddingWindowTransform
calls models’ encode_window
method inside. As we have discussed, these features are not regressors and should be used as lags for future.
[17]:
from etna.transforms import EmbeddingWindowTransform
from etna.transforms import FilterFeaturesTransform
def forecast_with_window_embeddings(emb_model: BaseEmbeddingModel, training_params: dict) -> float:
model = CatBoostMultiSegmentModel()
output_dims = emb_model.output_dims
emb_transform = EmbeddingWindowTransform(
in_columns=["target"], embedding_model=emb_model, training_params=training_params, out_column="embedding_window"
)
lag_emb_transforms = [
LagTransform(in_column=f"embedding_window_{i}", lags=[HORIZON], out_column=f"lag_emb_{i}")
for i in range(output_dims)
]
filter_transforms = FilterFeaturesTransform(exclude=[f"embedding_window_{i}" for i in range(output_dims)])
transforms = [lag_transform] + [emb_transform] + lag_emb_transforms + [filter_transforms]
pipeline = Pipeline(model=model, transforms=transforms, horizon=HORIZON)
metrics_df, _, _ = pipeline.backtest(ts, metrics=[SMAPE()], n_folds=3)
smape_score = metrics_df["SMAPE"].mean()
return smape_score
[18]:
emb_model = TSTCCEmbeddingModel(input_dims=1, tc_hidden_dim=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_window_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 43.7s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.5min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 2.3min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 2.3min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 8.9s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 17.8s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 26.5s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 26.5s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.1s finished
[19]:
print("SMAPE: ", smape_score)
SMAPE: 104.68988621650867
Oops… What about TS2VecEmbeddingModel
?
[20]:
emb_model = TS2VecEmbeddingModel(input_dims=1, hidden_dims=16, depth=3, output_dims=6, device=device)
training_params = {"n_epochs": 10}
smape_score = forecast_with_window_embeddings(emb_model, training_params)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 34.5s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 1.2min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.8min remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 1.8min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 8.7s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 17.4s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 26.4s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 26.4s finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.2s finished
[21]:
print("SMAPE: ", smape_score)
SMAPE: 29.776520212845234
Window embeddings don’t help with this dataset. It means that you should try both models and both transforms to get the best results.
3. Saving and loading models#
If you have a pretrained embedding model and aren’t going to train it on calling fit
, you should “freeze” training loop. It is helpful for using the model inside transforms, which call fit
method on each fit
of the pipeline.
[22]:
MODEL_PATH = "model.zip"
[23]:
emb_model.freeze()
emb_model.save(MODEL_PATH)
Now you are ready to load pretrained model.
[24]:
model_loaded = TS2VecEmbeddingModel.load(MODEL_PATH)
If you need to fine-tune pretrained model, you should “unfreeze” training loop. After that it will start fitting on calling fit
method.
[25]:
model_loaded.freeze(is_freezed=False)
To get information about whether model is “freezed” or not use is_freezed
property.
[26]:
model_loaded.is_freezed
[26]:
False