etna.datasets.DataFrameFormat#
- class DataFrameFormat(value)[source]#
-
Enum for different kinds of
pd.DataFramewhich can be used.This dataframe stores:
Timestamps;
Segments;
Features. In this context, ‘target’ is also a feature.
Currently, there are formats:
Wide
Has index to store timestamps.
Columns has two levels with names ‘segment’, ‘feature’. Each column stores values for a given feature in a given segment.
List of columns isn’t empty.
There are all combinations for (segment, feature) in the columns.
Long
Has column ‘timestamp’ to store timestamps.
Has column ‘segment’ to store segments.
Has at least one more column except for ‘timestamp’ and ‘segment’.
Currently, we don’t check the types of columns to save compatibility, but it is expected that:
Timestamps have type
intorpd.Timestamp. If it isn’t,TSDatasetmakes conversion for you.Segments have type
str. If it isn’t,TSDatasetmakes conversion for you.
Methods
determine(df)Determine format of the given dataframe.
Attributes
Wide format.
Long format.
- classmethod determine(df: DataFrame) DataFrameFormat[source]#
Determine format of the given dataframe.
- Parameters:
df (DataFrame) – Dataframe to infer format.
- Returns:
Format of the given dataframe.
- Raises:
ValueError: – Given long dataframe doesn’t have required column ‘timestamp’
ValueError: – Given long dataframe doesn’t have required column ‘segment’
ValueError: – Given long dataframe doesn’t have any columns except for ‘timestamp` and ‘segment’
ValueError: – Given wide dataframe doesn’t have levels of columns [‘segment’, ‘feature’]
ValueError: – Given wide dataframe doesn’t have any features
ValueError: – Given wide dataframe doesn’t have all combinations of pairs (segment, feature)
- Return type: