Practice
In order to utilise TiRex regression, make sure to either locally install it in your preferred Python environment or use a hosted Jupyter Notebook service like Google Colab.
1. Install Tirex
# install with the extra 'regression' for regression support
pip install 'tirex-ts[regression]'
Install additional packages (used only for example data).
pip install aeon
2. Import TiRex and supporting libraries
# General imports
import torch
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from aeon.datasets import load_regression
# Import TiRex regressors
from tirex.models.regression import TirexRFRegressor, TirexLinearRegressor, TirexGBMRegressor
3. Preprocessing of the Data
# Load dataset
X, y, meta = load_regression("HouseholdPowerConsumption1", return_metadata=True)
# Split dataset into train and test (for example, 80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Convert to torch tensors
train_X = torch.tensor(X_train, dtype=torch.float32)
test_X = torch.tensor(X_test, dtype=torch.float32)
train_y = torch.tensor(y_train, dtype=torch.float32)
test_y = torch.tensor(y_test, dtype=torch.float32)
print(train_X.shape, train_y.shape)
# torch.Size([1144, 5, 1440]) torch.Size([1144])
Note on target format:
- For Linear regressor: Use
train_ywith shape(num_samples, 1)- you can reshape withtrain_y = train_y.unsqueeze(1) - For RF/GBM regressors: Both
(num_samples,)and(num_samples, 1)work
4. Initialize TiRex Regressor
Model weights of the TiRex backbone model are automatically fetched from HuggingFace.
Option A: Random Forest Regressor
regressor = TirexRFRegressor(
data_augmentation=False,
device="cuda:0",
n_estimators=50,
max_depth=10,
random_state=42
)
data_augmentation(bool): Whether to use additional data augmentation concatenated to the embeddings. Defaults to False.device(str): Device used for embedding computation (for example, "cuda:0" for GPU or "cpu"). Note: Random Forest itself always runs on CPU (uses scikit-learn).compile(bool): Whether to compile the frozen embedding model. Default: False- The rest of the parameters are kwargs to RandomForest of sklearn. For more details see scikit-learn RandomForestRegressor documentation.
Option B: Torch Linear Regressor
regressor = TirexLinearRegressor(
data_augmentation=False,
device="cuda:0",
max_epochs=10,
lr=1e-4,
batch_size=32
)
data_augmentation(bool): Whether to use additional data augmentation concatenated to the embeddings. Defaults to False.device(str): Device used for training and inference (for example, "cuda:0" for GPU or "cpu").compile(bool): Whether to compile the frozen embedding model. Default: Falsemax_epochs(int): Maximum number of training epochs. Default: 10lr(float): Learning rate for the optimizer. Default: 1e-4batch_size(int): Batch size for training and embedding calculations. Default: 512
The rest of the parameters you can see in the API description.
Option C: Gradient Boosting Regressor
regressor = TirexGBMRegressor(
data_augmentation=False,
device="cuda:0",
batch_size=512,
early_stopping_rounds=10,
min_delta=0.0,
val_split_ratio=0.2,
random_state=42
)
data_augmentation(bool): Whether to use additional data augmentation concatenated to the embeddings. Defaults to False.device(str | None): Device used for embedding computation (for example, "cuda:0" for GPU or "cpu"). If None, uses CUDA if available, else CPU. Note: LightGBM itself always runs on CPU.compile(bool): Whether to compile the frozen embedding model. Default: Falsebatch_size(int): Batch size for embedding calculations. Default: 512.early_stopping_rounds(int | None): Number of rounds without improvement of all metrics for Early Stopping. Default: 10. Set to None to disable early stopping.min_delta(float): Minimum improvement in score to keep training. Default: 0.0.val_split_ratio(float): Proportion of training data to use for validation, if validation data are not provided. Default: 0.2.- The rest of the parameters are kwargs to LightGBM's LGBMRegressor. For more details see LGBMRegressor documentation.
5. Fit the model to the data
regressor.fit((train_X, train_y))
For Linear regressor and GBM regressor, you have the possibility to provide a validation set.
regressor.fit(train_data=(train_X, train_y), val_data=(val_X, val_y))
If you don't provide it, the train dataset will be split into train and validation datasets internally.
For both Linear regressor and Gradient Boosting regressor, two parameters control the train/validation split:
val_split_ratio(float): Size of the validation set (between 0 and 1). Defaults to: 0.2- For reproducibility, the Linear regressor uses the
seedparameter, while the GBM regressor uses therandom_stateparameter from LightGBM kwargs (if provided).
6. Prediction results
Analyze your prediction results
pred_y = regressor.predict(test_X)
# Convert to numpy for metric computation
pred_y_np = pred_y.cpu().numpy()
test_y_np = test_y.cpu().numpy()
# Compute regression metrics
print(f"MAE: {mean_absolute_error(test_y_np, pred_y_np):.4f}")
print(f"R² Score: {r2_score(test_y_np, pred_y_np):.4f}")