Utilities Module

RecIS’s utilities module provides various auxiliary functions, including logging, data processing, and more.

Logging System

Logger

class recis.utils.logger.Logger(name='/home/runner/work/RecIS/RecIS/recis/utils/logger.py', level=10)[source]

Distributed-training-friendly logger for RecIS framework.

This logger provides a unified interface for logging with support for distributed training scenarios. It automatically configures separate handlers for stdout and stderr, with appropriate filtering to ensure INFO messages go to stdout and other levels go to stderr.

Features:

Automatic handler configuration with level-based routing
Rank-aware logging methods for distributed training
Consistent formatting across all messages
Support for all standard logging levels

Parameters:

name (str, optional) – Name for the logger, typically __file__ or __name__. Defaults to __file__.
level (int, optional) – Minimum logging level. Defaults to logging.DEBUG.

Example

>>> from recis.utils.logger import Logger
>>> # Create logger for current module
>>> logger = Logger(__name__)
>>> # Basic logging
>>> logger.info("Training started")
>>> logger.warning("Learning rate is high")
>>> logger.error("Failed to load checkpoint")
>>> # Distributed training - only rank 0 logs
>>> logger.info_rank0("Epoch completed")  # Only logs on rank 0
>>> logger.warning_rank0("Memory usage high")  # Only logs on rank 0
>>> # Regular logging (all ranks)
>>> logger.info("Processing batch")  # Logs on all ranks

__init__(name='/home/runner/work/RecIS/RecIS/recis/utils/logger.py', level=10)[source]

Initialize the logger with configured handlers and formatting.

Sets up stdout handler for INFO messages and stderr handler for all other levels, with consistent formatting.

Parameters:

name (str, optional) – Logger name. Defaults to __file__.
level (int, optional) – Minimum log level. Defaults to logging.DEBUG.

error(*args, **kwargs)[source]

Log an error message.

Parameters:

*args – Positional arguments passed to logger.error().
**kwargs – Keyword arguments passed to logger.error().

Example

>>> logger.error("Failed to load model from %s", model_path)
>>> logger.error("CUDA out of memory: %s", str(e))

info(*args, **kwargs)[source]

Log an info message.

Parameters:

*args – Positional arguments passed to logger.info().
**kwargs – Keyword arguments passed to logger.info().

Example

>>> logger.info("Processing batch %d", batch_idx)
>>> logger.info("Training loss: %.4f", loss.item())

warning(*args, **kwargs)[source]

Log a warning message.

Parameters:

*args – Positional arguments passed to logger.warning().
**kwargs – Keyword arguments passed to logger.warning().

Example

>>> logger.warning("Learning rate is very high: %.6f", lr)
>>> logger.warning("Memory usage: %.1f%%", memory_percent)

Data Processing Tools

Data Copy Functions

recis.utils.data_utils.copy_data_to_device(data, device, *args, **kwargs)[source]

Recursively copies data to a specified PyTorch device.

This function handles various data structures and copies them to the target device while preserving their original structure and type. It supports tensors, collections, dataclasses, and nested structures.

Parameters:

data – The data structure to copy to device. Can be any of: - torch.Tensor: Will be moved to device using .to() - list/tuple: Each element will be recursively copied - dict/Mapping: Each value will be recursively copied - namedtuple: Will be reconstructed with copied fields - dataclass: Fields will be recursively copied - Any object with .to() method: Will use .to() method - Other types: Returned as-is
device (torch.device) – The target device to copy data to.
*args – Additional positional arguments passed to the .to() method.
**kwargs – Additional keyword arguments passed to the .to() method.

Returns:

The data structure copied to the specified device, maintaining the original structure and type.

Example

>>> import torch
>>> from recis.utils.data_utils import copy_data_to_device
>>> # Copy tensor to GPU
>>> tensor = torch.tensor([1, 2, 3])
>>> device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
>>> tensor_gpu = copy_data_to_device(tensor, device)
>>> # Copy batch dictionary to GPU
>>> batch = {
...     "user_id": torch.tensor([1, 2, 3]),
...     "item_id": torch.tensor([4, 5, 6]),
...     "labels": torch.tensor([0, 1, 0]),
... }
>>> batch_gpu = copy_data_to_device(batch, device)
>>> # Copy nested structure
>>> nested_data = {
...     "features": [torch.tensor([1.0, 2.0]), torch.tensor([3.0, 4.0])],
...     "metadata": {"batch_size": 2, "sequence_length": 10},
... }
>>> nested_gpu = copy_data_to_device(nested_data, device)

Note

This function preserves the exact type of input collections
For dataclasses, both init and non-init fields are handled
Objects without .to() method are returned unchanged
The function is recursive and handles arbitrarily nested structures