Skip to content

preprocessor

Feature preprocessor.

Classes:

Name Description
Preprocessor

Preprocessor class to preprocess the raw data.

IdentityPreprocessor

Identity Preprocessor class.

SklearnPreprocessor

Preprocessor class based on sklearn preprocessing classes.

TorchPreprocessor

Preprocessor class based on pytorch.

Preprocessor #

Preprocessor class to preprocess the raw data.

Parameters:

Name Type Description Default

preprocessed_size #

None indicates that the preprocessor was not fitted. Otherwise, it represents the feature size after being preprocessed, without the batch size.

required

Attributes:

Name Type Description
preprocessed_size tuple[int, ...] | None

preprocessed_size: tuple[int, ...] | None = None #

IdentityPreprocessor #

Identity Preprocessor class.

Methods:

Name Description
from_exposed

Unparse the serialized preprocessor to use it on client side.

Attributes:

Name Type Description
as_exposed ExposedIdentityPreprocessor

Parse the preprocessor to send it to xpdeep server.

as_exposed: ExposedIdentityPreprocessor #

Parse the preprocessor to send it to xpdeep server.

from_exposed(exposed_identity_preprocessor: ExposedIdentityPreprocessor) -> Self #

Unparse the serialized preprocessor to use it on client side.

Source code in src/xpdeep/dataset/schema/preprocessor.py
@classmethod
def from_exposed(cls, exposed_identity_preprocessor: ExposedIdentityPreprocessor) -> Self:
    """Unparse the serialized preprocessor to use it on client side."""
    return cls(preprocessed_size=exposed_identity_preprocessor.preprocessed_size)

SklearnPreprocessor #

Preprocessor class based on sklearn preprocessing classes.

Parameters:

Name Type Description Default

preprocess_function #

TransformerMixin | ExposedPreprocessFunction
required

Methods:

Name Description
from_exposed

Unparse the serialized preprocessor to use it on client side.

transform

Transform a feature raw value into its preprocessed value.

inverse_transform

Inverse transform a feature preprocessed value into its raw value.

Attributes:

Name Type Description
preprocess_function TransformerMixin | ExposedPreprocessFunction
as_exposed ExposedNumpyPreprocessor

Parse the preprocessor to send it to xpdeep server.

preprocess_function: TransformerMixin | ExposedPreprocessFunction #

as_exposed: ExposedNumpyPreprocessor #

Parse the preprocessor to send it to xpdeep server.

from_exposed(numpy_preprocessor: ExposedNumpyPreprocessor) -> Self #

Unparse the serialized preprocessor to use it on client side.

Unparsing requires the adequate external modules, like scikit-learn if it used it originally.

Source code in src/xpdeep/dataset/schema/preprocessor.py
@classmethod
def from_exposed(cls, numpy_preprocessor: ExposedNumpyPreprocessor) -> Self:
    """Unparse the serialized preprocessor to use it on client side.

    Unparsing requires the adequate external modules, like `scikit-learn` if it used it originally.
    """
    try:
        preprocess_function = numpy_preprocessor.preprocess_function.unparse(None, None)
    except ModuleNotFoundError as err:
        warnings.warn(  # noqa: B028
            "Unable to recreate preprocess_function from ExposedPreprocessFunction. An additional module is "
            f"required to achieve this operation. {err.msg}"
        )
        preprocess_function = numpy_preprocessor.preprocess_function

    return cls(
        preprocessed_size=numpy_preprocessor.preprocessed_size,
        preprocess_function=preprocess_function,
    )

transform(feature_raw_value: object) -> torch.Tensor #

Transform a feature raw value into its preprocessed value.

Source code in src/xpdeep/dataset/schema/preprocessor.py
def transform(self, feature_raw_value: object) -> torch.Tensor:
    """Transform a feature raw value into its preprocessed value."""
    if not isinstance(self.preprocess_function, TransformerMixin):
        msg = f"{self.preprocess_function} was not parsable"
        raise TypeError(msg)
    return self.preprocess_function.transform(feature_raw_value)  # type: ignore[no-any-return]

inverse_transform(preprocessed_value: torch.Tensor) -> object #

Inverse transform a feature preprocessed value into its raw value.

Source code in src/xpdeep/dataset/schema/preprocessor.py
def inverse_transform(self, preprocessed_value: torch.Tensor) -> object:
    """Inverse transform a feature preprocessed value into its raw value."""
    if not isinstance(self.preprocess_function, TransformerMixin):
        msg = f"{self.preprocess_function} was not parsable"
        raise TypeError(msg)

    return self.preprocess_function.inverse_transform(preprocessed_value)

TorchPreprocessor(input_size: tuple[int, ...], module_transform: torch.nn.Module | None = None, module_inverse_transform: torch.nn.Module | None = None) #

Preprocessor class based on pytorch.

To customize your preprocessor, inherit from this class and implement the transform and inverse_transform methods. Additionally, you can define module_transform and module_inverse_transform in the init method.

Initialize the preprocessor.

Parameters:

Name Type Description Default

input_size #

tuple[int, ...]

The dimensions of the data that the preprocessor expects, excluding the batch size. input_size must match the dimensions of the data in your dataset. - Set an empty tuple () if no specific dimensions are provided (e.g., for scalar values). - Or by example,For an array of size (3, 2) in your dataset, set input_size to (3, 2).

required

module_transform #

Module | None

A PyTorch module to preprocess data from the raw input space to the preprocessed space. If transform is not inherited, this will override the default transform method.

None

module_inverse_transform #

Module | None

A PyTorch module to reverse the preprocessing, converting data from the preprocessed space back to the raw input space. If inverse_transform is not inherited, this will override the default inverse_transform method.

None

Methods:

Name Description
forward

Transform.

transform

Process data: ie take in input a tensor and return the tensor preprocessed.

inverse_transform

Reciprocal of preprocess.

from_exposed

Unparse the serialized preprocessor to use it on client side.

Attributes:

Name Type Description
input_size
ward
module_transform
module_inverse_transform
as_exposed ExposedTorchPreprocessor

Parse the preprocessor to send it to xpdeep server.

Source code in src/xpdeep/dataset/schema/preprocessor.py
def __init__(
    self,
    input_size: tuple[int, ...],
    module_transform: torch.nn.Module | None = None,
    module_inverse_transform: torch.nn.Module | None = None,
):
    """Initialize the preprocessor.

    Parameters
    ----------
    input_size:tuple[int, ...]
        The dimensions of the data that the preprocessor expects, excluding the batch size.
        `input_size` must match the dimensions of the data in your dataset.
        - Set an empty tuple `()` if no specific dimensions are provided (e.g., for scalar values).
        - Or by example,For an array of size `(3, 2)` in your dataset, set `input_size` to `(3, 2)`.
    module_transform:torch.nn.Module | None
        A PyTorch module to preprocess data from the raw input space to the preprocessed space.
        If `transform` is not inherited, this will override the default `transform` method.
    module_inverse_transform:torch.nn.Module | None
        A PyTorch module to reverse the preprocessing, converting data from the preprocessed space
        back to the raw input space.
        If `inverse_transform` is not inherited, this will override the default `inverse_transform` method.
    """
    super().__init__()
    self.input_size = input_size
    self.ward = True
    self.module_transform = module_transform
    self.module_inverse_transform = module_inverse_transform

input_size = input_size #

ward = True #

module_transform = module_transform #

module_inverse_transform = module_inverse_transform #

as_exposed: ExposedTorchPreprocessor #

Parse the preprocessor to send it to xpdeep server.

forward(inputs: torch.Tensor) -> torch.Tensor #

Transform.

Source code in src/xpdeep/dataset/schema/preprocessor.py
def forward(self, inputs: torch.Tensor) -> torch.Tensor:
    """Transform."""
    if self.ward:
        return self.transform(inputs)
    return self.inverse_transform(inputs)

transform(inputs: torch.Tensor) -> torch.Tensor #

Process data: ie take in input a tensor and return the tensor preprocessed.

Source code in src/xpdeep/dataset/schema/preprocessor.py
def transform(self, inputs: torch.Tensor) -> torch.Tensor:
    """Process data: ie take in input a tensor and return the tensor preprocessed."""
    if self.module_transform is None:
        raise NotImplementedError("Implement this function.")
    return cast(torch.Tensor, self.module_transform(inputs))

inverse_transform(output: torch.Tensor) -> torch.Tensor #

Reciprocal of preprocess.

ie \forall x inverse_transform(transform(x)) = transform(inverse_transform(x)) = x.

Source code in src/xpdeep/dataset/schema/preprocessor.py
def inverse_transform(self, output: torch.Tensor) -> torch.Tensor:
    r"""Reciprocal of preprocess.

    ie \forall x inverse_transform(transform(x)) = transform(inverse_transform(x)) = x.
    """
    if self.module_inverse_transform is None:
        raise NotImplementedError("implement this function.")
    return cast(torch.Tensor, self.module_inverse_transform(output))

from_exposed(exposed_torch_preprocessor: ExposedTorchPreprocessor) -> Self #

Unparse the serialized preprocessor to use it on client side.

Source code in src/xpdeep/dataset/schema/preprocessor.py
@classmethod
def from_exposed(cls, exposed_torch_preprocessor: ExposedTorchPreprocessor) -> Self:
    """Unparse the serialized preprocessor to use it on client side."""
    inverse_transform = exposed_torch_preprocessor.inverse_preprocess_transformer.to_torch_module()
    if exposed_torch_preprocessor.preprocessed_size is None:
        raise ValueError("")
    input_size = inverse_transform(torch.randn(size=(2, *exposed_torch_preprocessor.preprocessed_size))).size()[1:]
    return cls(
        input_size=input_size,
        module_transform=exposed_torch_preprocessor.preprocess_transformer.to_torch_module(),
        module_inverse_transform=inverse_transform,
    )