preprocessor
The schema package provide tools to infer and build a dataset schema.
Modules:
| Name | Description |
|---|---|
preprocessor |
Feature preprocessor. |
utils_stable_hash |
Utility for the hash. |
zoo |
Preprocessor zoo. |
Classes:
| Name | Description |
|---|---|
IdentityPreprocessor |
Identity Preprocessor class. |
SklearnPreprocessor |
Preprocessor class based on sklearn preprocessing classes. |
TorchPreprocessor |
Preprocessor class based on pytorch. |
__all__ = ['IdentityPreprocessor', 'SklearnPreprocessor', 'TorchPreprocessor']
#
IdentityPreprocessor
#
Identity Preprocessor class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
|
None
|
Methods:
| Name | Description |
|---|---|
to_model |
Convert to PreprocessorInsert instance. |
from_model |
Create the client object from api response. |
stable_hash |
Return the hash. |
SklearnPreprocessor
#
Preprocessor class based on sklearn preprocessing classes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
|
None
|
|
str
|
|
required |
Methods:
| Name | Description |
|---|---|
transform |
Transform a feature raw value into its preprocessed value. |
inverse_transform |
Inverse transform a feature preprocessed value into its raw value. |
to_model |
Convert to PreprocessorInsert instance. |
from_model |
Create the client object from api response. |
stable_hash |
Return the hash. |
Attributes:
| Name | Type | Description |
|---|---|---|
preprocess_function |
TransformerMixin
|
|
preprocess_function: TransformerMixin
#
transform(feature_raw_value: object) -> torch.Tensor
#
Transform a feature raw value into its preprocessed value.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
inverse_transform(preprocessed_value: torch.Tensor) -> object
#
Inverse transform a feature preprocessed value into its raw value.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
to_model() -> PreprocessorInsert
#
Convert to PreprocessorInsert instance.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
from_model(preprocessor_input: PreprocessorSelectInput) -> SklearnPreprocessor
#
Create the client object from api response.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
stable_hash() -> str
#
Return the hash.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
TorchPreprocessor(input_size: tuple[int, ...], module_transform: torch.nn.Module | None = None, module_inverse_transform: torch.nn.Module | None = None, **additional_attributes: object)
#
Preprocessor class based on pytorch.
To customize your preprocessor, inherit from this class and implement the transform and inverse_transform methods. Additionally, you can define module_transform and module_inverse_transform in the init method.
Initialize the preprocessor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
tuple[int, ...]
|
The dimensions of the data that the preprocessor expects, excluding the batch size.
|
required |
|
Module | None
|
A PyTorch module to preprocess data from the raw input space to the preprocessed space.
If |
None
|
|
Module | None
|
A PyTorch module to reverse the preprocessing, converting data from the preprocessed space
back to the raw input space.
If |
None
|
|
object
|
Any additional keyword arguments can be passed when instantiating a TorchPreprocessor or a child class.
These arguments will be set as class attributes.
It can be especially useful to better customize the implementation of |
{}
|
Methods:
| Name | Description |
|---|---|
forward |
Transform. |
transform |
Process data: ie take in input a tensor and return the tensor preprocessed. |
inverse_transform |
Reciprocal of preprocess. |
to_model |
Convert to PreprocessorInsert instance. |
stable_hash |
Compute a stable hash for a torch module or exported program. |
from_model |
Convert to TorchPreprocessor. |
Attributes:
| Name | Type | Description |
|---|---|---|
input_size |
|
|
ward |
|
|
module_transform |
|
|
module_inverse_transform |
|
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
input_size = input_size
#
ward = True
#
module_transform = module_transform
#
module_inverse_transform = module_inverse_transform
#
forward(inputs: torch.Tensor) -> torch.Tensor
#
transform(inputs: torch.Tensor) -> torch.Tensor
#
Process data: ie take in input a tensor and return the tensor preprocessed.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
inverse_transform(output: torch.Tensor) -> torch.Tensor
#
Reciprocal of preprocess.
ie \forall x inverse_transform(transform(x)) = transform(inverse_transform(x)) = x.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
to_model() -> PreprocessorInsert
#
Convert to PreprocessorInsert instance.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
stable_hash() -> str
#
Compute a stable hash for a torch module or exported program.
Returns:
| Type | Description |
|---|---|
str
|
Hexadecimal digest that is stable across runs for identical parameters, buffers, and shapes. |
Notes
The hash is computed by: - Obtaining the state dictionary. - Sorting keys lexicographically. - Moving tensors to CPU, making them contiguous. - Hashing key names, dtypes, shapes, and raw tensor bytes.
This makes the hash independent of device placement and dictionary insertion order. It will change whenever any parameter or buffer content or shape changes.
Source code in src/xpdeep/dataset/preprocessor/preprocessor.py
from_model(preprocessor_input: PreprocessorSelectInput) -> TorchPreprocessor
#
Convert to TorchPreprocessor.