feature_types

Feature types module.

Classes:

Name	Description
`NumericalFeature`	Numerical feature.
`CategoricalFeature`	Categorical feature.
`ImageFeature`	Image feature.
`MultivariateNumericalFeature`	MultivariateNumericalFeature feature.
`UnivariateTimeSeriesFeature`	UnivariateTimeSerie feature.
`MultivariateTimeSeriesFeature`	MultivariateTimeSeries feature.
`BoundingBoxesFeature`	Feature that represents a list of bounding boxes, detection in a single image for instance.

Functions:

Name	Description
`feature_type_from_model`	Convert json features into the corresponding Feature.

`NumericalFeature` #

Numerical feature.

It represents quantifiable values that can be measured and ordered. Its values may be continuous (e.g., real numbers) or discrete (e.g., integers). For instance, "age" or "price" could be set as numerical features.

Parameters:

Name	Type	Description	Default
`name` #	`Literal['NUMERICAL']`		`'NUMERICAL'`

Methods:

Name	Description
`to_model`	Convert to NumericalFeatureType instance.

Attributes:

Name	Type	Description
`name`	`Literal['NUMERICAL']`

`name: Literal['NUMERICAL'] = 'NUMERICAL'` #

`to_model() -> NumericalFeatureType` #

Convert to NumericalFeatureType instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> NumericalFeatureType:
    """Convert to NumericalFeatureType instance."""
    return NumericalFeatureType(name=self.name)

`CategoricalFeature` #

Categorical feature.

It represents data (bool, int or str only) that can be divided into distinct groups or categories.

It may be nominal or ordinal. For instance, "Education level" or "gender" could be set as categorical features.

Parameters:

Name	Type	Description	Default
`categories` #	`list[str] \| list[int] \| None`	When building an `AnalyzedSchema`: If the preprocessor is one of `label_encoder`, `onehot_encoder`, then categories are automatically inferred from the preprocessor when fitted, and you should not provide any categories. Else, categories must be provided. When building a `FittedSchema`, categories must be provided. The categories' order must match the preprocessor's processing order. When using the dataset `analyze` and `fit` methods, no categories should be provided.	`None`
`name` #	`Literal['CATEGORICAL']`		`'CATEGORICAL'`

Methods:

Name	Description
`to_model`	Convert to CategoricalFeatureType instance.
`__attrs_post_init__`	Return a fake category to be overwritten by the fitted preprocessor inferred categories.

Attributes:

Name	Type	Description
`name`	`Literal['CATEGORICAL']`
`categories`	`list[str] \| list[int] \| None`

`name: Literal['CATEGORICAL'] = 'CATEGORICAL'` #

`categories: list[str] | list[int] | None = None` #

`to_model() -> CategoricalFeatureType` #

Convert to CategoricalFeatureType instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> CategoricalFeatureType:
    """Convert to CategoricalFeatureType instance."""
    return CategoricalFeatureType(
        name=self.name,
        categories=[] if self.categories is None or len(self.categories) == 0 else self.categories,
    )

`__attrs_post_init__() -> None` #

Return a fake category to be overwritten by the fitted preprocessor inferred categories.

Source code in src/xpdeep/dataset/feature/feature_types.py

def __attrs_post_init__(self) -> None:
    """Return a fake category to be overwritten by the fitted preprocessor inferred categories."""
    if self.categories is None:
        self.categories = ["fake_categories"]

`ImageFeature` #

Image feature.

It represents image feature objects. The corresponding data should use the channel-last format, i.e. batch_size x H x W x num_channels.

Parameters:

Name	Type	Description	Default
`name` #	`Literal['IMAGE']`		`'IMAGE'`

Methods:

Name	Description
`to_model`	Convert to ImageFeatureType instance.

Attributes:

Name	Type	Description
`name`	`Literal['IMAGE']`

`name: Literal['IMAGE'] = 'IMAGE'` #

`to_model() -> ImageFeatureType` #

Convert to ImageFeatureType instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> ImageFeatureType:
    """Convert to ImageFeatureType instance."""
    return ImageFeatureType(name=self.name)

`MultivariateNumericalFeature` #

MultivariateNumericalFeature feature.

It represents numerical data points divided into several channels. Oppositely as time serie features, there is no time relationship between points here.

Parameters:

Name	Type	Description	Default
`channel_names` #	`List[str]`	List of channel names used for visualization.	required
`name` #	`Literal['MULTIVARIATE_NUMERICAL']`		`'MULTIVARIATE_NUMERICAL'`

Methods:

Name	Description
`to_model`	Convert to VectorFeatureTypeInput instance.

Attributes:

Name	Type	Description
`channel_names`	`list[str]`
`name`	`Literal['MULTIVARIATE_NUMERICAL']`

`channel_names: list[str]` #

`name: Literal['MULTIVARIATE_NUMERICAL'] = 'MULTIVARIATE_NUMERICAL'` #

`to_model() -> VectorFeatureTypeInput` #

Convert to VectorFeatureTypeInput instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> VectorFeatureTypeInput:
    """Convert to VectorFeatureTypeInput instance."""
    return VectorFeatureTypeInput(
        name="VECTOR", items=NumericalFeatureType(name="NUMERICAL"), channel_names=self.channel_names
    )

`UnivariateTimeSeriesFeature` #

UnivariateTimeSerie feature.

It represents a time serie with a single channel, synchronized (no dynamic time warping required) or not.

The DTW will be applied automatically if this feature is required.

Parameters:

Name	Type	Description	Default
`asynchronous` #	`int`	Whether the time serie is asynchronous (dynamic time warping will be automatically applied server side) or not.	`False.`
`channel` #	`tuple[str, str] \| str \| None`	Used in XpViz. Indeed, if the same channel is used for both input (as a lookback, under a first feature object) and target (as a horizon to predict, under a second feature object), this parameter may be specified to visualize on the same curve both features, (the lookback and its corresponding horizon).	`None`
`name` #	`Literal['UNIVARIATE_TIMESERIES']`		`'UNIVARIATE_TIMESERIES'`

Methods:

Name	Description
`to_model`	Convert to TimeseriesFeatureTypeInput instance.

Attributes:

Name	Type	Description
`name`	`Literal['UNIVARIATE_TIMESERIES']`
`asynchronous`	`bool`
`channel`	`tuple[str, str] \| str \| None`

`name: Literal['UNIVARIATE_TIMESERIES'] = 'UNIVARIATE_TIMESERIES'` #

`asynchronous: bool = field(default=False, kw_only=True)` #

`channel: tuple[str, str] | str | None = field(default=None, kw_only=True)` #

`to_model() -> TimeseriesFeatureTypeInput` #

Convert to TimeseriesFeatureTypeInput instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> TimeseriesFeatureTypeInput:
    """Convert to TimeseriesFeatureTypeInput instance."""
    return TimeseriesFeatureTypeInput(
        name="TIMESERIES",
        items=NumericalFeatureType(name="NUMERICAL"),
        asynchronous=self.asynchronous,
        channel=list(self.channel)
        if isinstance(self.channel, tuple)
        # TODO(Tanguy): replace with `else self.channel`
        # https://gitlab.xpdeep.com/xpdeep/xpdeep-client/-/issues/331
        else [self.channel, "0"]
        if self.channel is not None
        else None,
    )

`MultivariateTimeSeriesFeature` #

MultivariateTimeSeries feature.

This class represents a multivariate time series (multiple channels), synchronized (no dynamic time warping required) or not.

The DTW will be applied automatically if this feature is required.

Parameters:

Name	Type	Description	Default
`asynchronous` #	`int`	Whether the time serie is asynchronous (dynamic time warping will be automatically applied server side) or not.	`False.`
`channel_names` #	`List[str]`	List of channel names used for visualization.	required
`name` #	`Literal['MULTIVARIATE_TIMESERIES']`		`'MULTIVARIATE_TIMESERIES'`

Methods:

Name	Description
`to_model`	Convert to TimeseriesFeatureTypeInput instance.

Attributes:

Name	Type	Description
`channel_names`	`list[str]`
`name`	`Literal['MULTIVARIATE_TIMESERIES']`
`asynchronous`	`bool`

`channel_names: list[str]` #

`name: Literal['MULTIVARIATE_TIMESERIES'] = 'MULTIVARIATE_TIMESERIES'` #

`asynchronous: bool = field(default=False, kw_only=True)` #

`to_model() -> TimeseriesFeatureTypeInput` #

Convert to TimeseriesFeatureTypeInput instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> TimeseriesFeatureTypeInput:
    """Convert to TimeseriesFeatureTypeInput instance."""
    return TimeseriesFeatureTypeInput(
        name="TIMESERIES",
        items=VectorFeatureTypeInput(
            name="VECTOR", items=NumericalFeatureType(name="NUMERICAL"), channel_names=self.channel_names
        ),
        asynchronous=self.asynchronous,
    )

`BoundingBoxesFeature` #

Feature that represents a list of bounding boxes, detection in a single image for instance.

Parameters:

Name	Type	Description	Default
`categories` #	`list[int] \| list[str]`	The categories that the bounding boxes represent.	required
`name` #	`Literal['BBOX_LIST']`	The feature name.	`'BBOX_LIST'`
`channel_names` #	`tuple[str, ...]`	The expected channel names for a single bounding box expected format is its class, its position, and the detection score.	`('class', 'center_x', 'center_y', 'width', 'height', 'score')`

Methods:

Name	Description
`to_model`	Convert to BBoxListFeatureType instance.

Attributes:

Name	Type	Description
`categories`	`list[int] \| list[str]`
`name`	`Literal['BBOX_LIST']`
`channel_names`	`tuple[str, ...]`

`categories: list[int] | list[str]` #

`name: Literal['BBOX_LIST'] = 'BBOX_LIST'` #

`channel_names: tuple[str, ...] = ('class', 'center_x', 'center_y', 'width', 'height', 'score')` #

`to_model() -> ListFeatureTypeInput` #

Convert to BBoxListFeatureType instance.

Source code in src/xpdeep/dataset/feature/feature_types.py

def to_model(self) -> ListFeatureTypeInput:
    """Convert to BBoxListFeatureType instance."""
    return ListFeatureTypeInput(
        name="LIST",
        items=BBoxFeatureType(
            name="BBOX", size=[0, 0], categories=self.categories, channel_names=list(self.channel_names)
        ),
    )

`feature_type_from_model(json_response: dict[str, object]) -> NumericalFeature | CategoricalFeature | MultivariateNumericalFeature | UnivariateTimeSeriesFeature | MultivariateTimeSeriesFeature | ImageFeature | BoundingBoxesFeature` #

Convert json features into the corresponding Feature.

Source code in src/xpdeep/dataset/feature/feature_types.py

def feature_type_from_model(  # noqa: PLR0911
    json_response: dict[str, object],
) -> (
    NumericalFeature
    | CategoricalFeature
    | MultivariateNumericalFeature
    | UnivariateTimeSeriesFeature
    | MultivariateTimeSeriesFeature
    | ImageFeature
    | BoundingBoxesFeature
):
    """Convert json features into the corresponding Feature."""
    match json_response["name"]:
        case "NUMERICAL":
            return NumericalFeature()
        case "CATEGORICAL":
            return CategoricalFeature(categories=json_response["categories"])  # type: ignore[arg-type]
        case "VECTOR":
            return MultivariateNumericalFeature(channel_names=json_response["channel_names"])  # type: ignore[arg-type]
        case "TIMESERIES":
            match json_response["items"]["name"]:  # type: ignore[index]
                case "NUMERICAL":
                    return UnivariateTimeSeriesFeature(
                        asynchronous=json_response["asynchronous"],  # type: ignore[arg-type]
                        channel=json_response["channel"],  # type: ignore[arg-type]
                    )
                case "VECTOR":
                    return MultivariateTimeSeriesFeature(
                        asynchronous=json_response["asynchronous"],  # type: ignore[arg-type]
                        channel_names=json_response["items"]["channel_names"],  # type: ignore[index]
                    )
                case _:
                    msg = f"Feature type `Timeseries[{json_response['items']['expose_data_type']}]` not recognized"  # type: ignore[index]
                    raise ApiError(msg)
        case "IMAGE":
            return ImageFeature()
        case "LIST":
            return BoundingBoxesFeature(
                categories=json_response["items"]["categories"],  # type: ignore[index]
                channel_names=json_response["items"]["channel_names"],  # type: ignore[index]
            )
        case _:
            msg = f"Feature type `{json_response['expose_data_type']}` not recognized"
            raise ApiError(msg)

feature_types

NumericalFeature #

name #

name: Literal['NUMERICAL'] = 'NUMERICAL' #

to_model() -> NumericalFeatureType #

CategoricalFeature #

categories #

name #

name: Literal['CATEGORICAL'] = 'CATEGORICAL' #

categories: list[str] | list[int] | None = None #

to_model() -> CategoricalFeatureType #

__attrs_post_init__() -> None #

ImageFeature #

name #

name: Literal['IMAGE'] = 'IMAGE' #

to_model() -> ImageFeatureType #

MultivariateNumericalFeature #

channel_names #

name #

channel_names: list[str] #

name: Literal['MULTIVARIATE_NUMERICAL'] = 'MULTIVARIATE_NUMERICAL' #

to_model() -> VectorFeatureTypeInput #

UnivariateTimeSeriesFeature #

asynchronous #

channel #

name #

name: Literal['UNIVARIATE_TIMESERIES'] = 'UNIVARIATE_TIMESERIES' #

asynchronous: bool = field(default=False, kw_only=True) #

channel: tuple[str, str] | str | None = field(default=None, kw_only=True) #

to_model() -> TimeseriesFeatureTypeInput #

MultivariateTimeSeriesFeature #

asynchronous #

channel_names #

name #

channel_names: list[str] #

name: Literal['MULTIVARIATE_TIMESERIES'] = 'MULTIVARIATE_TIMESERIES' #

asynchronous: bool = field(default=False, kw_only=True) #

to_model() -> TimeseriesFeatureTypeInput #

BoundingBoxesFeature #

categories #

name #

channel_names #

categories: list[int] | list[str] #

name: Literal['BBOX_LIST'] = 'BBOX_LIST' #

channel_names: tuple[str, ...] = ('class', 'center_x', 'center_y', 'width', 'height', 'score') #

to_model() -> ListFeatureTypeInput #

feature_type_from_model(json_response: dict[str, object]) -> NumericalFeature | CategoricalFeature | MultivariateNumericalFeature | UnivariateTimeSeriesFeature | MultivariateTimeSeriesFeature | ImageFeature | BoundingBoxesFeature #

`NumericalFeature` #

`name` #

`name: Literal['NUMERICAL'] = 'NUMERICAL'` #

`to_model() -> NumericalFeatureType` #

`CategoricalFeature` #

`categories` #

`name` #

`name: Literal['CATEGORICAL'] = 'CATEGORICAL'` #

`categories: list[str] | list[int] | None = None` #

`to_model() -> CategoricalFeatureType` #

`__attrs_post_init__() -> None` #

`ImageFeature` #

`name` #

`name: Literal['IMAGE'] = 'IMAGE'` #

`to_model() -> ImageFeatureType` #

`MultivariateNumericalFeature` #

`channel_names` #

`name` #

`channel_names: list[str]` #

`name: Literal['MULTIVARIATE_NUMERICAL'] = 'MULTIVARIATE_NUMERICAL'` #

`to_model() -> VectorFeatureTypeInput` #

`UnivariateTimeSeriesFeature` #

`asynchronous` #

`channel` #

`name` #

`name: Literal['UNIVARIATE_TIMESERIES'] = 'UNIVARIATE_TIMESERIES'` #

`asynchronous: bool = field(default=False, kw_only=True)` #

`channel: tuple[str, str] | str | None = field(default=None, kw_only=True)` #

`to_model() -> TimeseriesFeatureTypeInput` #

`MultivariateTimeSeriesFeature` #

`asynchronous` #

`channel_names` #

`name` #

`channel_names: list[str]` #

`name: Literal['MULTIVARIATE_TIMESERIES'] = 'MULTIVARIATE_TIMESERIES'` #

`asynchronous: bool = field(default=False, kw_only=True)` #

`to_model() -> TimeseriesFeatureTypeInput` #

`BoundingBoxesFeature` #

`categories` #

`name` #

`channel_names` #

`categories: list[int] | list[str]` #

`name: Literal['BBOX_LIST'] = 'BBOX_LIST'` #

`channel_names: tuple[str, ...] = ('class', 'center_x', 'center_y', 'width', 'height', 'score')` #

`to_model() -> ListFeatureTypeInput` #

`feature_type_from_model(json_response: dict[str, object]) -> NumericalFeature | CategoricalFeature | MultivariateNumericalFeature | UnivariateTimeSeriesFeature | MultivariateTimeSeriesFeature | ImageFeature | BoundingBoxesFeature` #