Post-Hoc Explanations#

Post-Hoc versus Ante-Hoc#

Two approaches for explainability exist.

ante-hoc: explanations are trained and computed while the model is training, the deep model is intrinsically explainable.
post-hoc: explainability is achieved by approaches that analyze the model after training.

Xpdeep provides both types of explanations. While the ante-hoc approach was previously presented in the explain section, the post-hoc approach is detailed below.

Build the model#

The first step consists in designing a model and its parameters to make it suitable for post-hoc explanations. We will follow the previous tutorial on how to create an explainable model, and focus on the design differences between the ante-hoc and post-hoc models.

Similarly to the model creation designed to compute ante-hoc explanations, the post-hoc explanations require your original model to be converted into an explainable model, please follow how to convert your original model to an XpdeepModel.

Info

Your model weights are conserved, therefore you will get the same performances on your original model and on your converted explainable model.

Explainable Model Specifications - PostHoc version#

Model specification for post-hoc explanations need to be slightly revisited, as we don't require the model itself to be trained but only its explanations.

We only need to set is_post_hoc to True to specify the post-hoc context.

from xpdeep.model.model_parameters import ModelDecisionGraphParameters
from xpdeep.model.feature_extraction_output_type import FeatureExtractionOutputType

build_configuration = ModelDecisionGraphParameters(
  graph_depth=3,
  target_homogeneity_weight=0.2,
  discrimination_weight=0.2,
  balancing_weight=0.2,
  target_homogeneity_pruning_threshold=0.9,
  population_pruning_threshold=0.1,
  prune_step=5,
  internal_model_complexity=1,
  feature_extraction_output_type=FeatureExtractionOutputType.VECTOR,
  is_post_hoc=True
)

Train the Explanations#

As the training process remains, in apparence, very close to the original ante-hoc process, the Trainer object still requires some adjustments. Only the max_epochs parameter should be filled. In a post-hoc context, it represents the number of epoch to train the explanations for. You should use the PostHocTrainer interface for convenience, and fill the max_epochs parameter.

Internally, Xpdeep uses its own internal algorithm to compute and train explanations while conserving the original model parameters and performances intact.

from xpdeep.trainer.trainer import PostHocTrainer

trainer = PostHocTrainer(max_epochs=5)

Get the Post-Hoc Explanations#

Finally, once trained, the explanations can be computed and displayed using the exact same process as for the ante-hoc process.

You can visualize and understand the post-hoc explanations the exact same way as for the ante-hoc explanations.