quantus.metrics.base module

This module implements the base class for creating evaluation metrics.

class quantus.metrics.base.Metric(abs: bool, normalise: bool, normalise_func: Callable | None, normalise_func_kwargs: Dict[str, Any] | None, return_aggregate: bool, aggregate_func: Callable, default_plot_func: Callable | None, disable_warnings: bool, display_progressbar: bool, **kwargs)

Bases: Generic[R]

Interface defining Metrics’ API.

Attributes:

disable_warnings: A helper to avoid polluting test outputs with warnings.
display_progressbar: A helper to avoid polluting test outputs with tqdm progress bars.
get_params: List parameters of metric.

Methods

`__call__`(model, x_batch, y_batch, a_batch, ...)	This implementation represents the main logic of the metric and makes the class object callable.
`batch_preprocess`(data_batch)	If data_batch has no a_batch, will compute explanations.
`custom_batch_preprocess`(*, model, x_batch, ...)	Implement this method if you need custom preprocessing of data or simply for creating/initialising additional attributes or assertions before a data_batch can be evaluated.
`custom_postprocess`(*, model, x_batch, ...)	Implement this method if you need custom postprocessing of results or additional attributes.
`custom_preprocess`(*, model, x_batch, ...)	Implement this method if you need custom preprocessing of data, model alteration or simply for creating/initialising additional attributes or assertions.
`evaluate_batch`(model, x_batch, y_batch, ...)	Evaluates model and attributes on a single data batch and returns the batched evaluation result.
`explain_batch`(model, x_batch, y_batch)	Compute explanations, normalise and take absolute (if was configured so during metric initialization.) This method should primarily be used if you need to generate additional explanation in metrics body. It encapsulates typical for Quantus pre- and postprocessing approach. It will do few things: - call model.shape_input (if ModelInterface instance was provided) - unwrap model (if ModelInterface instance was provided) - call explain_func - expand attribution channel - (optionally) normalise a_batch - (optionally) take np.abs of a_batch.
`general_preprocess`(model, x_batch, y_batch, ...)	Prepares all necessary variables for evaluation.
`generate_batches`(data, batch_size)	Creates iterator to iterate over all batched instances in data dictionary.
`interpret_scores`()	Get an interpretation of the scores.
`plot`([plot_func, show, path_to_save])	Basic plotting functionality for Metric class.

This implementation represents the main logic of the metric and makes the class object callable. It completes batch-wise evaluation of explanations (a_batch) with respect to input data (x_batch), output labels (y_batch) and a torch or tensorflow model (model).

Calls general_preprocess() with all relevant arguments, calls evaluate_instance() on each instance, and saves results to evaluation_scores. Calls custom_postprocess() afterwards. Finally returns evaluation_scores.

The content of evaluation_scores will be appended to all_evaluation_scores (list) at the end of the evaluation call.

Parameters:

model: torch.nn.Module, tf.keras.Model

A torch or tensorflow model that is subject to explanation.

x_batch: np.ndarray

A np.ndarray which contains the input data that are explained.

y_batch: np.ndarray

A np.ndarray which contains the output labels that are explained.

a_batch: np.ndarray, optional

A np.ndarray which contains pre-computed attributions i.e., explanations.

s_batch: np.ndarray, optional

A np.ndarray which contains segmentation masks that matches the input.

channel_first: boolean, optional

Indicates of the image dimensions are channel first, or channel last. Inferred from the input shape if None.

explain_func: callable

Callable generating attributions.

explain_func_kwargs: dict, optional

Keyword arguments to be passed to explain_func on call.

model_predict_kwargs: dict, optional

Keyword arguments to be passed to the model’s predict method.

softmax: boolean

Indicates whether to use softmax probabilities or logits in model prediction.: This is used for this __call__ only and won’t be saved as attribute. If None, self.softmax is used.

device: string

Indicated the device on which a torch.Tensor is or will be allocated: “cpu” or “gpu”.

custom_batch: any

Any object that can be passed to the evaluation process. Gives flexibility to the user to adapt for implementing their own metric.

kwargs: optional

Keyword arguments.

Returns:

evaluation_scores: list: a list of Any with the evaluation scores of the concerned batch.
Examples:

# Minimal imports. >> import quantus >> from quantus import LeNet >> import torch

# Enable GPU. >> device = torch.device(“cuda:0” if torch.cuda.is_available() else “cpu”)

# Load a pre-trained LeNet classification model (architecture at quantus/helpers/models). >> model = LeNet() >> model.load_state_dict(torch.load(“tutorials/assets/pytests/mnist_model”))

# Load MNIST datasets and make loaders. >> test_set = torchvision.datasets.MNIST(root=’./sample_data’, download=True) >> test_loader = torch.utils.data.DataLoader(test_set, batch_size=24)

# Load a batch of inputs and outputs to use for XAI evaluation. >> x_batch, y_batch = iter(test_loader).next() >> x_batch, y_batch = x_batch.cpu().numpy(), y_batch.cpu().numpy()

# Generate Saliency attributions of the test set batch of the test set. >> a_batch_saliency = Saliency(model).attribute(inputs=x_batch, target=y_batch, abs=True).sum(axis=1) >> a_batch_saliency = a_batch_saliency.cpu().numpy()

# Initialise the metric and evaluate explanations by calling the metric instance. >> metric = Metric(abs=True, normalise=False) >> scores = metric(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=a_batch_saliency)

__init__(abs: bool, normalise: bool, normalise_func: Callable | None, normalise_func_kwargs: Dict[str, Any] | None, return_aggregate: bool, aggregate_func: Callable, default_plot_func: Callable | None, disable_warnings: bool, display_progressbar: bool, **kwargs)

Initialise the Metric base class.

Each of the defined metrics in Quantus, inherits from Metric base class.

A child metric can benefit from the following class methods: - __call__(): Will call general_preprocess(), apply evaluate_instance() on each

instance and finally call custom_preprocess(). To use this method the child Metric needs to implement evaluate_instance().

general_preprocess(): Prepares all necessary data structures for evaluation.
Will call custom_preprocess() at the end.

The content of evaluation_scores will be appended to all_evaluation_scores (list) at the end of the evaluation call.

Parameters:

abs: boolean: Indicates whether absolute operation is applied on the attribution.
normalise: boolean: Indicates whether normalise operation is applied on the attribution.
normalise_func: callable: Attribution normalisation function applied in case normalise=True.
normalise_func_kwargs: dict: Keyword arguments to be passed to normalise_func on call.
return_aggregate: boolean: Indicates if an aggregated score should be computed over all instances.
aggregate_func: callable: Callable that aggregates the scores given an evaluation call.
default_plot_func: callable: Callable that plots the metrics result.
disable_warnings: boolean: Indicates whether the warnings are printed.
display_progressbar: boolean: Indicates whether a tqdm-progress-bar is printed.
kwargs: optional: Keyword arguments.

a_axes: Sequence[int]

all_evaluation_scores: Any

final batch_preprocess(data_batch: Dict[str, Any]) → Dict[str, Any]

If data_batch has no a_batch, will compute explanations. This needs to be done on batch level to avoid OOM. Additionally will set a_axes property if it is None, this can be done earliest after we have first a_batch.

Parameters:

data_batch:: A single entry yielded from the generator return by self.generate_batches(…)

Returns:

data_batch:: Dictionary, which is ready to be passed down to self.evaluate_batch.

custom_batch_preprocess(*, model: ModelInterface, x_batch: ndarray, y_batch: ndarray, a_batch: ndarray, **kwargs) → Dict[str, Any] | None

Implement this method if you need custom preprocessing of data or simply for creating/initialising additional attributes or assertions before a data_batch can be evaluated.

Parameters:

model:: A model that is subject to explanation.
x_batch:: A np.ndarray which contains the input data that are explained.
y_batch:: A np.ndarray which contains the output labels that are explained.
a_batch:: A np.ndarray which contains pre-computed attributions i.e., explanations.
kwargs:: Optional, metric-specific parameters.

Returns:

dict:: Optional dictionary with additional kwargs, which will be passed to self.evaluate_batch(…)

custom_postprocess(*, model: ModelInterface, x_batch: ndarray, y_batch: ndarray | None, a_batch: ndarray | None, s_batch: ndarray | None, **kwargs)

Implement this method if you need custom postprocessing of results or additional attributes.

Parameters:

model: torch.nn.Module, tf.keras.Model: A torch or tensorflow model e.g., torchvision.models that is subject to explanation.
x_batch: np.ndarray: A np.ndarray which contains the input data that are explained.
y_batch: np.ndarray: A np.ndarray which contains the output labels that are explained.
a_batch: np.ndarray, optional: A np.ndarray which contains pre-computed attributions i.e., explanations.
s_batch: np.ndarray, optional: A np.ndarray which contains segmentation masks that matches the input.
kwargs: any, optional: Additional data which was created in custom_preprocess().

Returns:

any:: Can be implemented, optionally by the child class.

custom_preprocess(*, model: ModelInterface, x_batch: ndarray, y_batch: ndarray, a_batch: ndarray | None, s_batch: ndarray | None, custom_batch: Any, **kwargs) → Dict[str, Any] | None

Implement this method if you need custom preprocessing of data, model alteration or simply for creating/initialising additional attributes or assertions.

If this method returns a dictionary, the keys (string) will be used as additional arguments for evaluate_instance(). If the key ends with _batch, this suffix will be removed from the respective argument name when passed to evaluate_instance(). If they key corresponds to the arguments x_batch, y_batch, a_batch, s_batch, these will be overwritten for passing x, y, a, s to evaluate_instance(). If this method returns None, no additional keyword arguments will be passed to evaluate_instance().

Parameters:

model: torch.nn.Module, tf.keras.Model: A torch or tensorflow model e.g., torchvision.models that is subject to explanation.
x_batch: np.ndarray: A np.ndarray which contains the input data that are explained.
y_batch: np.ndarray: A np.ndarray which contains the output labels that are explained.
a_batch: np.ndarray, optional: A np.ndarray which contains pre-computed attributions i.e., explanations.
s_batch: np.ndarray, optional: A np.ndarray which contains segmentation masks that matches the input.
custom_batch: any: Gives flexibility to the inheriting metric to use for evaluation, can hold any variable.
kwargs:: Optional, metric-specific parameters.

Returns:

dict, optional: A dictionary which holds (optionally additional) preprocessed data to

be included when calling evaluate_instance().

Examples

# Custom Metric definition with additional keyword argument used in evaluate_instance(): >>> def custom_preprocess( >>> self, >>> model: ModelInterface, >>> x_batch: np.ndarray, >>> y_batch: Optional[np.ndarray], >>> a_batch: Optional[np.ndarray], >>> s_batch: np.ndarray, >>> custom_batch: Optional[np.ndarray], >>> ) -> Dict[str, Any]: >>> return {‘my_new_variable’: np.mean(x_batch)} >>> >>> def evaluate_instance( >>> self, >>> model: ModelInterface, >>> x: np.ndarray, >>> y: Optional[np.ndarray], >>> a: Optional[np.ndarray], >>> s: np.ndarray, >>> my_new_variable: np.float, >>> ) -> float:

# Custom Metric definition with additional keyword argument that ends with _batch >>> def custom_preprocess( >>> self, >>> model: ModelInterface, >>> x_batch: np.ndarray, >>> y_batch: Optional[np.ndarray], >>> a_batch: Optional[np.ndarray], >>> s_batch: np.ndarray, >>> custom_batch: Optional[np.ndarray], >>> ) -> Dict[str, Any]: >>> return {‘my_new_variable_batch’: np.arange(len(x_batch))} >>> >>> def evaluate_instance( >>> self, >>> model: ModelInterface, >>> x: np.ndarray, >>> y: Optional[np.ndarray], >>> a: Optional[np.ndarray], >>> s: np.ndarray, >>> my_new_variable: np.int, >>> ) -> float:

# Custom Metric definition with transformation of an existing # keyword argument from evaluate_instance() >>> def custom_preprocess( >>> self, >>> model: ModelInterface, >>> x_batch: np.ndarray, >>> y_batch: Optional[np.ndarray], >>> a_batch: Optional[np.ndarray], >>> s_batch: np.ndarray, >>> custom_batch: Optional[np.ndarray], >>> ) -> Dict[str, Any]: >>> return {‘x_batch’: x_batch - np.mean(x_batch, axis=0)} >>> >>> def evaluate_instance( >>> self, >>> model: ModelInterface, >>> x: np.ndarray, >>> y: Optional[np.ndarray], >>> a: Optional[np.ndarray], >>> s: np.ndarray, >>> ) -> float:

# Custom Metric definition with None returned in custom_preprocess(), # but with inplace-preprocessing and additional assertion. >>> def custom_preprocess( >>> self, >>> model: ModelInterface, >>> x_batch: np.ndarray, >>> y_batch: Optional[np.ndarray], >>> a_batch: Optional[np.ndarray], >>> s_batch: np.ndarray, >>> custom_batch: Optional[np.ndarray], >>> ) -> None: >>> if np.any(np.all(a_batch < 0, axis=0)): >>> raise ValueError(“Attributions must not be all negative”) >>> >>> x_batch -= np.mean(x_batch, axis=0) >>> >>> return None >>> >>> def evaluate_instance( >>> self, >>> model: ModelInterface, >>> x: np.ndarray, >>> y: Optional[np.ndarray], >>> a: Optional[np.ndarray], >>> s: np.ndarray, >>> ) -> float:

data_applicability: ClassVar[Set[DataType]]

property disable_warnings: bool: A helper to avoid polluting test outputs with warnings.

property display_progressbar: bool: A helper to avoid polluting test outputs with tqdm progress bars.

abstract evaluate_batch(model: ModelInterface, x_batch: np.ndarray, y_batch: np.ndarray, a_batch: np.ndarray, s_batch: np.ndarray | None, **kwargs) → R

Evaluates model and attributes on a single data batch and returns the batched evaluation result.

This method needs to be implemented to use __call__().

Parameters:

model: ModelInterface: A ModelInteface that is subject to explanation.
x_batch: np.ndarray: The input to be evaluated on a batch-basis.
y_batch: np.ndarray: The output to be evaluated on a batch-basis.
a_batch: np.ndarray: The explanation to be evaluated on a batch-basis.
s_batch: np.ndarray: The segmentation to be evaluated on a batch-basis.

Returns:

np.ndarray: The batched evaluation results.

evaluation_category: ClassVar[EvaluationCategory]

evaluation_scores: Any

final explain_batch(model: ModelInterface | keras.Model | nn.Module, x_batch: np.ndarray, y_batch: np.ndarray) → np.ndarray

Compute explanations, normalise and take absolute (if was configured so during metric initialization.) This method should primarily be used if you need to generate additional explanation in metrics body. It encapsulates typical for Quantus pre- and postprocessing approach. It will do few things:

call model.shape_input (if ModelInterface instance was provided)

unwrap model (if ModelInterface instance was provided)

call explain_func

expand attribution channel

(optionally) normalise a_batch

(optionally) take np.abs of a_batch

Parameters

model:: A model that is subject to explanation.
x_batch:: A np.ndarray which contains the input data that are explained.
y_batch:: A np.ndarray which contains the output labels that are explained.

Returns:

a_batch:: Batch of explanations ready to be evaluated.

explain_func: Callable

explain_func_kwargs: Dict[str, Any]

Prepares all necessary variables for evaluation.

Reshapes data to channel first layout.

Wraps model into ModelInterface.

Creates attributions if necessary.

Expands attributions to data shape (adds channel dimension).

Calls custom_preprocess().

Normalises attributions if desired.

Takes absolute of attributions if desired.

If no segmentation s_batch given, creates list of Nones with as many elements as there are data instances.

If no custom_batch given, creates list of Nones with as many elements as there are data instances.

Parameters:

model: torch.nn.Module, tf.keras.Model: A torch or tensorflow model e.g., torchvision.models that is subject to explanation.
x_batch: np.ndarray: A np.ndarray which contains the input data that are explained.
y_batch: np.ndarray: A np.ndarray which contains the output labels that are explained.
a_batch: np.ndarray, optional: A np.ndarray which contains pre-computed attributions i.e., explanations.
s_batch: np.ndarray, optional: A np.ndarray which contains segmentation masks that matches the input.
channel_first: boolean, optional: Indicates of the image dimensions are channel first, or channel last. Inferred from the input shape if None.
explain_func: callable: Callable generating attributions.
explain_func_kwargs: dict, optional: Keyword arguments to be passed to explain_func on call.
model_predict_kwargs: dict, optional: Keyword arguments to be passed to the model’s predict method.
softmax: boolean: Indicates whether to use softmax probabilities or logits in model prediction. This is used for this __call__ only and won’t be saved as attribute. If None, self.softmax is used.
device: string: Indicated the device on which a torch.Tensor is or will be allocated: “cpu” or “gpu”.
custom_batch: any: Gives flexibility ot the user to use for evaluation, can hold any variable.

Returns:

tuple: A general preprocess.

final generate_batches(data: D, batch_size: int) → Generator[D, None, None]

Creates iterator to iterate over all batched instances in data dictionary. Each iterator output element is a keyword argument dictionary with string keys.

Each item key in the input data dictionary has to be of type string. - If the item value is not a sequence, the respective item key/value pair

will be written to each iterator output dictionary.

If the item value is a sequence and the item key ends with ‘_batch’, a check will be made to make sure length matches number of instances. The values of the batch instances in the sequence will be added to the respective iterator output dictionary with the ‘_batch’ suffix removed.
If the item value is a sequence but doesn’t end with ‘_batch’, it will be treated as a simple value and the respective item key/value pair will be written to each iterator output dictionary.

Parameters:

data: dict[str, any]: The data input dictionary.
batch_size: int: The batch size to be used.

Returns:

iterator:: Each iterator output element is a keyword argument dictionary (string keys).

property get_params: Dict[str, Any]

List parameters of metric.

Returns:

dict:: A dictionary with attributes if not excluded from pre-determined list.

interpret_scores(): Get an interpretation of the scores.

model_applicability: ClassVar[Set[ModelType]]

name: ClassVar[str]

normalise_func: Callable[[ndarray], ndarray] | None

plot(plot_func: Callable | None = None, show: bool = True, path_to_save: str | None = None, *args, **kwargs) → None

Basic plotting functionality for Metric class. The user provides a plot_func (Callable) that contains the actual plotting logic (but returns None).

Parameters:

plot_func: callable: A Callable with the actual plotting logic. Default set to None, which implies default_plot_func is set.
show: boolean: A boolean to state if the plot shall be shown.
path_to_save: (str): A string that specifies the path to save file.
args: optional: An optional with additional arguments.
kwargs: optional: An optional dict with additional arguments.

Returns:

None

score_direction: ClassVar[ScoreDirection]