quantus.metrics.localisation.focus module

This module contains the implementation of the AUC metric.

final class quantus.metrics.localisation.focus.Focus(mosaic_shape: Any | None = None, abs: bool = False, normalise: bool = True, normalise_func: Callable[[ndarray], ndarray] | None = None, normalise_func_kwargs: Dict[str, Any] | None = None, return_aggregate: bool = False, aggregate_func: Callable | None = None, default_plot_func: Callable | None = None, disable_warnings: bool = False, display_progressbar: bool = False, **kwargs)

Bases: Metric[List[float]]

Implementation of Focus evaluation strategy by Arias et. al. 2022

The Focus is computed through mosaics of instances from different classes, and the explanations these generate. Each mosaic contains four images: two images belonging to the target class (the specific class the feature attribution method is expected to explain) and the other two are chosen randomly from the rest of classes. Thus, the Focus estimates the reliability of feature attribution method’s output as the probability of the sampled pixels lying on an image of the target class of the mosaic. This is equivalent to the proportion of positive relevance lying on those images.

References:

1) Anna Arias-Duart et al.: “Focus! Rating XAI Methods and Finding Biases” FUZZ-IEEE (2022): 1-8.

Attributes:

_name: The name of the metric.
_data_applicability: The data types that the metric implementation currently supports.
_models: The model types that this metric can work with.
score_direction: How to interpret the scores, whether higher/ lower values are considered better.
evaluation_category: What property/ explanation quality that this metric measures.

Attributes:

disable_warnings: A helper to avoid polluting test outputs with warnings.
display_progressbar: A helper to avoid polluting test outputs with tqdm progress bars.
get_params: List parameters of metric.

Methods

`__call__`(model, x_batch, y_batch[, a_batch, ...])	This implementation represents the main logic of the metric and makes the class object callable.
`batch_preprocess`(data_batch)	If data_batch has no a_batch, will compute explanations.
`custom_batch_preprocess`(*, model, x_batch, ...)	Implement this method if you need custom preprocessing of data or simply for creating/initialising additional attributes or assertions before a data_batch can be evaluated.
`custom_postprocess`(*, model, x_batch, ...)	Implement this method if you need custom postprocessing of results or additional attributes.
`custom_preprocess`(model, x_batch, y_batch, ...)	Implementation of custom_preprocess_batch.
`evaluate_batch`(a_batch, c_batch, **kwargs)	This method performs XAI evaluation on a single batch of explanations.
`explain_batch`(model, x_batch, y_batch)	Compute explanations, normalise and take absolute (if was configured so during metric initialization.) This method should primarily be used if you need to generate additional explanation in metrics body. It encapsulates typical for Quantus pre- and postprocessing approach. It will do few things: - call model.shape_input (if ModelInterface instance was provided) - unwrap model (if ModelInterface instance was provided) - call explain_func - expand attribution channel - (optionally) normalise a_batch - (optionally) take np.abs of a_batch.
`general_preprocess`(model, x_batch, y_batch, ...)	Prepares all necessary variables for evaluation.
`generate_batches`(data, batch_size)	Creates iterator to iterate over all batched instances in data dictionary.
`interpret_scores`()	Get an interpretation of the scores.
`plot`([plot_func, show, path_to_save])	Basic plotting functionality for Metric class.

This implementation represents the main logic of the metric and makes the class object callable. It completes instance-wise evaluation of explanations (a_batch) with respect to input data (x_batch), output labels (y_batch) and a torch or tensorflow model (model).

Calls general_preprocess() with all relevant arguments, calls () on each instance, and saves results to evaluation_scores. Calls custom_postprocess() afterwards. Finally returns evaluation_scores.

For this metric to run we need to get the positions of the target class within the mosaic. This should be a np.ndarray containing one tuple per sample, representing the positions of the target class within the mosaic (where each tuple contains 0/1 values referring to (top_left, top_right, bottom_left, bottom_right).

An example:

>> custom_batch=[(1, 1, 0, 0), (0, 0, 1, 1), (1, 0, 1, 0), (0, 1, 0, 1)]

How to initialise the metric and evaluate explanations by calling the metric instance?

>> metric = Focus() >> scores = {method: metric(**init_params)(model=model,

x_batch=x_mosaic_batch, y_batch=y_mosaic_batch, a_batch=None, custom_batch=p_mosaic_batch, **{“explain_func”: explain,

“explain_func_kwargs”: { “method”: “LayerGradCAM”, “gc_layer”: “model._modules.get(‘conv_2’)”, “pos_only”: True, “interpolate”: (2*28, 2*28), “interpolate_mode”: “bilinear”,} “device”: device}) for method in [“LayerGradCAM”, “IntegratedGradients”]}

# Plot example! >> metric.plot(results=scores)

Parameters:

model: torch.nn.Module, tf.keras.Model: A torch or tensorflow model that is subject to explanation.
x_batch: np.ndarray: A np.ndarray which contains the input data that are explained.
y_batch: np.ndarray: A np.ndarray which contains the output labels that are explained.
a_batch: np.ndarray, optional: A np.ndarray which contains pre-computed attributions i.e., explanations.
s_batch: np.ndarray, optional: A np.ndarray which contains segmentation masks that matches the input.
channel_first: boolean, optional: Indicates of the image dimensions are channel first, or channel last. Inferred from the input shape if None.
explain_func: callable: Callable generating attributions.
explain_func_kwargs: dict, optional: Keyword arguments to be passed to explain_func on call.
model_predict_kwargs: dict, optional: Keyword arguments to be passed to the model’s predict method.
softmax: boolean: Indicates whether to use softmax probabilities or logits in model prediction. This is used for this __call__ only and won’t be saved as attribute. If None, self.softmax is used.
device: string: Indicated the device on which a torch.Tensor is or will be allocated: “cpu” or “gpu”.
custom_batch: any: Any object that can be passed to the evaluation process. Gives flexibility to the user to adapt for implementing their own metric.
kwargs: optional: Keyword arguments.

Returns:

evaluation_scores: list: a list of Any with the evaluation scores of the concerned batch.
Examples:

# Minimal imports. >> import quantus >> from quantus import LeNet >> import torch

# Enable GPU. >> device = torch.device(“cuda:0” if torch.cuda.is_available() else “cpu”)

# Load a pre-trained LeNet classification model (architecture at quantus/helpers/models). >> model = LeNet() >> model.load_state_dict(torch.load(“tutorials/assets/pytests/mnist_model”))

# Load MNIST datasets and make loaders. >> test_set = torchvision.datasets.MNIST(root=’./sample_data’, download=True) >> test_loader = torch.utils.data.DataLoader(test_set, batch_size=24)

# Load a batch of inputs and outputs to use for XAI evaluation. >> x_batch, y_batch = iter(test_loader).next() >> x_batch, y_batch = x_batch.cpu().numpy(), y_batch.cpu().numpy()

# Generate Saliency attributions of the test set batch of the test set. >> a_batch_saliency = Saliency(model).attribute(inputs=x_batch, target=y_batch, abs=True).sum(axis=1) >> a_batch_saliency = a_batch_saliency.cpu().numpy()

# Initialise the metric and evaluate explanations by calling the metric instance. >> metric = Metric(abs=True, normalise=False) >> scores = metric(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=a_batch_saliency)

__init__(mosaic_shape: Any | None = None, abs: bool = False, normalise: bool = True, normalise_func: Callable[[ndarray], ndarray] | None = None, normalise_func_kwargs: Dict[str, Any] | None = None, return_aggregate: bool = False, aggregate_func: Callable | None = None, default_plot_func: Callable | None = None, disable_warnings: bool = False, display_progressbar: bool = False, **kwargs)

Parameters:

abs: boolean: Indicates whether absolute operation is applied on the attribution, default=False.
normalise: boolean: Indicates whether normalise operation is applied on the attribution, default=True.
normalise_func: callable: Attribution normalisation function applied in case normalise=True. If normalise_func=None, the default value is used, default=normalise_by_max.
normalise_func_kwargs: dict: Keyword arguments to be passed to normalise_func on call, default={}.
return_aggregate: boolean: Indicates if an aggregated score should be computed over all instances.
aggregate_func: callable: Callable that aggregates the scores given an evaluation call.
default_plot_func: callable: Callable that plots the metrics result.
disable_warnings: boolean: Indicates whether the warnings are printed, default=False.
display_progressbar: boolean: Indicates whether a tqdm-progress-bar is printed, default=False.
kwargs: optional: Keyword arguments.

custom_preprocess(model: ModelInterface, x_batch: ndarray, y_batch: ndarray, custom_batch: ndarray, **kwargs) → Dict[str, Any]

Implementation of custom_preprocess_batch.

Parameters:

model: torch.nn.Module, tf.keras.Model: A torch or tensorflow model e.g., torchvision.models that is subject to explanation.
x_batch: np.ndarray: A np.ndarray which contains the input data that are explained.
y_batch: np.ndarray: A np.ndarray which contains the output labels that are explained.
custom_batch: any: Gives flexibility ot the user to use for evaluation, can hold any variable.
kwargs:: Unused.

Returns:

dictionary[str, np.ndarray]: Output dictionary with two items: 1) ‘c_batch’ as key and custom_batch as value. 2) ‘custom_batch’ as key and None as value. This results in the keyword argument ‘c’ being passed to evaluate_instance().

data_applicability: ClassVar[Set[DataType]] = {DataType.IMAGE}

evaluate_batch(a_batch: ndarray, c_batch: ndarray, **kwargs) → List[float]

This method performs XAI evaluation on a single batch of explanations. For more information on the specific logic, we refer the metric’s initialisation docstring.

Parameters:

a_batch:: A np.ndarray which contains pre-computed attributions i.e., explanations.
c_batch:: The custom input to be evaluated on an batch-basis.
kwargs:: Unused.

Returns:

score_batch:: Evaluation result for batch.

evaluation_category: ClassVar[EvaluationCategory] = 'Localisation'

model_applicability: ClassVar[Set[ModelType]] = {ModelType.TF, ModelType.TORCH}

name: ClassVar[str] = 'Focus'

score_direction: ClassVar[ScoreDirection] = 'higher'