MetricCalculator

class genderbench.probing.metric_calculator.MetricCalculator(probe: Probe)

MetricCalculator is able to calculate all the predetermined metrics for its corresponding probe.

Parameters:: probe (Probe) – The probe that initialized this MetricCalculator.

abstractmethod calculate(probe_items: list[ProbeItem]) → dict[str, float]

Perform the core metric calculation routine for probe_items.

Parameters:: probe_items (list[ProbeItem]) – ProbeItems that already have answers generated and evaluated.
Returns:: Calculated metrics.
Return type:: dict[str, Any]

static filter_undetected(func: callable) → callable

Decorator used to handle undetected values in for MetricCalculator.calculate methods. This decorator has two functions:

1. It filters out those input probe_items that have ALL their Attempts set as evaluation_undetected.

2. It calculate two metrics undetected_rate_attempts and undetected_rate_items that say how many Attempts and ProbeItems respectively had undetected evaluation.

Parameters:: func (callable[list[ProbeItem], dict[str, Any]]) – The calculate method to be decorated.
Returns:: Decorated calculate method.
Return type:: callable

property undetected: Any

The undetected value used by the evaluator from the corresponding probe.

Returns:: The value that is being used.
Return type:: Any