Probe
- class genderbench.probing.probe.Probe(evaluator: Evaluator, metric_calculator: MetricCalculator, num_repetitions: int = 1, sample_k: int | None = None, calculate_cis: bool = True, bootstrap_cycles: int = 1000, bootstrap_alpha: float = 0.95, random_seed: int = 123, log_strategy: Literal['no', 'during', 'after'] = 'after', log_dir: str = None)
Probes are capable of orchestrating the entire probing pipeline to calculate metrics and marks for text generators. Each Probe is designed to quantify one or more harmful behaviors that such text generators might manifest.
The lifecycle of Probe consists of four main steps:
Creating ProbeItems and their Attempts.
Running generator on all the created Attempts
Evaluating the generated answers with evaluator.
Calculating metrics and marks based on the evaluations.
- Parameters:
evaluator (Evaluator) – Evaluator used to evaluate generated answers in attempts.
metric_calculator (MetricCalculator) – MetricCalculator used to calculate metrics from the evaluated Attempts.
num_repetitions (int) – How many Attempts are created for each Prompt. Useful to increase the precision of measurments. Defaults to 1.
sample_k (Optional[int], optional) – How many ProbeItems are sampled from the full dataset. When set to None, all the samples are used. Defaults to None.
calculate_cis (bool, optional) – Whether to calculate confidence intervals (via bootstrapping) for metrics or use the raw values. Defaults to True.
bootstrap_cycles (int, optional) – How many resamplings of ProbeItems are done for confidence interval calculations. Defaults to 1000.
bootstrap_alpha (float, optional) – The alpha level for confidence interval calculations. Defaults to 0.95.
random_seed (int, optional) – Random seed used when we create ProbeItems. Defaults to 123.
log_strategy (Literal["after", "during", "no"], optional) –
How often is the state of the probe logged into a file as a JSON line:
”after” - After the entire run lifecycle.
”during” - After each of the 4 steps in the run lifecycle.
”no” - Never.
Defaults to “no”.
log_dir (str, optional) – Path to the logging directory. If None, LOG_DIR environment variable is used. Defaults to None.
- metrics
Calculated metrics. Available only in
status.FINISHED.- Type:
dict[str, float]
- marks
Calculated marks. Available only in
status.FINISHED.- Type:
dict[str, dict]
- status
Current status of the Probe, one of
status.NEW,status.POPULATED,status.GENERATED,status.EVALUATED,status.FINISHED. Status is changed after each of the four steps.- Type:
status
- uuid
UUID identifier.
- Type:
uuid.UUID
- probe_items
List of current ProbeItems. Available starting from
status.POPULATED.- Type:
list[ProbeItem]
- property attempts: Generator[Attempt, None, None]
Generator of all the attempts that belong to this Probe.
- Yields:
Attempt
- calculate_marks()
Calculate marks and prepare output mark dictionary.
- Returns:
- Assessment of the mark based on coressponding
metric value.
- Return type:
dict[str, dict]
- calculate_metrics()
Calculate metrics and marks based on the results of evaluation. This is the fourth and final step in the run lifecycle. Moves the status from
EVALUATEDtoFINISHED.
- create_probe_items()
Populate probe_items with corrensponding prepared ProbeItems. This is the first step in the run lifecycle. Moves the status from
NEWtoPOPULATED.
- evaluate()
Use evaluator to evaluate the generated texts and populate the evaluation field in all the Attempts. This is the third step in the run lifecycle. Moves the status from
GENERATEDtoEVALUATED.
- classmethod from_json_dict(json_dict)
Create a new Probe object from a JSON-serializable dictionary representation.
- Parameters:
json_dict (dict) – JSON-serializable dictionary. Generated by
to_json_dict.- Returns:
Restored Probe object.
- Return type:
- classmethod from_log_file(log_file: str) Probe
Restore a Probe object from a log file.
- Parameters:
log_file (str) – Path to a log file generated by
log_current_state.- Returns:
Restored Probe object.
- Return type:
- generate(generator: Generator)
Use text generator to generate texts based on all the Attempts from this Probe, and populate their answer field. This is the second step in the run lifecycle. Moves the status from
POPULATEDtoGENERATED.- Parameters:
generator (Generator) – Text generator that is being probed.
- log_current_state()
Log current state of Probe into a log file.
- property log_file: str
Path to the log file.
- Returns:
str
- run(generator: Generator) tuple[dict[str, dict], dict[str, float]]
This is the main process being used to probe generator for harmful behavior.
- Parameters:
generator (Generator) – Evaluated text generator.
- Returns:
A tuple containing:
Dictionary describing the calculated marks.
Dictionary with metrics and their values.
- Return type:
tuple[dict[str, dict]], dict[str, float]
- sample(k: int) list[ProbeItem]
Sample k existing ProbeItems.
- Parameters:
k (int) – How many ProbeItems are sampled.
- Returns:
Sampled ProbeItems.
- Return type:
list[ProbeItem]
- to_json_dict() dict
Prepare a JSON-serializable dictionary representation. Used for logging.
- Returns:
JSON-serializable dictionary.
- Return type:
dict