score_pep_evaluation#
- pepbench.evaluation.score_pep_evaluation(pipeline: BasePepExtractionPipeline, datapoint: BasePepDatasetWithAnnotations) dict[source]#
Run a PEP extraction pipeline on a single datapoint and compute evaluation metrics.
# first averaged over single datapoint and then aggregated (mean, std) on total dataset
The following evaluation metrics are first averaged over a single datapoint and then aggregated (mean, std) on the total dataset:
pep_reference_ms: The mean reference PEP in milliseconds.pep_estimated_ms: The mean estimated PEP in milliseconds.error_ms: The mean error between reference and estimated PEP in milliseconds.absolute_error_ms: The mean absolute error between reference and estimated PEP in milliseconds.absolute_relative_error_percent: The mean absolute relative error (relative to the reference PEP) between reference and estimated PEP in percent.
The following evaluation metrics are passed on (since they are single values per datapoint) and then aggregated by summing them up (using the
sum_aggregator):num_pep_total: The total number of PEPs in this datapoint.num_pep_valid: The number of valid PEPs in this datapoint.num_pep_invalid: The number of invalid PEPs in this datapoint.
The following evaluation metrics are not aggregated but passed on as they are:
pearson_r: The Pearson correlation coefficient between the reference and estimated PEP values of all matched heartbeats for a single datapoint (excluding NaN values).
The following evaluation metrics are not aggregated but passed as per-sample values:
pep_estimation_per_sample: The estimated and reference PEPs per sample.
The following evaluation metrics are directly aggregated (mean, std) over all samples without intermediate aggregation on single datapoint:
error_per_sample_ms: The mean error between reference and estimated PEP per sample in milliseconds.absolute_error_per_sample_ms: The mean absolute error between reference and estimated PEP per sample in milliseconds.absolute_relative_error_per_sample_percent: The mean absolute relative error (relative to the reference PEP) between reference and estimated PEP per sample in percent.
- Parameters:
- pipeline
pepbench.pipelines.BasePepExtractionPipeline A PEP extraction pipeline. The pipeline must be a subclass of
pepbench.pipelines.BasePepExtractionPipeline.- datapoint
pepbench.datasets.BasePepDatasetWithAnnotations A single datapoint from the dataset. The datapoint must be a subclass of
pepbench.datasets.BasePepDatasetWithAnnotations.
- pipeline
- Returns:
- dict
A dictionary containing the evaluation metrics.