compute_improvement_pipeline#

pepbench.data_handling.compute_improvement_pipeline(data: DataFrame, pipelines: Sequence[str]) → DataFrame[source]#

Compute the percentage of samples which showed sign changes in the error metric between two pipelines.

Parameters:

datapandas.DataFrame: The data containing the PEP extraction results from different pipelines.
pipelineslist of str: The pipelines to compare.

Returns:

pandas.DataFrame: Overview of the percentage of samples which showed a change in the sign of the error metric (i.e., either positive to negative, vice versa, or no change) between two pipelines.

Raises:

ValidationError: If the input data is not a pandas.DataFrame or pandas.Series.
ValueError: If less than two pipelines are provided.