compute_improvement_pipeline#

pepbench.data_handling.compute_improvement_pipeline(data: DataFrame, pipelines: Sequence[str]) DataFrame[source]#

Compute the percentage of samples which showed sign changes in the error metric between two pipelines.

Parameters:
datapandas.DataFrame

The data containing the PEP extraction results from different pipelines.

pipelineslist of str

The pipelines to compare.

Returns:
pandas.DataFrame

Overview of the percentage of samples which showed a change in the sign of the error metric (i.e., either positive to negative, vice versa, or no change) between two pipelines.

Raises:
ValidationError

If the input data is not a pandas.DataFrame or pandas.Series.

ValueError

If less than two pipelines are provided.