PepEvaluationChallenge#

class pepbench.evaluation.PepEvaluationChallenge(*, dataset: ~pepbench.datasets._base_pep_extraction_dataset.BasePepDatasetWithAnnotations, scoring: ~collections.abc.Callable = <function score_pep_evaluation>, validate_kwargs: dict | None = None)[source]#

Evaluation challenge for PEP extraction pipelines.

This is the tpcp implementation of the evaluation challenge for PEP extraction pipelines. It is used to evaluate the performance of a PEP extraction pipeline on a given dataset.

Methods

clone()

Create a new instance of the class with all parameters copied over.

get_params([deep])

Get parameters for this algorithm.

results_as_df()

Convert the results to pandas DataFrames.

run(pipeline)

Run the evaluation challenge for a given pipeline.

save_results(folder_path, filename_stub)

Save the results of the evaluation to disk.

set_params(**params)

Set the parameters of this Algorithm.

__init__(*, dataset: ~pepbench.datasets._base_pep_extraction_dataset.BasePepDatasetWithAnnotations, scoring: ~collections.abc.Callable = <function score_pep_evaluation>, validate_kwargs: dict | None = None) None[source]#

Initialize a new evaluation challenge.

To initialize a new evaluation challenge, you need to provide a dataset and a scoring function. Afterwards, you can challenge a specific PEP extraction pipeline by passing it to the run method.

Parameters:
datasetBasePepDatasetWithAnnotations

The dataset to evaluate the pipeline on. The dataset needs to be a subclass of BaseUnifiedPepExtractionDataset, which provides the necessary unified interface to access the data.

scoringCallable, optional

The scoring function to use for the evaluation. The scoring function should take the pipeline and a datapoint from the dataset as input and return a dictionary with the evaluation results. The default scoring function is :func:pepbench.evaluation.score_pep_evaluation.

validate_kwargsdict, optional

Additional keyword arguments to pass to the :class:tpcp.validate.Scorer class.

run(pipeline: BasePepExtractionPipeline) Self[source]#

Run the evaluation challenge for a given pipeline.

Parameters:
pipelineBasePepExtractionPipeline

The PEP extraction pipeline to evaluate. The pipeline needs to be a subclass of :class:pepbench.pipelines.BasePepExtractionPipeline and should be able to process the dataset.

Returns:
Self
save_results(folder_path: path_t, filename_stub: str) None[source]#

Save the results of the evaluation to disk.

Parameters:
folder_path:class:pathlib.Path or str

The folder path to save the results to.

filename_stubstr

The filename stub to use for the results file.

clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

get_params(deep: bool = True) dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

results_as_df() Self[source]#

Convert the results to pandas DataFrames.

The results are stored as attributes on the object. The following results are created:
  • results_agg_mean_std_: The mean and standard deviation of the aggregated results.

  • results_agg_total_: The total number of valid and invalid PEPs.

  • results_single_: The single (non-aggregated) results for each datapoint.

  • results_per_sample_: The per-sample results for each datapoint.

Returns:
Self