PepEvaluationChallenge#

class pepbench.evaluation.PepEvaluationChallenge(*, dataset: ~pepbench.datasets._base_pep_extraction_dataset.BasePepDatasetWithAnnotations, scoring: ~collections.abc.Callable = <function score_pep_evaluation>, validate_kwargs: dict | None = None)[source]#

Evaluation challenge for PEP extraction pipelines.

This is the tpcp implementation of the evaluation challenge for PEP extraction pipelines. It is used to evaluate the performance of a PEP extraction pipeline on a given dataset.

Methods

`clone`()	Create a new instance of the class with all parameters copied over.
`get_params`([deep])	Get parameters for this algorithm.
`results_as_df`()	Convert the results to pandas DataFrames.
`run`(pipeline)	Run the evaluation challenge for a given pipeline.
`save_results`(folder_path, filename_stub)	Save the results of the evaluation to disk.
`set_params`(**params)	Set the parameters of this Algorithm.

__init__(*, dataset: ~pepbench.datasets._base_pep_extraction_dataset.BasePepDatasetWithAnnotations, scoring: ~collections.abc.Callable = <function score_pep_evaluation>, validate_kwargs: dict | None = None) → None[source]#

Initialize a new evaluation challenge.

To initialize a new evaluation challenge, you need to provide a dataset and a scoring function. Afterwards, you can challenge a specific PEP extraction pipeline by passing it to the run method.

Parameters:

datasetBasePepDatasetWithAnnotations: The dataset to evaluate the pipeline on. The dataset needs to be a subclass of BaseUnifiedPepExtractionDataset, which provides the necessary unified interface to access the data.
scoringCallable, optional: The scoring function to use for the evaluation. The scoring function should take the pipeline and a datapoint from the dataset as input and return a dictionary with the evaluation results. The default scoring function is :func:pepbench.evaluation.score_pep_evaluation.
validate_kwargsdict, optional: Additional keyword arguments to pass to the :class:tpcp.validate.Scorer class.

run(pipeline: BasePepExtractionPipeline) → Self[source]#

Run the evaluation challenge for a given pipeline.

Parameters:

pipelineBasePepExtractionPipeline: The PEP extraction pipeline to evaluate. The pipeline needs to be a subclass of :class:pepbench.pipelines.BasePepExtractionPipeline and should be able to process the dataset.

Returns:

Self

save_results(folder_path: path_t, filename_stub: str) → None[source]#

Save the results of the evaluation to disk.

Parameters:

folder_path:class:pathlib.Path or str: The folder path to save the results to.
filename_stubstr: The filename stub to use for the results file.

clone() → Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

get_params(deep: bool = True) → dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:

deep: Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:

params: Parameter names mapped to their values.

set_params(**params: Any) → Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

results_as_df() → Self[source]#

Convert the results to pandas DataFrames.

The results are stored as attributes on the object. The following results are created:

results_agg_mean_std_: The mean and standard deviation of the aggregated results.
results_agg_total_: The total number of valid and invalid PEPs.
results_single_: The single (non-aggregated) results for each datapoint.
results_per_sample_: The per-sample results for each datapoint.

Returns:

Self

PepEvaluationChallenge#

This Page