PepEvaluationChallenge#
- class pepbench.evaluation.PepEvaluationChallenge(*, dataset: ~pepbench.datasets._base_pep_extraction_dataset.BasePepDatasetWithAnnotations, scoring: ~collections.abc.Callable = <function score_pep_evaluation>, validate_kwargs: dict | None = None)[source]#
Evaluation challenge for PEP extraction pipelines.
This is the
tpcpimplementation of the evaluation challenge for PEP extraction pipelines. It evaluates a given PEP extraction pipeline on a dataset and produces aggregated and per-sample evaluation results stored as pandas DataFrames on the challenge instance.- Parameters:
- dataset
BasePepDatasetWithAnnotations The dataset to evaluate. The dataset must implement the unified interface required by the evaluation utilities.
- scoringCallable, optional
The scoring function to use for evaluation. The scoring function should accept the pipeline and a datapoint and return a dictionary with evaluation outputs. Default is
pepbench.evaluation.score_pep_evaluation.- validate_kwargsdict, optional
Additional keyword arguments passed to
tpcp.validate.Scorer.
- dataset
- Attributes:
- results_dict
Raw results returned by
tpcp.validate.validate.- results_agg_mean_std_
pandas.DataFrame Mean and standard deviation aggregated results.
- results_agg_total_
pandas.DataFrame Total counts aggregated results.
- results_single_
pandas.DataFrame Single (non-aggregated) results for each datapoint.
- results_per_sample_
pandas.DataFrame Per-sample flattened results.
Methods
clone()Create a new instance of the class with all parameters copied over.
get_params([deep])Get parameters for this algorithm.
Convert the raw validation results to pandas DataFrames and attach them to the instance.
run(pipeline)Run the evaluation challenge for a given pipeline.
save_results(folder_path, filename_stub)Save the results of the evaluation to disk.
set_params(**params)Set the parameters of this Algorithm.
- __init__(*, dataset: ~pepbench.datasets._base_pep_extraction_dataset.BasePepDatasetWithAnnotations, scoring: ~collections.abc.Callable = <function score_pep_evaluation>, validate_kwargs: dict | None = None) None[source]#
Initialize a new evaluation challenge.
To initialize a new evaluation challenge, you need to provide a dataset and a scoring function. Afterwards, you can challenge a specific PEP extraction pipeline by passing it to the
runmethod.- Parameters:
- dataset
BasePepDatasetWithAnnotations The dataset to evaluate the pipeline on. The dataset needs to be a subclass of
BaseUnifiedPepExtractionDataset, which provides the necessary unified interface to access the data.- scoringCallable, optional
The scoring function to use for the evaluation. The scoring function should take the pipeline and a datapoint from the dataset as input and return a dictionary with the evaluation results. The default scoring function is :func:
pepbench.evaluation._scoring.score_pep_evaluation.- validate_kwargsdict, optional
Additional keyword arguments to pass to the :class:
tpcp.validate.Scorerclass.
- dataset
- run(pipeline: BasePepExtractionPipeline) Self[source]#
Run the evaluation challenge for a given pipeline.
Executes validation using
tpcp.validate.validatewith atpcp.validate.Scorerand aggregates timing information.- Parameters:
- pipeline
BasePepExtractionPipeline The PEP extraction pipeline to evaluate. The pipeline needs to be a subclass of :class:
pepbench.pipelines.BasePepExtractionPipelineand should be able to process the dataset.
- pipeline
- Returns:
- Self
The challenge instance with results stored as attributes (see class docstring).
- save_results(folder_path: path_t, filename_stub: str) None[source]#
Save the results of the evaluation to disk.
Saves timing information as JSON and
DataFrameresults as CSV files using the provided filename stub.- Parameters:
- folder_path
pathlib.Pathor str Folder path to save the results to.
- filename_stubstr
Filename stub to prefix saved files.
- folder_path
- clone() Self[source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- get_params(deep: bool = True) dict[str, Any][source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.
- set_params(**params: Any) Self[source]#
Set the parameters of this Algorithm.
To set parameters of nested objects use
nested_object_name__para_name=.
- results_as_df() Self[source]#
Convert the raw validation results to pandas DataFrames and attach them to the instance.
- The method builds the following DataFrames and stores them as instance attributes:
results_agg_mean_std_: Mean and standard deviation of aggregated metrics.results_agg_total_: Total counts (e.g. total/valid/invalid PEPs).results_single_: Single (non-aggregated) results per datapoint.results_per_sample_: Per-sample flattened results with multiindex columns.
- Returns:
- Self
The challenge instance with
DataFrameattributes populated.