Evaluation Module ================= .. py:module:: maldiamrkit.evaluation AMR-specific evaluation metrics and stratified splitting utilities, following EUCAST conventions. .. note:: ``LabelEncoder`` and ``IntermediateHandling`` moved to the :doc:`Susceptibility module ` in v0.15. Importing them from ``maldiamrkit.evaluation`` still works but emits a :class:`DeprecationWarning` and will be removed in v0.17. Metrics ------- .. autofunction:: maldiamrkit.evaluation.very_major_error_rate .. autofunction:: maldiamrkit.evaluation.major_error_rate .. autofunction:: maldiamrkit.evaluation.sensitivity_score .. autofunction:: maldiamrkit.evaluation.specificity_score .. autofunction:: maldiamrkit.evaluation.categorical_agreement .. autofunction:: maldiamrkit.evaluation.vme_me_curve .. autofunction:: maldiamrkit.evaluation.amr_classification_report .. autofunction:: maldiamrkit.evaluation.amr_multilabel_report .. autofunction:: maldiamrkit.evaluation.mic_regression_report Sklearn Scorers ~~~~~~~~~~~~~~~ Pre-built scorers for use with ``cross_val_score`` or ``GridSearchCV``: .. py:data:: maldiamrkit.evaluation.vme_scorer Scorer that minimizes VME (Very Major Error rate). Use with ``cross_val_score(pipe, X, y, scoring=vme_scorer)``. .. py:data:: maldiamrkit.evaluation.me_scorer Scorer that minimizes ME (Major Error rate). Use with ``cross_val_score(pipe, X, y, scoring=me_scorer)``. Metrics Example ~~~~~~~~~~~~~~~ .. code-block:: python from maldiamrkit.evaluation import ( very_major_error_rate, major_error_rate, amr_classification_report, vme_scorer, ) from sklearn.model_selection import cross_val_score # Individual metrics vme = very_major_error_rate(y_true, y_pred) me = major_error_rate(y_true, y_pred) # Full report report = amr_classification_report(y_true, y_pred) # Use scorer in cross-validation scores = cross_val_score(pipe, X, y, cv=5, scoring=vme_scorer) Splitting Utilities ------------------- .. autofunction:: maldiamrkit.evaluation.stratified_species_drug_split .. autofunction:: maldiamrkit.evaluation.case_based_split .. autoclass:: maldiamrkit.evaluation.SpeciesDrugStratifiedKFold :members: :undoc-members: :show-inheritance: .. autoclass:: maldiamrkit.evaluation.CaseGroupedKFold :members: :undoc-members: :show-inheritance: Splitting Example ~~~~~~~~~~~~~~~~~ .. code-block:: python from maldiamrkit.evaluation import ( stratified_species_drug_split, case_based_split, SpeciesDrugStratifiedKFold, CaseGroupedKFold, ) # Single split preserving species-drug distributions X_train, X_test, y_train, y_test = stratified_species_drug_split( X, y, species=species_labels, test_size=0.2, random_state=42 ) # Patient-grouped split X_train, X_test, y_train, y_test = case_based_split( X, y, case_ids=patient_ids, test_size=0.2 ) # Sklearn-compatible CV splitters cv = SpeciesDrugStratifiedKFold(n_splits=5) for train_idx, test_idx in cv.split(X, y, species=species_labels): pass cv = CaseGroupedKFold(n_splits=5) for train_idx, test_idx in cv.split(X, y, groups=patient_ids): pass Multi-Drug Evaluation --------------------- For predicting resistance to multiple antibiotics simultaneously: .. code-block:: python from maldiamrkit.susceptibility import LabelEncoder from maldiamrkit.evaluation import amr_multilabel_report from sklearn.multioutput import MultiOutputClassifier from sklearn.ensemble import RandomForestClassifier # Encode multi-drug labels (intermediate -> NaN) enc = LabelEncoder(intermediate="nan") y_encoded = enc.fit_transform(data.y) # DataFrame with one column per drug # Train multi-output model clf = MultiOutputClassifier(RandomForestClassifier()) clf.fit(X_train, y_train) y_pred = clf.predict(X_test) # Per-drug AMR report report = amr_multilabel_report(y_test, y_pred, as_dataframe=True) print(report)