MaldiAMRKit - Susceptibility & MIC regression#

This notebook walks through the maldiamrkit.susceptibility submodule introduced in v0.15, and its regression-evaluation counterpart in maldiamrkit.evaluation:

  • MICEncoder - turns raw MIC strings into a tidy DataFrame with log2_mic, a censoring mask, and (when given a BreakpointTable) the clinical S/I/R category plus the ATU flag.

  • BreakpointTable - clinical breakpoint table loaded from bundled EUCAST YAMLs or user-supplied files.

  • mic_regression_report - regression-style evaluation with essential agreement (within 1 dilution) and (with breakpoints) clinical categorical agreement. Lives in maldiamrkit.evaluation since it complements amr_classification_report.

Everything below runs on a small synthetic dataset and the bundled example.yaml to keep the notebook self-contained. For real clinical work, drop in a vendored EUCAST YAML produced by the gitignored eucast_converter/ tooling.

Imports#

[1]:
from importlib import resources

import numpy as np
import pandas as pd

from maldiamrkit.evaluation import mic_regression_report
from maldiamrkit.susceptibility import BreakpointTable, MICEncoder
/home/ettore/.venvs/maldiamrkit/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Loading a breakpoint table#

BreakpointTable has four constructors:

  • BreakpointTable.from_yaml(path) - load any YAML file in the canonical schema.

  • BreakpointTable.from_version("16.0") - load a vendored EUCAST version.

  • BreakpointTable.from_year(2026) - look up by publication year via the bundled manifest.

  • BreakpointTable.from_latest() - return the highest-numbered bundled version.

BreakpointTable.list_available() reports which EUCAST versions ship with the current install.

[2]:
available = BreakpointTable.list_available()
print(f"Bundled EUCAST versions on this install: {available or '[none yet]'}")

example_path = resources.files("maldiamrkit") / "data" / "breakpoints" / "example.yaml"
bp = BreakpointTable.from_yaml(example_path)
print(bp)
print("Species:", bp.species())
print("Drugs:", bp.drugs())
Bundled EUCAST versions on this install: ['1.0', '1.1', '1.2', '1.3', '2.0', '4.0', '5.0', '6.0', '7.1', '8.0', '8.1', '9.0', '10.0', '11.0', '12.0', '13.1', '14.0', '15.0', '16.0']
BreakpointTable(EXAMPLE v0.0, 5 rows)
Species: ['Escherichia coli', 'Klebsiella pneumoniae']
Drugs: ['Ceftriaxone', 'Ciprofloxacin', 'Meropenem', 'Piperacillin-tazobactam']

Categorising a single MIC value#

bp.apply(species, drug, mic) returns a BreakpointResult with three fields:

  • category - "S", "I", "R", or None if the lookup failed.

  • atu - True when the MIC sits in the species/drug ATU (Area of Technical Uncertainty) range. This is an assay-quality flag, not a third clinical category.

  • source - provenance, e.g. "EUCAST v16.0".

[3]:
for mic in (0.25, 1.0, 2.0, 4.0, 16.0):
    r = bp.apply("Klebsiella pneumoniae", "Meropenem", mic=mic)
    print(f"MIC={mic:>5} mg/L -> category={r.category!s:<5} atu={r.atu}")
MIC= 0.25 mg/L -> category=S     atu=False
MIC=  1.0 mg/L -> category=S     atu=False
MIC=  2.0 mg/L -> category=S     atu=False
MIC=  4.0 mg/L -> category=I     atu=True
MIC= 16.0 mg/L -> category=R     atu=False

Modern EUCAST treats I as Susceptible, Increased exposure (a real, treatable category) - not as “uncertain”. ATU is the assay-quality flag that runs alongside S/I/R: a Meropenem MIC of 4 here is still clinically I, but the ATU flag tells you the call sits in a zone where assay variability can flip it. Treat ATU-flagged results as “investigate further” rather than discarding them.

MICEncoder - parse MIC strings into ML targets#

MICEncoder is an sklearn-style transformer. Given a DataFrame with a MIC column, it produces:

  • log2_mic - regression target.

  • censored - True where the source MIC used <=, <, >=, or > qualifiers.

  • category, atu, source - populated only when a BreakpointTable is supplied.

Without breakpoints, you get the regression-only output:

[4]:
df = pd.DataFrame(
    {
        "Species": ["Klebsiella pneumoniae"] * 6,
        "Drug": ["Ceftriaxone"] * 6,
        "MIC": ["<=0.25", "0.5", "1", "2", "4", ">16"],
    }
)

enc = MICEncoder(mic_col="MIC")
out = enc.fit_transform(df)
out
[4]:
log2_mic censored category atu source
0 -2.0 True <NA> <NA> <NA>
1 -1.0 False <NA> <NA> <NA>
2 0.0 False <NA> <NA> <NA>
3 1.0 False <NA> <NA> <NA>
4 2.0 False <NA> <NA> <NA>
5 4.0 True <NA> <NA> <NA>

Wire in a BreakpointTable and the same call also categorises each row:

[5]:
enc = MICEncoder(
    breakpoints=bp,
    mic_col="MIC",
    species_col="Species",
    drug_col="Drug",
)
out = enc.fit_transform(df)
out
[5]:
log2_mic censored category atu source
0 -2.0 True S False MaldiAMRKit synthetic example (NOT FOR CLINICA...
1 -1.0 False S False MaldiAMRKit synthetic example (NOT FOR CLINICA...
2 0.0 False S False MaldiAMRKit synthetic example (NOT FOR CLINICA...
3 1.0 False I False MaldiAMRKit synthetic example (NOT FOR CLINICA...
4 2.0 False R False MaldiAMRKit synthetic example (NOT FOR CLINICA...
5 4.0 True R False MaldiAMRKit synthetic example (NOT FOR CLINICA...

MICEncoder is sklearn-compatible: fit, transform, fit_transform, and get_feature_names_out work as expected, so it slots into Pipeline and ColumnTransformer.

[6]:
enc.get_feature_names_out().tolist()
[6]:
['log2_mic', 'censored', 'category', 'atu', 'source']

mic_regression_report - clinically-grounded evaluation#

When a model predicts continuous MIC values, the metrics clinicians look at are different from the standard regression suite:

  • rmse_log2 / mae_log2 / bias_log2 - standard regression diagnostics on the log2 scale (one dilution = 1 log2 unit).

  • essential_agreement - fraction of predictions within +/- 1 log2 dilution. The clinical benchmark for MIC prediction.

  • When breakpoints are supplied, the report also re-bins both y_true and y_pred to S/I/R and reports clinical categorical agreement, very-major-error rate (R predicted as S), and major-error rate (S predicted as R).

[7]:
rng = np.random.default_rng(seed=0)
y_true_mic = np.array([0.25, 0.5, 1.0, 2.0, 4.0, 8.0, 16.0])
y_true = np.log2(y_true_mic)
y_pred = y_true + rng.normal(0.0, 0.6, size=y_true.shape)

report = mic_regression_report(
    y_true=y_true,
    y_pred=y_pred,
    breakpoints=bp,
    species="Klebsiella pneumoniae",
    drug="Ceftriaxone",
)
for k, v in report.items():
    print(f"{k:>22}: {v}")
                     n: 7
             rmse_log2: 0.3637316429270602
              mae_log2: 0.2746647707298642
             bias_log2: 0.16018918733802207
   essential_agreement: 1.0
 categorical_agreement: 0.7142857142857143
 very_major_error_rate: 0.0
      major_error_rate: 0.0
         n_categorical: 7
      n_resistant_true: 3
    n_susceptible_true: 3