Visualization Module#

Standalone plotting functions for spectra, datasets, peaks, and alignment.

All plotting functions use lazy matplotlib imports, so matplotlib is only required when a plot function is actually called.

Exploratory Plots#

Dimensionality reduction scatter plots for exploring datasets, colored by metadata columns such as species or resistance phenotype. PCA and t-SNE are available with no extra dependencies; UMAP requires pip install maldiamrkit[batch].

maldiamrkit.visualization.plot_pca(X, color_by=None, n_components=2, *, random_state=42, ax=None, palette=None, label_map=None, title=None, figsize=(8, 6), alpha=0.7, s=None, legend=True, legend_loc='best', grid=True, show=True, **pca_kwargs)[source]#

Scatter plot of a PCA embedding colored by metadata.

Axis labels include the percentage of explained variance for each principal component.

Parameters:
  • X (pd.DataFrame) – Feature matrix of shape (n_samples, n_features).

  • color_by (pd.Series, np.ndarray, or None) – Categorical labels used to color points (e.g. species, resistance phenotype, batch).

  • n_components (int, default=2) – Number of principal components.

  • random_state (int or None, default=42) – Random seed forwarded to sklearn.decomposition.PCA for reproducibility. Pass None for non-deterministic results.

  • ax (matplotlib.axes.Axes, optional) – Axes to draw on. If None, a new figure is created.

  • palette (str or dict, optional) – Colormap name or {label: color} mapping.

  • label_map (dict, optional) – Map raw label values to display strings shown in the legend. Defaults to the S/I/R and 0/1 → susceptible/resistant mapping; user values override entries in the default map.

  • title (str, optional) – Plot title. Defaults to "PCA".

  • figsize (tuple of float, default=(8, 6)) – Figure size (only used when ax is None).

  • alpha (float, default=0.7) – Point transparency.

  • s (float, optional) – Marker size. When None (the default), auto-scales with sample count (max(8, 2000/n)).

  • legend (bool, default=True) – Whether to show a legend.

  • legend_loc (str, default="best") – matplotlib legend location string (e.g. "upper right") or "outside" to place the legend outside the axes.

  • grid (bool, default=True) – Draw a faint background grid.

  • show (bool, default=True) – Call plt.show() at the end.

  • **pca_kwargs (dict) – Extra keyword arguments forwarded to sklearn.decomposition.PCA.

Return type:

tuple[Figure, Axes]

Returns:

  • fig (matplotlib.figure.Figure)

  • ax (matplotlib.axes.Axes)

Examples

>>> from maldiamrkit.visualization import plot_pca
>>> fig, ax = plot_pca(dataset.X, color_by=dataset.meta["Species"])
maldiamrkit.visualization.plot_tsne(X, color_by=None, n_components=2, *, perplexity=30.0, random_state=42, ax=None, palette=None, label_map=None, title=None, figsize=(8, 6), alpha=0.7, s=None, legend=True, legend_loc='best', grid=True, show=True, **tsne_kwargs)[source]#

Scatter plot of a t-SNE embedding colored by metadata.

Parameters:
  • X (pd.DataFrame) – Feature matrix of shape (n_samples, n_features).

  • color_by (pd.Series, np.ndarray, or None) – Categorical labels used to color points.

  • n_components (int, default=2) – Number of t-SNE dimensions.

  • perplexity (float, default=30.0) – t-SNE perplexity parameter.

  • random_state (int or None, default=42) – Random seed for reproducibility. Pass None for non-deterministic results.

  • ax (matplotlib.axes.Axes, optional) – Axes to draw on.

  • palette (str or dict, optional) – Colormap name or {label: color} mapping.

  • label_map (dict, optional) – Map raw label values to display strings shown in the legend. Defaults to the S/I/R and 0/1 → susceptible/resistant mapping; user values override entries in the default map.

  • title (str, optional) – Plot title. Defaults to "t-SNE".

  • figsize (tuple of float, default=(8, 6)) – Figure size.

  • alpha (float, default=0.7) – Point transparency.

  • s (float, optional) – Marker size. When None (the default), auto-scales with sample count (max(8, 2000/n)).

  • legend (bool, default=True) – Whether to show a legend.

  • legend_loc (str, default="best") – matplotlib legend location string (e.g. "upper right") or "outside" to place the legend outside the axes.

  • grid (bool, default=True) – Draw a faint background grid.

  • show (bool, default=True) – Call plt.show() at the end.

  • **tsne_kwargs (dict) – Extra keyword arguments forwarded to sklearn.manifold.TSNE.

Return type:

tuple[Figure, Axes]

Returns:

  • fig (matplotlib.figure.Figure)

  • ax (matplotlib.axes.Axes)

Examples

>>> from maldiamrkit.visualization import plot_tsne
>>> fig, ax = plot_tsne(dataset.X, color_by=labels, perplexity=15)
maldiamrkit.visualization.plot_umap(X, color_by=None, n_components=2, *, n_neighbors=15, min_dist=0.1, random_state=42, ax=None, palette=None, label_map=None, title=None, figsize=(8, 6), alpha=0.7, s=None, legend=True, legend_loc='best', grid=True, show=True, **umap_kwargs)[source]#

Scatter plot of a UMAP embedding colored by metadata.

Requires the optional umap-learn package. Install it with:

pip install maldiamrkit[batch]
Parameters:
  • X (pd.DataFrame) – Feature matrix of shape (n_samples, n_features).

  • color_by (pd.Series, np.ndarray, or None) – Categorical labels used to color points.

  • n_components (int, default=2) – Number of UMAP dimensions.

  • n_neighbors (int, default=15) – UMAP n_neighbors parameter.

  • min_dist (float, default=0.1) – UMAP min_dist parameter.

  • random_state (int or None, default=42) – Random seed for reproducibility. Pass None for non-deterministic results.

  • ax (matplotlib.axes.Axes, optional) – Axes to draw on.

  • palette (str or dict, optional) – Colormap name or {label: color} mapping.

  • label_map (dict, optional) – Map raw label values to display strings shown in the legend. Defaults to the S/I/R and 0/1 → susceptible/resistant mapping; user values override entries in the default map.

  • title (str, optional) – Plot title. Defaults to "UMAP".

  • figsize (tuple of float, default=(8, 6)) – Figure size.

  • alpha (float, default=0.7) – Point transparency.

  • s (float, optional) – Marker size. When None (the default), auto-scales with sample count (max(8, 2000/n)).

  • legend (bool, default=True) – Whether to show a legend.

  • legend_loc (str, default="best") – matplotlib legend location string (e.g. "upper right") or "outside" to place the legend outside the axes.

  • grid (bool, default=True) – Draw a faint background grid.

  • show (bool, default=True) – Call plt.show() at the end.

  • **umap_kwargs (dict) – Extra keyword arguments forwarded to umap.UMAP.

Return type:

tuple[Figure, Axes]

Returns:

  • fig (matplotlib.figure.Figure)

  • ax (matplotlib.axes.Axes)

Raises:

ImportError – If umap-learn is not installed.

Examples

>>> from maldiamrkit.visualization import plot_umap
>>> fig, ax = plot_umap(dataset.X, color_by=dataset.meta["Species"])

Spectrum Plots#

maldiamrkit.visualization.plot_spectrum(spectrum, *, stage='binned', peaks=None, highlight_regions=None, ax=None, color=None, figsize=(10, 4), title=None, log_y=False, ylim=None, show=True, binned=None, **kwargs)[source]#

Plot a single MALDI-TOF spectrum with real m/z axis.

Parameters:
  • spectrum (MaldiSpectrum) – Spectrum to plot.

  • stage ({"binned", "preprocessed", "raw"}, default="binned") – Processing stage to render. "binned" uses a bar plot with bar width inferred from the bin spacing; "preprocessed" and "raw" use a line plot.

  • peaks (list of float or ndarray, optional) – If given, draw a scatter marker above the spectrum at each peak m/z.

  • highlight_regions (list of (mz_min, mz_max) tuples, optional) – Shaded m/z bands drawn behind the spectrum (e.g. regions of interest from differential analysis).

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates a new figure.

  • color (str, optional) – Colour for the spectrum (bars / line). Matplotlib default used when None.

  • figsize (tuple of float, default=(10, 4)) – Figure size in inches (only used when ax is None).

  • title (str, optional) – Overrides the auto-generated title ("{spectrum.id} ({stage})").

  • log_y (bool, default=False) – Use a logarithmic y-axis.

  • ylim (tuple of float, optional) – Override y-axis limits. Defaults to matplotlib autoscaling (no clipping of negatives).

  • show (bool, default=True) – Call plt.show() at the end.

  • binned (bool, optional) – Deprecated. Use stage= instead. binned=True maps to stage="binned"; binned=False maps to "preprocessed" if available, else "raw".

  • **kwargs (dict) – Additional keyword arguments forwarded to ax.bar (binned stage) or ax.plot (raw / preprocessed).

Return type:

tuple[Figure, Axes]

Returns:

  • fig (matplotlib.figure.Figure)

  • ax (matplotlib.axes.Axes)

maldiamrkit.visualization.plot_pseudogel(dataset, *, antibiotic=None, species=None, regions=None, cmap='inferno', vmin=None, vmax=None, figsize=None, log_scale=True, sort_by='intensity', label_map=None, title=None, show=True, sort_by_intensity=None)[source]#

Display a pseudogel heatmap of the spectra.

Creates one subplot per unique value of the antibiotic column, in susceptibility order (S, I, R) with unknown labels appended alphabetically.

Parameters:
  • dataset (MaldiSet) – Dataset to visualize.

  • antibiotic (str, optional) – Target column to group by. Defaults to the first configured antibiotic in the MaldiSet.

  • species (str, optional) – When given, restrict the pseudogel to that species via SpeciesFilter. Default None keeps all samples.

  • regions (tuple or list of tuples, optional) – m/z region(s) to display. None shows all.

  • cmap (str, default="inferno") – Matplotlib colormap name.

  • vmin (float, optional) – Colour-scale limits in the raw intensity units the caller is familiar with. When log_scale=True both values are automatically mapped through np.log1p before being passed to imshow, so the plotted range matches what the user specified.

  • vmax (float, optional) – Colour-scale limits in the raw intensity units the caller is familiar with. When log_scale=True both values are automatically mapped through np.log1p before being passed to imshow, so the plotted range matches what the user specified.

  • figsize (tuple, optional) – Figure size. Defaults to (14.0, 2.5 * n_groups) so the m/z axis is wide enough for typical binned data (thousands of columns).

  • log_scale (bool, default=True) – Apply np.log1p to intensities.

  • sort_by ({"intensity", "id", None}, default="intensity") –

    How to order samples within each group:

    • "intensity": sort by mean intensity (descending).

    • "id": sort by the sample’s index value (deterministic).

    • None: keep the order encountered in the MaldiSet.

  • label_map (dict, optional) – Mapping from raw group label to display name. Default maps 0/1 and R/I/S to "Susceptible (S)" / "Intermediate (I)" / "Resistant (R)"; any other value is stringified as-is. Pass a dict to override.

  • title (str, optional) – Figure title. Defaults to f"Pseudogel: {antibiotic}" when omitted.

  • show (bool, default=True) – Call plt.show() at the end.

  • sort_by_intensity (bool, optional) – Deprecated. Use sort_by= instead. Retained for backwards-compatibility; True maps to sort_by="intensity" and False maps to sort_by=None.

Return type:

tuple[Figure, ndarray]

Returns:

  • fig (matplotlib.figure.Figure)

  • axes (ndarray of Axes)

Raises:

ValueError – If the antibiotic column is not defined, if a region has min_mz > max_mz, if no m/z values lie within a specified region, or if sort_by is not one of the recognised values.

Peak Plots#

maldiamrkit.visualization.plot_peaks(detector, X, indices=None, *, xlim=None, figsize=None, alpha=0.7, show_axvlines=False, ax=None, show=True)[source]#

Plot detected peaks overlaid on original spectra.

Parameters:
  • detector (MaldiPeakDetector) – Fitted peak detector.

  • X (pd.DataFrame or pd.Series) – Input spectra with shape (n_samples, n_bins).

  • indices (int, list of int, "all", or None, default=None) – Indices of spectra to plot. None plots the first spectrum (unchanged from prior behaviour); "all" plots every spectrum in X.

  • xlim (tuple of (float, float), optional) – X-axis limits for zooming into a specific m/z range.

  • figsize (tuple of (float, float), optional) – Figure size in inches. When None, defaults to (14, 3 * n_spectra) so stacking many spectra gives a proportionally tall figure. Ignored when ax is provided.

  • alpha (float, default=0.7) – Transparency for spectrum lines.

  • show_axvlines (bool, default=False) – If True, draw a dashed vertical line at each detected peak. Default off because the scatter markers alone already mark peak positions, and the vertical lines become visually noisy with dense peak sets.

  • ax (matplotlib.axes.Axes, optional) – Pre-existing axes to plot on. Only honoured when exactly one spectrum is plotted; multi-panel calls always create a new Figure.

  • show (bool, default=True) – Call plt.show() at the end.

Raises:

ValueError – If any index in indices is out of bounds for the data, or if an ax is provided together with multiple spectra.

Return type:

tuple[Figure, Any]

Returns:

  • fig (matplotlib.figure.Figure) – The figure holding the peak panels.

  • axes (Axes or ndarray of Axes) – A single Axes when indices resolves to one spectrum, otherwise an ndarray of Axes (one per spectrum, stacked vertically with shared x-axis).

Alignment Plots#

maldiamrkit.visualization.plot_alignment(warper, X_original, X_aligned=None, indices=None, *, show_peaks=True, show_sample_peaks=False, xlim=None, figsize=None, alpha=0.7, color_reference='black', color_original='red', color_aligned='blue', title=None, show=True)[source]#

Plot comparison of original vs aligned spectra against reference.

Parameters:
  • warper (Warping) – Fitted warping transformer.

  • X_original (pd.DataFrame) – Original (unaligned) spectra.

  • X_aligned (pd.DataFrame, optional) – Aligned spectra. If None, will compute by calling transform().

  • indices (int or list of int, optional) – Indices of spectra to plot. If None, plots the first spectrum.

  • show_peaks (bool, default=True) – Whether to draw reference peak positions (vertical dashed lines). These are the calibration markers used to judge alignment quality and are on by default.

  • show_sample_peaks (bool, default=False) – If True, additionally draw per-sample (and per-aligned) peak positions as dashed vertical lines. Off by default because dense peak sets clutter the panel.

  • xlim (tuple of (float, float), optional) – X-axis limits for zooming into specific m/z range.

  • figsize (tuple of (float, float), optional) – Figure size in inches. When None, defaults to (14, 3 * n_spectra).

  • alpha (float, default=0.7) – Transparency for spectrum lines.

  • color_reference (str, default="black") – Line colour for the reference spectrum.

  • color_original (str, default="red") – Line colour for the original (before-alignment) spectrum.

  • color_aligned (str, default="blue") – Line colour for the aligned (after-alignment) spectrum.

  • title (str, optional) – Figure-level title (suptitle). Defaults to f"Warping ({warper.method})".

  • show (bool, default=True) – Call plt.show() at the end.

Return type:

tuple[Figure, ndarray]

Returns:

  • fig (matplotlib.figure.Figure) – The generated figure.

  • axes (ndarray of matplotlib.axes.Axes) – 2-D array of shape (n_spectra, 2): column 0 = before, 1 = after.

Raises:
  • RuntimeError – If the transformer has not been fitted.

  • ValueError – If any index is out of bounds for the data.