topostats.plotting#

Plotting and summary of TopoStats output statistics.

Attributes#

Classes#

TopoSum

Class for summarising grain statistics in plots.

Functions#

toposum(→ dict)

Process plotting and summarisation of data.

run_toposum(→ None)

Run Plotting.

plot_height_profiles(→ tuple)

Plot height profiles.

_pad_array(→ numpy.typing.NDArray)

Pad array so that it matches the largest profile and plots are somewhat aligned.

Module Contents#

topostats.plotting.LOGGER#
class topostats.plotting.TopoSum(df: pandas.DataFrame = None, base_dir: str | pathlib.Path = None, csv_file: str | pathlib.Path = None, stat_to_sum: str = None, molecule_id: str = 'molecule_number', image_id: str = 'image', hist: bool = True, stat: str = 'count', bins: int = 12, kde: bool = True, cut: float = 20, figsize: tuple = (16, 9), alpha: float = 0.5, palette: str = 'deep', savefig_format: str = 'png', output_dir: str | pathlib.Path = '.', var_to_label: dict = None, hue: str = 'basename')#

Class for summarising grain statistics in plots.

Parameters:
  • df (pd.DataFrame) – Pandas data frame of data to be summarised.

  • base_dir (str | Path) – Base directory from which all paths are relative to.

  • csv_file (str | Path) – CSV file of data to be summarised.

  • stat_to_sum (str) – Variable to summarise.

  • molecule_id (str) – Variable that uniquely identifies molecules.

  • image_id (str) – Variable that uniquely identifies images.

  • hist (bool) – Whether to plot histograms.

  • stat (str) – Statistic to plot on histogram ‘count’ (default), ‘freq’.

  • bins (int) – Number of bins to plot.

  • kde (bool) – Whether to include a Kernel Density Estimate.

  • cut (float = 20,) – Cut point for KDE.

  • figsize (tuple) – Figure dimensions.

  • alpha (float) – Opacity to use in plots.

  • palette (str = "deep") – Seaborn colour plot to use.

  • savefig_format (str) – File type to save plots as ‘png’ (default), ‘pdf’, ‘svg’.

  • output_dir (str | Path) – Location to save plots to.

  • var_to_label (dict) – Variable to label dictionary for automatically adding titles to plots.

  • hue (str) – Dataframe column to group plots by.

df#
base_dir#
stat_to_sum#
molecule_id#
image_id#
hist#
bins#
stat#
kde#
cut#
figsize#
alpha#
palette#
savefig_format#
output_dir#
var_to_label#
hue#
melted_data = None#
summary_data = None#
label = None#
_setup_figure()#

Setup Matplotlib figure and axes.

Returns:

Matplotlib fig and ax objects.

Return type:

fig, ax

_outfile(plot_suffix: str) str#

Generate the output file name with the appropriate suffix.

Parameters:

plot_suffix (str) – The suffix to append to the output file.

Returns:

Concanenated string of the outfile and plot_suffix.

Return type:

str

sns_plot() tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes] | None#

Plot the distribution of one or more statistics as either histogram, kernel density estimates or both.

Uses base Seaborn.

Returns:

Tuple of Matplotlib figure and axes if plotting is successful, None otherwise.

Return type:

Optional[Union[Tuple[plt.Figure, plt.Axes], None]]

sns_violinplot() None#

Violin plot of data.

Returns:

Matplotlib fig and ax objects.

Return type:

fig, ax

static melt_data(df: pandas.DataFrame, stat_to_summarize: str, var_to_label: dict) pandas.DataFrame#

Melt a dataframe into long format for plotting with Seaborn.

Parameters:
  • df (pd.DataFrame) – Statistics to melt.

  • stat_to_summarize (str) – Statistics to summarise.

  • var_to_label (dict) – Mapping of variable names to descriptions.

Returns:

Data in long-format with descriptive variable names.

Return type:

pd.DataFrame

set_xlim(percent: float = 0.1) None#

Set the range of the x-axis.

Parameters:

percent (float) – Percentage of the observed range by which to extend the x-axis. Only used if supplied range is outside the observed values.

set_palette()#

Set the color palette.

save_plot(outfile: pathlib.Path) None#

Save the plot to the output_dir.

Parameters:

outfile (str) – Output file name to save figure to.

_set_label(var: str)#

Get the label based on the column name(s).

Parameters:

var (str) – The variable for which a label is required.

topostats.plotting.toposum(config: dict) dict#

Process plotting and summarisation of data.

Parameters:

config (dict) – Dictionary of summarisation options.

Returns:

Dictionary of nested dictionaries. Each variable has its own dictionary with keys ‘dist’ and ‘violin’ which

contain distribution like plots and violin plots respectively (if the later are required). Each ‘dist’ and

’violin’ is itself a dictionary with two elements ‘figures’ and ‘axes’ which correspond to MatplotLib ‘fig’ and ‘ax’ for that plot.

Return type:

dict

topostats.plotting.run_toposum(args=None) None#

Run Plotting.

Parameters:

args (None) – Arguments to pass and update configuration.

topostats.plotting.plot_height_profiles(height_profiles: list | numpy.typing.NDArray) tuple#

Plot height profiles.

Parameters:

height_profiles (npt.NDArray) – Single height profile (1-D numpy array of heights) or array of height profiles. If the later the profiles plot will be overlaid.

Returns:

Matplotlib.pyplot figure object and Matplotlib.pyplot axes object.

Return type:

tuple

topostats.plotting._pad_array(profile: numpy.typing.NDArray, max_array_length: int) numpy.typing.NDArray#

Pad array so that it matches the largest profile and plots are somewhat aligned.

Centering is done based on the mid-point of longest grain and heights of zero (‘0.0’) are used in padding.

Parameters:
  • profile (npt.NDArray) – 1-D Height profile.

  • max_array_length (int) – The longest height profile across a range of detected grains.

Returns:

Array padded to the same length as max_array_length.

Return type:

npt.NDArray