topostats.processing#

Functions for processing data.

Attributes#

Functions#

run_filters(→ numpy.typing.NDArray | None)

Filter and flatten an image. Optionally plots the results, returning the flattened image.

run_grains(→ dict | None)

Identify grains (molecules) and optionally plots the results.

run_grainstats(image, pixel_to_nm_scaling, ...)

Calculate grain statistics for an image and optionally plots the results.

run_disordered_tracing(→ dict)

Skeletonise and prune grains, adding results to statistics data frames and optionally plot results.

run_nodestats(→ tuple[dict, pandas.DataFrame])

Analyse crossing points in grains adding results to statistics data frames and optionally plot results.

run_ordered_tracing(→ tuple)

Order coordinates of traces, adding results to statistics data frames and optionally plot results.

run_splining(→ tuple)

Smooth the ordered trace coordinates, adding results to statistics data frames and optionally plot results.

run_curvature_stats(→ dict | None)

Calculate curvature statistics for the traced DNA molecules.

get_out_paths(image_path, base_dir, output_dir, ...)

Determine components of output paths for a given image and plotting config.

process_scan(→ tuple[dict, pandas.DataFrame, dict])

Process a single image, filtering, finding grains and calculating their statistics.

process_filters(→ tuple[str, bool])

Filter an image return the flattened images and save to ''.topostats''.

process_grains(→ tuple[str, bool])

Detect grains in an image return the flattened images and save to ''.topostats''.

check_run_steps(→ None)

Check options for running steps (Filter, Grain, Grainstats and DNA tracing) are logically consistent.

completion_message(→ None)

Print a completion message summarising images processed.

Module Contents#

topostats.processing.LOGGER#
topostats.processing.run_filters(unprocessed_image: numpy.typing.NDArray, pixel_to_nm_scaling: float, filename: str, filter_out_path: pathlib.Path, core_out_path: pathlib.Path, filter_config: dict, plotting_config: dict) numpy.typing.NDArray | None[source]#

Filter and flatten an image. Optionally plots the results, returning the flattened image.

Parameters:
  • unprocessed_image (npt.NDArray) – Image to be flattened.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometres. ie the number of pixels per nanometre.

  • filename (str) – File name for the image.

  • filter_out_path (Path) – Output directory for step-by-step flattening plots.

  • core_out_path (Path) – General output directory for outputs such as the flattened image.

  • filter_config (dict) – Dictionary of configuration for the Filters class to use when initialised.

  • plotting_config (dict) – Dictionary of configuration for plotting output images.

Returns:

Either a numpy array of the flattened image, or None if an error occurs or flattening is disabled in the configuration.

Return type:

npt.NDArray | None

topostats.processing.run_grains(image: numpy.typing.NDArray, pixel_to_nm_scaling: float, filename: str, grain_out_path: pathlib.Path, core_out_path: pathlib.Path, plotting_config: dict, grains_config: dict) dict | None[source]#

Identify grains (molecules) and optionally plots the results.

Parameters:
  • image (npt.NDArray) – 2d numpy array image to find grains in.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometres. I.e. the number of pixels per nanometre.

  • filename (str) – Name of file being processed (used in logging).

  • grain_out_path (Path) – Output path for step-by-step grain finding plots.

  • core_out_path (Path) – General output directory for outputs such as the flattened image with grain masks overlaid.

  • plotting_config (dict) – Dictionary of configuration for plotting images.

  • grains_config (dict) – Dictionary of configuration for the Grains class to use when initialised.

Returns:

Either None in the case of error or grain finding being disabled or a dictionary with keys of “above” and or “below” containing binary masks depicting where grains have been detected.

Return type:

dict | None

topostats.processing.run_grainstats(image: numpy.typing.NDArray, pixel_to_nm_scaling: float, grain_masks: dict, filename: str, basename: pathlib.Path, grainstats_config: dict, plotting_config: dict, grain_out_path: pathlib.Path)[source]#

Calculate grain statistics for an image and optionally plots the results.

Parameters:
  • image (npt.NDArray) – 2D numpy array image for grain statistics calculations.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometres. ie the number of pixels per nanometre.

  • grain_masks (dict) – Dictionary of grain masks, keys “above” or “below” with values of 2d numpy boolean arrays indicating the pixels that have been masked as grains.

  • filename (str) – Name of the image.

  • basename (Path) – Path to directory containing the image.

  • grainstats_config (dict) – Dictionary of configuration for the GrainStats class to be used when initialised.

  • plotting_config (dict) – Dictionary of configuration for plotting images.

  • grain_out_path (Path) – Directory to save optional grain statistics visual information to.

Returns:

A pandas DataFrame containing the statsistics for each grain. The index is the filename and grain number.

Return type:

pd.DataFrame

topostats.processing.run_disordered_tracing(image: numpy.typing.NDArray, grain_masks: dict, pixel_to_nm_scaling: float, filename: str, basename: str, core_out_path: pathlib.Path, tracing_out_path: pathlib.Path, disordered_tracing_config: dict, plotting_config: dict, grainstats_df: pandas.DataFrame = None) dict[source]#

Skeletonise and prune grains, adding results to statistics data frames and optionally plot results.

Parameters:
  • image (npt.ndarray) – Image containing the grains to pass to the tracing function.

  • grain_masks (dict) – Dictionary of grain masks, keys “above” or “below” with values of 2D Numpy boolean arrays indicating the pixels that have been masked as grains.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometers, i.e. the number of pixesl per nanometres (nm).

  • filename (str) – Name of the image.

  • basename (Path) – Path to directory containing the image.

  • core_out_path (Path) – Path to save the core disordered trace image to.

  • tracing_out_path (Path) – Path to save the optional, diagnostic disordered trace images to.

  • disordered_tracing_config (dict) – Dictionary configuration for obtaining a disordered trace representation of the grains.

  • plotting_config (dict) – Dictionary configuration for plotting images.

  • grainstats_df (pd.DataFrame | None) – The grain statistics dataframe to be added to. This optional argument defaults to None in which case an empty grainstats dataframe is created.

Returns:

Dictionary of “grain_<index>” keys and Nx2 coordinate arrays of the disordered grain trace.

Return type:

dict

topostats.processing.run_nodestats(image: numpy.typing.NDArray, disordered_tracing_data: dict, pixel_to_nm_scaling: float, filename: str, core_out_path: pathlib.Path, tracing_out_path: pathlib.Path, nodestats_config: dict, plotting_config: dict, grainstats_df: pandas.DataFrame = None) tuple[dict, pandas.DataFrame][source]#

Analyse crossing points in grains adding results to statistics data frames and optionally plot results.

Parameters:
  • image (npt.ndarray) – Image containing the DNA to pass to the tracing function.

  • disordered_tracing_data (dict) – Dictionary of skeletonised and pruned grain masks. Result from “run_disordered_tracing”.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometers, i.e. the number of pixels per nanometres (nm).

  • filename (str) – Name of the image.

  • core_out_path (Path) – Path to save the core NodeStats image to.

  • tracing_out_path (Path) – Path to save optional, diagnostic NodeStats images to.

  • nodestats_config (dict) – Dictionary configuration for analysing the crossing points.

  • plotting_config (dict) – Dictionary configuration for plotting images.

  • grainstats_df (pd.DataFrame | None) – The grain statistics dataframe to bee added to. This optional argument defaults to None in which case an empty grainstats dataframe is created.

Returns:

A NodeStats analysis dictionary and grainstats metrics dataframe.

Return type:

tuple[dict, pd.DataFrame]

topostats.processing.run_ordered_tracing(image: numpy.typing.NDArray, disordered_tracing_data: dict, nodestats_data: dict, filename: str, basename: pathlib.Path, core_out_path: pathlib.Path, tracing_out_path: pathlib.Path, ordered_tracing_config: dict, plotting_config: dict, grainstats_df: pandas.DataFrame = None) tuple[source]#

Order coordinates of traces, adding results to statistics data frames and optionally plot results.

Parameters:
  • image (npt.ndarray) – Image containing the DNA to pass to the tracing function.

  • disordered_tracing_data (dict) – Dictionary of skeletonised and pruned grain masks. Result from “run_disordered_tracing”.

  • nodestats_data (dict) – Dictionary of images and statistics from the NodeStats analysis. Result from “run_nodestats”.

  • filename (str) – Name of the image.

  • basename (Path) – The path of the files’ parent directory.

  • core_out_path (Path) – Path to save the core ordered tracing image to.

  • tracing_out_path (Path) – Path to save optional, diagnostic ordered trace images to.

  • ordered_tracing_config (dict) – Dictionary configuration for obtaining an ordered trace representation of the skeletons.

  • plotting_config (dict) – Dictionary configuration for plotting images.

  • grainstats_df (pd.DataFrame | None) – The grain statistics dataframe to be added to. This optional argument defaults to None in which case an empty grainstats dataframe is created.

Returns:

A NodeStats analysis dictionary and grainstats metrics dataframe.

Return type:

tuple[dict, pd.DataFrame]

topostats.processing.run_splining(image: numpy.typing.NDArray, ordered_tracing_data: dict, pixel_to_nm_scaling: float, filename: str, core_out_path: pathlib.Path, splining_config: dict, plotting_config: dict, grainstats_df: pandas.DataFrame = None, molstats_df: pandas.DataFrame = None) tuple[source]#

Smooth the ordered trace coordinates, adding results to statistics data frames and optionally plot results.

Parameters:
  • image (npt.NDArray) – Image containing the DNA to pass to the tracing function.

  • ordered_tracing_data (dict) – Dictionary of ordered coordinates. Result from “run_ordered_tracing”.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometers, i.e. the number of pixels per nanometres (nm).

  • filename (str) – Name of the image.

  • core_out_path (Path) – Path to save the core ordered tracing image to.

  • splining_config (dict) – Dictionary configuration for obtaining an ordered trace representation of the skeletons.

  • plotting_config (dict) – Dictionary configuration for plotting images.

  • grainstats_df (pd.DataFrame | None) – The grain statistics dataframe to be added to. This optional argument defaults to None in which case an empty grainstats dataframe is created.

  • molstats_df (pd.DataFrame | None) – The molecule statistics dataframe to be added to. This optional argument defaults to None in which case an empty grainstats dataframe is created.

Returns:

A smooth curve analysis dictionary and grainstats metrics dataframe.

Return type:

tuple[dict, pd.DataFrame]

topostats.processing.run_curvature_stats(image: numpy.ndarray, cropped_image_data: dict, grain_trace_data: dict, pixel_to_nm_scaling: float, filename: str, core_out_path: pathlib.Path, tracing_out_path: pathlib.Path, curvature_config: dict, plotting_config: dict) dict | None[source]#

Calculate curvature statistics for the traced DNA molecules.

Currently only works on simple traces, not branched traces.

Parameters:
  • image (np.ndarray) – AFM image, for plotting purposes.

  • cropped_image_data (dict) – Dictionary containing cropped images.

  • grain_trace_data (dict) – Dictionary of grain trace data.

  • pixel_to_nm_scaling (float) – Scaling factor for converting pixel length scales to nanometres. ie the number of pixels per nanometre.

  • filename (str) – Name of the image.

  • core_out_path (Path) – Path to save the core curvature image to.

  • tracing_out_path (Path) – Path to save the optional, diagnostic curvature images to.

  • curvature_config (dict) – Dictionary of configuration for running the curvature stats.

  • plotting_config (dict) – Dictionary of configuration for plotting images.

Returns:

Dictionary containing curvature statistics.

Return type:

dict

topostats.processing.get_out_paths(image_path: pathlib.Path, base_dir: pathlib.Path, output_dir: pathlib.Path, filename: str, plotting_config: dict)[source]#

Determine components of output paths for a given image and plotting config.

Parameters:
  • image_path (Path) – Path of the image being processed.

  • base_dir (Path) – Path of the data folder.

  • output_dir (Path) – Base output directory for output data.

  • filename (str) – Name of the image being processed.

  • plotting_config (dict) – Dictionary of configuration for plotting images.

Returns:

Core output path for general file outputs, filter output path for flattening related files and grain output path for grain finding related files.

Return type:

tuple

topostats.processing.process_scan(topostats_object: dict, base_dir: str | pathlib.Path, filter_config: dict, grains_config: dict, grainstats_config: dict, disordered_tracing_config: dict, nodestats_config: dict, ordered_tracing_config: dict, splining_config: dict, curvature_config: dict, plotting_config: dict, output_dir: str | pathlib.Path = 'output') tuple[dict, pandas.DataFrame, dict][source]#

Process a single image, filtering, finding grains and calculating their statistics.

Parameters:
  • topostats_object (dict[str, Union[npt.NDArray, Path, float]]) – A dictionary with keys ‘image’, ‘img_path’ and ‘pixel_to_nm_scaling’ containing a file or frames’ image, it’s path and it’s pixel to namometre scaling value.

  • base_dir (str | Path) – Directory to recursively search for files, if not specified the current directory is scanned.

  • filter_config (dict) – Dictionary of configuration options for running the Filter stage.

  • grains_config (dict) – Dictionary of configuration options for running the Grain detection stage.

  • grainstats_config (dict) – Dictionary of configuration options for running the Grain Statistics stage.

  • disordered_tracing_config (dict) – Dictionary configuration for obtaining a disordered trace representation of the grains.

  • nodestats_config (dict) – Dictionary of configuration options for running the NodeStats stage.

  • ordered_tracing_config (dict) – Dictionary configuration for obtaining an ordered trace representation of the skeletons.

  • splining_config (dict) – Dictionary of configuration options for running the splining stage.

  • curvature_config (dict) – Dictionary of configuration options for running the curvature stats stage.

  • plotting_config (dict) – Dictionary of configuration options for plotting figures.

  • output_dir (str | Path) – Directory to save output to, it will be created if it does not exist. If it already exists then it is possible that output will be over-written.

Returns:

TopoStats dictionary object, DataFrame containing grain statistics and dna tracing statistics, and dictionary containing general image statistics.

Return type:

tuple[dict, pd.DataFrame, dict]

topostats.processing.process_filters(topostats_object: dict, base_dir: str | pathlib.Path, filter_config: dict, plotting_config: dict, output_dir: str | pathlib.Path = 'output') tuple[str, bool][source]#

Filter an image return the flattened images and save to ‘’.topostats’’.

Runs just the first key step of flattening images to remove noise, tilt and optionally scars saving to ‘’.topostats’’ for subsequent processing and analyses.

Parameters:
  • topostats_object (dict[str, Union[npt.NDArray, Path, float]]) – A dictionary with keys ‘image’, ‘img_path’ and ‘pixel_to_nm_scaling’ containing a file or frames’ image, it’s path and it’s pixel to namometre scaling value.

  • base_dir (str | Path) – Directory to recursively search for files, if not specified the current directory is scanned.

  • filter_config (dict) – Dictionary of configuration options for running the Filter stage.

  • plotting_config (dict) – Dictionary of configuration options for plotting figures.

  • output_dir (str | Path) – Directory to save output to, it will be created if it does not exist. If it already exists then it is possible that output will be over-written.

Returns:

A tuple of the image and a boolean indicating if the image was successfully processed.

Return type:

tuple[str, bool]

topostats.processing.process_grains(topostats_object: dict, base_dir: str | pathlib.Path, grains_config: dict, plotting_config: dict, output_dir: str | pathlib.Path = 'output') tuple[str, bool][source]#

Detect grains in an image return the flattened images and save to ‘’.topostats’’.

Runs just the first key step of flattening images to remove noise, tilt and optionally scars saving to ‘’.topostats’’ for subsequent processing and analyses.

Parameters:
  • topostats_object (dict[str, Union[npt.NDArray, Path, float]]) – A dictionary with keys ‘image’, ‘img_path’ and ‘pixel_to_nm_scaling’ containing a file or frames’ image, it’s path and it’s pixel to namometre scaling value.

  • base_dir (str | Path) – Directory to recursively search for files, if not specified the current directory is scanned.

  • grains_config (dict) – Dictionary of configuration options for running the Filter stage.

  • plotting_config (dict) – Dictionary of configuration options for plotting figures.

  • output_dir (str | Path) – Directory to save output to, it will be created if it does not exist. If it already exists then it is possible that output will be over-written.

Returns:

A tuple of the image and a boolean indicating if the image was successfully processed.

Return type:

tuple[str, bool]

topostats.processing.check_run_steps(filter_run: bool, grains_run: bool, grainstats_run: bool, disordered_tracing_run: bool, nodestats_run: bool, ordered_tracing_run: bool, splining_run: bool) None[source]#

Check options for running steps (Filter, Grain, Grainstats and DNA tracing) are logically consistent.

This checks that earlier steps required are enabled.

Parameters:
  • filter_run (bool) – Flag for running Filtering.

  • grains_run (bool) – Flag for running Grains.

  • grainstats_run (bool) – Flag for running GrainStats.

  • disordered_tracing_run (bool) – Flag for running Disordered Tracing.

  • nodestats_run (bool) – Flag for running NodeStats.

  • ordered_tracing_run (bool) – Flag for running Ordered Tracing.

  • splining_run (bool) – Flag for running DNA Tracing.

topostats.processing.completion_message(config: dict, img_files: list, summary_config: dict, images_processed: int) None[source]#

Print a completion message summarising images processed.

Parameters:
  • config (dict) – Configuration dictionary.

  • img_files (list) – List of found image paths.

  • summary_config (dict) – Configuration for plotting summary statistics.

  • images_processed (int) – Pandas DataFrame of results.