topostats.tracing.dnatracing ============================ .. py:module:: topostats.tracing.dnatracing .. autoapi-nested-parse:: Perform DNA Tracing .. !! processed by numpydoc !! Attributes ---------- .. autoapisummary:: topostats.tracing.dnatracing.LOGGER Classes ------- .. autoapisummary:: topostats.tracing.dnatracing.dnaTrace Functions --------- .. autoapisummary:: topostats.tracing.dnatracing.trace_image topostats.tracing.dnatracing.round_splined_traces topostats.tracing.dnatracing.trim_array topostats.tracing.dnatracing.adjust_coordinates topostats.tracing.dnatracing.trace_mask topostats.tracing.dnatracing.prep_arrays topostats.tracing.dnatracing.grain_anchor topostats.tracing.dnatracing.trace_grain topostats.tracing.dnatracing.crop_array topostats.tracing.dnatracing.pad_bounding_box Module Contents --------------- .. py:data:: LOGGER .. py:class:: dnaTrace(image: numpy.ndarray, grain: numpy.ndarray, filename: str, pixel_to_nm_scaling: float, min_skeleton_size: int = 10, convert_nm_to_m: bool = True, skeletonisation_method: str = 'topostats', n_grain: int = None, spline_step_size: float = 7e-09, spline_linear_smoothing: float = 5.0, spline_circular_smoothing: float = 0.0, spline_quiet: bool = True, spline_degree: int = 3) This class gets all the useful functions from the old tracing code and staples them together to create an object that contains the traces for each DNA molecule in an image and functions to calculate stats from those traces. The traces are stored in dictionaries labelled by their gwyddion defined grain number and are represented as numpy arrays. The object also keeps track of the skeletonised plots and other intermediates in case these are useful for other things in the future. 2023-06-09 : This class has undergone some refactoring so that it works with a single grain. The `trace_grain()` helper function runs the class and returns the expected statistics whilst the `trace_image()` function handles processing all detected grains within an image. The original methods of skeletonisation are available along with additional methods from scikit-image. Some bugs have been identified and corrected see commits for further details... 236750b2 2a79c4ff .. !! processed by numpydoc !! .. py:attribute:: image .. py:attribute:: grain .. py:attribute:: filename .. py:attribute:: pixel_to_nm_scaling .. py:attribute:: min_skeleton_size .. py:attribute:: skeletonisation_method .. py:attribute:: n_grain .. py:attribute:: number_of_rows .. py:attribute:: number_of_columns .. py:attribute:: sigma .. py:attribute:: gauss_image :value: None .. py:attribute:: disordered_trace :value: None .. py:attribute:: ordered_trace :value: None .. py:attribute:: fitted_trace :value: None .. py:attribute:: splined_trace :value: None .. py:attribute:: contour_length .. py:attribute:: end_to_end_distance .. py:attribute:: mol_is_circular .. py:attribute:: curvature .. py:attribute:: spline_step_size :type: float .. py:attribute:: spline_linear_smoothing :type: float .. py:attribute:: spline_circular_smoothing :type: float .. py:attribute:: spline_quiet :type: bool .. py:attribute:: spline_degree :type: int .. py:attribute:: neighbours :value: 5 .. py:method:: trace_dna() Perform DNA tracing. .. !! processed by numpydoc !! .. py:method:: gaussian_filter(**kwargs) -> numpy.array Apply Gaussian filter .. !! processed by numpydoc !! .. py:method:: get_disordered_trace() Create a skeleton for each of the grains in the image. Uses my own skeletonisation function from tracingfuncs module. I will eventually get round to editing this function to try to reduce the branching and to try to better trace from looped molecules .. !! processed by numpydoc !! .. py:method:: linear_or_circular(traces) Determines whether each molecule is circular or linear based on the local environment of each pixel from the trace This function is sensitive to branches from the skeleton so might need to implement a function to remove them .. !! processed by numpydoc !! .. py:method:: get_ordered_traces() .. py:method:: get_fitted_traces() Create trace coordinates (for each identified molecule) that are adjusted to lie along the highest points of each traced molecule .. !! processed by numpydoc !! .. py:method:: remove_duplicate_consecutive_tuples(tuple_list: list[Union[tuple, numpy.ndarray]]) -> list[tuple] :staticmethod: Remove duplicate consecutive tuples from a list. Eg: for the list of tuples [(1, 2), (1, 2), (1, 2), (2, 3), (2, 3), (3, 4)], this function will return [(1, 2), (2, 3), (3, 4)] :param tuple_list: List of tuples or numpy ndarrays to remove consecutive duplicates from. :type tuple_list: list[Union[tuple, np.ndarray]] :returns: List of tuples with consecutive duplicates removed. :rtype: list[Tuple] .. !! processed by numpydoc !! .. py:method:: get_splined_traces() -> None Gets a splined version of the fitted trace - useful for finding the radius of gyration etc. This function actually calculates the average of several splines which is important for getting a good fit on the lower res data .. !! processed by numpydoc !! .. py:method:: show_traces() .. py:method:: saveTraceFigures(filename: Union[str, pathlib.Path], channel_name: str, vmaxval, vminval, output_dir: Union[str, pathlib.Path] = None) .. py:method:: _checkForSaveDirectory(filename, new_output_dir) .. py:method:: find_curvature() .. py:method:: saveCurvature() .. py:method:: plotCurvature(dna_num) Plot the curvature of the chosen molecule as a function of the contour length (in metres) .. !! processed by numpydoc !! .. py:method:: measure_contour_length() Measures the contour length for each of the splined traces taking into account whether the molecule is circular or linear Contour length units are nm .. !! processed by numpydoc !! .. py:method:: measure_end_to_end_distance() Calculate the Euclidean distance between the start and end of linear molecules. The hypotenuse is calculated between the start ([0,0], [0,1]) and end ([-1,0], [-1,1]) of linear molecules. If the molecule is circular then the distance is set to zero (0). .. !! processed by numpydoc !! .. py:function:: trace_image(image: numpy.ndarray, grains_mask: numpy.ndarray, filename: str, pixel_to_nm_scaling: float, min_skeleton_size: int, skeletonisation_method: str, spline_step_size: float = 7e-09, spline_linear_smoothing: float = 5.0, spline_circular_smoothing: float = 0.0, pad_width: int = 1, cores: int = 1) -> Dict Processor function for tracing image. :param image: Full image as Numpy Array. :type image: np.ndarray :param grains_mask: Full image as Grains that are labelled. :type grains_mask: np.ndarray :param filename: File being processed :type filename: str :param pixel_to_nm_scaling: Pixel to nm scaling. :type pixel_to_nm_scaling: float :param min_skeleton_size: Minimum size of grain in pixels after skeletonisation. :type min_skeleton_size: int :param skeletonisation_method: Method of skeletonisation, options are 'zhang' (scikit-image) / 'lee' (scikit-image) / 'thin' (scikitimage) or 'topostats' (original TopoStats method) :type skeletonisation_method: str :param spline_step_size: Step size for spline evaluation in metres. :type spline_step_size: float = 7e-9, :param spline_circular_smoothing: Smoothness of circular splines :type spline_circular_smoothing: float = 0.0, :param spline_linear_smoothing: Smoothness of linear splines :type spline_linear_smoothing: float = 5.0, :param pad_width: Number of cells to pad arrays by, required to handle instances where grains touch the bounding box edges. :type pad_width: int :param cores: Number of cores to process with. :type cores: int :returns: Statistics from skeletonising and tracing the grains in the image. :rtype: pd.DataFrame .. !! processed by numpydoc !! .. py:function:: round_splined_traces(splined_traces: list) Round a list of floating point coordinates to integer floating point coordinates. Note that if a trace has failed and is None, it will be skipped, so the indexes will NOT be correct. :param splined_traces: List of floating point coordinates, or Nones :type splined_traces: list :returns: **rounded_splined_traces** -- List of integer coordates, without Nones :rtype: list .. !! processed by numpydoc !! .. py:function:: trim_array(array: numpy.ndarray, pad_width: int) -> numpy.ndarray Trim an array by the specified pad_width. Removes a border from an array. Typically this is the second padding that is added to the image/masks for edge cases that are near image borders and means traces will be correctly aligned as a mask for the original image. :param array: Numpy array to be trimmed. :type array: np.ndarray :param pad_width: Padding to be removed. :type pad_width: int :returns: Trimmed array :rtype: np.ndarray .. !! processed by numpydoc !! .. py:function:: adjust_coordinates(coordinates: numpy.ndarray, pad_width: int) -> numpy.ndarray Adjust coordinates of a trace by the pad_width. A second padding is made to allow for grains that are "edge cases" and close to the bounding box edge. This adds the pad_width to the cropped grain array. In order to realign the trace with the original image we need to remove this padding so that when the coordinates are combined with the "grain_anchor", which isn't padded twice, the coordinates correctly align with the original image. :param coordinates: An array of trace coordinates (typically ordered). :type coordinates: np.ndarray :param pad_width: The amount of padding used. :type pad_width: int :returns: Array of trace coordinates adjusted for secondary padding. :rtype: np.ndarray .. !! processed by numpydoc !! .. py:function:: trace_mask(grain_anchors: List[numpy.ndarray], ordered_traces: List[numpy.ndarray], image_shape: tuple, pad_width: int) -> numpy.ndarray Place the traced skeletons into an array of the original image for plotting/overlaying. Adjusts the coordinates back to the original position based on each grains anchor coordinates of the padded bounding box. Adjustments are made for the secondary padding that is made. :param grain_anchors: List of grain anchors for the padded bounding box. :type grain_anchors: List[np.ndarray] :param ordered_traces: List of coordinates for each grains trace. :type ordered_traces: List[np.ndarray] :param image_shape: Shape of original image. :type image_shape: tuple :param pad_width: The amount of padding used on the image. :type pad_width: int :returns: Mask of traces for all grains that can be overlaid on original image. :rtype: np.ndarray .. !! processed by numpydoc !! .. py:function:: prep_arrays(image: numpy.ndarray, labelled_grains_mask: numpy.ndarray, pad_width: int) -> Tuple[list, list] Takes an image and labelled mask and crops individual grains and original heights to a list. A second padding is made after cropping to ensure for "edge cases" where grains are close to bounding box edges that they are traced correctly. This is accounted for when aligning traces to the whole image mask. :param image: Gaussian filtered image. Typically filtered_image.images["gaussian_filtered"]. :type image: np.ndarray :param labelled_grains_mask: 2D Numpy array of labelled grain masks, with each mask being comprised solely of unique integer (not :type labelled_grains_mask: np.ndarray :param zero). Typically this will be output from grains.directions[["labelled_region_02].: :param pad_width: Cells by which to pad cropped regions by. :type pad_width: int :returns: Returns a tuple of two lists, each consisting of cropped arrays. :rtype: Tuple .. !! processed by numpydoc !! .. py:function:: grain_anchor(array_shape: tuple, bounding_box: list, pad_width: int) -> list Extract the anchor (min_row, min_col) from all labelled regions which is used to align individual traces over the original image. :param array_shape: Shape of original array. :type array_shape: tuple :param bounding_box: A list of region properties returned by skimage.measure.regionprops() :type bounding_box: list :param pad_width: Padding for image. :type pad_width: int :returns: A list of tuples of the min_row, min_col of each bounding box. :rtype: List(Tuple) .. !! processed by numpydoc !! .. py:function:: trace_grain(cropped_image: numpy.ndarray, cropped_mask: numpy.ndarray, pixel_to_nm_scaling: float, filename: str = None, min_skeleton_size: int = 10, skeletonisation_method: str = 'topostats', spline_step_size: float = 7e-09, spline_linear_smoothing: float = 5.0, spline_circular_smoothing: float = 0.0, n_grain: int = None) -> Dict Trace an individual grain. Tracing involves multiple steps... 1. Skeletonisation 2. Pruning of side branch artefacts from skeletonisation. 3. Ordering of the skeleton. 4. Determination of molecule shape. 5. Jiggling/Fitting 6. Splining to improve resolution of image. :param cropped_image: Cropped array from the original image defined as the bounding box from the labelled mask. :type cropped_image: np.ndarray :param cropped_mask: Cropped array from the labelled image defined as the bounding box from the labelled mask. This should have been converted to a binary mask. :type cropped_mask: np.ndarray :param filename: File being processed :type filename: str :param pixel_to_nm_scaling: Pixel to nm scaling. :type pixel_to_nm_scaling: float :param min_skeleton_size: Minimum size of grain in pixels after skeletonisation. :type min_skeleton_size: int :param skeletonisation_method: Method of skeletonisation, options are 'zhang' (scikit-image) / 'lee' (scikit-image) / 'thin' (scikitimage) or 'topostats' (original TopoStats method) :type skeletonisation_method: str :param spline_step_size: Step size for spline evaluation in metres. :type spline_step_size: float = 7e-9, :param spline_circular_smoothing: Smoothness of circular splines :type spline_circular_smoothing: float = 0.0, :param spline_linear_smoothing: Smoothness of linear splines :type spline_linear_smoothing: float = 5.0, :param n_grain: Grain number being processed. :type n_grain: int :returns: * *Dictionary* -- Dictionary of the contour length, whether the image is circular or linear, the end-to-end distance and an array * *of coordinates.* .. !! processed by numpydoc !! .. py:function:: crop_array(array: numpy.ndarray, bounding_box: tuple, pad_width: int = 0) -> numpy.ndarray Crop an array. Ideally we pad the array that is being cropped so that we have heights outside of the grains bounding box. However, in some cases, if an grain is near the edge of the image scan this results in requesting indexes outside of the existing image. In which case we get as much of the image padded as possible. :param array: 2D Numpy array to be cropped. :type array: np.ndarray :param bounding_box: Tuple of coordinates to crop, should be of form (min_row, min_col, max_row, max_col). :type bounding_box: Tuple :param pad_width: Padding to apply to bounding box. :type pad_width: int :returns: Cropped array :rtype: np.ndarray() .. !! processed by numpydoc !! .. py:function:: pad_bounding_box(array_shape: tuple, bounding_box: list, pad_width: int) -> list Pad coordinates, if they extend beyond image boundaries stop at boundary. :param array_shape: Shape of original image :type array_shape: tuple :param bounding_box: List of coordinates min_row, min_col, max_row, max_col :type bounding_box: list :param pad_width: Cells to pad arrays by. :type pad_width: int :returns: List of padded coordinates :rtype: list .. !! processed by numpydoc !!