topostats.io ============ .. py:module:: topostats.io .. autoapi-nested-parse:: Functions for reading and writing data. .. !! processed by numpydoc !! Attributes ---------- .. autoapisummary:: topostats.io.LOGGER topostats.io.CONFIG_DOCUMENTATION_REFERENCE Classes ------- .. autoapisummary:: topostats.io.LoadScans Functions --------- .. autoapisummary:: topostats.io.read_yaml topostats.io.get_date_time topostats.io.write_yaml topostats.io.write_config_with_comments topostats.io.save_array topostats.io.load_array topostats.io.path_to_str topostats.io.get_out_path topostats.io.find_files topostats.io.save_folder_grainstats topostats.io.read_null_terminated_string topostats.io.read_u32i topostats.io.read_64d topostats.io.read_char topostats.io.read_gwy_component_dtype topostats.io.get_relative_paths topostats.io.convert_basename_to_relative_paths topostats.io.dict_to_hdf5 topostats.io.hdf5_to_dict topostats.io.save_topostats_file topostats.io.save_pkl topostats.io.load_pkl topostats.io.dict_to_json Module Contents --------------- .. py:data:: LOGGER .. py:data:: CONFIG_DOCUMENTATION_REFERENCE :value: Multiline-String .. raw:: html
Show Value .. code-block:: python """# For more information on configuration and how to use it: # https://afm-spm.github.io/TopoStats/main/configuration.html """ .. raw:: html
.. py:function:: read_yaml(filename: str | pathlib.Path) -> dict Read a YAML file. :param filename: YAML file to read. :type filename: Union[str, Path] :returns: Dictionary of the file. :rtype: Dict .. !! processed by numpydoc !! .. py:function:: get_date_time() -> str Get a date and time for adding to generated files or logging. :returns: A string of the current date and time, formatted appropriately. :rtype: str .. !! processed by numpydoc !! .. py:function:: write_yaml(config: dict, output_dir: str | pathlib.Path, config_file: str = 'config.yaml', header_message: str = None) -> None Write a configuration (stored as a dictionary) to a YAML file. :param config: Configuration dictionary. :type config: dict :param output_dir: Path to save the dictionary to as a YAML file (it will be called 'config.yaml'). :type output_dir: Union[str, Path] :param config_file: Filename to write to. :type config_file: str :param header_message: String to write to the header message of the YAML file. :type header_message: str .. !! processed by numpydoc !! .. py:function:: write_config_with_comments(args=None) -> None Write a sample configuration with in-line comments. This function is not designed to be used interactively but can be, just call it without any arguments and it will write a configuration to './config.yaml'. :param args: A Namespace object parsed from argparse with values for 'filename'. :type args: Namespace .. !! processed by numpydoc !! .. py:function:: save_array(array: numpy.typing.NDArray, outpath: pathlib.Path, filename: str, array_type: str) -> None Save a Numpy array to disk. :param array: Numpy array to be saved. :type array: npt.NDArray :param outpath: Location array should be saved. :type outpath: Path :param filename: Filename of the current image from which the array is derived. :type filename: str :param array_type: Short string describing the array type e.g. z_threshold. Ideally should not have periods or spaces in (use underscores '_' instead). :type array_type: str .. !! processed by numpydoc !! .. py:function:: load_array(array_path: str | pathlib.Path) -> numpy.typing.NDArray Load a Numpy array from file. Should have been saved using save_array() or numpy.save(). :param array_path: Path to the Numpy array on disk. :type array_path: Union[str, Path] :returns: Returns the loaded Numpy array. :rtype: npt.NDArray .. !! processed by numpydoc !! .. py:function:: path_to_str(config: dict) -> dict Recursively traverse a dictionary and convert any Path() objects to strings for writing to YAML. :param config: Dictionary to be converted. :type config: dict :returns: The same dictionary with any Path() objects converted to string. :rtype: Dict .. !! processed by numpydoc !! .. py:function:: get_out_path(image_path: str | pathlib.Path = None, base_dir: str | pathlib.Path = None, output_dir: str | pathlib.Path = None) -> pathlib.Path Add the image path relative to the base directory to the output directory. :param image_path: The path of the current image. :type image_path: Path :param base_dir: Directory to recursively search for files. :type base_dir: Path :param output_dir: The output directory specified in the configuration file. :type output_dir: Path :returns: The output path that mirrors the input path structure. :rtype: Path .. !! processed by numpydoc !! .. py:function:: find_files(base_dir: str | pathlib.Path = None, file_ext: str = '.spm') -> list Recursively scan the specified directory for images with the given file extension. :param base_dir: Directory to recursively search for files, if not specified the current directory is scanned. :type base_dir: Union[str, Path] :param file_ext: File extension to search for. :type file_ext: str :returns: List of files found with the extension in the given directory. :rtype: List .. !! processed by numpydoc !! .. py:function:: save_folder_grainstats(output_dir: str | pathlib.Path, base_dir: str | pathlib.Path, all_stats_df: pandas.DataFrame) -> None Save a data frame of grain and tracing statistics at the folder level. :param output_dir: Path of the output directory head. :type output_dir: Union[str, Path] :param base_dir: Path of the base directory where files were found. :type base_dir: Union[str, Path] :param all_stats_df: The dataframe containing all sample statistics run. :type all_stats_df: pd.DataFrame :returns: This only saves the dataframes and does not retain them. :rtype: None .. !! processed by numpydoc !! .. py:function:: read_null_terminated_string(open_file: io.TextIOWrapper, encoding: str = 'utf-8') -> str Read an open file from the current position in the open binary file, until the next null value. :param open_file: An open file object. :type open_file: io.TextIOWrapper :param encoding: Encoding to use when decoding the bytes. :type encoding: str :returns: String of the ASCII decoded bytes before the next null byte. :rtype: str .. rubric:: Examples >>> with open("test.txt", "rb") as f: ... print(read_null_terminated_string(f), encoding="utf-8") .. !! processed by numpydoc !! .. py:function:: read_u32i(open_file: io.TextIOWrapper) -> str Read an unsigned 32 bit integer from an open binary file (in little-endian form). :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: Python integer type cast from the unsigned 32 bit integer. :rtype: int .. !! processed by numpydoc !! .. py:function:: read_64d(open_file: io.TextIOWrapper) -> str Read a 64-bit double from an open binary file. :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: Python float type cast from the double. :rtype: float .. !! processed by numpydoc !! .. py:function:: read_char(open_file: io.TextIOWrapper) -> str Read a character from an open binary file. :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: A string type cast from the decoded character. :rtype: str .. !! processed by numpydoc !! .. py:function:: read_gwy_component_dtype(open_file: io.TextIOWrapper) -> str Read the data type of a `.gwy` file component. Possible data types are as follows: - 'b': boolean - 'c': character - 'i': 32-bit integer - 'q': 64-bit integer - 'd': double - 's': string - 'o': `.gwy` format object Capitalised versions of some of these data types represent arrays of values of that data type. Arrays are stored as an unsigned 32 bit integer, describing the size of the array, followed by the unseparated array values: - 'C': array of characters - 'I': array of 32-bit integers - 'Q': array of 64-bit integers - 'D': array of doubles - 'S': array of strings - 'O': array of objects. :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: Python string (one character long) of the data type of the component's value. :rtype: str .. !! processed by numpydoc !! .. py:function:: get_relative_paths(paths: list[pathlib.Path]) -> list[str] Extract a list of relative paths, removing the common suffix. From a list of paths, create a list where each path is relative to all path's closest common parent. For example, ['a/b/c', 'a/b/d', 'a/b/e/f'] would return ['c', 'd', 'e/f']. :param paths: List of string or pathlib paths. :type paths: list :returns: List of string paths, relative to the common parent. :rtype: list .. !! processed by numpydoc !! .. py:function:: convert_basename_to_relative_paths(df: pandas.DataFrame) Convert paths in the 'basename' column of a dataframe to relative paths. If the 'basename' column has the following paths: ['/usr/topo/data/a/b', '/usr/topo/data/c/d'], the output will be: ['a/b', 'c/d']. :param df: A pandas dataframe containing a column 'basename' which contains the paths indicating the locations of the image data files. :type df: pd.DataFrame :returns: A pandas dataframe where the 'basename' column has paths relative to a common parent. :rtype: pd.DataFrame .. !! processed by numpydoc !! .. py:class:: LoadScans(img_paths: list[str | pathlib.Path], channel: str) Load the image and image parameters from a file path. :param img_paths: Path to a valid AFM scan to load. :type img_paths: list[str, Path] :param channel: Image channel to extract from the scan. :type channel: str .. !! processed by numpydoc !! .. py:attribute:: img_paths .. py:attribute:: img_path :value: None .. py:attribute:: channel .. py:attribute:: channel_data :value: None .. py:attribute:: filename :value: None .. py:attribute:: image :value: None .. py:attribute:: pixel_to_nm_scaling :value: None .. py:attribute:: grain_masks .. py:attribute:: grain_trace_data .. py:attribute:: img_dict .. py:attribute:: MINIMUM_IMAGE_SIZE :value: 10 .. py:method:: load_spm() -> tuple[numpy.typing.NDArray, float] Extract image and pixel to nm scaling from the Bruker .spm file. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple[npt.NDArray, float] .. !! processed by numpydoc !! .. py:method:: _spm_pixel_to_nm_scaling(channel_data: pySPM.SPM.SPM_image) -> float Extract pixel to nm scaling from the SPM image metadata. :param channel_data: Channel data from PySPM. :type channel_data: pySPM.SPM.SPM_image :returns: Pixel to nm scaling factor. :rtype: float .. !! processed by numpydoc !! .. py:method:: load_topostats() -> tuple[numpy.typing.NDArray, float] Load a .topostats file (hdf5 format). Loads and extracts the image, pixel to nanometre scaling factor and any grain masks. Note that grain masks are stored via self.grain_masks rather than returned due to how we extract information for all other file loading functions. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple[npt.NDArray, float] .. !! processed by numpydoc !! .. py:method:: load_asd() -> tuple[numpy.typing.NDArray, float] Extract image and pixel to nm scaling from .asd files. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple[npt.NDArray, float] .. !! processed by numpydoc !! .. py:method:: load_ibw() -> tuple[numpy.typing.NDArray, float] Load image from Asylum Research (Igor) .ibw files. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple[npt.NDArray, float] .. !! processed by numpydoc !! .. py:method:: _ibw_pixel_to_nm_scaling(scan: dict) -> float Extract pixel to nm scaling from the IBW image metadata. :param scan: The loaded binary wave object. :type scan: dict :returns: A value corresponding to the real length of a single pixel. :rtype: float .. !! processed by numpydoc !! .. py:method:: load_jpk() -> tuple[numpy.typing.NDArray, float] Load image from JPK Instruments .jpk files. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple[npt.NDArray, float] .. !! processed by numpydoc !! .. py:method:: _jpk_pixel_to_nm_scaling(tiff_page: tifffile.tifffile.TiffPage) -> float :staticmethod: Extract pixel to nm scaling from the JPK image metadata. :param tiff_page: An image file directory (IFD) of .jpk files. :type tiff_page: tifffile.tifffile.TiffPage :returns: A value corresponding to the real length of a single pixel. :rtype: float .. !! processed by numpydoc !! .. py:method:: _gwy_read_object(open_file: io.TextIOWrapper, data_dict: dict) -> None :staticmethod: Parse and extract data from a `.gwy` file object, starting at the current open file read position. :param open_file: An open file object. :type open_file: io.TextIOWrapper :param data_dict: Dictionary of `.gwy` file image properties. :type data_dict: dict .. !! processed by numpydoc !! .. py:method:: _gwy_read_component(open_file: io.TextIOWrapper, initial_byte_pos: int, data_dict: dict) -> int :staticmethod: Parse and extract data from a `.gwy` file object, starting at the current open file read position. :param open_file: An open file object. :type open_file: io.TextIOWrapper :param initial_byte_pos: Initial position, as byte. :type initial_byte_pos: int :param data_dict: Dictionary of `.gwy` file image properties. :type data_dict: dict :returns: Size of the component in bytes. :rtype: int .. !! processed by numpydoc !! .. py:method:: _gwy_print_dict(gwy_file_dict: dict, pre_string: str) -> None :staticmethod: Recursively print nested dictionary. Can be used to find labels and values of objects / components in the `.gwy` file. :param gwy_file_dict: Dictionary of the nested object / component structure of a `.gwy` file. :type gwy_file_dict: dict :param pre_string: Prefix to use when printing string. :type pre_string: str .. !! processed by numpydoc !! .. py:method:: _gwy_print_dict_wrapper(gwy_file_dict: dict) -> None :staticmethod: Print dictionaries. This is a wrapper for the _gwy_print_dict() method. :param gwy_file_dict: Dictionary of the nested object / component structure of a `.gwy` file. :type gwy_file_dict: dict .. !! processed by numpydoc !! .. py:method:: _gwy_get_channels(gwy_file_structure: dict) -> dict :staticmethod: Extract a list of channels and their corresponding dictionary key ids from the `.gwy` file dictionary. :param gwy_file_structure: Dictionary of the nested object / component structure of a `.gwy` file. Where the keys are object names and the values are dictionaries of the object's components. :type gwy_file_structure: dict :returns: Dictionary where the keys are the channel names and the values are the dictionary key ids. :rtype: dict .. rubric:: Examples # Using a loaded dictionary generated from a `.gwy` file: LoadScans._gwy_get_channels(gwy_file_structure=loaded_gwy_file_dictionary) .. !! processed by numpydoc !! .. py:method:: load_gwy() -> tuple[numpy.typing.NDArray, float] Extract image and pixel to nm scaling from the Gwyddion .gwy file. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple[npt.NDArray, float] .. !! processed by numpydoc !! .. py:method:: get_data() -> None Extract image, filepath and pixel to nm scaling value, and append these to the img_dic object. .. !! processed by numpydoc !! .. py:method:: _check_image_size_and_add_to_dict(image: numpy.typing.NDArray, filename: str) -> None Check the image is above a minimum size in both dimensions. Images that do not meet the minimum size are not included for processing. :param image: An array of the extracted AFM image. :type image: npt.NDArray :param filename: The name of the file. :type filename: str .. !! processed by numpydoc !! .. py:method:: add_to_dict(image: numpy.typing.NDArray, filename: str) -> None Add an image and metadata to the img_dict dictionary under the key filename. Adds the image and associated metadata such as any grain masks, and pixel to nanometere scaling factor to the img_dict dictionary which is used as a place to store the image information for processing. :param image: An array of the extracted AFM image. :type image: npt.NDArray :param filename: The name of the file. :type filename: str .. !! processed by numpydoc !! .. py:function:: dict_to_hdf5(open_hdf5_file: h5py.File, group_path: str, dictionary: dict) -> None Recursively save a dictionary to an open hdf5 file. :param open_hdf5_file: An open hdf5 file object. :type open_hdf5_file: h5py.File :param group_path: The path to the group in the hdf5 file to start saving data from. :type group_path: str :param dictionary: A dictionary of the data to save. :type dictionary: dict .. !! processed by numpydoc !! .. py:function:: hdf5_to_dict(open_hdf5_file: h5py.File, group_path: str) -> dict Read a dictionary from an open hdf5 file. :param open_hdf5_file: An open hdf5 file object. :type open_hdf5_file: h5py.File :param group_path: The path to the group in the hdf5 file to start reading data from. :type group_path: str :returns: A dictionary of the hdf5 file data. :rtype: dict .. !! processed by numpydoc !! .. py:function:: save_topostats_file(output_dir: pathlib.Path, filename: str, topostats_object: dict) -> None Save a topostats dictionary object to a .topostats (hdf5 format) file. :param output_dir: Directory to save the .topostats file in. :type output_dir: Path :param filename: File name of the .topostats file. :type filename: str :param topostats_object: Dictionary of the topostats data to save. Must include a flattened image and pixel to nanometre scaling factor. May also include grain masks. :type topostats_object: dict .. !! processed by numpydoc !! .. py:function:: save_pkl(outfile: pathlib.Path, to_pkl: dict) -> None Pickle objects for working with later. :param outfile: Path and filename to save pickle to. :type outfile: Path :param to_pkl: Object to be picled. :type to_pkl: dict .. !! processed by numpydoc !! .. py:function:: load_pkl(infile: pathlib.Path) -> Any Load data from a pickle. :param infile: Path to a valid pickle. :type infile: Path :returns: Dictionary of generated images. :rtype: dict .. rubric:: Examples from pathlib import Path from topostats.io import load_plots pkl_path = "output/distribution_plots.pkl" my_plots = load_pkl(pkl_path) # Show the type of my_plots which is a dictionary of nested dictionaries type(my_plots) # Show the keys are various levels of nesting. my_plots.keys() my_plots["area"].keys() my_plots["area"]["dist"].keys() # Get the figure and axis object for a given metrics distribution plot figure, axis = my_plots["area"]["dist"].values() # Get the figure and axis object for a given metrics violin plot figure, axis = my_plots["area"]["violin"].values() .. !! processed by numpydoc !! .. py:function:: dict_to_json(data: dict, output_dir: str | pathlib.Path, filename: str | pathlib.Path, indent: int = 4) -> None Write a dictionary to a JSON file at the specified location with the given name. NB : The `NumpyEncoder` class is used as the default encoder to ensure Numpy dtypes are written as strings (they are not serialisable to JSON using the default JSONEncoder). :param data: Data as a dictionary that is to be written to file. :type data: dict :param output_dir: Directory the file is to be written to. :type output_dir: str | Path :param filename: Name of output file. :type filename: str | Path :param indent: Spaces to indent JSON with, default is 4. :type indent: int .. !! processed by numpydoc !!