topostats.io ============ .. py:module:: topostats.io .. autoapi-nested-parse:: Functions for reading and writing data. .. !! processed by numpydoc !! Attributes ---------- .. autoapisummary:: topostats.io.LOGGER topostats.io.CONFIG_DOCUMENTATION_REFERENCE Classes ------- .. autoapisummary:: topostats.io.LoadScans Functions --------- .. autoapisummary:: topostats.io.read_yaml topostats.io.get_date_time topostats.io.write_yaml topostats.io.write_config_with_comments topostats.io.save_array topostats.io.load_array topostats.io.path_to_str topostats.io.get_out_path topostats.io.find_files topostats.io.save_folder_grainstats topostats.io.read_null_terminated_string topostats.io.read_u32i topostats.io.read_64d topostats.io.read_char topostats.io.read_gwy_component_dtype topostats.io.get_relative_paths topostats.io.convert_basename_to_relative_paths topostats.io.save_topostats_file topostats.io.save_pkl topostats.io.load_pkl Module Contents --------------- .. py:data:: LOGGER .. py:data:: CONFIG_DOCUMENTATION_REFERENCE :value: Multiline-String .. raw:: html
Show Value .. code-block:: python """For more information on configuration and how to use it: # https://afm-spm.github.io/TopoStats/main/configuration.html """ .. raw:: html
.. py:function:: read_yaml(filename: Union[str, pathlib.Path]) -> Dict Read a YAML file. :param filename: YAML file to read. :type filename: Union[str, Path] :returns: Dictionary of the file. :rtype: Dict .. !! processed by numpydoc !! .. py:function:: get_date_time() -> str Get a date and time for adding to generated files or logging. :param None: :returns: A string of the current date and time, formatted appropriately. :rtype: str .. !! processed by numpydoc !! .. py:function:: write_yaml(config: dict, output_dir: Union[str, pathlib.Path], config_file: str = 'config.yaml', header_message: str = None) -> None Write a configuration (stored as a dictionary) to a YAML file. :param config: Configuration dictionary. :type config: dict :param output_dir: Path to save the dictionary to as a YAML file (it will be called 'config.yaml'). :type output_dir: Union[str, Path] :param config_file: Filename to write to. :type config_file: str :param header_message: String to write to the header message of the YAML file :type header_message: str .. !! processed by numpydoc !! .. py:function:: write_config_with_comments(config: str, output_dir: pathlib.Path, filename: str = 'config.yaml') -> None Create a config file, retaining the comments by writing it as a string rather than using a yaml handling package. :param config: A string of the entire configuration file to be saved. :type config: str :param output_dir: A pathlib path of where to create the config file. :type output_dir: Path :param filename: A name for the configuration file. Can have a ".yaml" on the end. :type filename: str .. !! processed by numpydoc !! .. py:function:: save_array(array: numpy.ndarray, outpath: pathlib.Path, filename: str, array_type: str) -> None Save a Numpy array to disk. :param array: Numpy array to be saved. :type array: np.ndarray :param outpath: Location array should be saved :type outpath: Path :param filename: Filename of the current image from which the array is derived. :type filename: str :param array_type: Short string describing the array type e.g. z_threshold. Ideally should not have periods or spaces in (use :type array_type: str :param underscores '_' instead).: .. !! processed by numpydoc !! .. py:function:: load_array(array_path: Union[str, pathlib.Path]) -> numpy.ndarray Load a Numpy array from file. Should have been saved using save_array() or numpy.save(). :param array_path: Path to the Numpy array on disk. :type array_path: Union[str, Path] :returns: Returns the loaded Numpy array. :rtype: np.ndarray .. !! processed by numpydoc !! .. py:function:: path_to_str(config: dict) -> Dict Recursively traverse a dictionary and convert any Path() objects to strings for writing to YAML. :param config: Dictionary to be converted. :type config: dict :returns: The same dictionary with any Path() objects converted to string. :rtype: Dict .. !! processed by numpydoc !! .. py:function:: get_out_path(image_path: Union[str, pathlib.Path] = None, base_dir: Union[str, pathlib.Path] = None, output_dir: Union[str, pathlib.Path] = None) -> pathlib.Path Adds the image path relative to the base directory to the output directory. :param image_path: The path of the current image. :type image_path: Path :param base_dir: Directory to recursively search for files. :type base_dir: Path :param output_dir: The output directory specified in the configuration file. :type output_dir: Path :returns: The output path that mirrors the input path structure. :rtype: Path .. !! processed by numpydoc !! .. py:function:: find_files(base_dir: Union[str, pathlib.Path] = None, file_ext: str = '.spm') -> List Recursively scan the specified directory for images with the given file extension. :param base_dir: Directory to recursively search for files, if not specified the current directory is scanned. :type base_dir: Union[str, Path] :param file_ext: File extension to search for. :type file_ext: str :returns: List of files found with the extension in the given directory. :rtype: List .. !! processed by numpydoc !! .. py:function:: save_folder_grainstats(output_dir: Union[str, pathlib.Path], base_dir: Union[str, pathlib.Path], all_stats_df: pandas.DataFrame) -> None Saves a data frame of grain and tracing statictics at the folder level. :param output_dir: Path of the output directory head. :type output_dir: Union[str, Path] :param base_dir: Path of the base directory where files were found. :type base_dir: Union[str, Path] :param all_stats_df: The dataframe containing all sample statistics run. :type all_stats_df: pd.DataFrame :returns: This only saves the dataframes and does not retain them. :rtype: None .. !! processed by numpydoc !! .. py:function:: read_null_terminated_string(open_file: io.TextIOWrapper) -> str Read an open file from the current position in the open binary file, until the next null value. :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: String of the ASCII decoded bytes before the next null byte. :rtype: str .. !! processed by numpydoc !! .. py:function:: read_u32i(open_file: io.TextIOWrapper) -> str Read an unsigned 32 bit integer from an open binary file (in little-endian form). :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: Python integer type cast from the unsigned 32 bit integer. :rtype: int .. !! processed by numpydoc !! .. py:function:: read_64d(open_file: io.TextIOWrapper) -> str Read a 64-bit double from an open binary file. :param open_file: An open file object. :returns: Python float type cast from the double. :rtype: float .. !! processed by numpydoc !! .. py:function:: read_char(open_file: io.TextIOWrapper) -> str Read a character from an open binary file. :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: A string type cast from the decoded character. :rtype: str .. !! processed by numpydoc !! .. py:function:: read_gwy_component_dtype(open_file: io.TextIOWrapper) -> str Read the data type of a `.gwy` file component. Possible data types are as follows: - 'b': boolean - 'c': character - 'i': 32-bit integer - 'q': 64-bit integer - 'd': double - 's': string - 'o': `.gwy` format object Capitalised versions of some of these data types represent arrays of values of that data type. Arrays are stored as an unsigned 32 bit integer, describing the size of the array, followed by the unseparated array values. - 'C': array of characters - 'I': array of 32-bit integers - 'Q': array of 64-bit integers - 'D': array of doubles - 'S': array of strings - 'O': array of objects :param open_file: An open file object. :type open_file: io.TextIOWrapper :returns: Python string (one character long) of the data type of the component's value. :rtype: str .. !! processed by numpydoc !! .. py:function:: get_relative_paths(paths: List[pathlib.Path]) -> List[str] From a list of paths, create a list of these paths but where each path is relative to all path's closest common parent. For example, ['a/b/c', 'a/b/d', 'a/b/e/f'] would return ['c', 'd', 'e/f'] :param paths: List of string or pathlib paths. :type paths: list :returns: **relative_paths** -- List of string paths, relative to the common parent. :rtype: list .. !! processed by numpydoc !! .. py:function:: convert_basename_to_relative_paths(df: pandas.DataFrame) Converts the paths in the 'basename' column in a dataframe from being absolute paths, to paths relative to the deepest common parent. For example if the 'basename' column has the following paths: ['/usr/topo/data/a/b', '/usr /topo/data/c/d'], the output will be: ['a/b', 'c/d']. :param df: A pandas dataframe containing a column 'basename' which contains the paths indicating the locations of the image data files. :type df: pd.DataFrame :returns: **df** -- A pandas dataframe where the 'basename' column has paths relative to a common parent. :rtype: pd.DataFrame .. !! processed by numpydoc !! .. py:class:: LoadScans(img_paths: list, channel: str) Load the image and image parameters from a file path. .. !! processed by numpydoc !! .. py:attribute:: img_paths .. py:attribute:: img_path :value: None .. py:attribute:: channel .. py:attribute:: channel_data :value: None .. py:attribute:: filename :value: None .. py:attribute:: image :value: None .. py:attribute:: pixel_to_nm_scaling :value: None .. py:attribute:: grain_masks .. py:attribute:: img_dict .. py:attribute:: MINIMUM_IMAGE_SIZE :value: 10 .. py:method:: load_spm() -> tuple Extract image and pixel to nm scaling from the Bruker .spm file. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple(np.ndarray, float) .. !! processed by numpydoc !! .. py:method:: _spm_pixel_to_nm_scaling(channel_data: pySPM.SPM.SPM_image) -> float Extract pixel to nm scaling from the SPM image metadata. :param channel_data: Channel data from PySPM. :type channel_data: pySPM.SPM.SPM_image :returns: Pixel to nm scaling factor. :rtype: float .. !! processed by numpydoc !! .. py:method:: load_topostats() -> tuple Load a .topostats file (hdf5 format), extracting the image, pixel to nanometre scaling factor and any grain masks. Note that grain masks are stored via self.grain_masks rather than returned due to how we extract information for all other file loading functions. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple(np.ndarray, float) .. !! processed by numpydoc !! .. py:method:: load_ibw() -> tuple Loads image from Asylum Research (Igor) .ibw files :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple(np.ndarray, float) .. !! processed by numpydoc !! .. py:method:: _ibw_pixel_to_nm_scaling(scan: dict) -> float Extract pixel to nm scaling from the IBW image metadata. :param scan: The loaded binary wave object. :type scan: dict :returns: A value corresponding to the real length of a single pixel. :rtype: float .. !! processed by numpydoc !! .. py:method:: load_jpk() -> tuple Loads image from JPK Instruments .jpk files. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple(np.ndarray, float) .. !! processed by numpydoc !! .. py:method:: _jpk_pixel_to_nm_scaling(tiff_page: tifffile.tifffile.TiffPage) -> float :staticmethod: Extract pixel to nm scaling from the JPK image metadata. :param tiff_page: An image file directory (IFD) of .jpk files. :type tiff_page: tifffile.tifffile.TiffPage :returns: A value corresponding to the real length of a single pixel. :rtype: float .. !! processed by numpydoc !! .. py:method:: _gwy_read_object(open_file: io.TextIOWrapper, data_dict: dict) -> None :staticmethod: Parse and extract data from a `.gwy` file object, starting at the current open file read position. :param open_file: An open file object. :type open_file: io.TextIOWrapper :param data_dict: Dictionary of `.gwy` file image properties. :type data_dict: dict :rtype: None .. !! processed by numpydoc !! .. py:method:: _gwy_read_component(open_file: io.TextIOWrapper, initial_byte_pos: int, data_dict: dict) -> int :staticmethod: Parse and extract data from a `.gwy` file object, starting at the current open file read position. :param open_file: An open file object. :type open_file: io.TextIOWrapper, :param data_dict: Dictionary of `.gwy` file image properties. :type data_dict: dict :returns: Size of the component in bytes. :rtype: int .. !! processed by numpydoc !! .. py:method:: _gwy_print_dict(gwy_file_dict: dict, pre_string: str) -> None :staticmethod: A developer function to print the nested object / component structure. Can be used to find labels and values of objects / components in the `.gwy` file. :param gwy_file_dict: Dictionary of the nested object / component structure of a `.gwy` file. :type gwy_file_dict: dict .. !! processed by numpydoc !! .. py:method:: _gwy_print_dict_wrapper(gwy_file_dict: dict) -> None :staticmethod: Wrapper for the `_print_gwy_dict` function. :param gwy_file_dict: Dictionary of the nested object / component structure of a `.gwy` file. :type gwy_file_dict: dict .. !! processed by numpydoc !! .. py:method:: load_gwy() -> tuple Extract image and pixel to nm scaling from the Gwyddion .gwy file. :returns: A tuple containing the image and its pixel to nanometre scaling value. :rtype: tuple(np.ndarray, float) .. !! processed by numpydoc !! .. py:method:: get_data() -> None Method to extract image, filepath and pixel to nm scaling value, and append these to the img_dic object. .. !! processed by numpydoc !! .. py:method:: _check_image_size_and_add_to_dict() -> None Check the image is above a minimum size in both dimensions. Images that do not meet the minimum size are not included for processing. .. !! processed by numpydoc !! .. py:method:: add_to_dict() -> None Adds the image, image path and pixel to nanometre scaling value to the img_dic dictionary under the key filename. :param filename: The filename, idealy without an extension. :type filename: str :param image: An array of the extracted AFM image. :type image: np.ndarray :param img_path: The path to the AFM file (with a frame number if applicable) :type img_path: str :param px_2_nm: The length of a pixel in nm. :type px_2_nm: float .. !! processed by numpydoc !! .. py:function:: save_topostats_file(output_dir: pathlib.Path, filename: str, topostats_object: dict) -> None Save a topostats dictionary object to a .topostats (hdf5 format) file. :param output_dir: Directory to save the .topostats file in. :type output_dir: Path :param filename: File name of the .topostats file. :type filename: str :param topostats_object: Dictionary of the topostats data to save. Must include a flattened image and pixel to nanometre scaling factor. May also include grain masks. :type topostats_object: dict .. !! processed by numpydoc !! .. py:function:: save_pkl(outfile: pathlib.Path, to_pkl: dict) -> None Pickle objects for working with later. :param outfile: Path and filename to save pickle to. :type outfile: Path :param to_pkl: Object to be picled. :type to_pkl: dict :rtype: None .. !! processed by numpydoc !! .. py:function:: load_pkl(infile: pathlib.Path) -> Any Load data from a pickle. :param infile: Path to a valid pickle. :type infile: Path :returns: Dictionary of generated images. :rtype: dict .. rubric:: Example from pathlib import Path from topostats.io import load_plots pkl_path = "output/distribution_plots.pkl" my_plots = load_pkl(pkl_path) # Show the type of my_plots which is a dictionary of nested dictionaries type(my_plots) # Show the keys are various levels of nesting. my_plots.keys() my_plots["area"].keys() my_plots["area"]["dist"].keys() # Get the figure and axis object for a given metrics distribution plot figure, axis = my_plots["area"]["dist"].values() # Get the figure and axis object for a given metrics violin plot figure, axis = my_plots["area"]["violin"].values() .. !! processed by numpydoc !!