IO Modules
Functions for reading and writing data.
LoadScans
Load the image and image parameters from a file path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
img_paths
|
list[str, Path]
|
Path to a valid AFM scan to load. |
required |
config
|
dict[str, Any]
|
Dictionary of all configuration options. |
required |
channel
|
str
|
Image channel to extract from the scan. |
None
|
Source code in topostats\io.py
603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 | |
__init__(img_paths: list[str | Path], config: dict[str, Any], channel: str | None = None)
Initialise the class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
img_paths
|
list[str | Path]
|
Path to a valid AFM scan to load. |
required |
config
|
dict[str, Any]
|
Dictionary of all configuration options. |
required |
channel
|
str
|
Image channel to extract from the scan. |
None
|
Source code in topostats\io.py
add_to_dict(image: npt.NDArray, filename: str) -> None
Add an image and metadata to the img_dict dictionary under the key filename.
Adds the image and associated metadata such as any grain masks, and pixel to nanometere scaling factor to the img_dict dictionary which is used as a place to store the image information for processing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
NDArray
|
An array of the extracted AFM image. |
required |
filename
|
str
|
The name of the file. |
required |
Source code in topostats\io.py
get_data() -> None
Extract image, filepath and pixel to nm scaling value, and append these to the img_dic object.
Source code in topostats\io.py
load_asd() -> tuple[npt.NDArray, float]
Extract image and pixel to nm scaling from .asd files.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_gwy() -> tuple[npt.NDArray, float]
Extract image and pixel to nm scaling from the Gwyddion .gwy file.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_ibw() -> tuple[npt.NDArray, float]
Load image from Asylum Research (Igor) .ibw files.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_jpk() -> tuple[npt.NDArray, float]
Load image from JPK Instruments .jpk files.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_spm() -> tuple[npt.NDArray, float]
Extract image and pixel to nm scaling from the Bruker .spm file.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_stp() -> tuple[npt.NDArray, float]
Extract image and pixel to nm scaling from the WsXM .stp file.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_top() -> tuple[npt.NDArray, float]
Extract image and pixel to nm scaling from the WsXM .top file.
Returns:
| Type | Description |
|---|---|
tuple[NDArray, float]
|
A tuple containing the image and its pixel to nanometre scaling value. |
Source code in topostats\io.py
load_topostats() -> dict[str, Any]
Load a .topostats file (hdf5 format) using AFMReader.
AFMReader is general and returns the data as a dictionary. This is converted to TopoStasts class later when
building dictionaries of images.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary of all previously processed data and configuration options. |
Source code in topostats\io.py
convert_basename_to_relative_paths(df: pd.DataFrame)
Convert paths in the 'basename' column of a dataframe to relative paths.
If the 'basename' column has the following paths: ['/usr/topo/data/a/b', '/usr/topo/data/c/d'], the output will be: ['a/b', 'c/d'].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
A pandas dataframe containing a column 'basename' which contains the paths indicating the locations of the image data files. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
A pandas dataframe where the 'basename' column has paths relative to a common parent. |
Source code in topostats\io.py
dict_almost_equal(dict1: dict, dict2: dict, abs_tol: float = 1e-09)
Recursively check if two dictionaries are almost equal with a given absolute tolerance.
This should really just be iterative and is an affront to memory usage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dict1
|
dict
|
First dictionary to compare. |
required |
dict2
|
dict
|
Second dictionary to compare. |
required |
abs_tol
|
float
|
Absolute tolerance to check for equality. |
1e-09
|
Returns:
| Type | Description |
|---|---|
bool
|
True if the dictionaries are almost equal, False otherwise. |
Source code in topostats\io.py
dict_to_hdf5(open_hdf5_file: h5py.File, group_path: str, dictionary: dict) -> None
Recursively save a dictionary to an open hdf5 file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_hdf5_file
|
File
|
An open hdf5 file object. |
required |
group_path
|
str
|
The path to the group in the hdf5 file to start saving data from. |
required |
dictionary
|
dict
|
A dictionary of the data to save. |
required |
Source code in topostats\io.py
924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 | |
dict_to_json(data: dict, output_dir: str | Path, filename: str | Path, indent: int = 4) -> None
Write a dictionary to a JSON file at the specified location with the given name.
NB : The NumpyEncoder class is used as the default encoder to ensure Numpy dtypes are written as strings (they are
not serialisable to JSON using the default JSONEncoder).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict
|
Data as a dictionary that is to be written to file. |
required |
output_dir
|
str | Path
|
Directory the file is to be written to. |
required |
filename
|
str | Path
|
Name of output file. |
required |
indent
|
int
|
Spaces to indent JSON with, default is 4. |
4
|
Source code in topostats\io.py
dict_to_topostats(dictionary: dict[str, Any]) -> TopoStats
Convert a dictionary, typically loaded from HDF5 .topostats file, to TopoStats object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dictionary
|
dict[str, Any]
|
Dictionary of TopoStats data. This will typically have been loaded from the HDF5 |
required |
Returns:
| Type | Description |
|---|---|
TopoStats
|
A TopoStat object. |
Source code in topostats\io.py
1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 | |
extract_height_profiles(topostats_object_all: dict[str, TopoStats], output_dir: str | Path, filename: str = 'height_profiles.json') -> None
Write height profiles from all grains across all processed images to JSON.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
topostats_object_all
|
dict[str, TopoStats]
|
Dictionary of processed TopoStats objects. |
required |
output_dir
|
str | Path
|
Path to save JSON to. |
required |
filename
|
str
|
Name of output file, default is |
'height_profiles.json'
|
Source code in topostats\io.py
find_files(base_dir: str | Path = None, file_ext: str = '.spm') -> list
Recursively scan the specified directory for images with the given file extension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_dir
|
Union[str, Path]
|
Directory to recursively search for files, if not specified the current directory is scanned. |
None
|
file_ext
|
str
|
File extension to search for. |
'.spm'
|
Returns:
| Type | Description |
|---|---|
List
|
List of files found with the extension in the given directory. |
Source code in topostats\io.py
get_date_time() -> str
Get a date and time for adding to generated files or logging.
Returns:
| Type | Description |
|---|---|
str
|
A string of the current date and time, formatted appropriately. |
Source code in topostats\io.py
get_out_path(image_path: str | Path = None, base_dir: str | Path = None, output_dir: str | Path = None) -> Path
Add the image path relative to the base directory to the output directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image_path
|
Path
|
The path of the current image. |
None
|
base_dir
|
Path
|
Directory to recursively search for files. |
None
|
output_dir
|
Path
|
The output directory specified in the configuration file. |
None
|
Returns:
| Type | Description |
|---|---|
Path
|
The output path that mirrors the input path structure. |
Source code in topostats\io.py
get_relative_paths(paths: list[Path]) -> list[str]
Extract a list of relative paths, removing the common suffix.
From a list of paths, create a list where each path is relative to all path's closest common parent. For example, ['a/b/c', 'a/b/d', 'a/b/e/f'] would return ['c', 'd', 'e/f'].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
paths
|
list
|
List of string or pathlib paths. |
required |
Returns:
| Type | Description |
|---|---|
list
|
List of string paths, relative to the common parent. |
Source code in topostats\io.py
hdf5_to_dict(open_hdf5_file: h5py.File, group_path: str) -> dict
Read a dictionary from an open hdf5 file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_hdf5_file
|
File
|
An open hdf5 file object. |
required |
group_path
|
str
|
The path to the group in the hdf5 file to start reading data from. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
A dictionary of the hdf5 file data. |
Source code in topostats\io.py
lists_almost_equal(list1: list, list2: list, abs_tol: float = 1e-09) -> bool
Check if two lists are almost equal with a given absolute tolerance.
Note: Currently the lists must be flat, the same length and contain only numbers (int or float).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
list1
|
list
|
First list to compare. |
required |
list2
|
list
|
Second list to compare. |
required |
abs_tol
|
float
|
Absolute tolerance to check for equality. |
1e-09
|
Returns:
| Type | Description |
|---|---|
bool
|
True if the lists are almost equal, False otherwise. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If the items in the lists are not of type int or float. |
Source code in topostats\io.py
load_array(array_path: str | Path) -> npt.NDArray
Load a Numpy array from file.
Should have been saved using save_array() or numpy.save().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array_path
|
Union[str, Path]
|
Path to the Numpy array on disk. |
required |
Returns:
| Type | Description |
|---|---|
NDArray
|
Returns the loaded Numpy array. |
Source code in topostats\io.py
load_pkl(infile: Path) -> Any
Load data from a pickle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
infile
|
Path
|
Path to a valid pickle. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
Any
|
Dictionary of generated images. |
Examples:
from pathlib import Path
from topostats.io import load_plots
pkl_path = "output/distribution_plots.pkl" my_plots = load_pkl(pkl_path)
Show the type of my_plots which is a dictionary of nested dictionaries
type(my_plots)
Show the keys are various levels of nesting.
my_plots.keys() my_plots["area"].keys() my_plots["area"]["dist"].keys()
Get the figure and axis object for a given metrics distribution plot
figure, axis = my_plots["area"]["dist"].values()
Get the figure and axis object for a given metrics violin plot
figure, axis = my_plots["area"]["violin"].values()
Source code in topostats\io.py
path_to_str(config: dict) -> dict
Recursively traverse a dictionary and convert any Path() objects to strings for writing to YAML.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
dict
|
Dictionary to be converted. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Dict |
dict
|
The same dictionary with any Path() objects converted to string. |
Source code in topostats\io.py
read_64d(open_file: io.TextIOWrapper) -> str
Read a 64-bit double from an open binary file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_file
|
TextIOWrapper
|
An open file object. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Python float type cast from the double. |
Source code in topostats\io.py
read_char(open_file: io.TextIOWrapper) -> str
Read a character from an open binary file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_file
|
TextIOWrapper
|
An open file object. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A string type cast from the decoded character. |
Source code in topostats\io.py
read_gwy_component_dtype(open_file: io.TextIOWrapper) -> str
Read the data type of a .gwy file component.
Possible data types are as follows:
- 'b': boolean
- 'c': character
- 'i': 32-bit integer
- 'q': 64-bit integer
- 'd': double
- 's': string
- 'o':
.gwyformat object
Capitalised versions of some of these data types represent arrays of values of that data type. Arrays are stored as an unsigned 32 bit integer, describing the size of the array, followed by the unseparated array values:
- 'C': array of characters
- 'I': array of 32-bit integers
- 'Q': array of 64-bit integers
- 'D': array of doubles
- 'S': array of strings
- 'O': array of objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_file
|
TextIOWrapper
|
An open file object. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Python string (one character long) of the data type of the component's value. |
Source code in topostats\io.py
read_null_terminated_string(open_file: io.TextIOWrapper, encoding: str = 'utf-8') -> str
Read an open file from the current position in the open binary file, until the next null value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_file
|
TextIOWrapper
|
An open file object. |
required |
encoding
|
str
|
Encoding to use when decoding the bytes. |
'utf-8'
|
Returns:
| Type | Description |
|---|---|
str
|
String of the ASCII decoded bytes before the next null byte. |
Examples:
Source code in topostats\io.py
read_u32i(open_file: io.TextIOWrapper) -> str
Read an unsigned 32 bit integer from an open binary file (in little-endian form).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
open_file
|
TextIOWrapper
|
An open file object. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Python integer type cast from the unsigned 32 bit integer. |
Source code in topostats\io.py
read_yaml(filename: str | Path) -> dict
Read a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
Union[str, Path]
|
YAML file to read. |
required |
Returns:
| Type | Description |
|---|---|
Dict
|
Dictionary of the file. |
Source code in topostats\io.py
save_array(array: npt.NDArray, outpath: Path, filename: str, array_type: str) -> None
Save a Numpy array to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array
|
NDArray
|
Numpy array to be saved. |
required |
outpath
|
Path
|
Location array should be saved. |
required |
filename
|
str
|
Filename of the current image from which the array is derived. |
required |
array_type
|
str
|
Short string describing the array type e.g. z_threshold. Ideally should not have periods or spaces in (use underscores '_' instead). |
required |
Source code in topostats\io.py
save_image_grainstats(output_dir: str | Path, base_dir: str | Path, all_stats_df: pd.DataFrame, stats_filename: str) -> None
Save a data frame of grain and tracing statistics at the folder level.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_dir
|
Union[str, Path]
|
Path of the output directory head. |
required |
base_dir
|
Union[str, Path]
|
Path of the base directory where files were found. |
required |
all_stats_df
|
DataFrame
|
The dataframe containing all sample statistics run. |
required |
stats_filename
|
str
|
The name of the type of statistics dataframe to be saved. |
required |
Returns:
| Type | Description |
|---|---|
None
|
This only saves the dataframes and does not retain them. |
Source code in topostats\io.py
save_pkl(outfile: Path, to_pkl: dict) -> None
Pickle objects for working with later.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
outfile
|
Path
|
Path and filename to save pickle to. |
required |
to_pkl
|
dict
|
Object to be pickled. |
required |
Source code in topostats\io.py
save_topostats_file(output_dir: Path, topostats_object: TopoStats, topostats_version: str = __release__) -> None
Save ''TopoStats'' object to a ''.topostats'' (hdf5 format) file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_dir
|
Path
|
Directory to save the .topostats file in. |
required |
topostats_object
|
dict
|
Dictionary of the topostats data to save. Must include a flattened image and pixel to nanometre scaling factor. May also include grain masks. |
required |
topostats_version
|
str
|
Version to save as, defaults to ''release''. |
__release__
|
Source code in topostats\io.py
write_csv(df: pd.DataFrame, dataset: str, names: list[str] | None, index: list[str], output_dir: str | Path, base_dir: str | Path) -> pd.DataFrame
Write summary statistics files to CSV.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Dataframe to write to CSV. |
required |
dataset
|
str
|
Type of dataframe, valid values are |
required |
names
|
list[str]
|
List of names to rename current index with. |
required |
index
|
list[str]
|
List of columns to set index to. |
required |
output_dir
|
str | Path
|
Output directory. |
required |
base_dir
|
str | Path
|
Base directory. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Pandas dataframe with index renamed and reset. |
Source code in topostats\io.py
write_yaml(config: dict, output_dir: str | Path, config_file: str = 'config.yaml', header_message: str = None) -> None
Write a configuration (stored as a dictionary) to a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
dict
|
Configuration dictionary. |
required |
output_dir
|
Union[str, Path]
|
Path to save the dictionary to as a YAML file (it will be called 'config.yaml'). |
required |
config_file
|
str
|
Filename to write to. |
'config.yaml'
|
header_message
|
str
|
String to write to the header message of the YAML file. |
None
|