# Configuration Configuration for TopoStats is done using a [YAML](https://yaml.org/) configuration that is specified on the command line when invoking. A default configuration file is provided in the TopoStats repository at [`topostats/default_config.yaml`](https://github.com/AFM-SPM/TopoStats/blob/dev/topostats/default_config.yaml). The file contains comments indicating valid values for many of the fields. If no configuration file is provided this default configuration is loaded automatically and used. You can make a copy of the `default_config.yaml` and modify it for your own use. Once saved you can run TopoStats with this configuration file as shown below. ``` bash run_topostats --config my_config.yaml ``` On completion a copy of the configuration that was used is written to the output directory. ## Fields Aside from the comments in YAML file itself the fields are described below. | Section | Sub-Section | Data Type | Default | Description | |:-------------|:-------------------------------|:-----------|:---------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `base_dir` | | string | `./` | Directory to recursively search for files within. | | `output_dir` | | string | `./output` | Directory that output should be saved to. | | `warnings` | | string | `ignore` | Turns off warnings being shown. | | `cores` | | integer | `4` | Number of cores to run parallel processes on. | | `quiet` | | false | | | | `file_ext` | | string | `.spm` | File extensions to search for. | | `loading` | `channel` | string | `Height` | The channel of data to be processed, what this is will depend on the file-format you are processing and the channel you wish to process. | | `filter` | `run` | boolean | `true` | Whether to run the filtering stage, without this other stages won't run so leave as `true`. | | | `threshold_method` | str | `std_dev` | Threshold method for filtering, options are `ostu`, `std_dev` or `absolute`. | | | `otsu_threshold_multiplier` | float | `1.0` | | | | `threshold_std_dev` | float | ` 1.0` | | | | `threshold_absolute_lower` | float | `-1.0` | | | | `threshold_absolute_upper` | float | `1.0` | | | | `gaussian_size` | float | `0.5` | The number of standard deviations to build the Gaussian kernel and thus affects the degree of blurring. See [skimage.filters.gaussian](https://scikit-image.org/docs/dev/api/skimage.filters.html#skimage.filters.gaussian) and `sigma` for more information | | | `gaussian_mode` | string | `nearest` | | | `grains` | `run` | boolean | `true` | Whether to run grain finding. Options `true`, `false` | | | `absolute_smallest_grain_size` | int | `100` | The smallest size of grains to be included (in pixels), anything smaller than this is considered noise and removed. | | | `threshold_method` | float | `std_dev` | Threshold method for grain finding. Options : `otsu`, `std_dev`, `absolute` | | | `otsu_threshold_multiplier` | | `1.0` | Factor by which the derived Otsu Threshold should be scaled. | | | `threshold_std_dev` | | `1.0` | | | | ` threshold_absolute_lower` | | `1.0` | | | | ` threshold_absolute_upper` | | `1.0` | | | | `absolute_area_threshold` | dictionary | | | | | `...upper` | list | `[500,800]` | Height above surface [Low, High] in nm^2 (also takes null) | | | `...lower` | | `[null, null]` | Height below surface [Low, High] in nm^2 (also takes null) | | | `direction` | | `upper` | Defines whether to look for grains above or below thresholds or both. Options: `upper`, `lower`, `both` | | | `background` | float | `0.0` | | | `grainstats` | `run` | boolean | `true` | Whether to calculate grain statistics. Options : `true`, `false` | | | `cropped_size` | float | `40.0` | Force cropping of grains to this length (in nm) of square cropped images (can take `-1` for grain-sized box) | | | `save_cropped_grains` | boolean | `true` | Options : true, false | | `dnatracing` | `run` | boolean | `true` | Whether to run DNA Tracing. Options : true, false | | `plotting` | `run` | boolean | `true` | Whether to run plotting. Options : `true`, `false` | | | `save_format` | string | `png` | Format to save images in, see [matplotlib.pyplot.savefig](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html) | | | `image_set` | string | `all` | Which images to plot. Options : `all`, `core` | | | `zrange` | list | `[0, 3]` | Low and high height range for core images (can take [null, null]) | | | `colorbar` | boolean | `true` | Whether to include the colorbar scale in plots. Options `true`, `false` | | | `axes` | boolean | `true` | Wether to include the axes in the produced plots. | | | `cmap` | string | `nanoscope` | Colormap to use in plotting. Options : `nanoscope`, `afmhot` | | | `histogram_log_axis` | bbolean | `false` | Whether to plot hisograms using a logarithmic scale or not. Options: `true`, `false`. | ## Validation Configuration files are validated against a schema to check that the values in the configuration file are within the expected ranges or valid parameters. This helps capture problems early and should provide informative messages as to what needs correcting if there are errors.