PrognosAIs.IO package¶
Submodules¶
PrognosAIs.IO.ConfigLoader module¶
PrognosAIs.IO.Configs module¶
-
class
PrognosAIs.IO.Configs.
bias_field_correcting_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
property
mask
¶
-
property
mask_file
¶
-
property
-
class
PrognosAIs.IO.Configs.
general_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
class
PrognosAIs.IO.Configs.
labeling_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
class
PrognosAIs.IO.Configs.
masking_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
property
mask
¶
-
property
mask_file
¶
-
property
-
class
PrognosAIs.IO.Configs.
multi_dimension_extracting_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
class
PrognosAIs.IO.Configs.
normalizing_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
property
mask
¶
-
property
mask_file
¶
-
property
-
class
PrognosAIs.IO.Configs.
patching_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
property
patch_size
¶
-
property
-
class
PrognosAIs.IO.Configs.
rejecting_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
property
mask
¶
-
property
mask_file
¶
-
property
-
class
PrognosAIs.IO.Configs.
resampling_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
-
class
PrognosAIs.IO.Configs.
saving_config
(config_settings: dict)[source]¶ Bases:
PrognosAIs.IO.Configs.config
PrognosAIs.IO.DataGenerator module¶
-
class
PrognosAIs.IO.DataGenerator.
Augmentor
(example_sample: tensorflow.python.framework.ops.Tensor, brightness_probability: float = 0, brightness_delta: float = 0, contrast_probability: float = 0, contrast_min_factor: float = 1, contrast_max_factor: float = 1, flip_probability: float = 0, to_flip_axis: Union[int, list] = 0, crop_probability: float = 0, crop_size: list = None, rotate_probability: float = 0, max_rotate_angle: float = 0, to_rotate_axis: Union[int, list] = 0)[source]¶ Bases:
object
-
__init__
(example_sample: tensorflow.python.framework.ops.Tensor, brightness_probability: float = 0, brightness_delta: float = 0, contrast_probability: float = 0, contrast_min_factor: float = 1, contrast_max_factor: float = 1, flip_probability: float = 0, to_flip_axis: Union[int, list] = 0, crop_probability: float = 0, crop_size: list = None, rotate_probability: float = 0, max_rotate_angle: float = 0, to_rotate_axis: Union[int, list] = 0) → None[source]¶ Augmentor to randomly augment the features of a sample.
- Parameters
example_sample (tf.Tensor) – Example sample from which settings for augmentation will be derived
brightness_probability (float, optional) – Probability of augmenting brightness. Defaults to 0.
brightness_delta (float, optional) – Brightness will be adjusted with value from -delta to delta. Defaults to 0.
contrast_probability (float, optional) – Probability of augmenting contrast. Defaults to 0.
contrast_min_factor (float, optional) – Minimum contrast adjustment factor. Defaults to 1.
contrast_max_factor (float, optional) – Maximum contrast adjustment factor. Defaults to 1.
flip_probability (float, optional) – Probability of a random flip. Defaults to 0.
to_flip_axis (Union[int, list], optional) – Axis to flip the feature over. Defaults to 0.
crop_probability (float, optional) – Probability of cropping the feature. Defaults to 0.
crop_size (list, optional) – Size to crop the feature to. Defaults to None.
-
apply_augmentation
(augmentation_probability: float, seed: tensorflow.python.framework.ops.Tensor = None) → bool[source]¶ Whether the the augmentation step should be applied based on the probability.
- Parameters
augmentation_probability (float) – The probability with which the step should be applied
seed (tf.Tensor) – Seed to make operation repeatable. Defaults to None.
- Returns
bool – Whether the step should be applied
-
augment_sample
(sample: tensorflow.python.framework.ops.Tensor, seed=None, is_mask=False) → tensorflow.python.framework.ops.Tensor[source]¶ Apply random augmentations to the sample based on the config.
- Parameters
sample (tf.Tensor) – sample to be augmented
- Returns
tf.Tensor – augmented sample
-
get_seed
() → tensorflow.python.framework.ops.Tensor[source]¶ Get a random seed that can be used to make other operation repeatable.
- Returns
tf.Tensor – The seed
-
pad_to_original_size
(sample: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶ Pad back a (potentially) augmented sample to its original size.
- Parameters
sample (tf.Tensor) – The sample to pad
- Returns
tf.Tensor – The padded sample with the same size as before any augmentation steps
-
random_brightness
(sample: tensorflow.python.framework.ops.Tensor, seed: tensorflow.python.framework.ops.Tensor = None) → tensorflow.python.framework.ops.Tensor[source]¶ Randomly adjusts the brightness of a sample.
Brightness is adjusted by a constact factor over the whole image, drawn from a distribution between -delta and delta as set during the initialization of the augmentator.
- Parameters
sample (tf.Tensor) – Sample for which to adjust brightness.
seed (tf.Tensor) – Seed to make operation repeatable. Defaults to None.
- Returns
tf.Tensor – The augmented sample.
-
random_contrast
(sample: tensorflow.python.framework.ops.Tensor, seed: tensorflow.python.framework.ops.Tensor = None) → tensorflow.python.framework.ops.Tensor[source]¶ Randomly adjust the contrast of a sample.
The contrast is adjusted by keeping the mean of the sample the same as for the original sample, and squeezing or expending the distribution of the intensities around the mean. The amount of squeezing or expanding is randomly drawn from the minimum and maximum contrast set during initialization.
- Parameters
sample (tf.Tensor) – Sample for which to adjust contrast
seed (tf.Tensor) – Seed to make operation repeatable. Defaults to None.
- Returns
tf.Tensor – The augmented sample
-
random_cropping
(sample: tensorflow.python.framework.ops.Tensor, seed: tensorflow.python.framework.ops.Tensor = None) → tensorflow.python.framework.ops.Tensor[source]¶ Randomly crop a part of the sample.
The crop will have the size of the crop size defined upon initialization of the augmentator. The crop will happen for all channels in the same way, but will not crop out channels. The location of the crop will be randomly drawn from throughout the whole image.
- Parameters
sample (tf.Tensor) – The sample to be cropped
seed (tf.Tensor) – Seed to make operation repeatable. Defaults to None.
- Returns
tf.Tensor – The augmented sample
-
random_flipping
(sample: tensorflow.python.framework.ops.Tensor, seed: tensorflow.python.framework.ops.Tensor = None) → tensorflow.python.framework.ops.Tensor[source]¶ Randomly flip the sample along one or multiple axis.
- Parameters
sample (tf.Tensor) – Sample for which to apply flipping
seed (tf.Tensor) – Seed to make operation repeatable. Defaults to None.
- Returns
tf.Tensor – The augmented sample
-
-
class
PrognosAIs.IO.DataGenerator.
HDF5Generator
(root_folder: str, batch_size: int = 16, shuffle: bool = False, max_steps: int = - 1, drop_batch_remainder: bool = True, labels_only: bool = False)[source]¶ Bases:
object
-
__init__
(root_folder: str, batch_size: int = 16, shuffle: bool = False, max_steps: int = - 1, drop_batch_remainder: bool = True, labels_only: bool = False) → None[source]¶ Generate data from HDF5 files to be used in a TensorFlow pipeline.
This generator loads sample data from HDF5 files, and does this efficiently making us of TensorFlow dataset functions. The inputs and outputs are dict, which allows for easy us in a multi-input and/or multi-output model
- Parameters
root_folder (str) – Folder in which the HDF5 files are stored
batch_size (int, optional) – Batch size of the generator. Defaults to 16.
shuffle (bool, optional) – Whether datset should be shuffled. Defaults to False.
data_augmentation (bool, optional) – Whether data augmentation should be applied. Defaults to False.
augmentation_factor (int, optional) – Number of times dataset should be repeated for augmentation. Defaults to 5.
augmentation_settings (dict, optional) – Setting for the data augmenation. Defaults to None.
max_steps (int, optional) – Maximum number of (iteration) steps to provide. Defaults to -1, in which case all samples are provied.
drop_batch_remainder (bool, optional) – Whether to drop the remainder of the batch if it does not fit perfectly. Defaults to True.
labels_only (bool, optional) – Whether to only provide labels. Defaults to False.
feature_index (str, optional) – Name of the feature group in the HDF5 file. Defaults to “sample”.
label_index (str, optional) – Name of the label group in the HDF5 file. Defaults to “label”.
-
_get_all_dataset_attributes
(h5py_object: Union[h5py._hl.files.File, h5py._hl.dataset.Dataset, h5py._hl.group.Group]) → dict[source]¶ Run through al groups and dataset to get the attributes.
- Parameters
h5py_object (Union[h5py.File, h5py.Dataset, h5py.Group]) – Object for which to return the attributes
- Returns
dict – Mapping between feature/label name and its attributes
-
_get_dataset_names
(h5py_object: Union[h5py._hl.files.File, h5py._hl.dataset.Dataset, h5py._hl.group.Group]) → list[source]¶ Run through all groups and dataset to get the names.
- Parameters
h5py_object (Union[h5py.File, h5py.Dataset, h5py.Group]) – Object for which to return the dataset names
- Returns
list – Dataset names in object
-
feature_loader
(sample_location: tensorflow.python.framework.ops.Tensor) → dict[source]¶ Load the features from a hdf5 sample file.
This loader only loads the labels, instead of the features and labels as done by features_and_labels_loader
- Parameters
sample_location (tf.Tensor) – Location of the sample file
- Returns
dict – Features loaded from the sample file
-
features_and_labels_loader
(sample_location: tensorflow.python.framework.ops.Tensor) → Tuple[dict, dict, tensorflow.python.framework.ops.Tensor][source]¶ Load the features and labels from a hdf5 file to be used in a TensorFlow dataset pipeline.
This loader loads the features and labels from a hdf5 file using TensorFlowIO. The outputs are therefor directly cast to tensor and can be used in a TensorFlow graph. All features and labels from the file are loaded, and a dict is returned mapping the name of each feature and label to its respective value
- Parameters
sample_location (tf.Tensor) – Location of the sample file
- Returns
Tuple[dict, dict] –
- The features (first output) and labels (second output) loaded
from the sample.
-
get_all_dataset_attributes
(sample_file: str = None) → dict[source]¶ Get the attributes of the features and labels stored in the file.
- Returns
dict – Mapping of the feature/label name to its attributes
-
get_dataset_attribute
(dataset_name: str, attribute_name: str) → Any[source]¶ Get the attribute of a specific dataset
- Parameters
dataset_name (str) – Name of dataset for which to get the attribute
attribute_name (str) – Name of attribute to get
- Returns
Any – The value of the attribute
-
get_dataset_names
() → list[source]¶ Get the names of all datasets in the sample.
- Returns
list – Dataset names in the sample
-
get_feature_attribute
(attribute_name: str) → dict[source]¶ Get a specific attribute for all features.
- Parameters
attribute_name (str) – Name of attribute to get
- Returns
dict – Mapping between feature names and the attribute value
-
get_feature_dimensionality
() → dict[source]¶ Get the dimensionality of each feature.
- Returns
dict – Dimensionality of each feature
-
get_feature_metadata
() → dict[source]¶ Get all metadata of all features.
- Returns
dict – The metadata of all features
-
get_feature_metadata_from_sample
(sample_location: str) → dict[source]¶ Get the feature metadata of a specific sample.
- Parameters
sample_location (str) – The file location of the sample
- Returns
dict – The feature metadata of the sample
-
get_feature_shape
() → dict[source]¶ Get the shape of each feature.
- Returns
dict – Shape of each feature
-
get_feature_size
() → dict[source]¶ Get the size of each feature.
The size only of the feature does not take into account the number of channels and only represents the size of an individual channel of the feature.
- Returns
dict – Size of each feature
-
get_label_attribute
(attribute_name: str) → dict[source]¶ Get a specific attribute for all labels.
- Parameters
attribute_name (str) – Name of attribute to get
- Returns
dict – Mapping between label names and the attribute value
-
get_labels_are_one_hot
() → dict[source]¶ Get whether labels are one-hot encoded.
- Returns
dict – One-hot encoding status of each label
-
get_number_of_channels
() → dict[source]¶ Get the number of feature channels.
- Returns
dict – Number of channels for each feature
-
get_number_of_classes
() → dict[source]¶ Get the number of output classes.
- Returns
dict – Number of output classes for each label
-
get_numpy_iterator
() → numpy.nditer[source]¶ Construct a numpy iterator instead of TensorFlow dataset.
The numpy iterator will provide exactly the same data as the TensorFlow dataset. However, it might be easier to inspect the data when using a numpy iterator instead of a TensorFlow dataset
- Returns
np.nditer – The dataset
-
get_spec
() → dict[source]¶ Get the TensorSpec for all input features.
- Returns
dict – Maps the name of each input feature to the TensorSpec of the input.
-
get_tf_dataset
(num_parallel_calls: int = - 1) → tensorflow.python.data.ops.dataset_ops.DatasetV2[source]¶ Construct a TensorFlow dataset.
The dataset is constructed based on the settings supplied to the DataGenerator. The dataset can then directly be used to train or evaluate a TensorFlow model
- Parameters
num_parallel_calls (int) – Number of parallel process to use. Defaults to tf.data.experimental.AUTOTUNE.
- Returns
tf.data.Dataset – The constructed dataset
-
label_loader
(sample_location: tensorflow.python.framework.ops.Tensor) → dict[source]¶ Load the labels from a hdf5 sample file.
This loader only loads the labels, instead of the features and labels as done by features_and_labels_loader
- Parameters
sample_location (tf.Tensor) – Location of the sample file
- Returns
dict – Labels loaded from the sample file
-
load_features
(loaded_hdf5: tensorflow_io.core.python.ops.io_tensor.IOTensor) → dict[source]¶ Load the features from a HDF5 tensor.
- Parameters
loaded_hdf5 (tfio.IOTensor) – Tensor from which to load features
- Returns
dict – Mapping between feature names and features
-
load_labels
(loaded_hdf5: tensorflow_io.core.python.ops.io_tensor.IOTensor) → dict[source]¶ Load the labels from a HDF5 tensor.
- Parameters
loaded_hdf5 (tfio.IOTensor) – Tensor from which to load labels
- Returns
dict – Mapping between label names and labels
-
setup_augmentation
(augmentation_factor: int = 1, augmentation_settings: dict = {}) → None[source]¶ Set up data augmentation in the generator.
- Parameters
augmentation_factor (int) – Repeat dataset this many times in augmentation. Defaults to 1.
augmentation_settings (dict) – Setting to parse to augmentation instance. Defaults to {}.
-
setup_caching
(cache_in_memory: Union[bool, str] = 'AUTO', used_memory: int = 0) → None[source]¶ Set up caching of the dataset in RAM.
- Parameters
cache_in_memory (Union[bool, str]) – Whether dataset should be cached in memory. Defaults to PrognosAIs.Constants.AUTO, in which case the dataset will be cached in memory if it fits, otherwise it will not be cached
used_memory (int) – Amount of RAM (in bytes) that is already being used. Defaults to 0.
- Raises
ValueError – If an unknown cache setting is requested
-
setup_caching_shuffling_steps
(dataset: tensorflow.python.data.ops.dataset_ops.DatasetV2) → tensorflow.python.data.ops.dataset_ops.DatasetV2[source]¶ Set-up caching, shuffling and the iteration step in the dataset pipeline.
This function helps to ensure that caching, shuffling and step limiting is done properly and efficiently, no matter where in the dataset pipeline it is included.
- Parameters
dataset (tf.data.Dataset) – Datset for which to include the steps
- Returns
tf.data.Dataset – Datset with caching, shuffling and iteration steps included
-
PrognosAIs.IO.LabelParser module¶
-
class
PrognosAIs.IO.LabelParser.
LabelLoader
(label_file: str, filter_missing: bool = False, missing_value: int = - 1, make_one_hot: bool = False, new_root_path: str = None)[source]¶ Bases:
object
-
__init__
(label_file: str, filter_missing: bool = False, missing_value: int = - 1, make_one_hot: bool = False, new_root_path: str = None) → None[source]¶ Create a label loader, that can load the image paths and labels from a text file to be used for a data generator
- Parameters
label_file – The label file from which to read the labels
filter_missing – Whether missing values should be masked when generating one hot labels and class weights
missing_value – If filter_missing is True, this value is used to mask
make_one_hot – Whether labels should be transformed to one hot labels
new_root_path – If you want to move the files, this will be the new root path
-
encode_labels_one_hot
() → None[source]¶ Encode sample labels as one hot
- Parameters
None
- Returns
None
-
get_class_weights
(json_serializable=False) → dict[source]¶ Get class weights for unbalanced labels
- Parameters
None
- Returns
Scaled_weights –
- the weights for each class of each label category, scaled
such that the total weights*number of samples of each class approximates the total number of samples
-
get_data
() → dict[source]¶ Get all data from the label file
- Parameters
None
- Returns
data – Dictionary mapping each sample to each label
-
get_label_categories
() → list[source]¶ Get categories of labels
- Parameters
None
- Returns
label_categories – Category names
-
get_label_category_type
(category_name: str) → type[source]¶ Get the type of a label of a specific category/class
- Parameters
category_name – Name of the category/class to get type of
- Returns
type – Type of the labels of the category
-
get_label_from_sample
(sample: str) → dict[source]¶ Get label from a sample
- Parameters
sample – The sample from which to get the label
- Returns
label – Label of the sample
-
get_labels
() → list[source]¶ Get all labels of all samples
- Parameters
None
- Returns
labels – List of labels
-
get_labels_from_category
(category_name: str) → list[source]¶ Get labels of a specific category/class
- Parameters
category_name – Name of the category/class to get
- Returns
list – Labels of the category
-
get_number_of_classes
() → dict[source]¶ Get number of classes for all categories
- Parameters
None
- Returns
number_of_classes – The number of classes for each category
-
get_number_of_classes_from_category
(category_name: str) → int[source]¶ Get number of classes for a label category
- Parameters
category_name – Category to get number of classes for
- Returns
number_of_classes – The number of classes for the category
-
get_number_of_samples
() → int[source]¶ Get number of samples
- Parameters
None
- Returns
number_of_samples – The number of samples
-
get_original_label_category_type
(category_name: str) → type[source]¶ Get the original type of a label of a specific category/class
- Parameters
category_name – Name of the category/class to get type of
- Returns
type – Type of the labels of the category
-
get_original_labels_from_category
(category_name: str) → list[source]¶ Get original labels of a specific category/class
- Parameters
category_name – Name of the category/class to get
- Returns
list – Original labels of the category
-
PrognosAIs.IO.utils module¶
-
PrognosAIs.IO.utils.
get_available_ram
(used_memory: int = 0) → int[source]¶ Get the available RAM in bytes.
- Returns
int – available in RAM in bytes
-
PrognosAIs.IO.utils.
get_dir_size
(root_dir)[source]¶ Returns total size of all files in dir (and subdirs)
-
PrognosAIs.IO.utils.
get_gpu_compute_capability
(gpu: tensorflow.python.eager.context.PhysicalDevice) → tuple[source]¶
-
PrognosAIs.IO.utils.
gpu_supports_float16
(gpu: tensorflow.python.eager.context.PhysicalDevice) → bool[source]¶