Usage#

OkaPy is designed for processing 3D DICOM images.

There are two primary use cases for OkaPy:

Conversion to NIfTI Files: OkaPy allows you to convert unsorted DICOM files to NIfTI files, specifically handling 3D images.
Feature Extraction from DICOM Files: Another usage involves extracting features directly from unsorted DICOM files. In this scenario, OkaPy first converts the DICOM files to NIfTI format and then utilizes PyRadiomics to perform feature extraction.

Example of the first usage:

from okapy.dicomconverter.converter import NiftiConverter

path_input = "path/to/DICOM/folder"
path_output = "path/to/NIfTI/folder"
converter = NiftiConverter()
result_conversion = converter(path_input, output_folder=output_folder)

The result_conversion is a summary of the conversion. Calling okapy.dicomconverter.converter.NiftiConverter like this will read all the DICOM files from the path_input and convert all the images to NIfTI and store it in the path_output folder. If RTSTRUCT are present, all the labels contained in the RTSTRUCT files will be stored in different NIfTI files (this behaviour can be controlled with the labels_startswith parameter that can be passe to the constructor of okapy.dicomconverter.converter.NiftiConverter).

Example of the second usage:

from okapy.dicomconverter.converter import ExtractorConverter

path_input = "path/to/DICOM/folder"
path_to_params = "path/to/parameters.yaml"
converter = ExtractorConverter.from_params(path_to_params)
result_conversion = converter(path_input)

The result_conversion is pandas.DataFrame containing all the feature values for each images. To use the converter this way, one must ensure that segmentation are present for each studies in the form of SEG or RTSTRUCT files. For PET images, one might want to use a RSTRUCT drawn on the CT, this can be achieved with the parameter combine_segmentation: True in the parameter file. More details are provided below for the parameter file.

Parameter File#

The parameter file for OkaPy is a YAML file containing configuration settings for various aspects of the processing pipeline. Here is an example:

general:
  padding: 10 # padding in [mm] around the union of the segmentations to avoid resampling huge image, "whole_image" is a possible value if you want to resample the whole image
  submodalities: False # used for MR, if true okapy parse the SeriesDescription DICOM tag for ` --- ` and append the following string to the modality
  combine_segmentation: False # If the segmentation within a study should be used for all modality (can be useful for PT when the RTSTRUCT is drawn on the CT)
  result_format: "long" # "long" or "multiindex", "long" should be used
  additional_dicom_tags: # Add DICOM tag here, they will be added in the final Dataframe
    - "SeriesInstanceUID"

volume_preprocessing: # all the preprocessing applied to the images by modality
  common: # to apply to all the images
  PT:
    bspline_resampler: # name defined when subclassing `okapy.dicomconverter.volume_processor.VolumeProcessor`
      resampling_spacing: [2.0, 2.0, 2.0]
      order: 3
  CT:
    bspline_resampler:
      resampling_spacing: [1.0, 1.0, 1.0]
      order: 3
  default: # to apply if the modality is not defined

mask_preprocessing: # all the preprocessing applied to the segmentation (RTSTRUCT or SEG), the "pixel_spacing" is inferred on the image it corresponds to
  default:
    binary_bspline_resampler:
      order: 3

feature_extraction: # parameters for the feature extraction, can be defined for each modality/submodality
  MR:
    pyradiomics: # The following are the parameters for pyradiomics, you can paste parameters.yaml from their github with the right indentation
      original:
        imageType:
          Original: {}
        featureClass:
          shape:
          firstorder:

        setting:
          normalize: False
          normalizeScale: 100  # This allows you to use more or less the same bin width.
          binWidth: 5
          voxelArrayShift: 0
          label: 1
  common: # Feature extraction applied to all modalities
      riesz: # Not supported out of the box, a MATLAB image of QuantImage v1 is used for that
        extractor0:
          RieszOrder: 1
          RieszScales: 4

Other examples can be found in parameters/. The preprocessing applied to the images and segmentations can be tailored to your need. This step is abstracted by the class okapy.dicomconverter.volume_processor.VolumeProcessor. By subclassing this class it is possible to define your own preprocessing step.

Subclassing `VolumeProcessor`#

You can create a custom processor by subclassing okapy.dicomconverter.volume_processor.VolumeProcessor. When doing so, it’s essential to specify a name parameter for reference in the parameter YAML file. Here’s an example:

class MyImageProcessor(VolumeProcessor, name="my_image_processor"):

    def __init__(self, *args, my_arg1=None, my_arg2=None, **kwargs):
        super().__init__(*args, **kwargs)
        self.my_arg1 = my_arg1
        self.my_arg2 = my_arg2

    def process(self, volume, *args, mask_file=None, **kwargs):
        # your processing
        return volume

The MyImageProcessor example demonstrates subclassing VolumeProcessor, specifying initialization arguments through __init__, and implementing the processing logic in the process method. Ensure to provide the required name parameter.

Requirements for the subclass: - Implement the __init__() method specifying preprocessing arguments. - Implement the process() method, which receives the image (volume variable) and segmentation (mask_file variable).

The base class, okapy.dicomconverter.volume_processor.VolumeProcessor, provides access to the segmentation resampler through the mask_resampler attribute.

Additional Example: Masked Standardizer#

Here’s an additional example illustrating a processor that standardizes an image within a region defined by a specific label:

class MaskedStandardizer(VolumeProcessor, name="masked_standardizer"):

    def __init__(self, *args, mask_label="", mask_resampler=None, **kwargs):
        super().__init__(*args, **kwargs)
        self.mask_label = mask_label
        if mask_resampler is None:
            raise TypeError("mask_resampler cannot be None")
        self.mask_resampler = mask_resampler

    def _get_mask_array(self, mask_files, reference_frame=None):
        mask = None
        for f in mask_files:
            if self.mask_label in f.labels:
                mask = f.get_volume(self.mask_label)
                break
        if mask is None:
            raise MissingSegmentationException(
                f"The label {self.mask_label} was not found.")
        return self.mask_resampler(mask, new_reference_frame=reference_frame).array != 0

    def process(self, volume, mask_files=None, **kwargs):
        array = volume.array
        mask_array = self._get_mask_array(
            mask_files, reference_frame=volume.reference_frame)
        mean = np.mean(array[mask_array])
        std = np.std(array[mask_array])
        array = (array - mean) / std
        volume.array = array
        return volume

Now if you want to use the MaskedStandardizer in your pipeline, simply write this in your YAML file (under the section preprocesing):

volume_preprocessing:
  MR:
    masked_standardizer: # the name you defined during suclassing
      mask_label: "edema" # the argument you define, here as an example we took edema

Usage#

Parameter File#

Subclassing VolumeProcessor#

Additional Example: Masked Standardizer#

Subclassing `VolumeProcessor`#