Processing DICOM files containing ECG data

The process_dicom module offers a robust codebase for processing DCM files. It establishes an API by mapping DCM file content to the MetaData, WaveForms, and MedianBeats class attributes. By utilizing user-supplied configuration files, both input and output data can be fully customized using ECGprocess or user-defined functions and methods.

In the following we will illustrate the core functionality of module. First we will import the relevant functions and classes, as well as some example DICOM data.

[1]:
import ecgprocess.process_dicom as pro_dicom
import ecgprocess.utils.config_tools as config_utils
from tempfile import NamedTemporaryFile
from ecgprocess.example_data.examples import (
    list_dicom_paths as list_data_paths,
    config_file,
)
from ecgprocess.constants import CoreData as Core
# get the privileged lead names
CLeads = Core.Leads

# Example file
path = list_data_paths()['example_dicom_1']

Loading an DICOM file

We will start by loading an DICOM file, and seeing how we can use this to define a custom configuration file.

[2]:
reader = pro_dicom.ECGDICOMReader()
parsed_dicom = reader(path, verbose=False)
parsed_dicom.tags[0:10]
[2]:
['Specific Character Set',
 'Instance Creation Date',
 'Instance Creation Time',
 'SOP Class UID',
 'SOP Instance UID',
 'Study Date',
 'Content Date',
 'Acquisition DateTime',
 'Study Time',
 'Content Time']

Creating a configuration file based on the parsed file tags

The DICOM reader class flattens the file content into a single dictionary, where the nested attributes of the DICOM file are concatenated into individual dictionary keys. These data can be accessed directly through parsed_dicom.raw_data, where the dictionary keys (as shown above) can be accessed through the tags attribute.

We will use these tags to create a configuration file, mapping DICO content to the class attributes MetaData, WaveForms, and MedianBeats. We will create a dictionary with lists containing tab delimited strings where the LHS is the internal name used the ECGprocess and the RHS is the DICOM tag name. In the current example we will write this dictionary to file and immediately read this in again. In real applications typically one would have a configuration file stored to disk and re-used in multiple analyses.

[3]:
leads = [attr for attr in dir(CLeads) if not attr.startswith("__")]
config_dicom = {
        "WaveForms": [
            f"{getattr(CLeads, lead)}\tWaveformData.ECG_Leads_{idx}"
            for idx, lead in enumerate(leads)],
        "MedianBeats": [
            f"{getattr(CLeads, lead)}\tWaveformData.Median_Beats_{idx}"
            for idx, lead in enumerate(leads)],
        "MetaData": [
            "unique identifier\tSOP Instance UID",
            "number of leads\tWaveform Sequence_0.Number of Waveform Channels",
            "resolution unit (waveforms)\tWaveform Sequence_0.Channel Definition Sequence_0.Channel Sensitivity Units Sequence_0.Code Value",
            "resolution (waveforms)\tWaveform Sequence_0.Channel Definition Sequence_0.Channel Sensitivity",
            "resolution unit (medianbeats)\tWaveform Sequence_1.Channel Definition Sequence_0.Channel Sensitivity Units Sequence_0.Code Value",
            "resolution (medianbeats)\tWaveform Sequence_1.Channel Definition Sequence_0.Channel Sensitivity",
            "sampling frequency (original)\tWaveform Sequence_0.Sampling Frequency",
            "sampling number (waveforms)\tWaveform Sequence_0.Number of Waveform Samples",
            "sampling number (medianbeats)\tWaveform Sequence_1.Number of Waveform Samples",
            "Softwave version\tSoftware Versions",
            "Manufacturer\tManufacturer",
            "Model name\tManufacturer's Model Name",
        ] +\
        [
            f"wave_channel_sens_{idx}\tWaveform Sequence_0.Channel Definition Sequence_{idx}.Channel Sensitivity" for idx, _ in enumerate(leads)] +\
        [
            f"wave_channel_correctionfactor_{idx}\tWaveform Sequence_0.Channel Definition Sequence_{idx}.Channel Sensitivity Correction Factor" for idx, _ in enumerate(leads)] +\
        [
            f"wave_channel_baseline_{idx}\tWaveform Sequence_0.Channel Definition Sequence_{idx}.Channel Baseline" for idx, _ in enumerate(leads)]
    }

with NamedTemporaryFile("w") as tmp_file:
    _ = config_file(path=tmp_file.name, text=config_dicom)
    parser_dicom = config_utils.ConfigParser(tmp_file.name)()
# adding the mapper
parser_dicom.map(mapper=config_utils.DataMap())
print(parser_dicom)
ConfigParser
[WaveForms]
        I                                   WaveformData.ECG_Leads_0
        II                                  WaveformData.ECG_Leads_1
        III                                 WaveformData.ECG_Leads_2
        V1                                  WaveformData.ECG_Leads_3
        V2                                  WaveformData.ECG_Leads_4
        V3                                  WaveformData.ECG_Leads_5
        V4                                  WaveformData.ECG_Leads_6
        V5                                  WaveformData.ECG_Leads_7
        V6                                  WaveformData.ECG_Leads_8
        aVF                                 WaveformData.ECG_Leads_9
        aVL                                 WaveformData.ECG_Leads_10
        aVR                                 WaveformData.ECG_Leads_11

[MedianBeats]
        I                                   WaveformData.Median_Beats_0
        II                                  WaveformData.Median_Beats_1
        III                                 WaveformData.Median_Beats_2
        V1                                  WaveformData.Median_Beats_3
        V2                                  WaveformData.Median_Beats_4
        V3                                  WaveformData.Median_Beats_5
        V4                                  WaveformData.Median_Beats_6
        V5                                  WaveformData.Median_Beats_7
        V6                                  WaveformData.Median_Beats_8
        aVF                                 WaveformData.Median_Beats_9
        aVL                                 WaveformData.Median_Beats_10
        aVR                                 WaveformData.Median_Beats_11

[MetaData]
        unique identifier                   SOP Instance UID
        number of leads                     Waveform Sequence_0.Number of Waveform Channels
        resolution unit (waveforms)         Waveform Sequence_0.Channel Definition Sequence_0.Channel Sensitivity Units Sequence_0.Code Value
        resolution (waveforms)              Waveform Sequence_0.Channel Definition Sequence_0.Channel Sensitivity
        resolution unit (medianbeats)       Waveform Sequence_1.Channel Definition Sequence_0.Channel Sensitivity Units Sequence_0.Code Value
        resolution (medianbeats)            Waveform Sequence_1.Channel Definition Sequence_0.Channel Sensitivity
        sampling frequency (original)       Waveform Sequence_0.Sampling Frequency
        sampling number (waveforms)         Waveform Sequence_0.Number of Waveform Samples
        sampling number (medianbeats)       Waveform Sequence_1.Number of Waveform Samples
        Softwave version                    Software Versions
        Manufacturer                        Manufacturer
        Model name                          Manufacturer's Model Name
        wave_channel_sens_0                 Waveform Sequence_0.Channel Definition Sequence_0.Channel Sensitivity
        wave_channel_sens_1                 Waveform Sequence_0.Channel Definition Sequence_1.Channel Sensitivity
        wave_channel_sens_2                 Waveform Sequence_0.Channel Definition Sequence_2.Channel Sensitivity
        wave_channel_sens_3                 Waveform Sequence_0.Channel Definition Sequence_3.Channel Sensitivity
        wave_channel_sens_4                 Waveform Sequence_0.Channel Definition Sequence_4.Channel Sensitivity
        wave_channel_sens_5                 Waveform Sequence_0.Channel Definition Sequence_5.Channel Sensitivity
        wave_channel_sens_6                 Waveform Sequence_0.Channel Definition Sequence_6.Channel Sensitivity
        wave_channel_sens_7                 Waveform Sequence_0.Channel Definition Sequence_7.Channel Sensitivity
        wave_channel_sens_8                 Waveform Sequence_0.Channel Definition Sequence_8.Channel Sensitivity
        wave_channel_sens_9                 Waveform Sequence_0.Channel Definition Sequence_9.Channel Sensitivity
        wave_channel_sens_10                Waveform Sequence_0.Channel Definition Sequence_10.Channel Sensitivity
        wave_channel_sens_11                Waveform Sequence_0.Channel Definition Sequence_11.Channel Sensitivity
        wave_channel_correctionfactor_0     Waveform Sequence_0.Channel Definition Sequence_0.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_1     Waveform Sequence_0.Channel Definition Sequence_1.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_2     Waveform Sequence_0.Channel Definition Sequence_2.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_3     Waveform Sequence_0.Channel Definition Sequence_3.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_4     Waveform Sequence_0.Channel Definition Sequence_4.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_5     Waveform Sequence_0.Channel Definition Sequence_5.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_6     Waveform Sequence_0.Channel Definition Sequence_6.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_7     Waveform Sequence_0.Channel Definition Sequence_7.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_8     Waveform Sequence_0.Channel Definition Sequence_8.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_9     Waveform Sequence_0.Channel Definition Sequence_9.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_10    Waveform Sequence_0.Channel Definition Sequence_10.Channel Sensitivity Correction Factor
        wave_channel_correctionfactor_11    Waveform Sequence_0.Channel Definition Sequence_11.Channel Sensitivity Correction Factor
        wave_channel_baseline_0             Waveform Sequence_0.Channel Definition Sequence_0.Channel Baseline
        wave_channel_baseline_1             Waveform Sequence_0.Channel Definition Sequence_1.Channel Baseline
        wave_channel_baseline_2             Waveform Sequence_0.Channel Definition Sequence_2.Channel Baseline
        wave_channel_baseline_3             Waveform Sequence_0.Channel Definition Sequence_3.Channel Baseline
        wave_channel_baseline_4             Waveform Sequence_0.Channel Definition Sequence_4.Channel Baseline
        wave_channel_baseline_5             Waveform Sequence_0.Channel Definition Sequence_5.Channel Baseline
        wave_channel_baseline_6             Waveform Sequence_0.Channel Definition Sequence_6.Channel Baseline
        wave_channel_baseline_7             Waveform Sequence_0.Channel Definition Sequence_7.Channel Baseline
        wave_channel_baseline_8             Waveform Sequence_0.Channel Definition Sequence_8.Channel Baseline
        wave_channel_baseline_9             Waveform Sequence_0.Channel Definition Sequence_9.Channel Baseline
        wave_channel_baseline_10            Waveform Sequence_0.Channel Definition Sequence_10.Channel Baseline
        wave_channel_baseline_11            Waveform Sequence_0.Channel Definition Sequence_11.Channel Baseline

[4]:
### Mapping the file content to the API entry points
extract = parsed_dicom.extract(config=parser_dicom)
### showing the API content
# Notice that the lead names which are exlcuded by the config file are set to `None`, this is ensured by the `DataMap`
# class which makes sure privileged data such as the leads are always present.
print(f'Metadata:\n{extract.MetaData}\nWaveforms:\n{extract.WaveForms}')
Metadata:
{'unique identifier': '2.25.269796857626990821315969488216511468638', 'number of leads': 12, 'resolution unit (waveforms)': 'uV', 'resolution unit (medianbeats)': 'uV', 'resolution (waveforms)': '4.88', 'resolution (medianbeats)': '4.88', 'sampling frequency (original)': '500.0', 'sampling frequency unit': None, 'sampling number (waveforms)': 5000, 'sampling number (medianbeats)': 600, 'Softwave version': ['1.02 SP03', 'MUSE_9.0.9.18167'], 'Manufacturer': 'GE Healthcare', 'Model name': 'MV360', 'wave_channel_sens_0': '4.88', 'wave_channel_sens_1': '4.88', 'wave_channel_sens_2': '4.88', 'wave_channel_sens_3': '4.88', 'wave_channel_sens_4': '4.88', 'wave_channel_sens_5': '4.88', 'wave_channel_sens_6': '4.88', 'wave_channel_sens_7': '4.88', 'wave_channel_sens_8': '4.88', 'wave_channel_sens_9': '4.88', 'wave_channel_sens_10': '4.88', 'wave_channel_sens_11': '4.88', 'wave_channel_correctionfactor_0': '1.0', 'wave_channel_correctionfactor_1': '1.0', 'wave_channel_correctionfactor_2': '1.0', 'wave_channel_correctionfactor_3': '1.0', 'wave_channel_correctionfactor_4': '1.0', 'wave_channel_correctionfactor_5': '1.0', 'wave_channel_correctionfactor_6': '1.0', 'wave_channel_correctionfactor_7': '1.0', 'wave_channel_correctionfactor_8': '1.0', 'wave_channel_correctionfactor_9': '1.0', 'wave_channel_correctionfactor_10': '1.0', 'wave_channel_correctionfactor_11': '1.0', 'wave_channel_baseline_0': '0.0', 'wave_channel_baseline_1': '0.0', 'wave_channel_baseline_2': '0.0', 'wave_channel_baseline_3': '0.0', 'wave_channel_baseline_4': '0.0', 'wave_channel_baseline_5': '0.0', 'wave_channel_baseline_6': '0.0', 'wave_channel_baseline_7': '0.0', 'wave_channel_baseline_8': '0.0', 'wave_channel_baseline_9': '0.0', 'wave_channel_baseline_10': '0.0', 'wave_channel_baseline_11': '0.0', 'duration (sec)': 10.0, 'sampling frequency (processed)': None}
Waveforms:
{'I': array([  0.  ,   0.  ,  29.28, ..., -48.8 , -53.68, -34.16]), 'II': array([   0.  ,   -9.76,    9.76, ..., -102.48,  -82.96,  -58.56]), 'III': array([  0.  ,  -9.76, -19.52, ..., -53.68, -29.28, -24.4 ]), 'V1': array([  0.  ,   4.88, -19.52, ...,  73.2 ,  68.32,  43.92]), 'V2': array([ 0.  ,  4.88, 24.4 , ...,  0.  , -9.76, -4.88]), 'V3': array([  0.  ,  -9.76,  -4.88, ..., -78.08, -53.68, -39.04]), 'V4': array([-4.88,  0.  , -9.76, ..., 24.4 , 24.4 , 24.4 ]), 'V5': array([ 9.76, 24.4 , 24.4 , ...,  0.  ,  0.  ,  0.  ]), 'V6': array([24.4 , 34.16, 29.28, ...,  0.  ,  0.  ,  0.  ]), 'aVF': array([ 29.28,  34.16,  39.04, ..., -29.28, -34.16, -29.28]), 'aVL': array([-14.64, -14.64,  -9.76, ..., -78.08, -82.96, -78.08]), 'aVR': array([-34.16, -29.28, -24.4 , ..., -87.84, -92.72, -87.84])}