Processing XML files containing ECG data
The process_xml module offers a robust codebase for validating and processing XML files. It establishes an API by mapping XML content to the MetaData, WaveForms, and MedianBeats class attributes. By utilizing user-supplied configuration files, both input and output data can be fully customized using ECGprocess or user-defined functions and methods.
In the following we will illustrate the core functionality of module. First we will import the relevant functions and classes, as well as some example XML data.
[1]:
import ecgprocess.process_xml as pro_xml
import ecgprocess.utils.config_tools as config_utils
from tempfile import NamedTemporaryFile
from ecgprocess.example_data.examples import (
config_file,
list_xml_paths,
)
# the XML file and XSD schema paths
path = list_xml_paths()['example_1']
schema = list_xml_paths()['example_1_schema']
Loading an XML file
We will start by loading an XML file, and seeing how we can use this to define a custom configuration file.
[2]:
reader = pro_xml.ECGXMLReader()
parsed_xml = reader(path, verbose=False)
parsed_xml.tags[0:10]
[2]:
['ObservationType',
'ObservationDateTime.Hour',
'ObservationDateTime.Minute',
'ObservationDateTime.Second',
'ObservationDateTime.Day',
'ObservationDateTime.Month',
'ObservationDateTime.Year',
'UID.DICOMStudyUID',
'ClinicalInfo.ReasonForStudy',
'ClinicalInfo.Technician.FamilyName']
Validating an XML file and applying a minimal amount of data augmentation
The XML reader class can additional validate the XML file using an XML schema. XML validation is higly recomended to ensure the content matches expetation and identify files with unanticipated structure.
Furthermore, if the augmented leads are omitted from the source XML file, we can calculate them and resample the ECG signals to a standardized frequency of 500 Hz.
[5]:
reader = pro_xml.ECGXMLReader(augment_leads=True, resample_500=True)
parsed_xml = reader(path, schema=schema, verbose=False)