Engineering callables in ECGProcess
ecgprocess.utils.engineering_tools provides ready-made callable objects and functions that can be passed directly to ECGTable via the engineer_meta, engineer_wave, and engineer_median parameters. Each callable receives the relevant data structure (a metadata dict or a signals dict) as its first positional argument, and the waveform/median callables also receive the current file’s metadata through a meta_dict keyword argument.
The implemented and showcased callables are non-exhaustive and the user is encouraged to generate (and contribute) their own functionality relevant for their own problems.
[1]:
import numpy as np
from ecgprocess.utils.engineering_tools import (
metadata_checkversion,
signal_correction,
signal_standardise_res,
LeadMapper,
)
from ecgprocess.example_data.examples import (
parsed_config,
list_dicom_paths,
)
from ecgprocess.process_dicom import ECGDICOMReader
from ecgprocess.tabular import ECGTable
from ecgprocess.errors import FileValidationError
# #### Relevant paths
dicom_paths = [
list_dicom_paths()['example_dicom_1'],
list_dicom_paths()['example_dicom_2'],
]
# #### Parsed config
parser_dicom = parsed_config()['parsed_dicom1']
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1776375908.676272 12670 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
metadata_checkversion
metadata_checkversion validates three fields in the metadata dict — software version, manufacturer, and model name — against expected values supplied at call time. When validation passes the original dict is returned unchanged; a FileValidationError is raised on any mismatch.
The function is designed to be used as the engineer_meta callable in ECGTable so that files from unexpected devices are rejected before signal processing begins.
[2]:
# Direct call — passing validation
meta_valid = {
'Softwave version': '1.02 SP03',
'Manufacturer': 'GE Healthcare',
'Model name': 'MV360',
}
result = metadata_checkversion(
meta_valid,
expected_version=['1.02 SP03', 'MUSE_9.0.9.18167'],
expected_manufacturer='GE Healthcare',
expected_model='MV360',
)
print('Validation passed; returned dict is same object:', result is meta_valid)
Validation passed; returned dict is same object: True
[3]:
# Direct call — failing validation (wrong manufacturer)
meta_invalid = {
'Softwave version': '1.02 SP03',
'Manufacturer': 'GE',
'Model name': 'MAC55',
}
try:
metadata_checkversion(
meta_invalid,
expected_version=['1.02 SP03', 'MUSE_9.0.9.18167'],
expected_manufacturer='GE Healthcare',
expected_model='MV360',
)
except FileValidationError as exc:
print('FileValidationError:', exc)
FileValidationError: Manufacturer `GE` failed to validate; expected one of ['GE Healthcare'].
signal_correction
signal_correction adjusts each signal array by subtracting baseline × correctionfactor on a per-channel basis. The per-channel baseline and correction factor values are read from meta_dict, which ECGTable passes automatically as a keyword argument.
The expected meta_dict keys follow the naming convention wave_channel_baseline_<i> and wave_channel_correctionfactor_<i> where i is the zero-based channel index.
[4]:
# Direct call — non-zero baseline and correction factor
synthetic_signals = {
'I': np.array([0.0, 10.0, 20.0]),
'II': np.array([5.0, 15.0, 25.0]),
}
meta_with_correction = {
'wave_channel_baseline_0': 10.0,
'wave_channel_correctionfactor_0': 2.0,
'wave_channel_baseline_1': 5.0,
'wave_channel_correctionfactor_1': 1.0,
}
corrected = signal_correction(
synthetic_signals,
meta_dict=meta_with_correction,
)
# Lead I: [0-10*2, 10-10*2, 20-10*2] = [-20, -10, 0]
# Lead II: [5-5*1, 15-5*1, 25-5*1] = [0, 10, 20]
print('Corrected I:', corrected['I'])
print('Corrected II:', corrected['II'])
Corrected I: [-20. 0. 20.]
Corrected II: [ 0. 10. 20.]
[5]:
# Via ECGTable — both example files have zero baselines so correction
# leaves the signals unchanged; verifies the pipeline wiring
reader = ECGDICOMReader()
ecgtable = ECGTable(
reader,
path_list=dicom_paths,
engineer_wave=signal_correction,
engineer_median=signal_correction,
)
table = ecgtable().get_table(parsed_config=parser_dicom)
table.WaveForms.head()
[5]:
| key | Sampling sequence | Lead | Voltage | Signal type | |
|---|---|---|---|---|---|
| 0 | 2.25.269796857626990821315969488216511468638 | 0 | I | 0.00 | WaveForms |
| 1 | 2.25.269796857626990821315969488216511468638 | 1 | I | 0.00 | WaveForms |
| 2 | 2.25.269796857626990821315969488216511468638 | 2 | I | 29.28 | WaveForms |
| 3 | 2.25.269796857626990821315969488216511468638 | 3 | I | 43.92 | WaveForms |
| 4 | 2.25.269796857626990821315969488216511468638 | 4 | I | 48.80 | WaveForms |
signal_standardise_res
signal_standardise_res rescales each signal array so that its amplitude represents the target_resolution µV per unit. The per-channel source resolution is read from meta_dict under the key wave_channel_sens_<i>. A scaling factor of source / target is applied.
The two example DICOM files have different native resolutions (4.88 µV and 5.0 µV respectively), making them a convenient real-world demonstration.
[6]:
# Direct call — synthetic signal at 4.88 uV rescaled to 5.0 uV
signal_488 = {'I': np.array([0.0, 4.88, 9.76])}
meta_488 = {'wave_channel_sens_0': 4.88}
rescaled = signal_standardise_res(
signal_488,
target_resolution=5.0,
meta_dict=meta_488,
)
print('Input I (4.88 uV/unit):', signal_488['I'])
print('Output I (5.0 uV/unit): ', rescaled['I'])
Input I (4.88 uV/unit): [0. 4.88 9.76]
Output I (5.0 uV/unit): [ 0. 5. 10.]
[7]:
# Via ECGTable — example_dicom_1 (4.88 uV) is rescaled;
# example_dicom_2 (5.0 uV) is unchanged
reader = ECGDICOMReader()
ecgtable_raw = ECGTable(reader, path_list=dicom_paths)
ecgtable_res = ECGTable(
reader,
path_list=dicom_paths,
engineer_wave=signal_standardise_res,
engineer_median=signal_standardise_res,
)
table_raw = ecgtable_raw().get_table(parsed_config=parser_dicom)
table_res = ecgtable_res().get_table(parsed_config=parser_dicom)
# Compare Lead I, first sample for each file
print('Before standardisation:')
print(table_raw.WaveForms.groupby('key').first()[['Lead', 'Voltage']])
print('\nAfter standardisation:')
print(table_res.WaveForms.groupby('key').first()[['Lead', 'Voltage']])
Before standardisation:
Lead Voltage
key
1.3.6.1.4.1.40744.65.22183544986928027811655533... I -105.0
2.25.269796857626990821315969488216511468638 I 0.0
After standardisation:
Lead Voltage
key
1.3.6.1.4.1.40744.65.22183544986928027811655533... I -105.0
2.25.269796857626990821315969488216511468638 I 0.0
LeadMapper
Some DICOM devices write ECG channels in a non-standard order — for example, Lead II data may be stored in the channel-0 slot. LeadMapper corrects this by reading signal name <i> entries from meta_dict, building an actual-device-label → canonical-key mapping, and reassigning the signal arrays accordingly.
LeadMapper is instantiated once with an accepted_mappings dict that maps each canonical key (e.g. 'I') to the list of device labels that are acceptable for that lead. The instance is then used as engineer_wave and/or engineer_median.
[8]:
# Construction and string representations
mapper = LeadMapper({
'I': ['Lead I', 'I', 'Lead I (Einthoven)'],
'II': ['Lead II', 'II', 'Lead II (Einthoven)'],
})
print(repr(mapper))
print(str(mapper))
LeadMapper(accepted_mappings={'I': ['Lead I', 'I', 'Lead I (Einthoven)'], 'II': ['Lead II', 'II', 'Lead II (Einthoven)']})
LeadMapper with 2 canonical leads: ['I', 'II']
[9]:
# Synthetic remapping demo
# The device wrote Lead II data into channel-0 (key 'I') and
# Lead I data into channel-1 (key 'II').
lead_II_array = np.array([1.0, 2.0, 3.0])
lead_I_array = np.array([4.0, 5.0, 6.0])
signals_swapped = {'I': lead_II_array, 'II': lead_I_array}
# meta_dict tells us what the device actually stored in each channel
meta_swapped = {
'signal name 0': 'Lead II',
'signal name 1': 'Lead I',
}
corrected_signals = mapper(
signals_swapped,
meta_dict=meta_swapped,
)
print('After remapping:')
print(' signals["I"] (should be Lead I [4,5,6]):', corrected_signals['I'])
print(' signals["II"] (should be Lead II [1,2,3]):', corrected_signals['II'])
After remapping:
signals["I"] (should be Lead I [4,5,6]): [4. 5. 6.]
signals["II"] (should be Lead II [1,2,3]): [1. 2. 3.]
Chaining callables
Multiple engineering steps can be composed into a single wrapper function and passed as one callable. The wrapper below applies baseline correction followed by resolution standardisation in a single pass.
[10]:
def correct_and_standardise(
signals: dict,
target_resolution: float = 5.0,
verbose: bool = False,
**kwargs,
) -> dict:
"""Apply baseline correction then resolution standardisation."""
signals = signal_correction(signals, verbose=verbose, **kwargs)
return signal_standardise_res(
signals,
target_resolution=target_resolution,
verbose=verbose,
**kwargs,
)
reader = ECGDICOMReader()
ecgtable = ECGTable(
reader,
path_list=dicom_paths,
engineer_wave=correct_and_standardise,
engineer_median=correct_and_standardise,
)
table = ecgtable().get_table(parsed_config=parser_dicom)
table.WaveForms.head()
[10]:
| key | Sampling sequence | Lead | Voltage | Signal type | |
|---|---|---|---|---|---|
| 0 | 2.25.269796857626990821315969488216511468638 | 0 | I | 0.0 | WaveForms |
| 1 | 2.25.269796857626990821315969488216511468638 | 1 | I | 0.0 | WaveForms |
| 2 | 2.25.269796857626990821315969488216511468638 | 2 | I | 30.0 | WaveForms |
| 3 | 2.25.269796857626990821315969488216511468638 | 3 | I | 45.0 | WaveForms |
| 4 | 2.25.269796857626990821315969488216511468638 | 4 | I | 50.0 | WaveForms |