| dc.description.abstract |
Accurate segmentation of cardiac MRI images is essential for clinical diagnostics, but constrictive
edge computing resources make it difficult to use deep learning U-Net models because of their
large memory and computation requirements. Efficient models such as ESPNetv2 provide a
solution for deploying segmentation models to edge devices, but segmentation quality suffers
tremendously, making them clinically impractical.
This work focuses of performant ESPNetv2 models. I present a quantization-driven enhancement
of ESPNetv2 with Quantization Aware Training (QAT) to improve model accuracy while
optimizing resource efficiency. The pipeline was developed and evaluated on the MnMs-2 dataset
using CPU-only inference to mimic low-resource deployment scenarios. Extensive evaluation on
Dice Coefficient, Intersection over Union (IoU), inference latency, parametric size, and RAM
constraints alongside segmentation precision were conducted to evaluate model efficacy.
The float32 baseline model achieved a Dice score of 0.6862 and an IoU of 0.5725 while having
on 0.36 million parameters and being trained on 2D cardiac MRI slices. The model was compact
at 1.35 MB and required a peak RAM usage of 1.06 MB during inference. It could process each
image in an approximated 0.0337 seconds. After applying Quantization Aware Training (QAT),
the model was able to maintain a strong Dice score of 0.6766 (a decrease of 1.4%) and improve
the IoU to 0.5949 (an increase of 3.9%) showing that segmentation performance retention despite
compression was robust. These results came from a 97.2% reduction in parameter count and 65.3%
decrease in RAM. The increase in inference latency by 94.6%, reached 0.0656 seconds per 2D
image, remains in acceptable bounds for real-time performance in medical scenarios despite the
fact.
Directly incorporating quantization compatibility into model design and training demonstrates the
practical potential of deploying efficient cardiac segmentation AI models in constrained settings.
The architecture proposed, ESPNetv2_QuantLite, illustrates the capability of preserving clinically
relevant performance under extreme memory and computational constraints remarkably. |
en_US |