Abstract:
"MRI research opportunities per patient are constrained by limited scan time and data storage. MRI machines store a finite number of records, with hospitals exporting visualizations based on real-time processing parameters. Retrospective analysis is often impossible as only preprocessed images remain available. By post-processing scan data on readily available software, clinicians could re-analyze cases for enhanced insights over time. Though MRI software continually improves, pre-recorded data cannot leverage new techniques. Efficient model deployment to mainstream devices could help address these efficiency and flexibility limitations.
Manual segmentation of magnetic resonance images (MRIs) is extremely laborious and prone to subjective errors by human experts. Automated deep learning algorithms provide a potential solution, but state-of-the-art convolutional neural networks (CNNs) have millions of parameters requiring heavy computation unsuitable for clinical deployment. This work aims to enable real-time MRI analysis on mainstream hardware without accuracy loss through model quantization.
A quantized U-Net model is developed by restricting weights and activations to 8-bit integers versus 32-bit floats. Quantization-aware training maintains segmentation performance while compressing the model 4x from 124MB to 31MB. Experiments on the brain tumor BRATS dataset demonstrate no degradation in Dice score (0.89) for the quantized versus float model, while reducing inference time 3x from 1.2 seconds to 0.4 seconds per 3D MRI volume on a mainstream laptop CPU.
Initial results highlight the feasibility of model quantization to remove computational bottlenecks for MRI analysis algorithms. The compressed model enables efficient deployment on common computer hardware without requiring specialized GPUs. Quantization could thus provide the missing link for translating automated deep learning segmentation from research to widespread clinical adoption."