| dc.description.abstract |
Tools and technologies for mental health screening in existing literature either sacrifice
conversational capability over diagnosis accuracy or vice versa. Moreover, both have tradeoffs
between computing resources and model capacity. This research aimed to answer this problem
by introducing a new model architecture that can achieve both human-friendly conversational
capability and diagnosis accuracy under limited computing resources.
To address this problem, a small language model (SLM) was selected and employed for the
conversational progression. A transformer-based encoder model was fine-tuned for the
disorder classification. The training and evaluation data was generated based on the PHQ-9
clinical screening instrument. Both models were trained, deployed, integrated, and inferred on
cloud-based high-performance VMs. The framework was validated through the dimensions of
diagnosis accuracy, empathic score, and computing resource consumption. Furthermore, the
framework was benchmarked against the popular commercial LLM, ChatGPT-4o-mini.
The framework achieved a 98.73% accuracy in classifying depression, a 0.84 empathic score,
and a 71.67% accuracy in off-topic handling while being trainable and inferable on NVIDIA
Tesla T4, a 16 GB GPU with 16 GB RAM. Benchmarked against ChatGPT-4o-mini, it attained
an accuracy of 86.34%, whereas GPT achieved 100% accuracy. |
en_US |