Article citationsMore >>

Alsentzer, E., et al. “Publicly available clinical BERT embeddings for clinical NLP.” Proceedings of the Clinical NLP Workshop, 2019.

has been cited by the following article:

Article

Towards Domain-Aware Language Models for Low-Resource Healthcare: Fine-Tuning LLMs and SLMs with Parallel English–Yoruba Maternal Health Data

1Department of Computer Science, University of Ibadan, Ibadan, Nigeria

2Faculty of Computing, University of Ibadan, Ibadan, Nigeria


Journal of Computer Sciences and Applications. 2025, Vol. 13 No. 2, 54-58
DOI: 10.12691/jcsa-13-2-3
Copyright © 2025 Science and Education Publishing

Cite this paper:
Elizabeth Ogunseye, Ezekiel Oladejo, Isaac Olaleye, Adesesan B. Adeyemo. Towards Domain-Aware Language Models for Low-Resource Healthcare: Fine-Tuning LLMs and SLMs with Parallel English–Yoruba Maternal Health Data. Journal of Computer Sciences and Applications. 2025; 13(2):54-58. doi: 10.12691/jcsa-13-2-3.

Correspondence to: Ezekiel  Oladejo, Department of Computer Science, University of Ibadan, Ibadan, Nigeria. Email: eoladejo184@stu.ui.edu.ng

Abstract

Healthcare communication barriers significantly impact maternal health outcomes in multilingual communities, particularly in Sub-Saharan Africa where indigenous languages dominate daily communication while medical resources remain primarily in colonial languages. This study presents a systematic approach to developing domain-aware language models specifically tailored for maternal health communication in low-resource settings. We introduce a comprehensive parallel English–Yoruba maternal health dataset comprising 7,000 translated and verified sentence pairs covering prenatal care, childbirth and postnatal support. Our methodology involves fine-tuning both Large Language Models (LLMs) including GPT-3.5-turbo and LLaMA-2-7B, and Small Language Models (SLMs) such as DistilBERT, mBERT, and XLM-R across multiple evaluation dimensions including translation quality, domain-specific terminology accuracy, and clinical relevance metrics. Results demonstrate that domain-specific fine-tuning significantly improves performance over general-purpose models, with the GPT-3.5-turbo variant achieving a BLEU score of 0.78 on the held-out test set and medical terminology accuracy of 89.3%. Fine-tuned models demonstrate substantial improvements in handling culture-specific maternal health concepts and traditional medicine terminology. This work contributes to bridging the digital health divide in low-resource settings and provides a replicable framework for developing multilingual healthcare AI systems that can effectively serve diverse linguistic communities while maintaining cultural and clinical sensitivity.

Keywords