1Department of Computer Science, University of Ibadan, Ibadan, Nigeria
2Faculty of Computing, University of Ibadan, Ibadan, Nigeria
Journal of Computer Sciences and Applications.
2025,
Vol. 13 No. 2, 54-58
DOI: 10.12691/jcsa-13-2-3
Copyright © 2025 Science and Education PublishingCite this paper: Elizabeth Ogunseye, Ezekiel Oladejo, Isaac Olaleye, Adesesan B. Adeyemo. Towards Domain-Aware Language Models for Low-Resource Healthcare: Fine-Tuning LLMs and SLMs with Parallel English–Yoruba Maternal Health Data.
Journal of Computer Sciences and Applications. 2025; 13(2):54-58. doi: 10.12691/jcsa-13-2-3.
Correspondence to: Ezekiel Oladejo, Department of Computer Science, University of Ibadan, Ibadan, Nigeria. Email:
eoladejo184@stu.ui.edu.ngAbstract
Healthcare communication barriers significantly impact maternal health outcomes in multilingual communities, particularly in Sub-Saharan Africa where indigenous languages dominate daily communication while medical resources remain primarily in colonial languages. This study presents a systematic approach to developing domain-aware language models specifically tailored for maternal health communication in low-resource settings. We introduce a comprehensive parallel English–Yoruba maternal health dataset comprising 7,000 translated and verified sentence pairs covering prenatal care, childbirth and postnatal support. Our methodology involves fine-tuning both Large Language Models (LLMs) including GPT-3.5-turbo and LLaMA-2-7B, and Small Language Models (SLMs) such as DistilBERT, mBERT, and XLM-R across multiple evaluation dimensions including translation quality, domain-specific terminology accuracy, and clinical relevance metrics. Results demonstrate that domain-specific fine-tuning significantly improves performance over general-purpose models, with the GPT-3.5-turbo variant achieving a BLEU score of 0.78 on the held-out test set and medical terminology accuracy of 89.3%. Fine-tuned models demonstrate substantial improvements in handling culture-specific maternal health concepts and traditional medicine terminology. This work contributes to bridging the digital health divide in low-resource settings and provides a replicable framework for developing multilingual healthcare AI systems that can effectively serve diverse linguistic communities while maintaining cultural and clinical sensitivity.
Keywords