IconAppHerb: Language Model for Recommending Traditional Thai Medicine

Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen, Thailand
AI, 2025
Graphical Abstract

Abstract

Trust in Traditional Thai Medicine (TTM) among Thai people has been reduced due to a lack of objective standards and the susceptibility of the general population to false information. The emergence of generative artificial intelligence (Gen AI) has significantly impacted various industries, including traditional medicine. However, previous Gen AI models have primarily focused on prescription generation based on Traditional Chinese Medicine (TCM), leaving TTM unexplored. To address this gap, we propose a novel fast-learning fine-tuned language model fortified with TTM knowledge. We utilized textual data from two TTM textbooks, Wat Ratcha-orasaram Ratchaworawihan (WRO), and Tamra Osot Phra Narai (NR), to fine-tune Unsloth’s Gemma-2 with 9 billion parameters. We developed two specialized TTM tasks: treatment prediction (TrP) and herbal recipe generation (HRG). The TrP and HRG models achieved precision, recall, and F1 scores of 26.54%, 28.14%, and 24.00%, and 32.51%, 24.42%, and 24.84%, respectively. Performance evaluation against TCM-based generative models showed comparable precision, recall, and F1 results with a smaller knowledge corpus. We further addressed the challenges of utilizing Thai, a low-resource and linguistically complex language. Unlike English or Chinese, Thai lacks explicit sentence boundary markers and employs an abugida writing system without spaces between words, complicating text segmentation and generation. These characteristics pose significant difficulties for machine understanding and limit model accuracy. Despite these obstacles, our work establishes a foundation for further development of AI-assisted TTM applications and highlights both the opportunities and challenges in applying language models to traditional medicine knowledge systems in Thai language contexts.

To prepare the data for fine-tuning AppHerb, traditional Thai medical knowledge was digitized from two classical texts: Wat Ratcha-orasaram Ratchaworawihan (WRO) and Tamra Osot Phra Narai (NR), through manual transcription into spreadsheets, ensuring cultural and linguistic accuracy. These entries were then cleaned to remove nulls, duplicates, and inconsistencies, with special attention to Thai-specific elements such as numerals and phrase structures. Symptoms linked to each recipe were restructured into clear list variables, and automated Python algorithms were used to assist in formatting, with manual validation ensuring ontological integrity. The resulting datasets were divided into two tailored tasks: TrP (Treatment Prediction) and HRG (Herbal Recipe Generation), each split into 90% training and 10% testing sets.


Descriptive alt text

AppHerb’s fine-tuning process included not just traditional transcription but cultural preservation. By digitizing herbal formulations from ancient Thai medical texts, we ensured precise translation of symptoms, remedies, and treatment logic. Observe how Gemma-2 learn Thai language. This figure illustrates the self-attention matrix generated for the sentence, แบบจำลองภาษาสำหรับแนะนำตำรับแผนไทย (Language model for recommending traditional Thai medicine). The input sentence was tokenized into [แบบ, จำ, ลอง, ภาษา, สำหรับ, แนะนำ, ตำ, รับ, แผน, ไทย] and self-attention was performed to explore inter-sentential relationships.

chatinterface

Demo App on Google Colab.

BibTeX

@Article{ai6080170,
        AUTHOR = {Piyasawetkul, Thanawat and Tiyaworanant, Suppachai and Srisongkram, Tarapong},
        TITLE = {AppHerb: Language Model for Recommending Traditional Thai Medicine},
        JOURNAL = {AI},
        VOLUME = {6},
        YEAR = {2025},
        NUMBER = {8},
        ARTICLE-NUMBER = {170},
        URL = {https://www.mdpi.com/2673-2688/6/8/170},
        ISSN = {2673-2688},
        ABSTRACT = {Trust in Traditional Thai Medicine (TTM) among Thai people has been reduced due to a lack of objective standards and the susceptibility of the general population to false information. The emergence of generative artificial intelligence (Gen AI) has significantly impacted various industries, including traditional medicine. However, previous Gen AI models have primarily focused on prescription generation based on Traditional Chinese Medicine (TCM), leaving TTM unexplored. To address this gap, we propose a novel fast-learning fine-tuned language model fortified with TTM knowledge. We utilized textual data from two TTM textbooks, Wat Ratcha-orasaram Ratchaworawihan (WRO), and Tamra Osot Phra Narai (NR), to fine-tune Unsloth’s Gemma-2 with 9 billion parameters. We developed two specialized TTM tasks: treatment prediction (TrP) and herbal recipe generation (HRG). The TrP and HRG models achieved precision, recall, and F1 scores of 26.54%, 28.14%, and 24.00%, and 32.51%, 24.42%, and 24.84%, respectively. Performance evaluation against TCM-based generative models showed comparable precision, recall, and F1 results with a smaller knowledge corpus. We further addressed the challenges of utilizing Thai, a low-resource and linguistically complex language. Unlike English or Chinese, Thai lacks explicit sentence boundary markers and employs an abugida writing system without spaces between words, complicating text segmentation and generation. These characteristics pose significant difficulties for machine understanding and limit model accuracy. Despite these obstacles, our work establishes a foundation for further development of AI-assisted TTM applications and highlights both the opportunities and challenges in applying language models to traditional medicine knowledge systems in Thai language contexts.},
        DOI = {10.3390/ai6080170}
        }