Automated ICD-11 Coding with GPT pre-trained model: Leveraging Prompt-Based Learning

تنهایی, محمد

doi:10.22034/cmes.2025.2049274.1043

Automated ICD-11 Coding with GPT pre-trained model: Leveraging Prompt-Based Learning

مقالات آماده انتشار

نوع مقاله : مقاله پژوهشی

نویسنده

محمد تنهایی

دانشگاه ایلام

10.22034/cmes.2025.2049274.1043

چکیده

Accurate ICD-11 coding is critical for billing, medical studies, and software communications and documentation in a clinic environment. Manual coding is not reliable and consumes a lot of time. There isn't much data in Persian language for training a language model. As such, a language model with an English dataset and a pre-trained model have been used to map sentences in Persian language to model language for prediction of ICD-11. This lessens the use of datasets in Persian language and immensely helps in prediction in Persian language.

In this paper, a technique utilizing a pre-trained language model (GPT) for ICD-11 automation in terms of patient grievances extracted from the MIMIC-IV corpus is proposed. We frozen part of a pre-trained model and have utilized such layers for language transformation. Prompt-based training is adopted in our model, and it obtains satisfactory performance with an average F1-score of 0.90 for ICD-11 prediction. We also demonstrate the usability of our model through output in JSON format for integration in medical tools.

کلیدواژه‌ها

GPT4

ICD-11

Transformers

LLM

عنوان مقاله English