روش‌های محاسباتی در علوم مهندسی

روش‌های محاسباتی در علوم مهندسی

Automated ICD-11 Coding with GPT pre-trained model: Leveraging Prompt-Based Learning

نوع مقاله : مقاله پژوهشی

نویسنده
دانشگاه ایلام
چکیده
Accurate ICD-11 coding is critical for billing, medical studies, and software communications and documentation in a clinic environment. Manual coding is not reliable and consumes a lot of time. There isn't much data in Persian language for training a language model. As such, a language model with an English dataset and a pre-trained model have been used to map sentences in Persian language to model language for prediction of ICD-11. This lessens the use of datasets in Persian language and immensely helps in prediction in Persian language.



In this paper, a technique utilizing a pre-trained language model (GPT) for ICD-11 automation in terms of patient grievances extracted from the MIMIC-IV corpus is proposed. We frozen part of a pre-trained model and have utilized such layers for language transformation. Prompt-based training is adopted in our model, and it obtains satisfactory performance with an average F1-score of 0.90 for ICD-11 prediction. We also demonstrate the usability of our model through output in JSON format for integration in medical tools.
کلیدواژه‌ها

عنوان مقاله English

Automated ICD-11 Coding with GPT pre-trained model: Leveraging Prompt-Based Learning

چکیده English

Accurate ICD-11 coding is critical for billing, medical studies, and software communications and documentation in a clinic environment. Manual coding is not reliable and consumes a lot of time. There isn't much data in Persian language for training a language model. As such, a language model with an English dataset and a pre-trained model have been used to map sentences in Persian language to model language for prediction of ICD-11. This lessens the use of datasets in Persian language and immensely helps in prediction in Persian language.



In this paper, a technique utilizing a pre-trained language model (GPT) for ICD-11 automation in terms of patient grievances extracted from the MIMIC-IV corpus is proposed. We frozen part of a pre-trained model and have utilized such layers for language transformation. Prompt-based training is adopted in our model, and it obtains satisfactory performance with an average F1-score of 0.90 for ICD-11 prediction. We also demonstrate the usability of our model through output in JSON format for integration in medical tools.

کلیدواژه‌ها English

GPT4
ICD-11
Transformers
LLM

مقالات آماده انتشار، پذیرفته شده
انتشار آنلاین از 08 اردیبهشت 1404

  • تاریخ دریافت 08 دی 1403
  • تاریخ بازنگری 01 اردیبهشت 1404
  • تاریخ پذیرش 08 اردیبهشت 1404
  • تاریخ انتشار 08 اردیبهشت 1404