نوع مقاله : مقاله پژوهشی
عنوان مقاله English
Accurate ICD-11 coding is critical for billing, medical studies, and software communications and documentation in a clinic environment. Manual coding is not reliable and consumes a lot of time. There isn't much data in Persian language for training a language model. As such, a language model with an English dataset and a pre-trained model have been used to map sentences in Persian language to model language for prediction of ICD-11. This lessens the use of datasets in Persian language and immensely helps in prediction in Persian language.
In this paper, a technique utilizing a pre-trained language model (GPT) for ICD-11 automation in terms of patient grievances extracted from the MIMIC-IV corpus is proposed. We frozen part of a pre-trained model and have utilized such layers for language transformation. Prompt-based training is adopted in our model, and it obtains satisfactory performance with an average F1-score of 0.90 for ICD-11 prediction. We also demonstrate the usability of our model through output in JSON format for integration in medical tools.
کلیدواژهها English