Comparison of machine learning V/S deep learning model to predict ICD9 code using text mining techniques

Authors

Bhat, Akshata

Issue Date

2021

Degree

MSc in Data Analytics

Publisher

Dublin Business School

Rights

Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.

Abstract

Healthcare information is usually collected and stored in form of numbers, texts or images. This data consists of important details such as their visits, symptoms, prescriptions, notes or vital statistics of the patients. Most of these documents are huge in amounts and difficult to maintain or access, hence most of the health institutions maintain such details in the form of Electronic Health Records (EHR) in order to avoid manual error and avoid redundancy. This dissertation uses text mining techniques on textual notes from a real time EHR database (MIMIC – III); to identify the most effective vectorization technique to retrieve meaningful information. A comparison among machine learning models alongside of deep learning model is made using the novel H2O framework and Rapid Miner to predict the ICD9 code based on the extracted data.