Comparison of machine learning V/S deep learning model to predict ICD9 code using text mining techniques

No Thumbnail Available
Bhat, Akshata
Issue Date
MSc in Data Analytics
Dublin Business School
Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.
Healthcare information is usually collected and stored in form of numbers, texts or images. This data consists of important details such as their visits, symptoms, prescriptions, notes or vital statistics of the patients. Most of these documents are huge in amounts and difficult to maintain or access, hence most of the health institutions maintain such details in the form of Electronic Health Records (EHR) in order to avoid manual error and avoid redundancy. This dissertation uses text mining techniques on textual notes from a real time EHR database (MIMIC – III); to identify the most effective vectorization technique to retrieve meaningful information. A comparison among machine learning models alongside of deep learning model is made using the novel H2O framework and Rapid Miner to predict the ICD9 code based on the extracted data.