Clin Res Cardiol (2023). https://doi.org/10.1007/s00392-023-02180-w

AI-based risk prediction for Heart Failure and Cardiomyopathy patients from multimodal clinical parameters
N. Vetter1, D. H. Lehmann1, H. Hund1, N. A. Geis1, F. André1, U. Köthe2, S. Engelhardt1, V. Heuveline2, B. Meder1
1Klinik für Innere Med. III, Kardiologie, Angiologie u. Pneumologie, Universitätsklinikum Heidelberg, Heidelberg; 2Informatics for Life, Heidelberg;

Objective: Cardiovascular diseases are the leading cause of death in the industrial nations. In Germany, heart failure (HF) is also the most common reason for patient hospitalization of adults. Artificial intelligence (AI) showed tremendous potential for clinical applications. This resulted in various risk scores and AI-assisted diagnosis systems. Large amounts of structured or semi-structured data is crucial for successful AI applications. Therefore, the automated inclusion of clinical data and parameters into a unified database is an important basis for AI research. In case of the University Hospital of Heidelberg, the Research Data Warehouse (RWH) fulfills that role. In this work, we want to highlight our multimodal approach for AI-based risk predictions for heart failure and cardiomyopathy patients. Specifically, the pipeline combining clinical data from multimodal sources resulting in AI models will be presented. 

 

Methods and Results: Processing data from different modalities required a wide variety of methods. After preprocessing tabular routine clinical data, Clustering (t-distributed Stochastic Neighbor Embedding, t-SNE) and Principal component analysis (PCA) were performed to identify important distinctive features. Correlation between features were quantified with Pearson correlation. The expressiveness of a feature with respect to an outcome or disease can be analyzed with ROC AUC and significance tests. In order to get a longitudinal understanding of each patient’s disease progression, doctoral letters were analyzed using e.g. Natural language processing (NLP), keyword search or entity extraction. For image data, ECGs and ventricular pressure curves, Convolutional Neural Networks (CNN) were the preferred choice. Additionally, Fast-Fourier-Transformations or the manual definition of novel features yielded in useful supplementary information as well. In case of incomplete patient data, Multiple imputation by chained equations (MICE) can be used to generate complete datasets. To combine features from different modalities we used Fully-Connected Neural Networks, Support Vector Machines and Extreme Gradient Boosting

 

Conclusion: The integration of continuously increasing accumulated data exceeds the human capacity of an individual doctor. From our AI-based analysis of large and complex data sets with multiple data types, we expect a deeper understanding of both the relationships between the numerous parameters as well as their effects on the development of the above-mentioned diseases.

After this initial exploratory phase, we have laid the foundation for a multimodal risk model. In the future, our models will be further developed and validated against external cohorts (e.g. NOKO, TORCH).


https://dgk.org/kongress_programme/jt2023/aP466.html