About this project

Ischemic heart disease (IHD) has been identified as a leading cause of death globally. Compelling evidence showed that lifestyle changes could be effective strategies for the secondary prevention of IHD. Therefore, to reduce the burden of IHD mortality, an efficient tool for IHD screening and early diagnosis is warranted.

This project aims to develop an algorithm that uses serum metabolites, cardiometabolic biomarkers, and self-reported phenotypic data to predict IHD status in a European population. I will use 4 approaches to train the algorithm:

  1. Logistic regression with principal components as predictors (PCA + Logistic)
  2. K-nearest neighbors with principal components as predictors (PCA + KNN)
  3. K-nearest neighbors with serum metabolites, cardiometabolic biomarkers, and self-reported phenotypic data as predictors (KNN)
  4. Random forest with serum metabolites, cardiometabolic biomarkers, and self-reported phenotypic data as predictors (RF)