Feature Selection for Polygenic Risk Scores using Genetic Algorithm and Network Science

Authors

  • Zhendong Sha
  • Dr. Ting Hu
  • Dr. Yuanzhu Chen

Abstract

Many human diseases can be attributed to genetic variations in the genome. Scientists have been identifying genetic variants associated with disease risks using population-based data. With this knowledge, an individual’s genetic liability to a disease can be estimated using the polygenic risk score (PRS), calculated based on their genotype profile. However, selecting the most predictive genetic variants is challenged by the high dimensionality of genomics data. Typically, hundreds of thousands of genetic variants are being tested on their association with a disease risk. Moreover, the effect of a genetic variant on a disease risk is often influenced by other variants. It is their interactions that contribute to a disease risk. In this research, we propose a feature selection method for PRS assessment that is able to search for combinations of genetic variants using a genetic algorithm and network science. Our method provides accurate predictive models for PRS computation, as well as useful insights into the intertwined relationships among a large number of genetic variants.

This work was accepted by CEC2021. [Link]