Bradford Patton

M.S. Data Science

Applying Link Prediction on Knowledge Graph for Biomedical Knowledge Discovery

Understanding complex biological processes, diseases, and drug discovery necessitates deciphering the intricate interactions among diverse biological entities such as proteins, drugs, metabolites, and enzymes. Knowledge graphs (KGs), representing interconnected multi-relational entities through nodes and edges, offer a promising approach to model heterogeneous biological data comprehensibly. However, many crucial links within KGs remain hidden, presenting a challenge for comprehensive understanding and analysis. In this capstone project, we address the task of link prediction within a biomedical knowledge graph to facilitate biomedical knowledge discovery. Our objectives encompass curating a comprehensive biomedical knowledge graph, constructing a general link prediction pipeline employing various knowledge graph embedding (KGE) models, and applying link prediction techniques to tackle two challenging biomedical problems: drug repurposing and protein function prediction. This project not only contributes a reusable pipeline for biomedical knowledge discovery but also lays the groundwork for future advancements in link discovery using knowledge graphs, applicable to a wide range of biomedical research tasks.