Wallace helps develop computational methods to infer structure interdependence from protein family MSAs

Predicting protein structure is an important, growing area of biomedical data science and bioinformatics that can be improved through computational science. Combining the two disciplines may accelerate drug discovery. Dr. T.L. Wallace, professor, computational sciences, and colleagues recently co-published work on a new computational method that may help researchers studying proteins and support development of new drug treatment strategies.

“Our team used an embodiment of the k-modes algorithm to develop a software tool that discovers the ranked critical non-proximal and proximal interdependencies in protein structure,” notes Dr. Wallace who contributed to the math and biostatistics portions.
The authors developed a new computational method to analyze large multiple sequence alignments (MSA) for proteins of various sizes. The method supports an efficient way for researchers to discover the key functional and structural sub-molecular components of both globular and intrinsically disordered proteins using a greedy algorithm to support explainable machine learning via deep information theoretic analysis with approximations.
The bioinformatics software tool provides ranked results of protein clusters within minutes.
“PSICalc may aid researchers studying proteins of known and unknown 3D structure by identifying relationships that may have structural or functional significance,” says Dr. Wallace. “It has promising potential for developing new drug therapies for various treatment strategies.”
Professor Wallace collaborated with a team which included a former student as well faculty and researchers from various institutions and scientific backgrounds.
Their paper, PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure, has been published in Bioinformatics Advances, Volume 2, Issue 1, 2022.