In humans, hundreds of proteins interact in a complex network dubbed the interactome. Those interactions are further complicated when disease-causing mutations are introduced into genes that code for these proteins. Some genes can be mutated in different ways to cause the same diseases meaning that a single condition can be associated with multiple interactomes. It poses a challenge for drug developers who are left with thousands of potential disease-causing interactions to select as therapeutic targets.
But there may now be a way to simplify that task. Scientists from Cleveland Clinic and Cornell University have used artificial intelligence to develop a publicly available computational tool that predicts how genetic mutations influence protein-protein interactions in cancers and other complex diseases. The software and database, called Protein-protein InteractiOn iNtErfacE pRediction or PIONEER is described in a new Nature Biotechnology paper titled, “A structurally informed human protein-protein interactome reveals proteome-wide perturbations caused by disease mutations.”
Their hope is that understanding the impact of pathogenic mutations on the protein interactome could help shorten the time required for drug development and clinical trials. “In theory, making new medicines based on genetic data is straightforward: mutated genes make mutated proteins,” said Feixiong Cheng, PhD, a co-lead author on the study and director of Cleveland Clinic’s Genome Center. “We try to create molecules that stop these proteins from disrupting critical biological processes by blocking them from interacting with healthy proteins, but in reality, that is much easier said than done.”
PIONEER can help by clearing a path to the most promising protein-protein interactions for drug researchers and developers. To design the tool, Cheng’s lab worked with the group of Haiyuan Yu, PhD, director of Cornell University’s Center for Innovative Proteomics. The scientists amassed data from multiple sources including genomic sequences from almost 100,000 individuals with disease-causing mutations. They also collected three-dimensional structures of over 16,000 human proteins along with information on how gene mutations impact their structures, as well as data on known interactions between almost 300,000 protein pairs.
This dataset allows scientists to navigate the interactome for more than 10,500 diseases including various cancers, autoimmune disease, and cardiovascular disease. To use PIONEER, scientists can enter a disease-associated mutation of interest and receive a ranked list of protein-protein interactions that contribute to the disease and can be potentially treated with a drug. Scientists can also search for diseases by name to receive a list of potential disease-causing protein interactions.
The team validated their database’s predictions in the lab, making almost 3,000 mutations on over 1,000 proteins. They then tested their impact on almost 7,000 protein-protein interaction pairs. Preliminary research based on these findings is already underway to develop and test treatments for lung and endometrial cancers. The model can also predict survival rates and prognoses as well as anticancer drug responses.
The resources needed for interactome studies are “a significant barrier to entry for most genetic researchers,” said Cheng. “We hope PIONEER can overcome these barriers computationally to lessen the burden and grant more scientists with the ability to advance new therapies.”