Ariful Azad, assistant professor of Intelligent Systems Engineering at the Luddy School of Informatics, Computing, and Engineering, co-authored a paper in the prestigious Nature that addresses a new approach to discover previously unknown protein families.
The paper, “'Unraveling the functional dark matter through global metagenomics,” focuses on microbial dark matter, which makes up most of the microbial organisms such as bacteria and archaea that microbiologists can’t culture in the laboratory because they lack the knowledge or ability to supply the necessary growth conditions.
Azad said he and his colleagues from around the country used a computational approach to discover 106,198 new protein families.
“The exact function of these newly discovered proteins are unknown,” Azad said. “Extensive research is needed to demystify microbial dark matter.”
Researchers used more than one billion sequences of microbial proteins and clustered them into groups based on their similarities. Each group is often called a protein family.
Azad developed the algorithm and software (HipMCL) used to cluster the proteins. Without his input, the discovery would not have been possible.
“It clustered 1.17 billion protein sequences using more than 100,000 processors on a Berkeley supercomputer,” he said. “This is a massively parallel algorithm. Without it, it would have been impossible to analyse billions of proteins.”
Nature, begun in 1869, is the world’s leading multidisciplinary science journal. It produces the best peer-reviewed research in all fields of science and technology with the focus on originality, importance, timeliness and accessibility.