Automated ‘pipeline’ improves access to advanced microscopy data

April 14, 2020

A new data-processing approach created by scientists at the University of Michigan Life Sciences Institute offers a simpler, faster path to data generated by cryo-electron microscopy (cryo-EM) instruments, removing a barrier to wider adoption of this powerful technique.

Cryo-EM enables scientists to determine the 3-D shape of cellular proteins and other molecules that have been flash-frozen in a thin layer of ice. Advanced microscopes beam high-energy electrons through the ice while capturing thousands of videos. These videos are then averaged to create a 3-D structure of the molecule.

By uncovering the precise structures of these molecules, researchers can answer important questions about how the molecules function in cells and how they might contribute to human health and disease. For example, researchers recently used cryo-EM to reveal how a protein spike on the COVID-19 virus enables it to gain entry into host cells.

Recent advances in cryo-EM technology have rapidly opened this field to new users and increased the rate at which data can be collected. Despite these improvements, however, researchers still face a substantial hurdle in accessing the full potential of this technique: the complex data processing landscape required to turn the microscope’s terabytes of data into a 3-D structure ready for analysis.  

Before researchers can begin analyzing the 3-D structure they want to study, they have to complete a series of preprocessing steps and subjective decisions. Currently, these steps must be supervised by humans — and because researchers use cryo-EM to analyze a huge variety of molecule types, scientists thought that it was nearly impossible to create a general set of guidelines that all researchers could follow for these steps, says Yilai Li, Ph.D., a Willis Life Sciences Fellow at the LSI who led the development of the new program.

Yilai Li, Ph.D.

“If we can create an automated pipeline for those pre-processing steps, the whole process could be much more user-friendly, especially for newcomers to the field,” Li explains.

Using machine learning, Li and his colleagues in the lab of LSI assistant professor Michael Cianfrocco, Ph.D., have developed just such a pipeline. The program was published April 14 as part of a study in the journal Structure.  

The new program connects several deep-learning and image-analysis tools with pre-existing software data preprocessing algorithms to narrow enormous data-sets down to the information that researchers need to begin their analysis.

“This pipeline takes the knowledge that experienced users have gained and puts it into a program that improves accessibility for users from a range of backgrounds,” says Cianfrocco, who is also an assistant professor of biological chemistry in the U-M Medical School. “It really streamlines the process stage so that researchers can jump in and focus on what’s important: the scientific questions they want to ask and answer.”

Top image: A new workflow developed by U-M researchers streamlines the processing of cryo-EM data, enabling researchers to more easily begin analyzing their data to answer scientific questions. Illustration by Rajani Arora, LSI multimedia designer.

Disclosure & Authorship

This research was supported by the National Science Foundation and the National Institutes of Health.

The study authors are Yilai Li, Michael A. Cianfrocco and Jennifer N. Cash of the University of Michigan Life Sciences Institute, and John G. Tesmer of Purdue University.

Go to Article

High-throughput cryo-EM enabled by user-free preprocess routines, Structure. DOI: 10.1016/j.str.2020.03.008