MiLoPYP: self-supervised molecular pattern mining and particle localization in situ
Cryo-electron tomography (CET) allows the routine visualization of cellular landscapes in three dimensions at nanometer-range resolutions. When combined with single-particle tomography (SPT), it is possible to obtain near-atomic resolution structures of frequently occurring macromolecules within their native environment. Two outstanding challenges associated with CET/SPT are the automatic identification and localization of proteins, tasks that are hindered by the molecular crowding inside cells, imaging distortions characteristic of CET tomograms, and the sheer size of tomographic datasets. Current methods either suffer from low accuracy, they demand extensive and time consuming manual labeling, or are limited to the detection of specific types of proteins. Here, we present MiLoPYP, a two-step dataset-specific contrastive learning-based framework that enables fast molecular pattern mining followed by accurate protein localization. MiLoPYP’s ability to effectively detect and localize a wide range of targets including globular and tubular complexes as well as large membrane proteins, will contribute to streamline and broaden the applicability of high-resolution workflows for in situ structure determination.