Accurate Detection of Proteins in Cryo-Electron Tomograms from Sparse Labels
Cryo-electron tomography (CET) combined with sub-volume averaging (SVA), is currently the only imaging technique capable of determining the structure of proteins imaged inside cells at molecular resolution. To obtain high-resolution reconstructions, sub-volumes containing randomly distributed copies of the protein of interest need be identified, extracted and subjected to SVA, making accurate particle detection a critical step in the CET processing pipeline. Classical template-based methods have high false-positive rates due to the very low signal-to-noise ratios (SNR) typical of CET volumes, while more recent neural-network based detection algorithms require extensive labeling, are very slow to train and can take days to run. To address these issues, we propose a novel particle detection framework that uses positive-unlabeled learning and exploits the unique properties of 3D tomograms to improve detection performance. Our end-to-end framework is able to identify particles within minutes when trained using a single partially labeled tomogram. We conducted extensive validation experiments on two challenging CET datasets representing different experimental conditions, and observed more than 10% improvement in mAP and F1 scores compared to existing particle picking methods used in CET. Ultimately, the proposed framework will facilitate the structural analysis of challenging biomedical targets imaged within the native environment of cells.