Mol Syst Biol. 2024 Mar 12. doi: 10.1038/s44320-024-00029-6.
PIFiA: Self-supervised Approach for Protein Functional Annotation from Single-Cell Imaging Data

Anastasia Razdaibiedina1,2,4, Alexander Brechalov1,2, Helena Friesen2, Mojca Mattiazzi Usaj2*, Myra Paz David Masinas2, Harsha Garadi Suresh2, Kyle Wang1,2, Charles Boone1,2,5†, Jimmy Ba3,4†, Brenda Andrews1,2†

  • 1Department of Molecular Genetics, University of Toronto, Toronto ON, Canada
  • 1The Donnelly Centre, University of Toronto, Toronto ON, Canada
  • 3Department of Computer Science, University of Toronto, Toronto ON, Canada
  • 4Vector Institute for Artificial Intelligence, Toronto ON, Canada
  • 5RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama, Japan

  • † Corresponding authors
  • * current address: Department of Chemistry and Biology, Toronto Metropolitan University, Toronto, ON, Canada

  • Correspondence: brenda.andrews@utoronto.ca, jimmy@cs.toronto.edu, charlie.boone@utoronto.ca

Abstract

Fluorescence microscopy data describe protein localization patterns at single-cell resolution and have the potential to reveal whole-proteome functional information with remarkable precision. Yet, extracting biologically meaningful representations from cell micrographs remains a major challenge. Existing approaches often fail to learn robust and noise-invariant features or rely on supervised labels for accurate annotations. We developed PIFiA, (Protein Image-based Functional Annotation), a self-supervised approach for protein functional annotation from single-cell imaging data. We imaged the global yeast ORF-GFP collection and applied PIFiA to generate protein feature profiles from single-cell images of fluorescently tagged proteins. We show that PIFiA outperforms existing approaches for molecular representation learning and describe a range of downstream analysis tasks to explore the information content of the feature profiles. Specifically, we cluster extracted features into a hierarchy of functional organization, study cell population heterogeneity, and develop techniques to distinguish multi-localizing proteins and identify functional modules. Finally, we confirm new PIFiA predictions using a colocalization assay, suggesting previously unappreciated biological roles for several proteins. Paired with a fully interactive website (https://thecellvision.org/pifia/), PIFiA is a resource for the quantitative analysis of protein organization within the cell.