Yang Yan
AI for Science | Computer Vision | Representation Learning | Structural Biology
Research Profile
I am a PhD candidate in computer science, jointly trained at Zhejiang University and Westlake University. My research lies at the intersection of AI for Science, computer vision, and structural biology, with a focus on building foundation models for cryo‑EM image analysis. I develop self‑supervised models, automated pipelines, and open‑source tools to enable high‑throughput, expert‑free structural biology.
My recent work introduced Cryo-IEF, a foundation model for cryo-EM particle processing pretrained on approximately 65 million particle images and transferred to structural classification, pose-aware clustering, particle-quality ranking, and automated reconstruction. Current work extends this direction through CryoDECO, a foundation-prior framework for compositional and conformational heterogeneity.
Research Vision
My long-term research vision is to develop AI systems that understand molecular structure in its native, dynamic context. Building on the CryoDECO idea that foundation-model priors can break the circular dependency between classification and reconstruction, I am interested in next-generation structural world models that learn the latent laws of macromolecular identity, interaction, motion, and cellular context from large-scale cryo-EM/cryo-ET data.
A central direction is adaptive manifold intelligence: models that infer the intrinsic degrees of freedom of a sample, choose appropriate latent capacity, and move beyond static structure determination toward continuous maps of biological state space. I am also interested in agent-based structural-biology systems that autonomously plan reconstruction strategies, diagnose failure modes, select data-processing actions, and connect imaging data with biochemical and multimodal biological evidence.
The broader goal is panoramic structural biology: replacing single-target, purification-heavy workflows with AI-guided discovery of molecular machines, transient interactions, and conformational dynamics directly from complex native mixtures.
Education
Ph.D. Candidate in Computer Science
Zhejiang University / Westlake University
Joint Ph.D. program supervised by Prof. Fajie Yuan and Prof. Huaizong Shen. Research: foundation models for cryo-EM image processing, automated workflows, and heterogeneity analysis.
M.S. in Translational Medicine (Engineering)
Xiamen University
National Institute of Diagnostics and Vaccine Development in Infectious Diseases, supervised by Prof. Ningshao Xia. Research: machine learning for medical image analysis.
B.S. in Electrical Information Engineering
Northeastern University (Qinhuangdao)
Research included machine-learning applications in localization.
Selected Publications
A comprehensive foundation model for cryo-EM image processing
Yang Yan, Shiqi Fan, Fajie Yuan, Huaizong Shen. Nature Methods, 23(1), 88-95, 2026. DOI: 10.1038/s41592-025-02916-8.
Artificial intelligence foundation model automates cryo-EM structure determination
Yang Yan, Huaizong Shen. Nature Methods Research Briefing, 23, 26-27, 2026. DOI: 10.1038/s41592-025-02917-7.
CryoDECO: Deconstructing Extreme Compositional and Conformational Heterogeneity in Cryo-EM via Foundation Model Priors
Yang Yan, Yanwanyu Xi, Shiqi Fan, Yifei Wang, Ziyun Tang, Fajie Yuan, Huaizong Shen. LangTaoSha Preprint Server, 2026. DOI: 10.65215/LTSpreprints.2025.12.30.000075.
Research Highlights
Cryo-EM Foundation Models
Introduced Cryo-IEF as a foundation-model paradigm for automated cryo-EM particle analysis.
Automated Reconstruction
Developed CryoWizard, a fully automated computational pipeline for single-particle cryo-EM reconstruction.
Heterogeneous Reconstruction
Developed CryoDECO to combine foundation-model priors with ab initio reconstruction for complex compositional and conformational heterogeneity.
Open-Source Python Package
Built cryodata, a reusable PyTorch-ready data layer for CryoSPARC particle outputs, MRC/MRCS preprocessing, LMDB datasets, and metadata conversion.
Featured Software
Cryo-IEF
Foundation model ecosystem for cryo-EM particle processing, including downstream tooling for CryoRanker and CryoClustering.
CryoDECO
Foundation-prior framework for heterogeneous cryo-EM reconstruction and compositional/conformational deconstruction.
cryodata
Open-source data-processing package that turns CryoSPARC particle jobs into reproducible PyTorch-ready datasets.
CryoWizard
End-to-end automated single-particle cryo-EM reconstruction pipeline integrating particle ranking and CryoSPARC workflows.
Posters
Cryo-IEF Poster
Foundation models for cryo-EM particle processing.
CryoDECO Poster
Foundation-prior reconstruction for extreme cryo-EM heterogeneity.
