Yang Yan

AI for Science | Computer Vision | Representation Learning | Structural Biology

Download CV Email GitHub

Research Profile

I am a PhD candidate in computer science, jointly trained at Zhejiang University and Westlake University. My research lies at the intersection of AI for Science, computer vision, and structural biology, with a focus on building foundation models for cryo‑EM image analysis. I develop self‑supervised models, automated pipelines, and open‑source tools to enable high‑throughput, expert‑free structural biology.

My recent work introduced Cryo-IEF, a foundation model for cryo-EM particle processing pretrained on approximately 65 million particle images and transferred to structural classification, pose-aware clustering, particle-quality ranking, and automated reconstruction. Current work extends this direction through CryoDECO, a foundation-prior framework for compositional and conformational heterogeneity.

Research Vision

My long-term research vision is to develop AI systems that understand molecular structure in its native, dynamic context. Building on the CryoDECO idea that foundation-model priors can break the circular dependency between classification and reconstruction, I am interested in next-generation structural world models that learn the latent laws of macromolecular identity, interaction, motion, and cellular context from large-scale cryo-EM/cryo-ET data.

A central direction is adaptive manifold intelligence: models that infer the intrinsic degrees of freedom of a sample, choose appropriate latent capacity, and move beyond static structure determination toward continuous maps of biological state space. I am also interested in agent-based structural-biology systems that autonomously plan reconstruction strategies, diagnose failure modes, select data-processing actions, and connect imaging data with biochemical and multimodal biological evidence.

The broader goal is panoramic structural biology: replacing single-target, purification-heavy workflows with AI-guided discovery of molecular machines, transient interactions, and conformational dynamics directly from complex native mixtures.

Education

2023-Present

Ph.D. Candidate in Computer Science

Zhejiang University / Westlake University

Joint Ph.D. program supervised by Prof. Fajie Yuan and Prof. Huaizong Shen. Research: foundation models for cryo-EM image processing, automated workflows, and heterogeneity analysis.

2020-2023

M.S. in Translational Medicine (Engineering)

Xiamen University

National Institute of Diagnostics and Vaccine Development in Infectious Diseases, supervised by Prof. Ningshao Xia. Research: machine learning for medical image analysis.

2016-2020

B.S. in Electrical Information Engineering

Northeastern University (Qinhuangdao)

Research included machine-learning applications in localization.

Selected Publications

A comprehensive foundation model for cryo-EM image processing

Yang Yan, Shiqi Fan, Fajie Yuan, Huaizong Shen. Nature Methods, 23(1), 88-95, 2026. DOI: 10.1038/s41592-025-02916-8.

Artificial intelligence foundation model automates cryo-EM structure determination

Yang Yan, Huaizong Shen. Nature Methods Research Briefing, 23, 26-27, 2026. DOI: 10.1038/s41592-025-02917-7.

CryoDECO: Deconstructing Extreme Compositional and Conformational Heterogeneity in Cryo-EM via Foundation Model Priors

Yang Yan, Yanwanyu Xi, Shiqi Fan, Yifei Wang, Ziyun Tang, Fajie Yuan, Huaizong Shen. LangTaoSha Preprint Server, 2026. DOI: 10.65215/LTSpreprints.2025.12.30.000075.

View all publications

Research Highlights

Cryo-EM Foundation Models

Introduced Cryo-IEF as a foundation-model paradigm for automated cryo-EM particle analysis.

Automated Reconstruction

Developed CryoWizard, a fully automated computational pipeline for single-particle cryo-EM reconstruction.

Heterogeneous Reconstruction

Developed CryoDECO to combine foundation-model priors with ab initio reconstruction for complex compositional and conformational heterogeneity.

Open-Source Python Package

Built cryodata, a reusable PyTorch-ready data layer for CryoSPARC particle outputs, MRC/MRCS preprocessing, LMDB datasets, and metadata conversion.

Cryo-IEF

Foundation model ecosystem for cryo-EM particle processing, including downstream tooling for CryoRanker and CryoClustering.

CryoDECO

Foundation-prior framework for heterogeneous cryo-EM reconstruction and compositional/conformational deconstruction.

cryodata

Open-source data-processing package that turns CryoSPARC particle jobs into reproducible PyTorch-ready datasets.

CryoWizard

End-to-end automated single-particle cryo-EM reconstruction pipeline integrating particle ranking and CryoSPARC workflows.

View projects

Posters

Cryo-IEF Poster

Foundation models for cryo-EM particle processing.

Open poster PDF

CryoDECO Poster

Foundation-prior reconstruction for extreme cryo-EM heterogeneity.

Open poster PDF