My applied work is motivated by scientific problems in which data are large, heterogeneous, and imperfect, and where classical modeling assumptions are often violated. I use applications across statistical genetics and genomics, electronic health records, epidemiology, microbiome studies, and climate science as testbeds to drive methodological development and to demonstrate practical impact.

Rather than treating applications as isolated case studies, I focus on recurring data challenges—such as distribution shift, outcome heterogeneity, and measurement error—that arise across domains. Selected application-driven papers are listed below. For a complete and up-to-date list, please refer to the Publications page.

underline indicates a student co-author under my (co)supervision, with denoting an undergraduate student mentee; indicates the corresponding author.


Statistical genetics and genomics

Impact: Scalable discovery and interpretation of heterogeneous genetic effects in large biobank and omics studies.


Electronic health records and epidemiology

Impact: Robust inference and transportable learning in the presence of missing data, cohort shift, and measurement error.


Microbiome and compositional data

Impact: Distribution-aware and robust association analysis for zero-inflated and compositional microbiome data.


Climate science

Impact: Robust detection and attribution of climate signals under measurement error and extreme-value behavior.