With the increasingly large and complex data generated and need to be analyzed, statistics is highly interdisciplinary and rapidly expanding among the sciences. My career goal includes developing state-of-the-art statistical and machine learning methods for solving data-driven problems, advancing statistical theory in the emerging field of data science, and harnessing real-world evidence for decision-making and precision medicine.

My primary research is centered around developing statistical theories and methodologies to promote statistical learning in complex data, especially on data heterogeneity and model interpretability. Much of my methodology research applies to genetic and genomic data analysis, microbiome data analysis, epidemiologic research, and environmental statistics.

Detecting heterogeneous genetic associations

The first thread of my research focuses on nonlinear heterogeneous genetic association studies based on quantile regression, which is complementary to existing methods based on linear regression, leading to innovative discoveries on local nonlinear gene-trait associations and identifications of high-risk subgroups.

  • Liu, Y. and Wang, T. (2024+). “A powerful transformation of quantitative responses for biobank-scale association studies”, under review.
  • Wang, T.*, Ionita-Laza, I., and Wei, Y. (2023+). “A unified quantile framework reveals nonlinear heterogeneous transcriptome-wide associations”, revision submitted.
  • Wang, C., Wang, T., Kiryluk, K., Wei, Y., Aschard, H., and Ionita-Laza, I. (2024). “Genome-wide discovery for biomarkers using quantile regression at biobank scale”, Nature Communications, 15 (1), 6460.
  • Wang, T., Ionita-Laza, I., and Wei, Y. (2022). “Integrated Quantile RAnk Test (iQRAT) for gene-level associations”. Annals of Applied Statistics, 16 (3) 1423 - 1444.
  • Wang, T., Ling, W., Plantinga, A., Wu, M., and Zhan, X. (2022). “Testing microbiome association using integrated quantile regression models”. Bioinformatics, 38(2), 419-425.

Modeling zero-inflated data

The second thread of my research focuses on robust statistical modeling in zero-inflated data, which is commonly seen in various studies, such as microbiome data analysis. My work is more robust and utilizes the data generation rationale of the zeros, resulting in significant improvement of the test power and estimation accuracy.

Measurement error analysis in complex structured data

The third thread of my work focuses on addressing measurement errors with complex structured data, such as in environmental statistics and high-dimensional statistics. My recent work on climate data analysis enables us to correctly estimate scaling factors of the fingerprints in the climate system and provide valid statistical inferences for the attribution and detection analyses of climate change.

Other topics

I am also interested in a broad range of research topics, such as high-dimensional statistics and case-control studies.

  • Ma, S. and Wang, T. (2023). “The optimal pre-post allocation for randomized clinical trials”. BMC Medical Research Methodology, 23:72 doi: 10.1186/s12874-023-01893-w.
  • Wang, T., Liu, J., and Wu, A. (2024). [“Semiparametric Analysis in Case-Control Studies for Gene-Environment Independent Models: Bibliographical Connections and Extensions”], Journal of Data Science, accepted.
  • Wang, T. and Asher, A. (2021). “Improved Semiparametric Analysis of Polygenic Gene-Environment Interactions in Case-Control Studies”. Statistics in Biosciences, 13, 386–401.
  • Gaynanova, I. and Wang, T. (2019). “Sparse quadratic classification rules via linear dimension reduction”. Journal of Multivariate Analysis, 169, 278–299.

Please check here for more of my publications.

Research opportunities are open to highly motivated students. Interested individuals are encouraged to reach out for more details.