With the increasingly large and complex data generated and need to be analyzed, statistics is highly interdisciplinary and rapidly expanding among the sciences. My career goal includes developing state-of-art statistical and machine learning methods for solving data-driven problems, advancing statistical theory in the emerging field of data science, and harnessing real-world evidence for decision-making and precision medicine.

My primary research is centered around developing statistical theories and methodologies to promote statistical learning in complex data, especially on data heterogeneity and model interpretability. Much of my methodology research applies to genetic and genomic data analysis, microbiome data analysis, epidemiologic research, and environmental statistics.

Quantile-based inference for heterogeneous genetic association studies

The first thread of my research focuses on nonlinear heterogeneous genetic association studies based on quantile regression, which is complementary to existing methods based on linear regression, leading to innovative discoveries on local nonlinear gene-trait associations and identifications of high-risk subgroups.

  • Wang, T.*, Ionita-Laza, I., and Wei, Y. (2023+) “A unified quantile framework reveals nonlinear heterogeneous transcriptome-wide associations”, under review.
  • Wang, C., Wang, T., Wei, Y., Aschard, H., and Ionita-Laza, I. (2023+). “Quantile Regression for biomarkers in the UK Biobank”, under review.
  • Wang, T.*, Ionita-Laza, I., and Wei, Y. (2022). “Integrated Quantile RAnk Test (iQRAT) for gene-level associations”. Annals of Applied Statistics, 16 (3) 1423 - 1444.
  • Wang, T., Ling, W., Plantinga, A., Wu, M., and Zhan, X. (2022). “Testing microbiome association using integrated quantile regression models”. Bioinformatics, 38(2), 419-425.

Novel statistical methods for handling zero-inflated data

The second thread of my research focuses on robust statistical modeling in zero-inflated data, which is commonly seen in various studies, such as microbiome data analysis. My work is more robust and utilizes the data generation rationale of the zeros, resulting in significant improvement of the test power and estimation accuracy.

  • Zhao, H. and Wang, T.* (2024+). “A high-dimensional calibration method for log-contrast models subject to measurement errors”, under review.
  • Wang, Z. and Wang, T.* (2023+). “A Semiparametric Quantile Single-Index Model for Zero-Inflated and Overdispersed Outcomes”, under review.
  • Jiang, R., Zhan, X.*, and Wang, T.* (2023) “A Flexible Zero-Inflated Poisson-Gamma Model with Application to Microbiome Read Count Data”, Journal of the American Statistical Association, 118 (542), 792 - 804.
  • Wang, T.*, Zhang, W., and Wei, Y. (2024). “ZIKQ: An innovative centile chart method for utilizing natural history data in rare disease clinical development”, Statistica Sinica, to appear.

Statistical analyses on the errors-in-variables issue with complex data

The third thread of my work focuses on addressing error-in-variable issues with complex structured data, such as in environmental statistics. My recent work on climate data analysis enables us to correctly estimate scaling factors of the fingerprints in the climate system and provide valid statistical inferences for the attribution and detection analyses of climate change.

Other topics

I am also interested in a broad range of research topics, such as high-dimensional statistics and case-control studies.

  • Ma, S. and Wang, T.* (2023). “The optimal pre-post allocation for randomized clinical trials”. BMC Medical Research Methodology, 23:72 doi: 10.1186/s12874-023-01893-w.
  • Wang, T., Liu, J., and Wu, A. (2023+). “Semiparametric Analysis in Case-Control Studies for Gene-Environment Independent Models: Bibliographical Connections and Extensions”, under review.
  • Wang, T.* and Asher, A. (2021). “Improved Semiparametric Analysis of Polygenic Gene-Environment Interactions in Case-Control Studies”. Statistics in Biosciences, 13, 386–401.
  • Gaynanova, I. and Wang, T. (2019). “Sparse quadratic classification rules via linear dimension reduction”. Journal of Multivariate Analysis, 169, 278–299.

Please check here for more of my publications.

Research opportunities are open to highly motivated students. Interested individuals are encouraged to reach out for more details.