With the increasingly large and complex data generated and need to be analyzed, statistics is highly interdisciplinary and rapidly expanding among the sciences. My career goal includes developing state-of-the-art statistical and machine learning methods for solving data-driven problems, advancing statistical theory in the emerging field of data science, and harnessing real-world evidence for decision-making and precision medicine.
My primary research is centered around developing statistical theories and methodologies to promote statistical learning in complex data, especially on data heterogeneity and model interpretability. Much of my methodology research applies to genetic and genomic data analysis, microbiome data analysis, epidemiologic research, and environmental statistics.
Detecting heterogeneous genetic associations
The first thread of my research focuses on nonlinear heterogeneous genetic association studies based on quantile regression, which is complementary to existing methods based on linear regression, leading to innovative discoveries on local nonlinear gene-trait associations and identifications of high-risk subgroups.
- Liu, Y. and Wang, T.✉ (2024+). “A powerful transformation of quantitative responses for biobank-scale association studies”, under review.
- Wang, T.*, Ionita-Laza, I., and Wei, Y. (2023+). “A unified quantile framework reveals nonlinear heterogeneous transcriptome-wide associations”, revision submitted.
- Wang, C., Wang, T., Kiryluk, K., Wei, Y., Aschard, H., and Ionita-Laza, I. (2024). “Genome-wide discovery for biomarkers using quantile regression at biobank scale”, Nature Communications, 15 (1), 6460.
- Wang, T.✉, Ionita-Laza, I., and Wei, Y. (2022). “Integrated Quantile RAnk Test (iQRAT) for gene-level associations”. Annals of Applied Statistics, 16 (3) 1423 - 1444.
- Wang, T., Ling, W., Plantinga, A., Wu, M., and Zhan, X. (2022). “Testing microbiome association using integrated quantile regression models”. Bioinformatics, 38(2), 419-425.
Modeling zero-inflated data
The second thread of my research focuses on robust statistical modeling in zero-inflated data, which is commonly seen in various studies, such as microbiome data analysis. My work is more robust and utilizes the data generation rationale of the zeros, resulting in significant improvement of the test power and estimation accuracy.
- Wang, Z.♦, Ling, W., and Wang, T.✉ (2024+). “A Semiparametric Quantile Regression Rank Score Test for Zero-inflated Data”, under review.
- Zhao, H. and Wang, T.✉ (2024+). “A high-dimensional calibration method for log-contrast models subject to measurement errors”, revision submitted.
- Mao, Y., Jiang, Z., Wang, T., and Zhan, X. (2024+) “Tree-guided compositional variable selection analysis of microbiome data”, under review.
- Wang, Z.♦ and Wang, T.✉ (2023+). “A Semiparametric Quantile Single-Index Model for Zero-Inflated and Overdispersed Outcomes”, revision submitted.
- Wang, T.✉, Zhang, W., and Wei, Y. (2024). “ZIKQ: An innovative centile chart method for utilizing natural history data in rare disease clinical development”, Statistica Sinica, to appear.
- Jiang, R.♦, Zhan, X.✉, and Wang, T.✉ (2023) “A Flexible Zero-Inflated Poisson-Gamma Model with Application to Microbiome Read Count Data”, Journal of the American Statistical Association, 118 (542), 792 - 804.
Measurement error analysis in complex structured data
The third thread of my work focuses on addressing measurement errors with complex structured data, such as in environmental statistics and high-dimensional statistics. My recent work on climate data analysis enables us to correctly estimate scaling factors of the fingerprints in the climate system and provide valid statistical inferences for the attribution and detection analyses of climate change.
- Zhao, H. and Wang, T.✉ (2024+). “A high-dimensional calibration method for log-contrast models subject to measurement errors”, revision submitted.
- Li, Y., Wang, T.✉, Yan, J., and Zhang, X. (2024+). “Detection and Attribution Analysis of Regional Temperature with Estimating Equations”, revision submitted.
- Zhao, H. and Wang, T.✉ (2023+). “A pseudo-simulation extrapolation method for misspecified models with errors-in-variables in epidemiological studies”, revision submitted.
- Zhou, S., Pati, D., Wang, T., Yang, Y., and Carroll, R. J. (2023). “Gaussian Processes with Errors in Variables: theory and computation”, Journal of Machine Learning Research, 24, 1-53.
- Lau, Y., Wang, T.✉, Yan, J., and Zhang, X. (2023). “Extreme Value Modeling with Errors-in-Variables in Detection and Attribution of Changes in Climate Extremes”, Statistics and Computing, 33 (6), 125.
- Ma, S., Wang, T.✉, Yan, J., and Zhang, X. (2023). “Optimal Fingerprinting with Estimating Equations”, Journal of Climate, 36(20), 7109-7122.
- Jiang, R.♦, Zhan, X.✉, and Wang, T.✉ (2023). “A Flexible Zero-Inflated Poisson-Gamma Model with Application to Microbiome Read Count Data”, Journal of the American Statistical Association, 118 (542), 792 - 804.
- Blas Achic, B.♯, Wang, T.♯ , Su, Y., Kipnis, V., Dodd, K., and Carroll, R. J. (2018). “Categorizing a Continuous Predictor Subject to Measurement Error”. Electronic Journal of Statistics, Vol. 12, No. 2, 4032-4056. ( ♯ joint first authors).
Other topics
I am also interested in a broad range of research topics, such as high-dimensional statistics and case-control studies.
- Ma, S. and Wang, T.✉ (2023). “The optimal pre-post allocation for randomized clinical trials”. BMC Medical Research Methodology, 23:72 doi: 10.1186/s12874-023-01893-w.
- Wang, T., Liu, J., and Wu, A. (2024). [“Semiparametric Analysis in Case-Control Studies for Gene-Environment Independent Models: Bibliographical Connections and Extensions”], Journal of Data Science, accepted.
- Wang, T.✉ and Asher, A. (2021). “Improved Semiparametric Analysis of Polygenic Gene-Environment Interactions in Case-Control Studies”. Statistics in Biosciences, 13, 386–401.
- Gaynanova, I. and Wang, T. (2019). “Sparse quadratic classification rules via linear dimension reduction”. Journal of Multivariate Analysis, 169, 278–299.
Please check here for more of my publications.
Research opportunities are open to highly motivated students. Interested individuals are encouraged to reach out for more details.