Yong Chen, Ph.D.

Dr. Yong Chen’s group is focusing on developing statistical and computing methods and software for tackling problems related to data in modern technology, such as electronic health records, vaccine/drug safety reports, Twitter data, mHealth data, and high dimensional genetic and genomic data. He has strong interest in risk prediction models, comparative effectiveness research, and precision medicine. He is also interested in evidence-based medicine. He co-directs a course at Penn School of Medicine on systematic review and meta-analysis, and he leads the xmeta-project based at Penn Medicine.

Dr. Chen is currently mentoring four postdoc fellows and three doctoral students. His research group is currently supported by two NIH funded projects on conducting risk prediction of post-vaccination events using massive vaccine safety reports (1R01AI130460; 2017-2022), and on developing bias-reduction methods for systematic reviews (1R01LM012607; 2017-2021), as well as several collaborative grants including PCORI, NIH R01’s and P50. Dr. Chen has been serving as a scientific merit reviewer for PCORI since 2015. He has served as a dissertation advisor for five doctoral students to date.

Dr. Chen graduated from the Department of Biostatistics at the Johns Hopkins University with Margaret Merrell Award for excellence in research. He was also the recipient of Sommer Scholar named after the Dean of School of Public Health of the Johns Hopkins University during 2005-2010, and Institute of Mathematical Statistics IMS Travel Award in 2015.


Active Research Grants (selected)

  1. 1R01AI130460: Dynamic learning for post-vaccine event prediction using temporal information in VAERS.

    Role: Principal Investigator (with Cui Tao at University of Texas). Period: 04/01/2017 - 03/31/2022

  2. 1R01LM012607: A General Framework to Account for Outcome Reporting Bias in Systematic Reviews

    Role: Principal Investigator. Period: 09/08/2017 - 09/07/2021

Postdoc fellow position available

Selected Publications

*: first-authored by a student advised/co-advised by Dr. Chen

  • Statistical inference
    • Chen, Y, Huang, J, Ning, Y, Liang, KY and Lindsay, B. A Conditional Composite Likelihood Ratio Test with Boundary Constraints, Biometrika (2017).
    • Hong, C*, Ning, Y, Wang, S, Wu, H, Carroll, R.J. and Chen, Y. PLEMT: A Novel Pseudolikelihood Based EM Test for Homogeneity in Generalized Exponential Tilt Mixture Models. Journal of the American Statistical Association (2017). (This paper won 2015 JSM Biometrics section Byar Awards)
    • Hong, C*, Ning, Y, Wei, P, Cao, Y and Chen, Y. A semiparametric model for vQTL mapping, Biometrics (2017).
    • Chen, Y, Ning, J, Ning, Y, Liang, KY and Bandeen-Roche, K. On the Pseudolikelihood Inference for Semiparametric Models with Boundary Problems , Biometrika (2017).
    • Ning, J, Chen, Y, Cai, C, Huang, X and Wang, MC. On the Dependence Structure of Bivariate Recurrent Event Processes: Inference and Estimation, Biometrika 2015 March, 102 (2), 345–358.
    • Ning, Y and Chen, Y Test for homogeneity in semiparametric exponential tilt mixture models, Scandinavian Journal of Statistics 2015 June, 41 (2), 504–-517.
    • Chen, Y, Ning, J and Cai, C. Regression analysis of longitudinal data with irregular and informative observation times, Biostatistics, March 2015, 16(4): 727-739.
    • Chen, Y, Ning, Y, Hong, C and Wang, S. Semiparametric tests for identifying differentially methylated loci with case-control designs using Illumina arrays, Genetic Epidemiology 2014 Jan 38 (1), 42-50.
    • Chen, Y and Liang, KY. On the asymptotic behaviour of the pseudolikelihood ratio test statistic with boundary problems, Biometrika 2010 June, 97 (3), 603--620.
  • Medical Informatics
    • Huang, J*, Duan, R*, Hubbard, R, Wu, Y, Moore, JH, Xu, H and Chen, Y. A prior knowledge guided integrated likelihood estimation method (PIE) for bias reduction in association studies using electronic health records data, Journal of the American Medical Informatics Association 2017.
    • Duan, R*, Zhang, X, Du, J, Huang, J, Tao, C and Chen, Y. Post-marketing Drug Safety Evaluation using Data Mining Based on FAERS, International Conference on Data Mining and Big Data 2017. Springer, Cham.
    • Huang, J*, Zhang< X, Du, J, Duan, R, Yang, L, Moore, JH, Chen, Y and Tao, C. Comparing difference of adverse effects among multiple drugs using FAERS data, Medinfo 2017.
    • Du, J*, Huang, J*, Duan, R*, Chen, Y and Tao, C. Comparing the Human Papillomavirus Vaccination Opinions Trends from Different Twitter User Groups with a Machine Learning Based System and Semiparametric Nonlinear Regression, Medinfo 2017.
    • Sun, H, Wang, Y, Chen, Y, Li, Y and Wang, S. pETM: a penalized Exponential Tilt Model for analysis of correlated high-dimensional DNA methylation data, Bioinformatics 2017.
    • Duan, R*, Cao, M, Wu, Y, Huang, J, Denny, J, Xu, H and Chen, Y. An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies, AMIA annual symposium proceedings 2016. (This paper won the first prize of "Best of Student Papers in Knowledge Discovery and Data Mining (KDDM)”Awards)
    • Cai, Y*, Du, J, Huang, J, Tao, C and Chen, Y. A Signal Detection Method for Temporal Variation of Adverse Effect with Vaccine Adverse Event Reporting System Data. BMC Medical Informatics and Decision Making 2016.
    • Du J, Cai Y, Chen, Y, He Y, Tao C. Analysis of Individual Differences in Vaccine Pharmacovigilance using VAERS Data and MedDRA System Organ Classes: A Use Case Study with Trivalent Influenza Vaccine, Vaccine 2016.
    • Du J, Cai Y, Chen, Y, Tao C. Trivalent influenza vaccine adverse symptoms analysis based on MedDRA terminology using VAERS data in 2011, Journal of Biomedical Semantics 2016.
    • Tao C, Du J, Cai Y, Chen, Y. Trivalent Influenza Vaccine Adverse Event Analysis Based On MedDRA System Organ Classes Using VAERS Data, Studies in health technology and informatics 2015. 216:1076.
    • Cao M, Chen, Y, Zhu M, Zhang J. Automated Evaluation of Medical Software Usage: Algorithm and Statistical Analyses. Studies in health technology and informatics 2015. 216:965.
    • Tong, P, Chen, Y, Su, X and Coombers K. SIBER: Systematic Identication of Bimodally Expressed Genes Using RNAseq Data, Bioinformatics 2013.
  • Comparative Effectiveness Research
    • Zhang, J, Ko, CW, Nie, L, Chen, Y and Tiwari, R. Bayesian hierarchical methods for meta-analysis combining randomized-controlled and single-arm studies. Statistical Methods in Medical Research 2018.
    • Hong, C*, Riley, R and Chen, Y. Robust variance estimator for Riley method of the multivariate meta-analysis when within-study correlations are unknown. Research Synthesis Methods 2017.
    • Agarwal, R, Bartsch, SM, Kelly, BJ, Prewitt, M, Liu, YL, Chen, Y and Umscheid, CA. Newer glycopeptide antibiotics for treatment of complicated skin and soft tissue infections: a systematic review, network meta-analysis and cost analysis. Clinical Microbiology and Infection 2017.
    • Huang, J*, Liu, YL*, Vitale, S, Penning, T, Whitehead, A, Vachani, A, Clapper, M, Muscat, J, Lazarus, P, Scheet, P, Moore, JH and Chen, Y and Ying, G. On meta- and mega-analyses for gene-environment interactions. Genetic Epidemiology 2017.
    • Huang, J*, Huang, J, Chen, Y and Ying, G. Evaluation of Approaches to Analyzing Continuous Correlated Eye Data When Sample Size Is Small. Ophthalmic Epidemiology 2017.
    • Ma, X, Lian, X, Chu, H, Ibrahim, J, and Chen, Y. A Bayesian hierarchical model for network meta-analysis of diagnostic tests. Biostatistics 2017.
    • Wang, L, Chen, Y and Zhu, H. Implementing Optimal Allocation in Clinical Trials with Multiple Endpoints. Journal of Statistical Planning and Inference 2017.
    • Ning J, Chen, Y and Piao, J. Maximum likelihood estimation and EM algorithm of Copas selection model for publication bias correction. Biostatistics 2017. *: equally contributed first-author.
    • Li, X*, Chen, Y and Li, R. A frailty model for recurrent events during alternating restraint and non-restraint time periods. Statistics in Medicine 20 February 2017.
    • Liu, Y*, DeSantis, S and Chen, Y. Bayesian network meta-analysis of clinical trials with correlated outcomes subject to publication bias: application to a systematic review of alcohol dependence. Journal of the Royal Statistical Society: Series C 2017
    • Liu, Y*, Chen, Y and Scheet, P. A Meta-Analytic Framework for Detection of Genetic Interactions, Genetic Epidemiology 2016.
    • Chen, Y, Liu, Y, Chu, H, Lee, M and Schmid, C. A simple and robust method for multivariate meta-analysis of diagnostic test accuracy. Statistics in Medicine 2016.
    • Chahoud, J, Semaan, A, Chen, Y, Cao, M, Rieber, A, Rady, P and Tyring, S. The Association between Beta-genus Human Papillomavirus and Cutaneous Squamous Cell Carcinoma in Immunocompetent Individuals: a Meta-analysis, JAMA Dermatology 2015 Dec.
    • Chen, Y, Cai, Y, Hong, C, and Jackson, D. Inference for correlated effect sizes using multiple univariate meta-analyses, Stat Med. 2015 Nov.
    • Chen, Y, Hong C, Ning Y, Su X. Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach. Stat Med. 2015 Aug 24. PubMed PMID: 26303591.
    • Liu, Y*, Chen, Y and Chu H, A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard, Biometrics 2015 Jun;71(2):538-47.
    • Chen, Y, Liu Y, Ning J, Cormier J, Chu H. A hybrid model for combining case-control and cohort studies in systematic reviews of diagnostic tests. J R Stat Soc Ser C Appl Stat. 2015 Apr 1;64(3):469-489.
    • Chen, Y, Hong C, Riley RD. An alternative pseudolikelihood method for multivariate random-effects meta-analysis. Stat Med. 2015 Feb 10;34(3):361-80.
    • Chen, Y, Liu Y, Ning J, Nie L, Zhu H, Chu H. A composite likelihood method for bivariate meta-analysis in diagnostic systematic reviews. Stat Methods Med Res. 2014 Dec 14. PubMed Central PMCID: PMC4466215.
    • Chen, Y, Luo, S, Chu, H, Su, X and Nie, L. An Empirical Bayes Method for Multivariate Meta-analysis with Application in Clinical Trials. Communications in Statistics - Theory and Methods. 2014 43(16), 3536–3551.
    • Ma, X, Chen, Y, Cole, S and Chu, H. A hybrid Bayesian hierarchical model combining cohort and case-control studies for meta-analysis of diagnostic tests: accounting for partial verification bias. Statistical Methods in Medical Research. 2014.
    • Luo S, Chen, Y, Su X, Chu H. mmeta: An R Package for Multivariate Meta-Analysis. J Stat Softw. 2014 Jan 1;56(11):11. PubMed PMID: 24904241; PubMed Central PMCID: PMC4043353.
    • Chen, Y, Luo, S, Chu, H and Wei, P. Bayesian inference on risk differences: an application to multivariate meta-analysis of adverse events in clinical trials. Statistics in Biopharmaceutical Research. 2013 5 (2): 142-155.
    • Nie, L, Soon, G, Qi, K, Chen, Y, and Chu, H. A note on partial covariate-adjustment and design considerations in noninferiority trials when patient-level data are not available. Journal of Biopharmaceutical Statistics. 2013 23 (5), 1042–1053.
    • Chu, H, Nie, L, Chen, Y, Huang, Y and Sun, W. Bivariate random effects models for meta-analysis of comparative studies with binary outcomes: methods for the absolute risk difference and relative risk. Statistical Methods in Medical Research. 2012. 21 (6): 621–633.
    • Chen, Y, Chu H, Luo S, Nie L, Chen S. Bayesian analysis on meta-analysis of case-control studies accounting for within-study correlation. Stat Methods Med Res. 2011 Dec 4. [Epub ahead of print] PubMed PMID: 22143403; PubMed Central PMCID: PMC3683108.

    Statistical software

    • xmeta-project: a platform for conducting comprehensive meta-analysis. https://www.xmeta.wiki/

      R package xmeta: a comprehensive collection of functions for multivariate meta-analyses of continuous or binary outcomes. This package also includes functions and visualization tools for detection and correction of publication bias in multivariate meta-analysis. https://cran.rstudio.com/web/packages/xmeta/index.html

    • R package mmeta: a free, cross-platform and open-source program for pooling summary measures on correlated binary outcomes from multiple studies https://cran.r-project.org/web/packages/mmeta/index.html
    • R package robustETM: Testing homogeneity for generalized exponential tilt model. This package includes a collection of functions for (1) implementing methods for testing homogeneity for generalized exponential tilt model; and (2) implementing existing methods under comparison. https://cran.r-project.org/web/packages/robustETM/index.html References: Chen, et al., 2013, Genetic Epidemiology; and Hong, et al. (2017) JASA.
    • R code for YETI in A meta-analytic framework for detection of genetic interactions (2016) Genetic Epidemiology. http://www.yulunliu.com/software.html