OBSC Abstracts

Courtney Montgomery, Ph.D.

Professor, Genes and Human Disease Research Program
Oklahoma Medical Research Foundation

Title: Growing Opportunities for Bioinformatic Advancements in Biomedical Research

Abstract: The last few years have seen both in technologies to generate whole genome sequencing, spatial and single-cell spatial transcriptomic, high-throughput proteomic, high-resolution molecular, and quantitative imaging data, as well as advances in and increased accessibility of artificial intelligence and deep learning methods. These data and methods have the potential to revolutionize our understanding of any number of health-related phenomena but are virtually useless without the ability to process and analyze them in an efficient way. This offers an unprecedented opportunity to the data science community. The purpose of my talk is to discuss these opportunities including the emerging technologies that are motivating them and the data they are creating. By presenting this information along with examples of the application of integrative analytical methods, specifically network-based and machine learning approaches to clinical and genomic data, to help data scientists and medical researchers optimize, expand, and innovate their current research practices.

Short Bio: Dr. Montgomery is the Director of the OMRF Center for Biomedical Data Sciences (CBDS), and a Professor in the Genes and Human Disease Research Program at the Oklahoma Medical Research Foundation (OMRF). She is also an Adjunct Professor in the Department of Biostatistics and Epidemiology in the College of Public Health and the School of Computer Science at the University of Oklahoma. She has been engaged in complex disease research for >20 years with a strong focus on “big data” and translational research, particularly genetics and genomics. As an early-stage investigator at Case Western Reserve she worked on the development, refinement, and application of methodology for statistical analysis of genetic data with a special emphasis on modeling methods for highly correlated data and the implementation of these methods into the SAGE software suite as the Assistant Director of the NIH funded Human Genetic Analysis Resource. Since then, she has directed Bioinformatics and Data Analysis Cores for both a Center of Biomedical Research Excellence (COBRE), a Center for Research Translation, and she directed the OMRF Quantitative Analysis Core for over 10 years. She also leads her own laboratory focused on the immunogenetics of sarcoidosis, particularly in African American patients and, as such, serves as Director of the OMRF Sarcoidosis Research Unit (SRU). Her collaborative work centers on the application of the data sciences to clinical and biological data. She has extensive experience in genomic, immunological, and other large-scale, high-dimension data, and continues to explore methods of data integration and network development.

David Bard

Professor, Department of Pediatrics

University of Oklahoma Health Sciences Center

Title: Unleashing Machine Learning on Assessment Data to Bolster Alternative-to-Placement Decisions: Lessons Learned from an 8-Year Predictive Risk Model Implementation

Abstract: Judgment and decision-making, even among experts, suffers from human errors propelled by useful but often misleading heuristics and biases. This presentation reviews the development, maintenance, and implementation history of PREMISS, a PREdictive risk case selection Model for Intensive Safety Services in the state of Oklahoma. The model was designed to bolster complex alternative-to-placement decisions traditionally dependent on case worker expertise alone. The presentation chronicles the evolution of modeling and practicalities of embedding model-assisted decision-making into a state-run child welfare system. Lessons learned over an 8+ year implementation period are presented along with recommendations for harnessing the potential of machine learning to aid case decision-making.

Short bio: David Bard is a health services researcher with expertise in implementation science, psychometrics, biostatistics, informatics, and the applied preventative science of Adverse Childhood Experiences (ACEs). He is currently the Chief Research Informatics Officer at the University of Oklahoma (OU) Health Sciences, Professor and Director of the Biomedical and Behavioral Methodology Core in the OU Department of Pediatrics, and a Children’s Health Foundation CMRI Endowed Research Chair. He is also Director of the Center for Environmental and Biological Research on ACEs and Resiliency (EmBRACER). His work on ACEs is motivated by a well-replicated finding that even moderate levels of exposure to common occurring childhood adversities can significantly increase an individual’s risk for future health problems. Dr. Bard is active in multiple lines of research aiming to establish effectiveness and improve quality of ACEs intervention services. These services directly address prevention of future adverse experiences, like child abuse and neglect, and equip families with caregiving strategies and resources for building child resiliency. Dr. Bard’s team is also working to identify and better understand the long-term biological and social consequences of early adversity exposures and their enduring impact on health and wellness later in life.

Ye Liang

Associate Professor, Department of Statistics

Oklahoma State University

Title: Model-free Variable Selection and Inference for High-dimensional Data

Abstract: Simultaneous variable selection and statistical inference is challenging in high-dimensional data analysis. LASSO-based selection and post-selection inference require explicitly specified regression models, which is often a linear regression model. Both the selection and inference tend to perform poorly under misspecified nonlinear models. The LASSO-based methods also hinge on the sparsity of regression models, which sometimes cannot be justified in reality. In this paper, we propose a sufficient dimension association (SDA) technique that measures the association between each predictor and the response conditioning on other predictors. Our proposed SDA method requires neither a specific form of regression model nor sparsity in the regression. Alternatively, our method assumes normalized or Gaussian-distributed predictors with a Markov blanket property. We develop a conditional association measure using sufficient dimension reduction and sliced inverse regression. We propose an estimator for the SDA and prove asymptotic properties for the estimator. For the hypothesis testing and variable selection, we construct test statistics based on the Kolmogorov-Smirnov principle and the Cramer-von-Mises principle. A multiplier bootstrap approach is used for computing critical values and $p$-values for each hypothesis test. Extensive simulation studies have been conducted to show the validity and superiority of our SDA method. The Alzheimer Disease Neuroimaging Initiative data of gene expressions are used to illustrate a real application.

Short Bio: Dr. Liang is an Associate Professor of Statistics at Oklahoma State University. He obtained his B.S. in Mathematics from Nanjing University in 2006 and Ph.D. in Statistics from the University of Missouri in 2012. His research focus is Bayesian statistics (modeling, inference, and computation), survival and longitudinal models, nonlinear state-space models, spatial statistics. He is interested in applications in health sciences and ecology.

Yichuan Zhao

Professor, Department of Mathematics and Statistics

Georgia State University

Title: Smoothed empirical likelihood for the difference of two quantiles with the paired sample

Abstract: In this talk, we propose a novel smoothed empirical likelihood method for the difference of quantiles with paired samples. While the empirical likelihood for the difference of two quantiles with independent samples has been studied, it is crucial to develop a statistical procedure that accounts for the dependence between paired samples. To this end, we propose two estimating equations for the difference of two quantiles and introduce a nuisance parameter in our smoothed empirical likelihood framework. We demonstrate that our approach yields a limiting distribution that follows the standard chi-squared distribution. Extensive simulation studies confirm that our smoothed empirical likelihood method outperforms the normal approximation and method M (Wilcox and Erceg-Hurn, 2012) in most cases. Finally, we illustrate the usefulness of our proposed method by applying it to a real-world data set, estimating the interval of the quantile difference of GDP between different years.

Short bio: Yichuan Zhao is a Professor of Statistics at Georgia State University in Atlanta. He has a joint appointment as associate member of the Neuroscience Institute, and he is also an affiliated faculty member of the School of Public Health at Georgia State University. His current research interest focuses on survival analysis, empirical likelihood methods, nonparametric statistics, analysis of ROC curves, bioinformatics, Monte Carlo methods, and statistical modelling of fuzzy systems. He has published more than 100 research articles in statistics and biostatistics, has coedited 6 books on statistics, biostatistics and data science, and has been invited to deliver more than 200 research talks nationally and internationally. Dr. Zhao has organized the Workshop Series on Biostatistics and Bioinformatics since its initiation in 2012. He also organized the 25th ICSA Applied Statistics Symposium in Atlanta as the chair of the organizing committee to great success. In addition, the 6th ICSA China Conference that he organized as the chair of the program committee was a huge success. He is currently serving as associate editor, or on the editorial board, for several statistical journals. Dr. Zhao is a Fellow of the American Statistical Association, and an elected member of the International Statistical Institute.

Changbao Wu

Professor, Department of Statistics and Actuarial Science

University of Waterloo

Title: Statistical Analysis with Non-Probability Survey Samples

Abstract: We provide an overview of recent developments in statistical inference with non-probability survey samples. We discuss issues arising from methodological developments related to inverse probability weighting and model-based prediction, and concerns with practical applications. Three procedures proposed in the recent literature on the estimation of participation probabilities are examined under a joint randomization framework. The inexplicit impact of the positivity assumption on model-based prediction approach is examined, and the main issue of undercoverage is highlighted. We discuss potential strategies for dealing with standard assumptions and undercoverage problems in practice.

Short Bio: Changbao Wu is Professor and Chair, Department of Statistics and Actuarial Science, University of Waterloo. His main research interest is design and analysis of complex surveys, with extended interests covering calibration techniques, empirical likelihood, resampling methods, missing data problems and causal inference. He is a Fellow of ASA, a Fellow of IMS, an Elected Member of ISI, and the 2012 winner of the CRM-SSC Prize in Statistics. He is the lead author (with Mary Thompson) of the book “Sampling Theory and Practice” published by Springer in 2020.

Pratyaydipta Rudra

Associate Professor, Department of Statistics

Oklahoma State University

Title: Comparing short-term and long-term resource tracking in ecology – A statistical conundrum

Abstract: Migratory birds and animals track resources along their migration pathway. Their ability to predict resource abundance along their route may depend on many factors including changes brought upon by global climate change. It is of utmost interest for the ecologists to understand the impact of such factors on the resource tracking ability of the migratory animals and hence their survival. The tracking of long-term average abundances are compared by some researchers with the tracking of current abundances to understand the resource prediction mechanism of the animals. The full statistical modeling of resource tracking can often be a complicated spatiotemporal question. However, these practitioners often simplify the questions by using linear models and assess the slope terms to throw light on the resource tracking abilities of the animals. In our work, we critically assess these methods for their statistical rigor. We demonstrate that the existing approaches can have some limitations and may lead to misleading conclusions. We propose modifications of the existing approaches to resolve such issues.

Short Bio: Dr. Rudra is an associate professor in the department of statistics at Oklahoma State University. His primary research interests lie broadly in developing statistical methodologies for large biological data sets such as 'omics' data sets. In particular, his interest spans several areas such as test of association, expression Quantitative Trait Loci (eQTL) analysis, kernel machine methods, multiple hypothesis testing, longitudinal data analysis and multivariate methods. His collaborative research includes developing appropriate statistical approaches for data from various scientific disciplines including physiology, pharmacology, agriculture and ecology.

Mike Daniels
Professor and Chair, Andrew Banks Family Endowed Chair
Department of Statistics, University of Florida

Title: A Bayesian nonparametric approach for nonignorable missingness in EHR data

Abstract: We propose an approach for missingness in EHRs using Bayesian nonparametric (BNP) models. We show how to introduce sensitivity parameters corresponding to nonignorable missingness in the outcome and confounders by extracting unidentified distributions from the BNP model and reconstructing the distribution of interest. We also flexibly include auxiliary covariates to move closer to MAR. We use G-computation based on the reconstructed distribution to compute causal estimands of interest. We use our approach to assess the comparative effectiveness of two bariatic surgeries on BMI 18 months after surgery.

Short bio: Michael Daniels, ScD is Professor, Andrew Banks Family Endowed Chair, and Chair in the Department of Statistics at the University of Florida. He received his doctoral degree from Harvard Biostatistics in 1995. He has been on the faculty at Iowa State University and the University of Texas at Austin (as chair) before returning to the University of Florida. His research interests focus on Bayesian (nonparametric) approaches for missing data and causal inference. This research has been funded by the US National Institutes of Health since 2001. His current projects include “Bayesian machine learning approaches for complex missing data and causal inference with a focus on cardiovascular and obesity studies” as PI and “ Combining longitudinal cohort studies to examine cardiovascular risk factor trajectories across the adult lifespan and their association with disease“ as MPI. He served as co-editor of Biometrics from 2015 to 2018 . He is a fellow of the American Statistical Association and an elected member of the International Statistics Institute (ISI).

Talayeh Razzaghi

Assistant Professor of School of Industrial and Systems Engineering

Assistant Professor of Data Science and Analytics Institute

Gallogly College of Engineering

University of Oklahoma

Title: Scalable and Trustworthy Deep Neural Networks: Application to Early Prediction of Preeclampsia

Abstract: The rapid increase in data availability has propelled neural networks to the forefront of effective learning algorithms, particularly within healthcare applications. However, training on large datasets necessitates extensive resources and introduces significant challenges, such as addressing class imbalance and ensuring fairness. To tackle these issues, we present a novel Multilevel Deep Neural Network (MLDNN) approach. This innovative method is designed to scale efficiently with large datasets while providing robust solutions to class imbalance and fairness. Our demonstration will highlight MLDNN's superior accuracy and resilience in classification tasks compared to traditional models, utilizing both preeclampsia-specific and benchmark datasets.

Short bio: Talayeh Razzaghi is an Assistant Professor in the School of Industrial and Systems Engineering at the University of Oklahoma (OU). She is also affiliated with the Data Science and Analytics Institute at OU. As a recipient of the National Science Foundation CAREER Award, Talayeh Razzaghi has an active research program dedicated to the development and use of data-driven analytical models to guide decision making for real-world problems, particularly risk stratification models for maternal care. In her research, she primarily employs the theory of machine learning and data mining for the settings with the presence of imperfect, noisy, and possibly massive datasets.

Ping-Shou Zhong

Associate Professor

Mathematics, Statistics, and Computer Science

University of Illinois Chicago

Title: Causal Inference for Biomarker Identification with High-dimensional Outcome Variables

Abstract: In the fields of genomics, genetics, and neuroimaging, the identification of biomarkers plays a crucial role in detecting disease-caused changes among a multitude of potential candidates. Existing causal inference methods predominantly address low-dimensional outcome variables, rendering them unsuitable for biomarker identification involving a substantial number of candidates. To address this gap, we present a novel causal inference procedure tailored for high-dimensional data when the number of biomarker candidates exceeds the sample size. Our proposed method exhibits doubly robust properties, offering resilience against misspecification in both propensity score functions and outcome regression models. We establish the asymptotic distributions of our proposed statistic, which vary depending on the particular misspecifications in propensity score or outcome regression models. To adaptively estimate moments in the asymptotic distributions, we introduce a bootstrap procedure. We evaluate the finite-sample performance of our approach through comprehensive numerical simulation studies. Additionally, we apply our method to a diffusion MRI dataset, identifying regions of interest with potential as biomarkers for Parkinson's disease.

Short Bio: Ping-Shou Zhong is a professor of Statistics in the Department of Math, Statistics, and Computer Science at the University of Illinois at Chicago. He received his Ph.D. in Statistics from Iowa State University. His research interests include statistical inference for high dimensional data, statistical analysis for longitudinal and functional data, empirical likelihood method and its applications, missing data problems, and applications in biomedical data. He has published papers in the Annals of Statistics, Journal of Econometrics, Journal of American Statistical Association, Biometrika, and Journal of Royal Statistics Society Series B. He serves as an associate editor for Computational Statistics and Data Analysis, and the Journal of the Korean Statistical Society. His research was supported by NSF and NIH grants.

Danhyang Lee

Assistant Professor, Department of Statistical Science

Baylor University

Title: Enhanced Statistical Inference with Semiparametric Models for Nonignorable Nonresponse and Data Integration Applications

Abstract: In this presentation, we tackle the prevalent issue of missing data in statistical analysis, with a focus on nonignorable nonresponse scenarios. Traditional methodologies frequently fall short due to model misspecification. In response, we propose a semiparametric response model, offering a more adaptable and flexible solution than conventional parametric models. A central aspect of our discussion will be the introduction of an efficient profile maximum likelihood estimator with its asymptotic properties. This innovative approach is not only robust in addressing missing data challenges but is also versatile for integrating data from both probability and nonprobability sources. To address selection bias inherent in nonprobability samples, we utilize a semiparametric propensity score model that moves beyond the standard missing at random framework. We introduce a novel estimation method for this model, ensuring unbiased results irrespective of the model’s specification. The session will cover the asymptotic properties of these estimators and explore variance estimation techniques. Through comparative simulations, we will showcase the superiority of our methodology in effectively managing nonignorable selection bias, especially from nonprobability samples.

Short Bio: Dr. Danhyang Lee currently serves as an Assistant Professor in the Department of Information Systems, Statistics and Management Science at the University of Alabama’s Culverhouse College of Business, a position she has held since 2019. She earned her Ph.D. in Statistics from Iowa State University in 2019.

Dr. Lee's research interests are diverse within the field of applied statistics, encompassing areas such as missing data analysis, data integration, Bayesian modeling, survey sampling, and causal inference. Her significant contributions to the field are evident in her publications in esteemed journals including the Journal of the American Statistical Association, Biometrika, Annals of Applied Statistics, Biometrics, and Scandinavian Journal of Statistics, among others.

Kosuke MORIKAWA

Assistant Professor, Department of Statistics

Iowa State University

Title: Nonparametric Data Integration for Estimation of Average Treatment Effect by Generalized Entropy Balancing

Abstract: In case-control studies, obtaining case data can be both time-consuming and expensive, while control data are often easily accessible from previous research. However, a significant challenge arises due to potential differences in patient background information between the current and reference studies. We propose a nonparametric method for estimating the average treatment effect that balances these two distributions by using generalized entropy. This method effectively utilizes summary information from the reference study and avoids the assumptions typically associated with regression or propensity score models.

Short Bio: Dr. Morikawa is an Assistant Professor of Statistics at Iowa State University. Prior to joining Iowa State University, he served as an Assistant and Associate Professor at Osaka University in Japan. His research interests include missing data analysis and semi-parametric methods.