Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1051 - 1080 of 13246

Full-Text Articles in Physical Sciences and Mathematics

Robust Uncertainty Quantification With Analysis Of Error In Standard And Non-Standard Quantities Of Interest, Zachary Stevens Aug 2022

Robust Uncertainty Quantification With Analysis Of Error In Standard And Non-Standard Quantities Of Interest, Zachary Stevens

Mathematics & Statistics ETDs

This thesis derives two Uncertainty Quantification (UQ) methods for differential equations that depend on random parameters: (\textbf{i}) error bounds for a computed cumulative distribution function (\textbf{ii}) a multi-level Monte Carlo (MLMC) algorithm with adaptively refined meshes and accurately computed stopping-criteria. Both UQ approaches utilize adjoint-based \textit{a posteriori} error analysis in order to accurately estimate the error in samples of numerically approximated quantities of interest. The adaptive MLMC algorithm developed in this thesis relies on the adjoint-based error analysis to adaptively create meshes and accurately monitor a stopping criteria. This is in contrast to classical MLMC algorithms which employ either a …


Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero Aug 2022

Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero

Doctoral Dissertations

With the continuous improvements in biological data collection, new techniques are needed to better understand the complex relationships in genomic and other biological data sets. Explainable Artificial Intelligence (X-AI) techniques like Iterative Random Forest (iRF) excel at finding interactions within data, such as genomic epistasis. Here, the introduction of new methods to mine for these complex interactions is shown in a variety of scenarios. The application of iRF as a method for Genomic Wide Epistasis Studies shows that the method is robust in finding interacting sets of features in synthetic data, without requiring the exponentially increasing computation time of many …


A Positivity Preserving, Energy Stable Finite Difference Scheme For The Flory-Huggins-Cahn-Hilliard-Navier-Stokes System, Wenbin Chen, Jianyu Jing, Cheng Wang, Xiaoming Wang Aug 2022

A Positivity Preserving, Energy Stable Finite Difference Scheme For The Flory-Huggins-Cahn-Hilliard-Navier-Stokes System, Wenbin Chen, Jianyu Jing, Cheng Wang, Xiaoming Wang

Mathematics and Statistics Faculty Research & Creative Works

In this paper, we propose and analyze a finite difference numerical scheme for the Cahn-Hilliard-Navier-Stokes system, with logarithmic Flory-Huggins energy potential. in the numerical approximation to the singular chemical potential, the logarithmic term and the surface diffusion term are implicitly updated, while an explicit computation is applied to the concave expansive term. Moreover, the convective term in the phase field evolutionary equation is approximated in a semi-implicit manner. Similarly, the fluid momentum equation is computed by a semi-implicit algorithm: implicit treatment for the kinematic diffusion term, explicit update for the pressure gradient, combined with semi-implicit approximations to the fluid convection …


A Computationally Efficient Wald Test In M-Estimation, Denisse Urenda Castañeda Aug 2022

A Computationally Efficient Wald Test In M-Estimation, Denisse Urenda Castañeda

Open Access Theses & Dissertations

Under the maximum likelihood framework, three asymptotic overall tests have been well developed in generalized linear models (GLM) for testing the single null hypothesis H0 : θ = θ0, namely, the Wald test, Likelihood Ratio Test (LRT) and Score test also known as the Lagrange Multiplier test (LM). Modified versions of Wald, LR and LM tests can also be found for testing the significance of a portion of the parameter θ, i.e., if θ = (θ T 1 , θ T 2 ) T it is of interest to test H0 : θ2 = 0. However, with the constant increase …


A Bayesian Hierarchical Approach For Modeling Virtual Species With Realistic Functional Trait Relationships, Sarah Bogen Aug 2022

A Bayesian Hierarchical Approach For Modeling Virtual Species With Realistic Functional Trait Relationships, Sarah Bogen

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Understanding the spatial and temporal dynamics of plant populations has important implications for the fields of ecology and conservation. A rich body of mathematical modeling approaches, including reaction-diffusion equations and integrodifference equations, have been developed to mechanistically model population spread based on species demography and seed dispersal characteristics. However, with over 390,000 plant species on Earth, it is not feasible to collect complete information on all species for the purpose of drawing generalized conclusions. One means of overcoming such a problem is through trait-based modeling, which seeks to represent realistic combinations of organismal traits rather than focusing on individual species. …


Development Of A Reverse Engineered, Parameterized, And Structurally Validated Computational Model To Identify Design Parameters That Influence American Football Faceguard Performance, William Ferriell Aug 2022

Development Of A Reverse Engineered, Parameterized, And Structurally Validated Computational Model To Identify Design Parameters That Influence American Football Faceguard Performance, William Ferriell

All Dissertations

Traumatic brain injury (TBI) continues to have the greatest incidence among athletes participating in American football. The headgear design research community has focused on developing accurate computational and experimental analysis techniques to better assess the ability of headgear technology to attenuate impacts and protect athletes from TBI. Despite efforts to innovate the headgear system, minimal progress has been made to innovate the faceguard. Although the faceguard is not the primary component of the headgear system that contributes to impact attenuation, faceguard performance metrics, such as weight, structural stiffness, and visual field occlusions, have been linked to athlete safety. To improve …


Advanced High Dimensional Regression Techniques, Yuan Yang Aug 2022

Advanced High Dimensional Regression Techniques, Yuan Yang

All Dissertations

This dissertation focuses on developing high dimensional regression techniques to analyze large scale data using both Bayesian and frequentist approaches, motivated by data sets from various disciplines, such as public health and genetics. More specifically, Chapters 2 and Chapter 4 take a Bayesian approach to achieve modeling and parameter estimation simultaneously while Chapter 3 takes a frequentist approach. The main aspects of these techniques are that they perform variable selection and parameter estimation simultaneously, while also being easily adaptable to large-scale data. In particular, by embedding a logistic model into traditional spike and slab framework and selecting of proper prior …


Sex And Gender Differences In Symptoms Of Early Psychosis: A Systematic Review And Meta-Analysis, Brooke Carter, Jared Wootten, Suzanne Archie, Amanda L Terry, Kelly K. Anderson Aug 2022

Sex And Gender Differences In Symptoms Of Early Psychosis: A Systematic Review And Meta-Analysis, Brooke Carter, Jared Wootten, Suzanne Archie, Amanda L Terry, Kelly K. Anderson

Epidemiology and Biostatistics Publications

First-episode psychosis (FEP) can be quite variable in clinical presentation, and both sex and gender may account for some of this variability. Prior literature on sex or gender differences in symptoms of psychosis have been inconclusive, and a comprehensive summary of evidence on the early course of illness is lacking. The objective of this study was to conduct a systematic review and meta-analysis of the literature to summarize prior evidence on the sex and gender differences in the symptoms of early psychosis. We conducted an electronic database search (MEDLINE, Scopus, PsycINFO, and CINAHL) from 1990 to present to identify quantitative …


Human Perception Of Exponentially Increasing Data Displayed On A Log Scale Evaluated Through Experimental Graphics Tasks, Emily Robinson Aug 2022

Human Perception Of Exponentially Increasing Data Displayed On A Log Scale Evaluated Through Experimental Graphics Tasks, Emily Robinson

Department of Statistics: Dissertations, Theses, and Student Work

Log scales are often used to display data over several orders of magnitude within one graph. We conducted a series of three graphical studies to evaluate the impact displaying data on the log scale has on human perception of exponentially increasing trends compared to displaying data on the linear scale. Each study was related to a different graphical task, each requiring a different level of interaction and cognitive use of the data being presented. The first experiment evaluated whether our ability to perceptually notice differences in exponentially increasing trends is impacted by the choice of scale. Participants were shown a …


Quantum Computing Simulation Of The Hydrogen Molecule System With Rigorous Quantum Circuit Derivations, Yili Zhang Aug 2022

Quantum Computing Simulation Of The Hydrogen Molecule System With Rigorous Quantum Circuit Derivations, Yili Zhang

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Quantum computing has been an emerging technology in the past few decades. It utilizes the power of programmable quantum devices to perform computation, which can solve complex problems in a feasible time that is impossible with classical computers. Simulating quantum chemical systems using quantum computers is one of the most active research fields in quantum computing. However, due to the novelty of the technology and concept, most materials in the literature are not accessible for newbies in the field and sometimes can cause ambiguity for practitioners due to missing details.

This report provides a rigorous derivation of simulating quantum chemistry …


Multiple Imputation In High-Dimensional Data With Variable Selection, Qiushuang Li Aug 2022

Multiple Imputation In High-Dimensional Data With Variable Selection, Qiushuang Li

Legacy Theses & Dissertations (2009 - 2024)

This dissertation focuses on the development of multiple imputation models and algorithms for high-dimensional data with variable selection structures. Leveraging on the multivariate linear mixed-effects model with missing responses for clustered data, we incorporate the variable selection routines using spike-and-slab priors within the Bayesian variable selection framework. Specific choice of these priors allow us to "force'' variables of importance (e.g. design variables or variables known to play role in missingness mechanism) into the imputation models. Our ultimate goal is to improve computational speed by removing unnecessary variables. Markov chain Monte Carlo techniques have been designed to sample from the implied …


Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang Aug 2022

Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang

Legacy Theses & Dissertations (2009 - 2024)

Recently there are a considerable amount of work devoted to the study of the algorithmic stability as well as differential privacy (DP) for stochastic gradient methods (SGM). However, most of the existing work focus on the empirical risk minimization (ERM) and the population risk minimization problems. In this paper, we study two types of optimization problems that enjoy wide applications in modern machine learning, namely the minimax problem and the pairwise learning problem.


Semiparametric Estimation With Clustered Right Censored Data Via Multivariate Gaussian Random Fields, Fathima Zahra Sainul Abdeen Aug 2022

Semiparametric Estimation With Clustered Right Censored Data Via Multivariate Gaussian Random Fields, Fathima Zahra Sainul Abdeen

Doctoral Dissertations

Consider a fixed number of clustered areas identified by their geographical coordinate that are monitored for the occurrences of an event such as pandemic, epidemic, migration to name a few. Data collected on units at all areas include time varying covariates and other environmental factors that may affect event occurrences. The event times in every area can be independent. They can also be correlated with correlation between two units induced by an unobservable frailty. In both cases, the collected data is considered pairwise to account for spatial correlation between all pair of areas. The pairwise right censored data is probit-transformed …


Survivor Bond Models For Securitizing Longevity Risk, Priscilla Mansah Codjoe Aug 2022

Survivor Bond Models For Securitizing Longevity Risk, Priscilla Mansah Codjoe

Doctoral Dissertations

"Longevity risk is the risk that a reference population’s mortality rates deviate from what is projected from prior life tables. This is due to discoveries in biological sciences, improved public health measures, and nutrition, which have dramatically increased life expectancy. Longevity risk raises life insurers’ liability, increasing product costs and reserves. Securitization through longevity derivatives is a way of dealing with this risk.

To enhance the pricing of life contingent products, we present an additive type mortality model in the style of the Lee-Carter. This model incorporates policyholder covariates. By using counting processes and martingale machinery, we obtain close form …


Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen Aug 2022

Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.

The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …


Bayesian Adaptive Designs For Proof-Of-Concept Trials And Platform Trials, Yujie Zhao Aug 2022

Bayesian Adaptive Designs For Proof-Of-Concept Trials And Platform Trials, Yujie Zhao

Dissertations & Theses (Open Access)

With the revolutionary achievement in molecular targeted therapies and cancer immunotherapies, the traditional drug development paradigm in phase II trials becomes increasingly inefficient due to its slow progress, high cost, and high failure rate. Fitting one standard strategy to all different trials also harms its reliability in decision-making because it doesn’t fully use all available resources and information in each trial. It’s crucial to develop novel phase II trial designs to accomplish different objectives for different types of trials. This research mainly focuses on Bayesian adaptive designs for phase II trials. Three types of trials are discussed in which traditional …


Efficient Approaches To Steady State Detection In Multivariate Systems, Honglun Xu Aug 2022

Efficient Approaches To Steady State Detection In Multivariate Systems, Honglun Xu

Open Access Theses & Dissertations

Steady state detection is critically important in many engineering fields such as fault detection and diagnosis, process monitoring and control. However, most of the existing methods are designed for univariate signals. In this dissertation, we proposed an efficient online steady state detection method for multivariate systems through a sequential Bayesian partitioning approach. The signal is modeled by a Bayesian piecewise constant mean and covariance model, and a recursive updating method is developed to calculate the posterior distributions analytically. The duration of the current segment is utilized to test the steady state. Insightful guidance is provided for hyperparameter selection. The effectiveness …


Computer Aided Diagnosis System For Breast Cancer Using Deep Learning., Asma Baccouche Aug 2022

Computer Aided Diagnosis System For Breast Cancer Using Deep Learning., Asma Baccouche

Electronic Theses and Dissertations

The recent rise of big data technology surrounding the electronic systems and developed toolkits gave birth to new promises for Artificial Intelligence (AI). With the continuous use of data-centric systems and machines in our lives, such as social media, surveys, emails, reports, etc., there is no doubt that data has gained the center of attention by scientists and motivated them to provide more decision-making and operational support systems across multiple domains. With the recent breakthroughs in artificial intelligence, the use of machine learning and deep learning models have achieved remarkable advances in computer vision, ecommerce, cybersecurity, and healthcare. Particularly, numerous …


Statistical Methods For Personalized Treatment Selection And Survival Data Analysis Based On Observational Data With High-Dimensional Covariates., Don Ramesh Dinendra Sudaraka Tholkage Aug 2022

Statistical Methods For Personalized Treatment Selection And Survival Data Analysis Based On Observational Data With High-Dimensional Covariates., Don Ramesh Dinendra Sudaraka Tholkage

Electronic Theses and Dissertations

Due to the wide availability of functional data from multiple disciplines, the studies of functional data analysis have become popular in the recent literature. However, the related development in censored survival data has been relatively sparse. In Chapter 2, we consider the problem of analyzing time-to-event data in the presence of functional predictors. We develop a conditional generalized Kaplan Meier (KM) estimator that incorporates functional predictors using kernel weights and rigorously establishes its asymptotic properties. In addition, we propose to select the optimal bandwidth based on a time-dependent Brier score. We then carry out extensive numerical studies to examine the …


Defining Areas Of Interest Using Voronoi And Modified Voronoi Tesselations To Analyze Eye-Tracking Data, Joanna D. Coltrin Aug 2022

Defining Areas Of Interest Using Voronoi And Modified Voronoi Tesselations To Analyze Eye-Tracking Data, Joanna D. Coltrin

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Eye tracking is a technology used to track where someone is looking. Eye-tracking technology is often used to study what people focus on when looking at a photo of another person. The eye-tracking technology records points on a photo that a person is looking at. When the photo being looked at shows a person, the points can be categorized by body part such as head, right hand, left hand, and torso. This thesis presents the use of partially circular areas to define the body parts of the person in the photo and therefore categorize the points collected by the eye-tracker. …


Effects Of Macronutrients Intake And Physical Activity On Childhood Obesity Of Hispanic Children, Prosanta Barai Aug 2022

Effects Of Macronutrients Intake And Physical Activity On Childhood Obesity Of Hispanic Children, Prosanta Barai

Theses and Dissertations

Obesity has become more ubiquitous during the past few decades, and still, its prevalence is increasing. It is in every population in the world and all regions, including rural parts of low and middle-income countries. In the USA, regardless of age, the severity of obesity is no different from the global trend. Although numerous pieces of literature are available, that tried to find answers to some pressing issues like how obesity can be controlled, but there is little to no study focused on younger children, especially the 4-6-year-old Hispanic population. Our study aimed to determine the causal path among literature …


Neural Networks And Stochastic Differential Equations, Stephanie L. Flores Aug 2022

Neural Networks And Stochastic Differential Equations, Stephanie L. Flores

Theses and Dissertations

Influenced by the seminal work, “Physics Informed Neural Networks” by Raissi et al., 2017, there has been a growing interest in solving and parameter estimation of Nonlinear Partial Differential Equations (PDE) with Deep Neural networks in recent years. In fact, this has broadened the pathways and shed light on deep learning of stochastic differential equations (SDE) and stochastic PDE’s (SPDE).In this work, we intend to investigate the current approaches of solving and parameter estimation of the SDE/SPDE with deep neural networks and the possibility of extending them to obtain more accurate/stable solutions with residual systems and/or generative adversarial neural networks. …


Dynamic System Discovery With Recursive Physics-Informed Neural Networks, Jarrod Mau Aug 2022

Dynamic System Discovery With Recursive Physics-Informed Neural Networks, Jarrod Mau

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

This thesis presents a novel method, recursive Physics informed neural network, to learn the right hand side of differential equations. The neural network takes in data, then trains, and then acts as a proxy for the differential equation which can be used for modeling. We show the theoretical superiority of the recursive approach. We also use computer simulations to demonstrate the proved properties.


Redefining Nba Basketball Positions Through Visualization And Mega-Cluster Analysis, Alexander L. Hedquist Aug 2022

Redefining Nba Basketball Positions Through Visualization And Mega-Cluster Analysis, Alexander L. Hedquist

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Basketball players have historically been classified based on one of five positions, namely Point Guards, Shooting Guards, Small Forwards, and Centers. While grouping players into these five categories may provide general descriptions of their perceived role, these standard positions fall short of describing players based on their true abilities and performance. This MS thesis proposes a method to group players of the National Basketball Association (NBA) from the past 20 seasons into more meaningful and specific player positions. We systematically group these players into nine distinct categories, and we draw from a vast array of visualization tools, techniques, and software …


An Introduction To Combinatorics Via Cayley's Theorem, Jaylee Willis Aug 2022

An Introduction To Combinatorics Via Cayley's Theorem, Jaylee Willis

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

In this paper, we explore some of the methods that are often used to solve combinatorial problems by proving Cayley’s theorem on trees in multiple ways. The intended audience of this paper is undergraduate and graduate mathematics students with little to no experience in combinatorics. This paper could also be used as a supplementary text for an undergraduate combinatorics course.


Geometry- And Accuracy-Preserving Random Forest Proximities With Applications, Jake S. Rhodes Aug 2022

Geometry- And Accuracy-Preserving Random Forest Proximities With Applications, Jake S. Rhodes

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many machine learning algorithms use calculated distances or similarities between data observations to make predictions, cluster similar data, visualize patterns, or generally explore the data. Most distances or similarity measures do not incorporate known data labels and are thus considered unsupervised. Supervised methods for measuring distance exist which incorporate data labels and thereby exaggerate separation between data points of different classes. This approach tends to distort the natural structure of the data. Instead of following similar approaches, we leverage a popular algorithm used for making data-driven predictions, known as random forests, to naturally incorporate data labels into similarity measures known …


Improving Computation For Hierarchical Bayesian Spatial Gaussian Mixture Models With Application To The Analysis Of Thz Image Of Breast Tumor, Jean Remy Habimana Aug 2022

Improving Computation For Hierarchical Bayesian Spatial Gaussian Mixture Models With Application To The Analysis Of Thz Image Of Breast Tumor, Jean Remy Habimana

Graduate Theses and Dissertations

In the first chapter of this dissertation we give a brief introduction to Markov chain Monte Carlo methods (MCMC) and their application in Bayesian inference. In particular, we discuss the Metropolis-Hastings and conjugate Gibbs algorithms and explore the computational underpinnings of these methods. The second chapter discusses how to incorporate spatial autocorrelation in linear a regression model with an emphasis on the computational framework for estimating the spatial correlation patterns.

The third chapter starts with an overview of Gaussian mixture models (GMMs). However, because in the GMM framework the observations are assumed to be independent, GMMs are less effective when …


Ensemble Tree-Based Machine Learning For Imaging Data, Reza Iranzad Aug 2022

Ensemble Tree-Based Machine Learning For Imaging Data, Reza Iranzad

Graduate Theses and Dissertations

In particular medical imaging data, such as positron emission tomography (PET), computed tomography (CT), and fluorescence intravital microscopy (IVM), have become prevalent for use in a wide variety of applications, from diagnostic purposes, tracking diseases' progress, and monitoring the effectiveness of treatments to decision-making processes. The detailed information generated by medical imaging has enabled physicians to provide more comprehensive care. Although numerous machine learning algorithms, especially those used for imaging data, have been developed, dealing with unique structures in imaging data remained a big challenge. In this dissertation, we are proposing novel statistical tree-based methods with more efficient and more …


Hiding In Plain Sight: Accounting For Rate Heterogeneity In Trait Evolution Models, James Boyko Aug 2022

Hiding In Plain Sight: Accounting For Rate Heterogeneity In Trait Evolution Models, James Boyko

Graduate Theses and Dissertations

Within the last four decades, phylogenetic comparative methods have become the defacto method of analysis for comparative biologists. The availability of high-quality comparative datasets has been matched by an explosion of possible phylogenetic models. In large part, the efforts to increase the realism of phylogenetic comparative methods has been successful as evidenced by their widespread use. To this extensive literature, my contributions are modest. I have focused my dissertation work on two main themes. First, most phenotypic evolution is not independent of other phenotypes. Changes in a particular character may influence changes in another and modeling these characters in isolation …


Quantile Differences In The Age-Related Decline In Cardiorespiratory Fitness Between Sexes In Adults Without Type 2 Diabetes Mellitus In The United States, Andrew Ortaglia, Melissa Stansbury, Michael David Wirth, Xuemei Sui, Matteo Bottai Aug 2022

Quantile Differences In The Age-Related Decline In Cardiorespiratory Fitness Between Sexes In Adults Without Type 2 Diabetes Mellitus In The United States, Andrew Ortaglia, Melissa Stansbury, Michael David Wirth, Xuemei Sui, Matteo Bottai

Faculty Publications

Objective: To comprehensively assess the extent to which the decline in cardiorespiratory fitness (CRF) with age differs between sexes. Participants and Methods: This study used data from the Aerobics Center Longitudinal Study, conducted between September 1974 and August 2006, consisting primarily of White adults from middle-to-upper socioeconomic strata restricted to adults without type 2 diabetes mellitus (33,742 men and 9,415 women). Quantile regression models were used to estimate the differences in age-associated changes in CRF between the sexes, estimated using a maximal treadmill test. Results: For adults aged up to 45 years, significant differences in slopes relating to age and …