Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 571 - 600 of 13246

Full-Text Articles in Physical Sciences and Mathematics

Empirical Likelihood Ratio Tests For Homogeneity Of Distributions Of Component Lifetimes From System Lifetime Data With Known System Structures, Jingjing Qu May 2023

Empirical Likelihood Ratio Tests For Homogeneity Of Distributions Of Component Lifetimes From System Lifetime Data With Known System Structures, Jingjing Qu

Statistical Science Theses and Dissertations

In system reliability, practitioners may be interested in testing the homogeneity of the component lifetime distributions based on system lifetimes from multiple data sources for various reasons, such as identifying the component supplier that provides the most reliable components.

In the first part of the dissertation, we develop distribution-free hypothesis testing procedures for the homogeneity of the component lifetime distributions based on system lifetime data when the system structures are known. Several nonparametric testing statistics based on the empirical likelihood method are proposed for testing the homogeneity of two or more component lifetime distributions. The computational approaches to obtain the …


Development Of Bayesian Hierarchical Methods Involving Meta-Analysis, Jackson Barth May 2023

Development Of Bayesian Hierarchical Methods Involving Meta-Analysis, Jackson Barth

Statistical Science Theses and Dissertations

When conducting statistical analysis in the Bayesian paradigm, the most critical decision made by the researcher is the identification of a prior distribution for a parameter. Despite the mathematical soundness of the Bayesian approach, a wrongly specified prior can lead to biased and incorrect results. To avoid this, prior distributions should be based on real data, which are easily accessible in the "big data" era. This dissertation explores two applications of Bayesian hierarchical modelling that incorporate information obtained from a meta-analysis.

The first of these applications is in the normalization of genomics data, specifically for nanostring nCounter datasets. A meta-analysis …


Identifying Key Activity Indicators In Rats' Neuronal Data Using Lasso Regularized Logistic Regression, Avery Woods May 2023

Identifying Key Activity Indicators In Rats' Neuronal Data Using Lasso Regularized Logistic Regression, Avery Woods

Honors Theses

This thesis aims to identify timestamps of rats’ neuronal activity that best determine behavior using a machine learning model. Neuronal data is a complex and high-dimensional dataset, and identifying the most informative features is crucial for understanding the underlying neuronal processes. The Lasso regularization technique is employed to select the most relevant features of the data to the model’s prediction. The results of this study provide insights into the key activity indicators that are associated with specific behaviors or cognitive processes in rats, as well as the effect that stress can have on neuronal activity and behavior. Ultimately, it was …


Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile May 2023

Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile

Statistical Science Theses and Dissertations

Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …


Drug Ideologies Of The United States, Macy Montgomery May 2023

Drug Ideologies Of The United States, Macy Montgomery

Helm's School of Government Conference - 2021-2024

The United States has been increasingly creating lenient drug policies. Seventeen states and Washington, the District of Columbia, legalized marijuana, and Oregon decriminalized certain drugs, including methamphetamine, heroin, and cocaine. The medical community has proven that drugs, including marijuana, have myriad adverse health side effects. This leads to two questions: Why does the United States government continue to create lenient drug policies, and what reasons do citizens give for legalizing drugs when the medical community has proven them harmful? The paper hypothesizes that the disadvantages of drug legalization outweigh its benefits because of the numerous harms it causes, such as …


Increasing Racial Diversity In The North American Plant Phenotyping Network Through Conference Participation Support, David Lebauer, Alexander Bucksch, Jennifer Clarke, Jesse Potts, Sonali Roy May 2023

Increasing Racial Diversity In The North American Plant Phenotyping Network Through Conference Participation Support, David Lebauer, Alexander Bucksch, Jennifer Clarke, Jesse Potts, Sonali Roy

Department of Statistics: Faculty Publications

A key goal of the North American Plant Phenotyping Network (NAPPN) annual conference is to cultivate a new generation of scientists from diverse backgrounds. As part of their effort to diversify the plant phenomics research community, NAPPN acquired funding to cover all attendance costs for participants from historically black colleges and universities (HBCU) for the 2022 annual meeting. Seven award recipients represented the first attendees from HBCUs in the conference’s 6-year history. In this commentary, we report on the impact of the conference awards, including lessons learned, and the future of the award.


Big Ideas In Sports Analytics And Statistical Tools For Their Investigation, Benjamin S. Baumer, Gregory J. Matthews, Quang Nguyen May 2023

Big Ideas In Sports Analytics And Statistical Tools For Their Investigation, Benjamin S. Baumer, Gregory J. Matthews, Quang Nguyen

Statistical and Data Sciences: Faculty Publications

Sports analytics—broadly defined as the pursuit of improvement in athletic performance through the analysis of data—has expanded its footprint both in the professional sports industry and in academia over the past 30 years. In this article, we connect four big ideas that are common across multiple sports: the expected value of a game state, win probability, measures of team strength, and the use of sports betting market data. For each, we explore both the shared similarities and individual idiosyncracies of analytical approaches in each sport. While our focus is on the concepts underlying each type of analysis, any implementation necessarily …


An Application Of The Pagerank Algorithm To Ncaa Football Team Rankings, Morgan Majors May 2023

An Application Of The Pagerank Algorithm To Ncaa Football Team Rankings, Morgan Majors

Honors Theses

We investigate the use of Google’s PageRank algorithm to rank sports teams. The PageRank algorithm is used in web searches to return a list of the websites that are of most interest to the user. The structure of the NCAA FBS football schedule is used to construct a network with a similar structure to the world wide web. Parallels are drawn between pages that are linked in the world wide web with the results of a contest between two sports teams. The teams under consideration here are the members of the 2021 Football Bowl Subdivision. We achieve a total ordering …


Movie Recommender System Using Matrix Factorization, Roland Fiagbe May 2023

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Observational Study Of Organisational Responses Of 17 Us Hospitals Over The First Year Of The Covid-19 Pandemic, Esther K. Choo, Matthew Strehlow, Marina Del Rios, Evrim Oral, Ruth Pobee, Andrew Nugent, Stephen Lim, Christian Hext, Sarah Newhall, Diana Ko, Srihari V. Chari, Amy Wilson, Joshua J. Baugh, David Callaway, Mucio Kit Delgado, Zoe Glick, Christian J. Graulty, Nicholas Hall, Abdusebur Jemal, Madhav Kc, Aditya Mahadevan, Milap Mehta, Andrew C. Meltzer, Dar'ya Pozhidayeva, Daniel Resnick-Ault May 2023

Observational Study Of Organisational Responses Of 17 Us Hospitals Over The First Year Of The Covid-19 Pandemic, Esther K. Choo, Matthew Strehlow, Marina Del Rios, Evrim Oral, Ruth Pobee, Andrew Nugent, Stephen Lim, Christian Hext, Sarah Newhall, Diana Ko, Srihari V. Chari, Amy Wilson, Joshua J. Baugh, David Callaway, Mucio Kit Delgado, Zoe Glick, Christian J. Graulty, Nicholas Hall, Abdusebur Jemal, Madhav Kc, Aditya Mahadevan, Milap Mehta, Andrew C. Meltzer, Dar'ya Pozhidayeva, Daniel Resnick-Ault

School of Public Health Faculty Publications

Objectives The COVID-19 pandemic has required significant modifications of hospital care. The objective of this study was to examine the operational approaches taken by US hospitals over time in response to the COVID-19 pandemic. Design, setting and participants This was a prospective observational study of 17 geographically diverse US hospitals from February 2020 to February 2021. Outcomes and analysis We identified 42 potential pandemic-related strategies and obtained week-to-week data about their use. We calculated descriptive statistics for use of each strategy and plotted percent uptake and weeks used. We assessed the relationship between strategy use and hospital type, geographic region …


An Analysis Of All-Cause Mortality On Patients With Sickle Cell Disease And Kidney Disease Using Propensity Score Matching, Adam Garrison May 2023

An Analysis Of All-Cause Mortality On Patients With Sickle Cell Disease And Kidney Disease Using Propensity Score Matching, Adam Garrison

Electronic Theses and Dissertations

In this work, we provide an overview of the Cox proportional hazards model for time to event or survival analysis and the notion of propensity score matching to deal with confounding factors. A full analysis is reported in Chapter 2 concerning mortality for in-center dialysis patients with sickle cell disease to demonstrate the application of a general analysis strategy that has some logistical benefits over more traditional approaches to accounting for confounding variables. We also provide some insight and discussions on the challenges and future research questions that will emerge when trying to implement this strategy as a monitoring tool …


Near-Term Effects Of Perennial Grasses On Soil Carbon And Nitrogen In Eastern Nebraska, Salvador Ramirez Ii, Marty R. Schmer, Virginia L. Jin, Robert B. Mitchell, Kent M. Eskridge May 2023

Near-Term Effects Of Perennial Grasses On Soil Carbon And Nitrogen In Eastern Nebraska, Salvador Ramirez Ii, Marty R. Schmer, Virginia L. Jin, Robert B. Mitchell, Kent M. Eskridge

Department of Statistics: Faculty Publications

Incorporating native perennial grasses adjacent to annual row crop systems managed on marginal lands can increase system resiliency by diversifying food and energy production. This study evaluated (1) soil organic C (SOC) and total N stocks (TN) under warm-season grass (WSG) monocultures and a low diversity mixture compared to an adjacent no-till continuous-corn system, and (2) WSG total above-ground biomass (AGB) in response to two levels of N fertilization from 2012 to 2017 in eastern Nebraska, USA. The WSG treatments consisted of (1) switchgrass (SWG), (2) big bluestem (BGB), and (3) low-diversity grass mixture (LDM; big bluestem, Indiangrass, and sideoat …


Inter-Rater Reliability Of Statistics Based On Reconstructed Individual Patient Data From Published Kaplan-Meier Curves, Megan E. Smith May 2023

Inter-Rater Reliability Of Statistics Based On Reconstructed Individual Patient Data From Published Kaplan-Meier Curves, Megan E. Smith

Theses & Dissertations

Introduction: Time-to-event outcomes include two elements: an indicator variable for whether the event has taken place, and the length of time from some origin point to the occurrence of the event of interest. Due to the complexity of these data, secondary analysis methods, such as indirect comparisons and meta-analysis, are easier to perform when individual-level patient data (IPD) is available.

Objectives: In 2021, an R package IPDfromKM was published, which contains an algorithm for reconstructing IPD from a Kaplan-Meier graph. The current research aimed to investigate the reproducibility of the IPDfromKM algorithm.

Methods: Three statisticians (MS, LS, …


Factors Affecting Apothecia Production And Primary Infection By Monilinia Vaccinii-Corymbosi On Vaccinium Angustifolium, Ian Leonard May 2023

Factors Affecting Apothecia Production And Primary Infection By Monilinia Vaccinii-Corymbosi On Vaccinium Angustifolium, Ian Leonard

Electronic Theses and Dissertations

Mummy berry, caused by Monilinia vaccinii-corymbosi (MVC), is a prolific disease of Vaccinium angustifolium (wild blueberry) leading to decreased yield in wild blueberry fields throughout the Downeast (DE) and Midcoast (MC) regions of Maine (ME). This study aimed to identify factors affecting primary inoculum production and infection by MVC on wild blueberry, and what bud stages of wild blueberry are most susceptible to infection. Through common garden (CGE), field and incubation experiments conducted in 2021 and 2022, factors affecting carpogenic germination of MVC pseudosclerotia and relationships between susceptible wild blueberry buds and environmental factors were analyzed. The CGE conducted in …


Comparing Hierarchical Data Structures And Hierarchical Data Analysis, Halley Jeanne Dante, Robert Rovetti May 2023

Comparing Hierarchical Data Structures And Hierarchical Data Analysis, Halley Jeanne Dante, Robert Rovetti

Honors Thesis

Real world data is inherently noisy and data analysis can be especially complex when noise is compounded in hierarchical and multilevel data structures. Since such data structures can be described using multiple approaches, the way data is collapsed and grouped within these structures can influence its resulting interpretation and analyses. To avoid discrepancies in data collapsing and grouping, multiple statistical approaches have been developed specifically to analyze multilevel data structures. Examples of multilevel statistical models are the two-factor ANOVA and the general linear model with repeated-measures (GLM-RR) which is typically used in the context of looking at change over time. …


Developing An Enhanced Forest Inventory In Maine Using Airborne Laser Scanning: The Role Of Calibration Plot Design And Data Quality, Stephanie Willsey May 2023

Developing An Enhanced Forest Inventory In Maine Using Airborne Laser Scanning: The Role Of Calibration Plot Design And Data Quality, Stephanie Willsey

Electronic Theses and Dissertations

Forests provide essential ecosystem services such as carbon sequestration, clean water, lumber, and more. It is important that foresters be able to collect accurate forest inventories, especially in a changing climate. Foresters need to know what is in the forest not only to manage for the economic benefits, but also to manage for social acceptability and ecological soundness to prevent further degradation of these ecosystem services. One way to collect accurate and precise forest inventories is through the utilization of remote sensing products. These enhanced forest inventories (EFIs) can be done at varying resolutions that are contingent on the plot …


Brief Review: Low Frequency Event Charts (G-Charts) In Healthcare, James Espinosa, David Ho, Alan Lucerna, Henry Schuitema May 2023

Brief Review: Low Frequency Event Charts (G-Charts) In Healthcare, James Espinosa, David Ho, Alan Lucerna, Henry Schuitema

Rowan-Virtua Research Day

The ability to determine if a change in a system is actually an improvement—or worsening in function—is one of the essential desiderata of quality improvement efforts. There are many ways to look at the issue. A special problem occurs when the event being studied is low frequency by nature. By way of example, patient falls in a given hospital or division of a hospital may occur in a way that is low frequency—yet each event is important. Process engineering has developed an approach to low frequency events. Part of this approach may involve specialized charts that look at the “time-between-events”—as …


A Probabilistic Exploration Of Food Supplementation And Assistance, Logan Mattingly May 2023

A Probabilistic Exploration Of Food Supplementation And Assistance, Logan Mattingly

Honors College Theses

Food insecurity is a stark threat that grips our country and affects households throughout our country. Dietary insufficiency manifests itself in ways that affect health and public safety. According to researchers, individuals who suffer from food insecurity have a higher risk of aggression, anxiety, suicide ideation and depression. These problems tend to occur unequally distributed among those households with lower income. In this work, an exploratory analysis within these data sets will be performed to examine the socio-economic, biographical, nutritional, and geographical principal components of food insecurity among survey participants and how the US Supplemental Nutrition Assistance Program (SNAP) effects …


Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez May 2023

Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez

Student Scholar Symposium Abstracts and Posters

The reason why I am interested in Formula One is that my friend showed me what Formula One was all about. It became interesting to see the action of the sport, including the battles the drivers have during the race and how fast they go through a corner. Also, when qualifying comes around, they push their car to the absolute limit to gain a few seconds off their opponents. The drivers only in the top 10 receive points from the winner getting 25 points, the last driver in the top 10 getting 1 point, and those below the top ten …


Fractal Newton Methods, Ali Akgül, David E. Grow May 2023

Fractal Newton Methods, Ali Akgül, David E. Grow

Mathematics and Statistics Faculty Research & Creative Works

We introduce fractal Newton methods for solving (Formula presented.) that generalize and improve the classical Newton method. We compare the theoretical efficacy of the classical and fractal Newton methods and illustrate the theory with examples.


Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson May 2023

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson

Theses and Dissertations

From statistics being reported in newspapers in the 1840s, to present day, baseballhas always been one of the most data-driven sports. We make use of the endless publicly available baseball data to build models in R and Python that answer various baseball- related questions regarding predicting and optimizing run production, evaluating player effectiveness, and forecasting the postseason. To predict and optimize run production, we present three models. The first builds a common tool in baseball analysis called a Run Expectancy Matrix which is used to give a value (in terms of runs) to various in-game decisions. The second uses the …


Baseball’S Evolution In The 21st Century, And How It Exemplifies Human Response To Change, Jonathan Sharpe May 2023

Baseball’S Evolution In The 21st Century, And How It Exemplifies Human Response To Change, Jonathan Sharpe

Honors Projects

The game of baseball has changed a lot in the past twenty years. It can be primarily attributed to the explosion in data analytics and how they are used to evaluate baseball players. This led to different player profiles being preferred and eventually led to the development of players changing. As a result, the strategies employed have also evolved and turned into a different game than seen only a couple of decades ago. This paper will explore the changes that the game has seen. On the other hand, Major League Baseball has also implemented its own changes to try and …


Fully Decoupled Energy-Stable Numerical Schemes For Two-Phase Coupled Porous Media And Free Flow With Different Densities And Viscosities, Yali Gao, Xiaoming He, Tao Lin, Yanping Lin May 2023

Fully Decoupled Energy-Stable Numerical Schemes For Two-Phase Coupled Porous Media And Free Flow With Different Densities And Viscosities, Yali Gao, Xiaoming He, Tao Lin, Yanping Lin

Mathematics and Statistics Faculty Research & Creative Works

In this article, we consider a phase field model with different densities and viscosities for the coupled two-phase porous media flow and two-phase free flow, as well as the corresponding numerical simulation. This model consists of three parts: a Cahn-Hilliard-Darcy system with different densities/viscosities describing the porous media flow in matrix, a Cahn-illiard-Navier-Stokes system with different densities/viscosities describing the free fluid in conduit, and seven interface conditions coupling the flows in the matrix and the conduit. Based on the separate Cahn-Hilliard equations in the porous media region and the free flow region, a weak formulation is proposed to incorporate the …


Identifying And Analyzing Multi-Star Systems Among Tess Planetary Candidates Using Gaia, Katie E. Bailey May 2023

Identifying And Analyzing Multi-Star Systems Among Tess Planetary Candidates Using Gaia, Katie E. Bailey

Electronic Theses and Dissertations

Exoplanets represent a young, rapidly advancing subfield of astrophysics where much is still unknown. It is therefore important to analyze trends among their parameters to learn more about these systems. More complexity is added to these systems with the presence of additional stellar companions. To study these complex systems, one can employ programming languages such as Python to parse databases such as those constructed by TESS and Gaia to bridge the gap between exoplanets and stellar companions. Data can then be analyzed for trends in these multi-star exoplanet systems and in juxtaposition to their single-star counterparts. This research was able …


Comparison Of Different Robust Methods In Linear Regression And Applications In Cardiovascular Data, Jagannath Das May 2023

Comparison Of Different Robust Methods In Linear Regression And Applications In Cardiovascular Data, Jagannath Das

Open Access Theses & Dissertations

Due to advanced technology and wide source of data collection, high-dimensional data is available in several fields, including healthcare, bioinformatics, medicine, epidemiology, economics, finance, sociology, and climatology. In those datasets, outliers are generally encountered due to technical errors, heterogeneous sources, or the effect of some confounding variables. As outliers are often difficult to detect in high-dimensional data, the standard approaches may fail to model such data and produce misleading information. In this thesis, we studied Huber and Tukey's M-estimators for linear regression that automatically down-weight outliers and provide a good fit. We also investigated two variable selection methods -- LASSO …


Nonparametric Estimation Of Elliptical Copulas, Panfeng Liang May 2023

Nonparametric Estimation Of Elliptical Copulas, Panfeng Liang

Open Access Theses & Dissertations

Elliptical copulas provide flexibility in modeling the dependence structure of a random vector. They are often parameterized with a correlation matrix and a scalar function, called generator. The estimation of the generator can be challenging, because it is a functional parameter. In this dissertation, we provide a rigorous approach to estimating the generator in a Bayesian framework, which is simpler, more robust, and outperforms existing estimation methods in the literature. Based on the proposed framework in this dissertation, other researchers may modify the model for other types of generators in their own research.


Change Point Detection For A Process Having Several Regimes, Oliver Gerd Meister May 2023

Change Point Detection For A Process Having Several Regimes, Oliver Gerd Meister

Theses and Dissertations

In this dissertation, possible methods for multiple change point detection on Markovchain processes are studied. Related works for oine and online change point detection are discussed and their applicability on sequential multiple change point detection for several regimes is evaluated. We develop a method for a multiple change point detection for a process having three regimes. Its eciency is then evaluated on simulated Markov chain data by looking into dierent scenarios such as processes that signicantly dier between each other or probability distributions that are slightly similar. This approach is then applied on Covid- 19 hospital data. Therefore, the data …


Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion May 2023

Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Design of Experiments (DoE) is the field of statistics concerned with helping researchers maximize the amount of information they gain from their experiments. Recently, researchers have been turning to optimal experimental designs instead of classical/catalog experimental designs. One of the most popular algorithms used today to generate optimal designs is the Coordinate Exchange (CEXCH) Algorithm. CEXCH is known to be a greedy algorithm, which means it tends to favor immediate, locally best designs instead of globally optimal designs. Previous research demonstrated that this tradeoff was efficacious in that it reduced the cost of a single run of CEXCH and allowed …


Theoretical And Computational Aspects Of Robust Cluster Analysis For Multivariate And High-Dimensional Datasets, Andrews Tawiah Anum May 2023

Theoretical And Computational Aspects Of Robust Cluster Analysis For Multivariate And High-Dimensional Datasets, Andrews Tawiah Anum

Open Access Theses & Dissertations

Multivariate and high-dimensional datasets typically contain subgroups that may not be immediately apparent. To reveal these groups, cluster analysis is performed. Cluster analysis is an unsupervised machine learning technique commonly employed to partition a dataset into distinct categories referred to as clusters. The k-means algorithm is a prominent distance-based clustering method. Despite overwhelming popularity, the algorithm is not invariant under non-singular affine transformations and is not robust, i.e., can be unduly influenced by outliers. To address these deficiencies, we propose an alternative model-based clustering procedure by minimizing a “trimmed” variant of the negative log-likelihood function. We develop a “concentration step”, …


Flexible Models For The Estimation Of Treatment Effect, Habeeb Abolaji Bashir May 2023

Flexible Models For The Estimation Of Treatment Effect, Habeeb Abolaji Bashir

Open Access Theses & Dissertations

Estimation of treatment effect is an important problem which is well studied in the literature. While the regression models are one of the most commonly used techniques for the estimation of treatment effect, they are prone to model misspecification. To minimize the model misspecification bias, flexible nonparametric models are introduced for the estimation. Continuing this line of research, we propose two flexible nonparametric models that allow the treatment effect to vary across different levels of covariates. We provide estimation algorithms for both these models. Using simulations and data analysis, we illustrate the usefulness of the proposed methods.