Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 481 - 510 of 13246

Full-Text Articles in Physical Sciences and Mathematics

Exploring Experimental Design And Multivariate Analysis Techniques For Evaluating Community Structure Of Bacteria In Microbiome Data, Kelsey Karnik Aug 2023

Exploring Experimental Design And Multivariate Analysis Techniques For Evaluating Community Structure Of Bacteria In Microbiome Data, Kelsey Karnik

Department of Statistics: Dissertations, Theses, and Student Work

The gut microbiome plays a crucial role in human health, and by working collaboratively with microbiologists, we aim to further our understanding of the human gut and its impact on human health. Promoting a diverse microbiome is emphasized throughout microbiology literature, and involving a statistician in designing experiments to relate gut bacteria and some measured health outcome is crucial for ensuring valid and accurate results. By adopting new experimental design and analysis methods, researchers can begin to gain a deeper understanding of how the genetics of our food affect the composition of taxa within the gut microbiome. This dissertation is …


Causal Inference Methods For Estimation Of Survival And General Health Status Measures Of Alzheimer’S Disease Patients, Ehsan Yaghmaei Aug 2023

Causal Inference Methods For Estimation Of Survival And General Health Status Measures Of Alzheimer’S Disease Patients, Ehsan Yaghmaei

Computational and Data Sciences (PhD) Dissertations

Identifying optimal treatment options with respect to survival of Alzheimer's disease patients is crucially important and previously uninvestigated research question. Our objective was to estimate the causal effects of the most prevalent classes of Alzheimer’s disease drugs, Donepezil and Memantine, and their combined use on Survival and General Health Status Measures of Alzheimer's disease patients for the first five years after initial diagnosis. We carried out a thorough causal inference study using doubly robust estimators, nonparametric bootstrap confidence intervals, Bonferroni corrections for multiple comparisons and analyzing one of the largest high-quality medical databases containing millions of de-identified electronic health records …


Geometric Morphometric Analysis Of Modern Viperid Vertebrae Facilitates Identification Of Fossil Specimens, Lance D. Jessee Aug 2023

Geometric Morphometric Analysis Of Modern Viperid Vertebrae Facilitates Identification Of Fossil Specimens, Lance D. Jessee

Electronic Theses and Dissertations

Snake vertebrae are common in the fossil record, whereas cranial remains are generally fragile and rare. Consequently, vertebrae are the most commonly studied fossil element of snakes. However, identification of snake vertebrae can be problematic due to extensive variation. This study utilizes 2-D geometric morphometrics and canonical variates analysis to 1) reveal variation between genera and species and 2) classify vertebrae of modern and fossil eastern North American Agkistrodon and Crotalus. The results show that vertebrae of Agkistrodon and Crotalus can reliably be classified to genus and species using these methods. Based on the statistical analyses, four of the …


Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman Aug 2023

Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman

Electronic Theses and Dissertations

This dissertation consists of three research projects on cancer screening probability modeling. In these projects, the three key modeling parameters (sensitivity, sojourn time, transition density) for cancer screening were estimated, along with the long-term outcomes (including overdiagnosis as one outcome), the optimal screening time/age, the lead time distribution, and the probability of overdiagnosis at the future screening time were simulated to provide a statistical perspective on the effectiveness of cancer screening programs. In the first part of this dissertation, a statistical inference was conducted for male and female smokers using the National Lung Screening Trial (NLST) chest X-ray data. A …


Penalized Bayesian Exponential Random Graph Models., Vicki Modisette Aug 2023

Penalized Bayesian Exponential Random Graph Models., Vicki Modisette

Electronic Theses and Dissertations

Networks have the critical ability to represent the complex interconnectedness of social relationships, biological processes, and the spread of diseases and information. Exponential random graph models (ERGM) are one of the popular statistical methods for analyzing network data. ERGM, however, struggle with computational challenges and degeneracy issues, further exacerbated by their inability to handle high-dimensional network data. Bayesian techniques provide a promising avenue to overcome these two problems. This paper considers penalized Bayesian exponential random graph models with adaptive lasso and adaptive ridge penalties to perform variable selection and reduce multicollinearity on a variety of networks. The experimental results demonstrate …


Comparative Study Of Supervised Classification Techniques With A Modified Knn Algorithm, Noah Owusu Aug 2023

Comparative Study Of Supervised Classification Techniques With A Modified Knn Algorithm, Noah Owusu

Open Access Theses & Dissertations

The goal of classification is to develop a model that can be used to accurately assign new observations to labeled classes based on the patterns learned from the training data. K-nearest Neighbors algorithm (KNN) is a popular and widely used algorithm for classification, however, its performance can be adversely affected by the presence of outliers in a dataset. In this study we have modified this existing KNN algorithm that can alleviate the effect of outliers in a dataset, thereby improving the performance of the KNN algorithm. We compared the performances of the Modified KNN method and the Existing KNN algorithm …


A Framework For Statistical Modeling Of Wind Speed And Wind Direction, Eva Murphy Aug 2023

A Framework For Statistical Modeling Of Wind Speed And Wind Direction, Eva Murphy

All Dissertations

Atmospheric near surface wind speed and wind direction play an important role in many applications, ranging from air quality modeling, building design, wind turbine placement to climate change research. It is therefore crucial to accurately estimate the joint probability distribution of wind speed and direction. This dissertation aims to provide a modeling framework for studying the variation of wind speed and wind direction. To this end, three projects are conducted to address some of the key issues for modeling wind vectors.\\

First, a conditional decomposition approach is developed to model the joint distribution of wind speed and direction. Specifically, the …


Single-Index Multinomial Model For Analyzing Crime Data, Kwabena Gyamfi Duodu Aug 2023

Single-Index Multinomial Model For Analyzing Crime Data, Kwabena Gyamfi Duodu

Open Access Theses & Dissertations

We develop a flexible single-index multinomial model for analyzing crime data. In additionto the number of crimes reported, the data also includes covariates such as location, time of day, weather, and other demographic factors. We provide an estimation algorithm and develop R code for the single-index multinomial model. Using simulations, we evaluate the performance of the proposed estimation algorithm. When applied to crime data, the single-index multinomial model provides important insights into crime trends and risk variables, assisting in the development of tailored crime prevention programs. Policymakers and law enforcement organizations can use the model's projections to more efficiently allocate …


Robust Mahalanobis K-Means Algorithm In Comparison With Other Existing Clustering Methods., Eleazer Tabi Serebour Aug 2023

Robust Mahalanobis K-Means Algorithm In Comparison With Other Existing Clustering Methods., Eleazer Tabi Serebour

Open Access Theses & Dissertations

This study enhances K-means Mahalanobis clustering using Density Power Divergence (DPD) for outlier handling and detection. Through the utilization of simulations and the analysis of real-world data, our approach consistently outperforms standard K-means, Mahalanobis K-means, Fuzzy C-means, and others in clustering datasets with outliers. While our method performs similarly to others on spherical datasets, it ranks second to DBSCAN for arbitrary shapes. We showcase its superiority on real-life datasets (Iris flower and wheat seed), demonstrating resilient outlier identification. By navigating various structures and cluster characteristics, our Modified Mahalanobis K-means method proves adaptable and robust, offering insights into diverse clustering scenarios. …


Modeling Biphasic, Non-Sigmoidal Dose-Response Relationships: Comparison Of Brain- Cousens And Cedergreen Models For A Biochemical Dataset, Venkat D. Abbaraju, Tamaraty L. Robinson, Brian P. Weiser Aug 2023

Modeling Biphasic, Non-Sigmoidal Dose-Response Relationships: Comparison Of Brain- Cousens And Cedergreen Models For A Biochemical Dataset, Venkat D. Abbaraju, Tamaraty L. Robinson, Brian P. Weiser

Rowan-Virtua School of Osteopathic Medicine Departmental Research

Biphasic, non-sigmoidal dose-response relationships are frequently observed in biochemistry and pharmacology, but they are not always analyzed with appropriate statistical methods. Here, we examine curve fitting methods for “hormetic” dose-response relationships where low and high doses of an effector produce opposite responses. We provide the full dataset used for modeling, and we provide the code for analyzing the dataset in SAS using two established mathematical models of hormesis, the Brain-Cousens model and the Cedergreen model. We show how to obtain and interpret curve parameters such as the ED50 that arise from modeling, and we discuss how curve parameters might change …


Statistical Modeling Approaches For The Inference Of Cancer Mechanisms, Licai Huang Aug 2023

Statistical Modeling Approaches For The Inference Of Cancer Mechanisms, Licai Huang

Dissertations & Theses (Open Access)

The aim of this study was to explore the potential of integrating multi-platform genomic datasets to improve our understanding of the biological mechanisms behind cancer. By merging clinical outcomes with the data obtained from multi-platform genomic studies, we can gain insight into the biological mechanisms behind a patient’s response to treatment. Additionally, the evaluation of the correlations between genetic variations and gene expression provides a better understanding of the functional significance of these variations. Such knowledge has the potential to revolutionize cancer diagnosis and treatment. This thesis describes methods developed to address two related aims. Aim 1: We have developed …


Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner Aug 2023

Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner

Electronic Theses and Dissertations

As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and …


A Data-Driven Multi-Regime Approach For Predicting Real-Time Energy Consumption Of Industrial Machines., Abdulgani Kahraman Aug 2023

A Data-Driven Multi-Regime Approach For Predicting Real-Time Energy Consumption Of Industrial Machines., Abdulgani Kahraman

Electronic Theses and Dissertations

This thesis focuses on methods for improving energy consumption prediction performance in complex industrial machines. Working with real-world industrial machines brings several challenges, including data access, algorithmic bias, data privacy, and the interpretation of machine learning algorithms. To effectively manage energy consumption in the industrial sector, it is essential to develop a framework that enhances prediction performance, reduces energy costs, and mitigates air pollution in heavy industrial machine operations. This study aims to assist managers in making informed decisions and driving the transition towards green manufacturing. The energy consumption of industrial machinery is substantial, and the recent increase in CO2 …


An Interval-Valued Random Forests, Paul Gaona Partida Aug 2023

An Interval-Valued Random Forests, Paul Gaona Partida

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

There is a growing demand for the development of new statistical models and the refinement of established methods to accommodate different data structures. This need arises from the recognition that traditional statistics often assume the value of each observation to be precise, which may not hold true in many real-world scenarios. Factors such as the collection process and technological advancements can introduce imprecision and uncertainty into the data.

For example, consider data collected over a long period of time, where newer measurement tools may offer greater accuracy and provide more information than previous methods. In such cases, it becomes crucial …


Statistical Graph Quality Analysis Of Utah State University Master Of Science Thesis Reports, Ragan Astle Aug 2023

Statistical Graph Quality Analysis Of Utah State University Master Of Science Thesis Reports, Ragan Astle

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Graphical software packages have become increasingly popular in our modern world, but there are concerns within the statistical visualization field about the default settings provided by these packages, which can make it challenging to create good quality graphs that align with standard graph principles. In this thesis, we investigate whether the quality of graphs from Utah State University (USU) Plan A Master of Science (MS) thesis reports from the years 1930 to 2019 was affected by the rise of graphical software packages. We collected all data stored on the USU Digital Commons website since November 2021 to determine the specific …


Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock Aug 2023

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many discipline specific researchers need a way to quickly compare the accuracy of their predictive models to other alternatives. However, many of these researchers are not experienced with multiple programming languages. Python has recently been the leader in machine learning functionality, which includes the PyCaret library that allows users to develop high-performing machine learning models with only a few lines of code. The goal of the stressor package is to help users of the R programming language access the advantages of PyCaret without having to learn Python. This allows the user to leverage R’s powerful data analysis workflows, while simultaneously …


Using Natural Language Processing To Quantify The Efficacy Of Language Simplification As A Communication Strategy, Brian Nalley Aug 2023

Using Natural Language Processing To Quantify The Efficacy Of Language Simplification As A Communication Strategy, Brian Nalley

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

People with communication disorders often experience difficulties being understood by unfamiliar listeners or in noisy environments. A common strategy for effectively communicating in these scenarios is to use simpler and more predictable language. Despite the prevalence of this strategy, there has been little to no research to date focused on the effectiveness of language simplification as a communication strategy. This study seeks to begin filling that gap by using natural language processing to determine whether speakers with early-stage Parkinson’s disease and age-matched neurotypical speakers are able to successfully simplify their language while still maintaining the original message.

Simplification was measured …


Comparing Predictive Performance Of Garch And Stochastic Volatility Models, Swapnaneel Nath Aug 2023

Comparing Predictive Performance Of Garch And Stochastic Volatility Models, Swapnaneel Nath

Graduate Theses and Dissertations

This paper compares the predictive performance of two commonly used financial models, the Generalized Auto-Regressive Conditional Heteroskedasticity (GARCH) model, and the Stochastic Volatility model. Both techniques are used in the finance literature to model returns on an asset; the main difference between the two is that the former holds volatility as deterministic, whereas the latter treats it as a stochastic component. Three 10-year periods (2006-15, 2008-17, and 2010-19) of returns of the S&P-500 Index are used to train the two models. The parameter estimation is done using Hamiltonian Monte Carlo. Then, using Sequential Monte Carlo updates, returns for 2016, 2018, …


A Comparative Study Of Techniques For Non-Monotonic Dependence With Emphasis On Sensitivity To Sample Size, Noise Level And Computational Attributes, Fariha Tasnim Aug 2023

A Comparative Study Of Techniques For Non-Monotonic Dependence With Emphasis On Sensitivity To Sample Size, Noise Level And Computational Attributes, Fariha Tasnim

Graduate Theses and Dissertations

Evaluating association between variables is often of interest by many researchers. To serve this purpose, different association measures have been developed. However, type of relation between variables affects the degree of relationship. Hence, detection of the rela- tionship between variables is germane to measuring the correlation coefficient. With that mindset, here we explored six non-monotonic measure of association techniques and com- pared them with three classical approaches. Due to inconsistency in definition and range of different techniques, it is not feasible to compare the correlation estimates as their nature of variability differ. Therefore, we used permutation test based on Monte …


Synergy And Antagonism In Log-Linear Models, Md Nahid Hasan Aug 2023

Synergy And Antagonism In Log-Linear Models, Md Nahid Hasan

UNLV Theses, Dissertations, Professional Papers, and Capstones

Synergy and antagonism have been extensively studied in the context of the statistical analysis of drug combinations given to treat a disease. “Synergy” (“antagonism”, resp.) in a drug combination occurs when the desirable effect of two drugs given together for treating a disease is greater than (less than, resp.) the effect of each drug given separately.

In this dissertation, however, we study “synergy” and “antagonism” in log-linear models. We also give a thorough review and extend several epidemiological definitions of synergy and antagonism for categorical data, and we connect these definitions to the definitions of synergy and antagonism in log-linear …


Benchmarking And Practical Evaluation Of Machine And Statistical Learning Methods In Credit Scoring: A Method Selection Perspective, Gwen Verbeck Aug 2023

Benchmarking And Practical Evaluation Of Machine And Statistical Learning Methods In Credit Scoring: A Method Selection Perspective, Gwen Verbeck

UNLV Theses, Dissertations, Professional Papers, and Capstones

Predictive models are important tools used in all scientific fields. Machine learning (ML) algorithms and statistical models are widely used for decision-making because of their capability to tackle intricate and unique problems. In domains where data are high-dimensional and contain irrelevant and redundant features, ML algorithms are known to have superior performance over traditional (statistical) learning methods. However, researchers and analysts are often faced with a myriad of techniques to choose from, with no clear consensus on which will perform best for their specific task. Considering resource limitations, exhaustive exploration of all available methods is impractical and often fails to …


Development Of A Metapgs For Accurate Prediction Of Osteoporotic Fracture, Xiangxue Xiao Aug 2023

Development Of A Metapgs For Accurate Prediction Of Osteoporotic Fracture, Xiangxue Xiao

UNLV Theses, Dissertations, Professional Papers, and Capstones

Introduction: Early identification of individuals at high-risk for osteoporotic fractures who may benefit from preventive intervention is essential. However, the predictive accuracy of the currently used fracture risk assessment tool remains suboptimal. The first aim of this research is to construct genome-wide polygenic scores for the femoral neck (PGS_FNBMDidpred) and total body BMD (PGS_TBBMDidpred) and to estimate their potential in identifying individuals with a high risk of osteoporotic fractures. The second aim is to validate the predictive performance of two previously established PGSs (PGS_FNBMDidpred and PGS_TBBMDidpred) in an external cohort …


Copula Based Models For Bivariate Zero-Inflated Count Time Series Data, Dimuthu Fernando Aug 2023

Copula Based Models For Bivariate Zero-Inflated Count Time Series Data, Dimuthu Fernando

Mathematics & Statistics Theses & Dissertations

Count time series data have multiple applications. The applications can be found in areas of finance, climate, public health and crime data analyses. In most scenarios, time is an important part of the data. Time series counts then come as multivariate vectors that exhibit not only serial dependence within each time series but also with cross-correlation among the series. When considering these observed counts, and when a value, say zero, occurs more often than usual, analysis presents crucial challenges. There is presence of zeroinflation in the data. The literature on bivariate or multivariate count time series, as well as zero-inflated …


Editorial, Al Asyary Jul 2023

Editorial, Al Asyary

Kesmas

No abstract provided.


Sentiment Analysis Before And During The Covid-19 Pandemic, Emily Musgrove Jul 2023

Sentiment Analysis Before And During The Covid-19 Pandemic, Emily Musgrove

Mathematics Summer Fellows

This study examines the change in connotative language use before and during the Covid-19 pandemic. By analyzing news articles from several major US newspapers, we found that there is a statistically significant correlation between the sentiment of the text and the publication period. Specifically, we document a large, systematic, and statistically significant decline in the overall sentiment of articles published in major news outlets. While our results do not directly gauge the sentiment of the population, our findings have important implications regarding the social responsibility of journalists and media outlets especially in times of crisis.


Advances In Copula Estimation And Distribution Theory, Yishan Zang Jul 2023

Advances In Copula Estimation And Distribution Theory, Yishan Zang

Electronic Thesis and Dissertation Repository

This dissertation initially features distributional results related to copulas. Four distinct copula density estimation methodologies, including Bernstein’s polynomial approximation, are proposed and criteria for the selection of their tuning parameters are provided. These four approaches were found to produce similar density estimates, which validates their suitability. Moreover, the copula associated with a Wiener process and its running maximum is determined, and an illustrative numerical example is presented. The principal properties of Spearman’s, Kendall’s, Blomqvist’ and Hoeffding’s measures of association as well as their representations in terms of copulas are also discussed. Then, a novel method that is based on an …


A Multivariate Investigation Of The Motivational, Academic, And Well-Being Characteristics Of First-Generation And Continuing-Generation College Students, Christopher L. Thomas, Staci Zolkoski Jul 2023

A Multivariate Investigation Of The Motivational, Academic, And Well-Being Characteristics Of First-Generation And Continuing-Generation College Students, Christopher L. Thomas, Staci Zolkoski

Journal of Research Initiatives

Prior research has noted differences in motivational, academic, and well-being factors between first-generation and continuing-education students. However, past investigations have primarily overlooked the interactive influence of protective and risk factors when comparing the characteristics of first-generation and continuing-education students. Thus, the current study adopted a multivariate approach to gain a more nuanced understanding of the influence of generational status on students' self-regulated learning capabilities, academic anxiety, sense of belonging, academic barriers, mental health concerns, and satisfaction with life. University students (N = 432, 67.46% Caucasian, 87.55% female, Age = 28.10 ± 9.46) completed the Cognitive Test Anxiety Scale-2nd …


Modified Geometries, Clifford Algebras And Graphs: Their Impact On Discreteness, Locality And Symmetr, Roman Sverdlov Jul 2023

Modified Geometries, Clifford Algebras And Graphs: Their Impact On Discreteness, Locality And Symmetr, Roman Sverdlov

Mathematics & Statistics ETDs

In this dissertation I will explore the question whether various entities commonly used in quantum field theory can be “constructed". In particular, can spacetime be “constructed" out of building blocks, and can Berezin integral be “constructed" in terms of Riemann integrals.

As far as “constructing" spacetime out of building blocks, it has been attempted by multiple scientific communities and various models were proposed. But the common downfall is they break the principles of relativity. I will explore the ways of doing so in such a way that principles of relativity are respected. One of my approaches is to replace points …


A Comparison Of Confidence Intervals In State Space Models, Jinyu Du Jul 2023

A Comparison Of Confidence Intervals In State Space Models, Jinyu Du

Statistical Science Theses and Dissertations

This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …


Kinetic Particle Simulations Of Plasma Charging At Lunar Craters Under Severe Conditions, David Lund, Xiaoming He, Daoru Frank Han Jul 2023

Kinetic Particle Simulations Of Plasma Charging At Lunar Craters Under Severe Conditions, David Lund, Xiaoming He, Daoru Frank Han

Mathematics and Statistics Faculty Research & Creative Works

This paper presents fully kinetic particle simulations of plasma charging at lunar craters with the presence of lunar lander modules using the recently developed Parallel Immersed-Finite-Element Particle-in-Cell (PIFE-PIC) code. The computation model explicitly includes the lunar regolith layer on top of the lunar bedrock, taking into account the regolith layer thickness and permittivity as well as the lunar lander module in the simulation domain, resolving a nontrivial surface terrain or lunar lander configuration. Simulations were carried out to study the lunar surface and lunar lander module charging near craters at the lunar terminator region under mean and severe plasma environments. …