Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 661 - 690 of 13246

Full-Text Articles in Physical Sciences and Mathematics

Interpretable Learning In Multivariate Big Data Analysis For Network Monitoring, José Camacho, Rasmus Bro, David Kotz Apr 2023

Interpretable Learning In Multivariate Big Data Analysis For Network Monitoring, José Camacho, Rasmus Bro, David Kotz

Dartmouth Scholarship

There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows …


A New Generalized Gamma-Weibull Distribution And Its Applications, Nihimat Iyebuhola Aleshinloye, Samuel Adewale Aderoju, Alfred Adewole Abiodun, Bako Lukmon Taiwo Apr 2023

A New Generalized Gamma-Weibull Distribution And Its Applications, Nihimat Iyebuhola Aleshinloye, Samuel Adewale Aderoju, Alfred Adewole Abiodun, Bako Lukmon Taiwo

Al-Bahir Journal for Engineering and Pure Sciences

In this paper, a New Generalized Gamma-Weibull (NGGW) distribution is developed by compounding Weibull and generalized gamma distribution. Some mathematical properties such as moments, Rényi entropy and order statistics are derived and discussed. The maximum likelihood estimation (MLE) method is used to estimate the model parameters. The proposed model is applied to two real-life datasets to illustrate its performance and flexibility as compared to some other competing distributions. The results obtained show that the new distribution fits each of the data better than the other competing distributions.


Knowledge, Attitude, And Behavioral Intention About Oral Cancer Among Public Health Students In Southeast Georgia, Ravneet Kaur, Gulzar H. Shah Apr 2023

Knowledge, Attitude, And Behavioral Intention About Oral Cancer Among Public Health Students In Southeast Georgia, Ravneet Kaur, Gulzar H. Shah

Department of Biostatistics, Epidemiology, and Environmental Health Sciences Faculty Publications

Background: Oral cancer (OC) is a significant public health problem; however, the degree to which the future public health workforce is aware of this issue is not well researched. The purpose of this study is to explore the level of knowledge, attitudes, and behavioral intentions about OC among public health students.
Materials and Methods: A sequential exploratory mixed-method research design was employed for this study. Using quantitative and qualitative measures, a survey was administered to 129 public health students. Subsequently, to understand the quantitative findings, two follow-up focus groups were conducted with survey participants.
Results: We found …


Multiple Endpoints In Randomized Controlled Trials: A Review And An Illustration Of The Global Test, Lindsay Cameron Apr 2023

Multiple Endpoints In Randomized Controlled Trials: A Review And An Illustration Of The Global Test, Lindsay Cameron

Electronic Thesis and Dissertation Repository

A randomized controlled trial is often used to provide high quality evidence regarding treatment interventions. Due to the complex nature of many diseases, trials usually select multiple primary outcomes to capture the efficacy of the interventions. In this thesis, we conducted a literature search to determine the prevalence of the different types of multiple outcomes that have been used in randomized controlled trials. We also reviewed the corresponding statistical methods used to deal with such outcomes. In addition, we described the benefits of using global tests as a statistical method when there are multiple primary outcomes in order to answer …


Mlb 2023 Season Attendance Predictions, Sophia Andersen, Anna Tollette, Hannah Clinton Apr 2023

Mlb 2023 Season Attendance Predictions, Sophia Andersen, Anna Tollette, Hannah Clinton

Research and Scholarship Symposium Posters

The goal of this project was to predict home game attendance for all 30 Major League Baseball (MLB) teams in their 2023 season. Researching and understanding that data as well as identifying influential factors of attendance were key factors before building a predictive model. Both the given material and data sets from MinneMUDAC, the competition organizer, was used as well as some outside sources. Finally, a predictive model was coded in Python which gave attendance predictions for every MLB game scheduled in 2023. From these results, insights could be offered to Major League Baseball or each team individually, to help …


El Final Report: Undergraduate Summer Research Internships, Sophie Wu Apr 2023

El Final Report: Undergraduate Summer Research Internships, Sophie Wu

SASAH 4th Year Capstone and Other Projects: Publications

In her final report, Sophie Wu discusses her two Undergraduate Summer Research Internships at Western University: the first in the Statistics and Actuarial Science department, concerning microinsurance, and the second, in the Mathematics department, concerning computational neuroscience.


Classification Of Land Cover On Sand Dunes, Heleyna Tucker, Micah Sterk Apr 2023

Classification Of Land Cover On Sand Dunes, Heleyna Tucker, Micah Sterk

22nd Annual Celebration of Undergraduate Research and Creative Activity (2023)

As members of the Hope College Coastal Research Group, we have studied the mechanisms for and effects of sand transport. In particular, we have worked to model vegetation coverage in West Michigan sand dune complexes in order to better understand how sand movement and resident vegetation affect one another. We use aerial drone imagery to develop machine learning algorithms for creating ground cover classification mappings in an automated way. Our team collected drone imagery ranging from high-resolution, low-altitude photographs to high-altitude stitched and rectified orthomosaics. We developed accurate ground cover classification methods for the low-altitude imagery and then explored ways …


Generating Optimal Space-Filling Designs With Particle Swarm Optimization, Rebekah Scott Apr 2023

Generating Optimal Space-Filling Designs With Particle Swarm Optimization, Rebekah Scott

Student Research Symposium

In 1935, Ronald Fisher published The Design of Experiments, establishing classical designs for various types of experiments. With the rise of computing power came optimal design, where statisticians can better customize designs according to the needs of the researchers running the experiment. This research focuses on generating optimal MaxMin space-filling designs with particle swarm optimization using various distance metrics (Manhattan, Euclidean, etc). Interestingly, changing the distance metric in the objective function had a minimal effect on the design, except for Aitchison geometry on the simplex. Space-filling designs are optimal for supporting high-order models with only a small sacrifice in prediction …


A Graphical User Interface Using Spatiotemporal Interpolation To Determine Fine Particulate Matter Values In The United States, Kelly M. Entrekin Apr 2023

A Graphical User Interface Using Spatiotemporal Interpolation To Determine Fine Particulate Matter Values In The United States, Kelly M. Entrekin

Honors College Theses

Fine particulate matter or PM2.5 can be described as a pollution particle that has a diameter of 2.5 micrometers or smaller. These pollution particle values are measured by monitoring sites installed across the United States throughout the year. While these values are helpful, a lot of areas are not accounted for as scientists are not able to measure all of the United States. Some of these unmeasured regions could be reaching high PM2.5 values over time without being aware of it. These high values can be dangerous by causing or worsening health conditions, such as cardiovascular and lung diseases. Within …


Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra Apr 2023

Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra

Mathematics & Statistics ETDs

Large neural networks have become ubiquitous in machine learning. Despite their widespread use, the optimization process for training a neural network remains com-putationally expensive and does not necessarily create networks that generalize well to unseen data. In addition, the difficulty of training increases as the size of the neural network grows. In this thesis, we introduce the novel MGDrop and SMGDrop algorithms which use a multigrid optimization scheme with a dropout coarsening operator to train neural networks. In contrast to other standard neural network training schemes, MGDrop explicitly utilizes information from smaller sub-networks which act as approximations of the full …


A New Approach To Proper Orthogonal Decomposition With Difference Quotients, Sarah Locke Eskew, John R. Singler Apr 2023

A New Approach To Proper Orthogonal Decomposition With Difference Quotients, Sarah Locke Eskew, John R. Singler

Mathematics and Statistics Faculty Research & Creative Works

In a Recent Work (Koc Et Al., SIAM J. Numer. Anal. 59(4), 2163–2196, 2021), the Authors Showed that Including Difference Quotients (DQs) is Necessary in Order to Prove Optimal Pointwise in Time Error Bounds for Proper Orthogonal Decomposition (POD) Reduced Order Models of the Heat Equation. in This Work, We Introduce a New Approach to Including DQs in the POD Procedure. Instead of Computing the POD Modes using All of the Snapshot Data and DQs, We Only Use the First Snapshot Along with All of the DQs and Special POD Weights. We Show that This Approach Retains All of the …


Rank-Based Inference For Survey Sampling Data, Akim Adekpedjou, Huybrechts F. Bindele Apr 2023

Rank-Based Inference For Survey Sampling Data, Akim Adekpedjou, Huybrechts F. Bindele

Mathematics and Statistics Faculty Research & Creative Works

For regression models where data are obtained from sampling surveies, the statistical analysis is often based on approaches that are either non-robust or inefficient. The handling of survey data requires more appropriate techniques, as the classical methods usually result in biased and inefficient estimates of the underlying model parameters. This article is concerned with the development of a new approach of obtaining robust and efficient estimates of regression model parameters when dealing with survey sampling data. Asymptotic properties of such estimators are established under mild regularity conditions. To demonstrate the performance of the proposed method, Monte Carlo simulation experiments are …


A Diffusion Network Event History Estimator, Jeffrey J. Harden, Bruce A. Desmarais, Mark Brockway, Frederick J. Boehmke, Scott J. Lacombe, Fridolin Linder, Hanna Wallach Apr 2023

A Diffusion Network Event History Estimator, Jeffrey J. Harden, Bruce A. Desmarais, Mark Brockway, Frederick J. Boehmke, Scott J. Lacombe, Fridolin Linder, Hanna Wallach

Government: Faculty Publications

Research on the diffusion of political decisions across jurisdictions typically accounts for units’ influence over each other with (1) observable measures or (2) by inferring latent network ties from past decisions. The former approach assumes that interdependence is static and perfectly captured by the data. The latter mitigates these issues but requires analytical tools that are separate from the main empirical methods for studying diffusion. As a solution, we introduce network event history analysis (NEHA), which incorporates latent network inference into conventional discrete-time event history models. We demonstrate NEHA’s unique methodological and substantive benefits in applications to policy adoption in …


A Change-Point Analysis Of Air Pollution Levels In Silao, Mexico And Fresno, California, Rachael Goodwin Apr 2023

A Change-Point Analysis Of Air Pollution Levels In Silao, Mexico And Fresno, California, Rachael Goodwin

WWU Honors College Senior Projects

We analyzed PM10 levels in the city of Silao, Mexico, as well as PM2.5 and PM10 levels in Fresno, California to determine if there was a shift in air pollution levels in either location. A change point based analysis was used to determine if there was a shift in air pollution levels. In the city of Silao, there was a significant increase in PM10 levels, but there was no significant change in Fresno for either pollutant.


Prevalence Of Sars-Cov-2 Antibodies In Liberty University Student Population, Emily Bonus Apr 2023

Prevalence Of Sars-Cov-2 Antibodies In Liberty University Student Population, Emily Bonus

Senior Honors Theses

In 2020, the virus SARS-CoV-2 gained attention as it spread around the world. Its antibodies are poorly understood, and little research focuses on those with few COVID-19 complications yet large numbers of close contacts: university students. This longitudinal study recorded SARS-CoV-2 antibody presence in 107 undergraduate Liberty University students twice during early 2021. After extensive data cleaning and the application of various statistical tests and ANOVAs, the data seems to show that in the case of COVID-19 infections, SARS-CoV-2 IgM antibodies are immediately produced, and then IgG antibodies follow later. However, the COVID-19 vaccine causes the production of both IgM …


Arousing Motives Or Eliciting Stories? On The Role Of Pictures In A Picture–Story Exercise, Philipp Schäpers, Stefan Krumm, Filip Lievens, Nikola Stenzel Apr 2023

Arousing Motives Or Eliciting Stories? On The Role Of Pictures In A Picture–Story Exercise, Philipp Schäpers, Stefan Krumm, Filip Lievens, Nikola Stenzel

Research Collection Lee Kong Chian School Of Business

Picture–story exercises (PSE) form a popular measurement approach that has been widely used for the assessment of implicit motives. However, current theorizing offers two diverging perspectives on the role of pictures in PSEs: either to elicit stories or to arouse motives. In the current study, we tested these perspectives in an experimental design. We administered a PSE either with or without pictures. Results from N = 281 participants revealed that the experimental manipulation had a medium to large effect for the affiliation and power motive domains, but no effect for the achievement motive domain. We conclude that the herein chosen …


Moral Injury To Inform Analysis Of Post-Traumatic Stress Disorder, Amanda Julia Manea Apr 2023

Moral Injury To Inform Analysis Of Post-Traumatic Stress Disorder, Amanda Julia Manea

Senior Theses

Post-traumatic stress disorder (PTSD) is a mental health condition that almost one out of ten veterans struggle with. Although the National Center for PTSD has made extensive progress in characterizing and developing new treatments for PTSD, most veterans still experience symptoms of PTSD following treatment. Novel avenues of investigation, such as developing algorithms to review electronic health record (EHR) data and better understanding moral injury, are being pursued to address the gap that still exists when it comes to treating veterans. Moral injury is the individual evaluation of exposure to a potentially morally injurious event (PMIE) and can lead to …


That’S My Deity: An Examination Of Online Lokean Cultures Through Log-Linear Modeling, Mary Bernstein Apr 2023

That’S My Deity: An Examination Of Online Lokean Cultures Through Log-Linear Modeling, Mary Bernstein

Senior Theses

A rise in online religious communities and the growth of so-called ‘Old World’ religions are reflected in the internet’s subcultures of Neopaganism, a growing religious movement that has been documented in America since the 1960s. The religions under this umbrella movement vary drastically and include belief systems such as Wicca, Druidry, and deity worship. Belief systems under this movement lack the traditional hierarchy found in structured religion and lack a singular sacred text. As such, believers usually find and support one another not through a physical sacred place of meeting, but through an online community that acts as sacred space. …


Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley Apr 2023

Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley

Senior Theses

In Major League Baseball (MLB), the outcome of a stolen base attempt has important implications. Success moves the runner closer to scoring, while failure records an out and removes the runner from the basepaths altogether. Therefore, it is important that the decision by a coach or player to steal a base is well-informed. In this thesis, I explore a statistical approach to making this decision. I train logistic regression and random forest models, using data about the game situation and about the runner, pitcher, and catcher involved in the stolen base attempt, to estimate the probability that a stolen base …


Influence Diagnostics For Generalized Estimating Equations Applied To Correlated Categorical Data, Louis Vazquez Apr 2023

Influence Diagnostics For Generalized Estimating Equations Applied To Correlated Categorical Data, Louis Vazquez

Statistical Science Theses and Dissertations

Influence diagnostics in regression analysis allow analysts to identify observations that have a strong influence on model fitted probabilities and parameter estimates. The most common influence diagnostics, such as Cook’s Distance for linear regression, are based on a deletion approach where the results of a model with and without observations of interest are compared. Here, deletion-based influence diagnostics are proposed for generalized estimating equations (GEE) for correlated, or clustered, nominal multinomial responses. The proposed influence diagnostics focus on GEEs with the baseline-category logit link function and a local odds ratio parameterization of the association structure. Formulas for both observation- and …


Wernicke's Encephalopathy: Mapping The Risk Factors Throughout The State Of South Carolina, Shannon M. Rychener Apr 2023

Wernicke's Encephalopathy: Mapping The Risk Factors Throughout The State Of South Carolina, Shannon M. Rychener

Senior Theses

Wernicke’s Encephalopathy is a consistently underrecognized neurodegenerative brain disorder resulting from prolonged thiamine deficiency. Clinical presentation of the disease results from brain lesions attributable to thiamine deficiency. Because these lesions occur in various locations in the cerebral cortex, symptoms can vary significantly. Varied presentation of symptoms, in addition to the lack of a widely accepted biomarker for the disorder cause challenges to clinicians when identifying and diagnosing the disorder. Due to these challenges, healthcare providers must heavily rely on patient history and risk factor prevalence when multiple symptoms of the disorder are present. By mapping the prevalence of the four …


High-Dimensional Variable Selection Via Knockoffs Using Gradient Boosting, Amr Essam Mohamed Apr 2023

High-Dimensional Variable Selection Via Knockoffs Using Gradient Boosting, Amr Essam Mohamed

Dissertations

As data continue to grow rapidly in size and complexity, efficient and effective statistical methods are needed to detect the important variables/features. Variable selection is one of the most crucial problems in statistical applications. This problem arises when one wants to model the relationship between the response and the predictors. The goal is to reduce the number of variables to a minimal set of explanatory variables that are truly associated with the response of interest to improve the model accuracy. Effectively choosing the true influential variables and controlling the False Discovery Rate (FDR) without sacrificing power has been a challenge …


Bayesian Dependence Structure Analysis For Ordinal Data, Yang He Apr 2023

Bayesian Dependence Structure Analysis For Ordinal Data, Yang He

Theses and Dissertations

This dissertation explores different methods to study the dependence structure among many ordinal variables under the Bayesian framework.

Chapter 1 introduces ordinal data analysis methods, and the related literature works are briefly reviewed. An outline of the dissertation is put forward.

In Chapter 2, Gaussian copula graphical models with different priors of graphical Lasso, adaptive graphical Lasso, and spike-and-slab Lasso on the precision matrix are assessed and compared. The proposed models are well illustrated via simulations and a real ordinal survey data analysis.

In Chapter 3, adaptive spike-and-slab Lasso prior is proposed as an extension of Chapter 2. The developed …


Community Participation And Perspectives Of Ambondrolava Mangrove Restoration Project, Nadine Shannon Apr 2023

Community Participation And Perspectives Of Ambondrolava Mangrove Restoration Project, Nadine Shannon

Independent Study Project (ISP) Collection

Madagascar’s mangrove forests are intertidal ecosystems that provide numerous valuable ecosystem services but are nonetheless under pressure from large amounts of deforestation. On the southwestern coast of Madagascar, the village of Ambondrolava practices community led management of the mangrove and its resources. This research project studied the evolution of the mangrove area using GIS data, and investigated, through interviews, the relationship between the local community of Ambondrolava and the organizations that manage the mangrove ecosystem. From 2000 to 2018, the zone of the mangrove has experienced a net loss in area every year, despite reforestation efforts. Most community members interviewed …


Changing Nfl Playoff Overtime Rules To Create Equal Opportunities To Win A Game, Matthew Silvia Apr 2023

Changing Nfl Playoff Overtime Rules To Create Equal Opportunities To Win A Game, Matthew Silvia

Honors Projects in Mathematics

The NFL has attempted to create fair overtime rules over the course of the past decade; however, this study is interested in determining what playoff overtime rule (or rules) could the NFL implement to result in outcomes where both teams have a relatively equal chance of winning a game. This study aims to find which overtime rules work best at minimizing the differences between teams who possess the ball first versus teams that kick the ball off to start an overtime period. By collecting various NFL statistics from ESPN.com and FantasyOutsiders.com, this study hopes to run multiple simulations of different …


Advancements In Parametric Modal Regression, Qingyang Liu Apr 2023

Advancements In Parametric Modal Regression, Qingyang Liu

Theses and Dissertations

This dissertation considers statistical inference methods for parametric modal regression models. In Chapter 1, we motivate the mode as the measure of central tendency instead of the median or the mean with an example. Following the motivational example, we include an overview of existing modal regression models. Later, in the same chapter, we explain advantages of the parametric modal regression models over existing nonparametric modal regression models. In Chapter 2, we address issues in statistical inference brought in by data contaminated with measurement error. With measurement error in covariates, statistical inference methods designed for modal regression models with error-free covariates …


Detecting Spatially Varying Coefficient Effects With Conditional Autoregressive Models: A Simulation Study Using Social Determinants Of Health Screening Data, Reid J. Demass Apr 2023

Detecting Spatially Varying Coefficient Effects With Conditional Autoregressive Models: A Simulation Study Using Social Determinants Of Health Screening Data, Reid J. Demass

Theses and Dissertations

Generalized linear models which include spatially varying coefficient terms allow researchers to determine if the association between predictor and outcome variables vary across geographic space. Such models are particularly applicable to research with public health data where interventions and limited health care resources must be allocated carefully. The integrated nested Laplace approximation (INLA) methodology available in the R INLA package is a popular tool to estimate spatially varying coefficients. To assess the performance of the estimation procedure, patient emergency department (ED) visits were simulated from data sourced from a pilot study at Prisma Health. The INLA technique was used to …


Sparse Partitioned Empirical Bayes Ecm Algorithms For High-Dimensional Linear Mixed Effects And Heteroscedastic Regression, Anja Zgodic Apr 2023

Sparse Partitioned Empirical Bayes Ecm Algorithms For High-Dimensional Linear Mixed Effects And Heteroscedastic Regression, Anja Zgodic

Theses and Dissertations

Variable selection methods in both the frequentist and Bayesian frameworks are powerful techniques that provide prediction and inference in high-dimensional linear regression models. These methods often assume independence between observations and normally distributed errors with the same variance. In practice, these two assumptions are often violated. To mitigate this, we develop efficient and powerful Bayesian approaches for linear mixed modeling and heteroscedastic linear regression. These method offers increased flexibility through the development of empirical Bayes estimators for hyperparameters, with computationally efficient estimation through the Expectation Conditional-Minimization (ECM) algorithm. The novelty of these approaches lies in the partitioning and parameter expansion, …


Statistical Clustering Of Networks With Additional Information, Paul Atandoh Apr 2023

Statistical Clustering Of Networks With Additional Information, Paul Atandoh

Dissertations

As the online market grows rapidly, many companies and researchers are interested in analyzing product review dataset which includes ratings and text review data. In the first project, we mainly focus on analyzing the text review data. In the current literature, it is common to use only text analysis tools to analyze review dataset. But in our work, we propose a method that utilizes both a text analysis method such as topic modeling and a statistical network model to build network among individuals and find interesting communities. We introduce a promising framework that incorporates topic modeling technique to define the …


Dynamic Equations, Control Problems On Time Scales, And Chaotic Systems, Martin Bohner Mar 2023

Dynamic Equations, Control Problems On Time Scales, And Chaotic Systems, Martin Bohner

Mathematics and Statistics Faculty Research & Creative Works

The unification of integral and differential calculus with the calculus of finite differences has been rendered possible by providing a formal structure to study hybrid discrete-continuous dynamical systems besides offering applications in diverse fields that require simultaneous modeling of discrete and continuous data concerning dynamic equations on time scales. Therefore, the theory of time scales provides a unification between the calculus of the theory of difference equations with the theory of differential equations. In addition, it has become possible to examine diverse application problems more precisely by the use of dynamical systems on time scales whose calculus is made up …