Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 181 - 210 of 685

Full-Text Articles in Physical Sciences and Mathematics

Samples, Unite! Understanding The Effects Of Matching Errors On Estimation Of Total When Combining Data Sources, Benjamin Williams May 2019

Samples, Unite! Understanding The Effects Of Matching Errors On Estimation Of Total When Combining Data Sources, Benjamin Williams

Statistical Science Theses and Dissertations

Much recent research has focused on methods for combining a probability sample with a non-probability sample to improve estimation by making use of information from both sources. If units exist in both samples, it becomes necessary to link the information from the two samples for these units. Record linkage is a technique to link records from two lists that refer to the same unit but lack a unique identifier across both lists. Record linkage assigns a probability to each potential pair of records from the lists so that principled matching decisions can be made. Because record linkage is a probabilistic …


A Quantitative Assessment Of The Diabetes Self-Management Education Program, Grace Mcfarlane May 2019

A Quantitative Assessment Of The Diabetes Self-Management Education Program, Grace Mcfarlane

Scholars Week

A Diabetes Self-Management Education (DSME) program offered in an inner-city health center run by the Cincinnati Health Department, which started in 2014, was created to help those in an underserved population learn how to manage their diabetes. Two key measurements, A1C (glycated hemoglobin) and BMI (body mass index), were taken over time to monitor their progress. In this study, we analyzed quantitatively whether or not there was a significant improvement in their BMI and A1C values over the course of two years since they joined DSME program as any improvement would imply a potentially healthier lifestyle in regards to their …


Market Research On Student Concert Attendance At Bgsu's College Of Musical Arts, Mary Solomon May 2019

Market Research On Student Concert Attendance At Bgsu's College Of Musical Arts, Mary Solomon

Honors Projects

Bowling Green State University boasts a well established College of Musical Arts which holds concerts performed by esteemed faculty, prestigious guest artists, and students. The school hosts these events in Kobacker Hall and Bryan Recital Hall which can accommodate up to 800 and 250 audience members, respectively. However, performances in Kobacker hall only fill one- fourth of the 800 seats, on average. Why is this so? This project aims to investigate the factors that influence students’ decisions to attend concerts at the College of Musical Arts (CMA). By methodology of survey research and statistical analysis, this project will look into …


Bias Reduction In Machine Learning Classifiers For Spatiotemporal Analysis Of Coral Reefs Using Remote Sensing Images, Justin J. Gapper May 2019

Bias Reduction In Machine Learning Classifiers For Spatiotemporal Analysis Of Coral Reefs Using Remote Sensing Images, Justin J. Gapper

Computational and Data Sciences (PhD) Dissertations

This dissertation is an evaluation of the generalization characteristics of machine learning classifiers as applied to the detection of coral reefs using remote sensing images. Three scientific studies have been conducted as part of this research: 1) Evaluation of Spatial Generalization Characteristics of a Robust Classifier as Applied to Coral Reef Habitats in Remote Islands of the Pacific Ocean 2) Coral Reef Change Detection in Remote Pacific Islands using Support Vector Machine Classifiers 3) A Generalized Machine Learning Classifier for Spatiotemporal Analysis of Coral Reefs in the Red Sea. The aim of this dissertation is to propose and evaluate a …


The Reproducibility Crisis In Scientific Research, Sarah Eline May 2019

The Reproducibility Crisis In Scientific Research, Sarah Eline

Senior Honors Projects, 2010-2019

Following the push for evidence based practice, came a huge proliferation of research journals and journal articles. With this increase in quantity came an increased concern about the quality of these articles being published, which led to a multifield investigation regarding the reproducibility of scientific research. With studies in the fields of psychology and biomedicine only reaching approximately a 30% reproducibility rate, a conversation has been sparked that spans across every field of research. Upon further investigation, various causes for this reproducibility crisis have surfaced which include, lack of data sharing/ transparency, statistical errors, funding corruption, and the culture surrounding …


Advanced Statistics In Arkansas Sports Reporting, Andrew Lee Epperson May 2019

Advanced Statistics In Arkansas Sports Reporting, Andrew Lee Epperson

Graduate Theses and Dissertations

This study seeks to analyze how Arkansas’ sports journalists are adapting to the recent surge in available advanced statistics that are being used by certain national news organizations. Using in-depth qualitative research that includes in-depth interviews with a number of individuals in the print, broadcast, and athletics side of sports coverage, we discover how journalists and coaches use these next-generation analytics, what they fundamentally mean for the evolution of each respective path, and why so few Arkansas reporters and writers use them at the time of this paper’s defense. We see how budgets and deadlines restrict the use of these …


A Self-Contained Course In The Mathematical Theory Of Statistics For Scientists & Engineers With An Emphasis On Predictive Regression Modeling & Financial Applications., Tim Smith Apr 2019

A Self-Contained Course In The Mathematical Theory Of Statistics For Scientists & Engineers With An Emphasis On Predictive Regression Modeling & Financial Applications., Tim Smith

Timothy Smith

Preface & Acknowledgments

This textbook is designed for a higher level undergraduate, perhaps even first year graduate, course for engineering or science students who are interested to gain knowledge of using data analysis to make predictive models. While there is no statistical perquisite knowledge required to read this book, due to the fact that the study is designed for the reader to truly understand the underlying theory rather than just learn how to read computer output, it would be best read with some familiarity of elementary statistics. The book is self-contained and the only true perquisite knowledge is a solid …


The Evolution Of Data Science: A New Mode Of Knowledge Production, Jennifer Lewis Priestley, Robert J. Mcgrath Apr 2019

The Evolution Of Data Science: A New Mode Of Knowledge Production, Jennifer Lewis Priestley, Robert J. Mcgrath

Faculty Articles

Is data science a new field of study or simply an extension or specialization of a discipline that already exists, such as statistics, computer science, or mathematics? This article explores the evolution of data science as a potentially new academic discipline, which has evolved as a function of new problem sets that established disciplines have been ill-prepared to address. The authors find that this newly-evolved discipline can be viewed through the lens of a new mode of knowledge production and is characterized by transdisciplinarity collaboration with the private sector and increased accountability. Lessons from this evolution can inform knowledge production …


Do Men Matter? In Statistics, Probably, Michael Kelly Apr 2019

Do Men Matter? In Statistics, Probably, Michael Kelly

WWU Honors College Senior Projects

In statistical genetics, there are several parameters of a dataset which a researcher might, but which are difficult to estimate in practice. In this paper, we will be focusing on allele frequencies, null alleles, inbreeding coefficients and, to a certain extent, beta values. A common technique for obtaining these values, developed by Amy Anderson and her co-workers, is to jointly estimate all of them using an EM-algorithm and the method of maximum likelihood. Despite this technique being effective in general, it is currently unable to deal with males at X-linked markers. The purpose of this project is to modify the …


Sensitivity Analyses For Tumor Growth Models, Ruchini Dilinika Mendis Apr 2019

Sensitivity Analyses For Tumor Growth Models, Ruchini Dilinika Mendis

Masters Theses & Specialist Projects

This study consists of the sensitivity analysis for two previously developed tumor growth models: Gompertz model and quotient model. The two models are considered in both continuous and discrete time. In continuous time, model parameters are estimated using least-square method, while in discrete time, the partial-sum method is used. Moreover, frequentist and Bayesian methods are used to construct confidence intervals and credible intervals for the model parameters. We apply the Markov Chain Monte Carlo (MCMC) techniques with the Random Walk Metropolis algorithm with Non-informative Prior and the Delayed Rejection Adoptive Metropolis (DRAM) algorithm to construct parameters' posterior distributions and then …


Daily And Seasonal Variability Of Offshore Wind Power On The Central California Coast And Statewide Demand, Matthew Douglas Kehrli Apr 2019

Daily And Seasonal Variability Of Offshore Wind Power On The Central California Coast And Statewide Demand, Matthew Douglas Kehrli

Physics

No abstract provided.


Reporting Number Needed To Treat In Clinical Trials Published In Physical Therapy Specific Literature 1989 - 2018, Susan Ann Talley Jan 2019

Reporting Number Needed To Treat In Clinical Trials Published In Physical Therapy Specific Literature 1989 - 2018, Susan Ann Talley

Wayne State University Dissertations

Evidence-based practice requires physical therapists to make clinical decisions about the best intervention to use when providing services to patients/clients. Although null hypothesis significance testing (NHST) is frequently used to interpret the outcome of a clinical trial investigating the comparative effectiveness of an intervention, statistical significance does not directly translate into clinical importance. Number needed to treat (NNT) is a measure of effect size (ES) that may be particularly useful when translating the results from clinical trials to PT clinical practice. The purpose of this study was to conduct a bibliometric content analysis of the methods of reporting research results …


Comparative Analysis Of Students’ Performance Between Online And On Campus In An Introductory Statistics Course, Kendal Mcdonald Jan 2019

Comparative Analysis Of Students’ Performance Between Online And On Campus In An Introductory Statistics Course, Kendal Mcdonald

The Corinthian

In this research, we compare students’ performance in an online and on-campus introductory statistics and probability course at Georgia College. MyStatLab is the learning management system used in both the online and on-campus courses for homework and quizzes. The online data is produced by five summer courses between Summer 2014 to Summer 2017 and the on-campus data is produced from nine on-campus courses from Spring 2014, Spring 2016, and Spring 2017. For homework, the research compares the scores made between online and on-campus. For quizzes, we test if there is a difference between the scores and the number of attempts …


Conceptualizing And Interpreting Mean And Median With Future Teachers, Eryn Stehr Maher, Ha Nguyen, Gregory Chamblee, Sharon Taylor Jan 2019

Conceptualizing And Interpreting Mean And Median With Future Teachers, Eryn Stehr Maher, Ha Nguyen, Gregory Chamblee, Sharon Taylor

Department of Mathematical Sciences Faculty Publications

Mathematical Education of Teachers II (METII), echoed by the American Statistical Association publication, Statistical Education of Teachers, recommended teacher preparation programs support future teachers in developing deep understandings of mean and median, such that middle grades teachers may use them to “summarize, describe, and compare distributions” (Conference Board of Mathematical Sciences, 2012, p. 44; Franklin et al., 2015). Georgia Standards of Excellence require statistical reasoning from students beginning as early as 6-7 years old, including interpretation of measures of center and statistical reasoning about best measures of center (Georgia Department of Education, 2015). This level of understanding and interpretation of …


Investigating The Factors That Best Describe Student Experience And Performance In College, Abigale Wynn Jan 2019

Investigating The Factors That Best Describe Student Experience And Performance In College, Abigale Wynn

Undergraduate Honors Thesis Collection

The National Survey of Student Engagement (NSSE) surveys students at four-year institutions around the United States in order to offer Universities accessible ways to evaluate their students' experiences and performance. The NSSE data is collected in the form of a Likert-scale survey geared towards first year and senior year students. It asks questions about how they spend their time throughout the academic year and how they rate their experience. This thesis looks at the NSSE survey data from Butler University in 2016 and attempts to apply classification techniques and predictive models to draw conclusions about student performance. Methods such as …


Estimation And Variable Selection In High-Dimensional Settings With Mismeasured Observations, Michael Byrd Jan 2019

Estimation And Variable Selection In High-Dimensional Settings With Mismeasured Observations, Michael Byrd

Statistical Science Theses and Dissertations

Understanding high-dimensional data has become essential for practitioners across many disciplines. The general increase in ability to collect large amounts of data has prompted statistical methods to adapt for the rising number of possible relationships to be uncovered. The key to this adaptation has been the notion of sparse models, or, rather, models where most relationships between variables are assumed to be negligible at best. Driving these sparse models have been constraints on the solution set, yielding regularization penalties imposed on the optimization procedure. While these penalties have found great success, they are typically formulated with strong assumptions on the …


Toward Collaborative Open Data Science In Metabolomics Using Jupyter Notebooks And Cloud Computing, Kevin M. Mendez, Leighton Pritchard, Stacey N. Reinke, David I. Broadhurst Jan 2019

Toward Collaborative Open Data Science In Metabolomics Using Jupyter Notebooks And Cloud Computing, Kevin M. Mendez, Leighton Pritchard, Stacey N. Reinke, David I. Broadhurst

Research outputs 2014 to 2021

Background

A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods …


The Effect Of Using A Project-Based Learning (Pbl) Approach To Improve Engineering Students' Understanding Of Statistics, Fionnuala Farrell, Michael Carr Jan 2019

The Effect Of Using A Project-Based Learning (Pbl) Approach To Improve Engineering Students' Understanding Of Statistics, Fionnuala Farrell, Michael Carr

Articles

Over the last number of years we have gradually been introducing a project based learning approach to the teaching of engineering mathematics inDublin Institute of Technology. Several projects are now in existence for the teaching of both second-order differential equations and first order differential equations.We intend to incrementally extend this approach acrossmore of the engineering mathematics curriculum. As part of this ongoing process, practical realworld projects in statistics were incorporated into a second year ordinary degree mathematics module. This paper provides an overview of these projects and their implementation. As a means to measure the success of this initiative, we …


A Self-Contained Course In The Mathematical Theory Of Statistics For Scientists & Engineers With An Emphasis On Predictive Regression Modeling & Financial Applications., Tim Smith Jan 2019

A Self-Contained Course In The Mathematical Theory Of Statistics For Scientists & Engineers With An Emphasis On Predictive Regression Modeling & Financial Applications., Tim Smith

Open Access Textbooks

Preface & Acknowledgments

This textbook is designed for a higher level undergraduate, perhaps even first year graduate, course for engineering or science students who are interested to gain knowledge of using data analysis to make predictive models. While there is no statistical perquisite knowledge required to read this book, due to the fact that the study is designed for the reader to truly understand the underlying theory rather than just learn how to read computer output, it would be best read with some familiarity of elementary statistics. The book is self-contained and the only true perquisite knowledge is a solid …


Conceptualizing And Interpreting Mean And Median With Future Teachers, Eryn M. Stehr, Ha Nguyen, Gregory Chamblee, Sharon Taylor Jan 2019

Conceptualizing And Interpreting Mean And Median With Future Teachers, Eryn M. Stehr, Ha Nguyen, Gregory Chamblee, Sharon Taylor

Proceedings of the Annual Meeting of the Georgia Association of Mathematics Teacher Educators

Mathematical Education of Teachers II (METII), echoed by the American Statistical Association publication, Statistical Education of Teachers, recommended teacher preparation programs support future teachers in developing deep understandings of mean and median, such that middle grades teachers may use them to “summarize, describe, and compare distributions” (Conference Board of Mathematical Sciences, 2012, p. 44; Franklin et al., 2015). Georgia Standards of Excellence require statistical reasoning from students beginning as early as 6-7 years old, including interpretation of measures of center and statistical reasoning about best measures of center (Georgia Department of Education, 2015). This level of understanding and interpretation of …


Optical Vortex And Poincaré Analysis For Biophysical Dynamics, Anindya Majumdar Jan 2019

Optical Vortex And Poincaré Analysis For Biophysical Dynamics, Anindya Majumdar

Dissertations, Master's Theses and Master's Reports

Coherent light - such as that from a laser - on interaction with biological tissues, undergoes scattering. This scattered light undergoes interference and the resultant field has randomly added phases and amplitudes. This random interference pattern is known as speckles, and has been the subject of multiple applications, including imaging techniques. These speckle fields inherently contain optical vortices, or phase singularities. These are locations where the intensity (or amplitude) of the interference pattern is zero, and the phase is undefined.

In the research presented in this dissertation, dynamic speckle patterns were obtained through computer simulations as well as laboratory setups …


Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane Jan 2019

Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane

Statistical Science Theses and Dissertations

If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?

We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …


Basketball Charts, Kevin Lewis Jan 2019

Basketball Charts, Kevin Lewis

Williams Honors College, Honors Research Projects

The purpose of this project was to develop an interactive web application with access to a self-updating database of basketball statistics. This data would then be used to allow users to generate informative visuals about specific sets of players. Obtaining statistics from the National Basketball Association (NBA) for the 2018-19 season was the original target goal. By utilizing an open source and community driven API, this goal was successfully achieved. With the data in place, the development of the chart building tool that was intended to be the primary functionality of the web application could begin. Highcharts was used as …


Automated Trading Systems Statistical And Machine Learning Methods And Hardware Implementation: A Survey, Boming Huang, Yuziang Huan, Li Da Xu, Lirong Zheng, Zhuo Zou Jan 2019

Automated Trading Systems Statistical And Machine Learning Methods And Hardware Implementation: A Survey, Boming Huang, Yuziang Huan, Li Da Xu, Lirong Zheng, Zhuo Zou

Information Technology & Decision Sciences Faculty Publications

Automated trading, which is also known as algorithmic trading, is a method of using a predesigned computer program to submit a large number of trading orders to an exchange. It is substantially a real-time decision-making system which is under the scope of Enterprise Information System (EIS). With the rapid development of telecommunication and computer technology, the mechanisms underlying automated trading systems have become increasingly diversified. Considerable effort has been exerted by both academia and trading firms towards mining potential factors that may generate significantly higher profits. In this paper, we review studies on trading systems built using various methods and …


Cramer Type Moderate Deviations For Random Fields And Mutual Information Estimation For Mixed-Pair Random Variables, Aleksandr Beknazaryan Jan 2019

Cramer Type Moderate Deviations For Random Fields And Mutual Information Estimation For Mixed-Pair Random Variables, Aleksandr Beknazaryan

Electronic Theses and Dissertations

In this dissertation we first study Cramer type moderate deviation for partial sums of random fields by applying the conjugate method. In 1938 Cramer published his results on large deviations of sums of i.i.d. random variables after which a lot of research has been done on establishing Cramer type moderate and large deviation theorems for different types of random variables and for various statistics. In particular results have been obtained for independent non-identically distributed random variables for the sum of independent random to estimate the mutual information between two random variables. The estimates enjoy a central limit theorem under some …


Utilizing Multi-Level Classification Techniques To Predict Adverse Drug Effects And Reactions, Victoria Puhl Jan 2019

Utilizing Multi-Level Classification Techniques To Predict Adverse Drug Effects And Reactions, Victoria Puhl

Undergraduate Honors Thesis Collection

Multi-class classification models are used to predict categorical response variables with more than two possible outcomes. A collection of multi-class classification techniques such as Multinomial Logistic Regression, Na\"{i}ve Bayes, and Support Vector Machine is used in predicting patients’ drug reactions and adverse drug effects based on patients’ demographic and drug administration. The newly released 2018 data on drug reactions and adverse drug effects from U.S. Food and Drug Administration are tested with the models. The applicability of model evaluation measures such as sensitivity, specificity and prediction accuracy in multi-class settings, are also discussed.


Bayesian Hierarchical Meta-Analysis Of Asymptomatic Ebola Seroprevalence, Peter Brody-Moore Jan 2019

Bayesian Hierarchical Meta-Analysis Of Asymptomatic Ebola Seroprevalence, Peter Brody-Moore

CMC Senior Theses

The continued study of asymptomatic Ebolavirus infection is necessary to develop a more complete understanding of Ebola transmission dynamics. This paper conducts a meta-analysis of eight studies that measure seroprevalence (the number of subjects that test positive for anti-Ebolavirus antibodies in their blood) in subjects with household exposure or known case-contact with Ebola, but that have shown no symptoms. In our two random effects Bayesian hierarchical models, we find estimated seroprevalences of 8.76% and 9.72%, significantly higher than the 3.3% found by a previous meta-analysis of these eight studies. We also produce a variation of this meta-analysis where we exclude …


Re-Describing Surface Roughness, Vincent Wagner Dec 2018

Re-Describing Surface Roughness, Vincent Wagner

Essential Studies UNDergraduate Showcase

The purpose of this project is to explore a non-traditional method of identifying and describing variance in data. The original goal was to provide a more useful description of surface roughness for use in calculating pressure loss due to pipe friction in the oil and gas industry. This approach uses simple trigonometric calculations to capture more information about the point to point variance of a given data set, as well as information related to the ratio of measured length vs total contact length. This method utilizes steps similar to the bootstrap method in statistics, however, rather than sampling a data …


Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett Dec 2018

Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Random forests are very popular tools for predictive analysis and data science. They work for both classification (where there is a categorical response variable) and regression (where the response is continuous). Random forests provide proximities, and both local and global measures of variable importance. However, these quantities require special tools to be effectively used to interpret the forest. Rfviz is a sophisticated interactive visualization package and toolkit in R, specially designed for interpreting the results of a random forest in a user-friendly way. Rfviz uses a recently developed R package (loon) from the Comprehensive R Archive Network (CRAN) to create …


Comparing Performance Of Gene Set Test Methods Using Biologically Relevant Simulated Data, Richard M. Lambert Dec 2018

Comparing Performance Of Gene Set Test Methods Using Biologically Relevant Simulated Data, Richard M. Lambert

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Today we know that there are many genetically driven diseases and health conditions. These problems often manifest only when a set of genes are either active or inactive. Recent technology allows us to measure the activity level of genes in cells, which we call gene expression. It is of great interest to society to be able to statistically compare the gene expression of a large number of genes between two or more groups. For example, we may want to compare the gene expression of a group of cancer patients with a group of non-cancer patients to better understand the genetic …