Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 391 - 420 of 685

Full-Text Articles in Physical Sciences and Mathematics

Student Fact Book, Fall 2014 - Thirty-Eighth Annual Edition, Wright State University, Office Of Student Information Systems, Wright State University Jan 2014

Student Fact Book, Fall 2014 - Thirty-Eighth Annual Edition, Wright State University, Office Of Student Information Systems, Wright State University

Wright State University Student Fact Books

The student fact book has general demographic information on all students enrolled at Wright State University for Fall Semester, 2014.


Flint International Statistics Conference Announcement, Kettering University Jan 2014

Flint International Statistics Conference Announcement, Kettering University

Flint: One City, 100 Years of Variability

CONFERENCE ANNOUNCEMENT POSTER:

Kettering University is organizing this international conference to celebrate the IYS 2013 and the 175th anniversary of the American Statistical Association.

The main focus of this conference will be on STATISTICAL METHODS & STUDIES OF HISTORICAL DATA.

Participants may use any data. Data on Flint—consisting of up to 100 years of demographic, health, labor, census and crime records will be summarized and made available to participants. Sessions will include presentations of the statistical achievements and perspectives, followed by several talks on current results.


Lack Of Quantitative Training Among Early-Career Ecologists: A Survey Of The Problem And Potential Solutions, F. Barraquand, T. G. Ezard, P. Søgaard Jørgensen, Naupaka B. Zimmerman, S. Chamberlain, R. Salguero-Gómez, T. J. Curran, T. Poisot Jan 2014

Lack Of Quantitative Training Among Early-Career Ecologists: A Survey Of The Problem And Potential Solutions, F. Barraquand, T. G. Ezard, P. Søgaard Jørgensen, Naupaka B. Zimmerman, S. Chamberlain, R. Salguero-Gómez, T. J. Curran, T. Poisot

Biology Faculty Publications

Proficiency in mathematics and statistics is essential to modern ecological science, yet few studies have assessed the level of quantitative training received by ecologists. To do so, we conducted an online survey. The 937 respondents were mostly early-career scientists who studied biology as undergraduates. We found a clear self-perceived lack of quantitative training: 75% were not satisfied with their understanding of mathematical models; 75% felt that the level of mathematics was “too low” in their ecology classes; 90% wanted more mathematics classes for ecologists; and 95% more statistics classes. Respondents thought that 30% of classes in ecology-related degrees should be …


Building A Predictive Model For Baseball Games, Jordan Robertson Tait Jan 2014

Building A Predictive Model For Baseball Games, Jordan Robertson Tait

All Graduate Theses, Dissertations, and Other Capstone Projects

In this paper, we will discuss a method of building a predictive model for Major League Baseball Games. We detail the reasoning for pursuing the proposed predictive model in terms of social popularity and the complexity of analyzing individual variables. We apply a coarse-grain outlook inspired by Simon Dedeos' work on Human Social Systems, in particular the open source website Wikipedia [2] by attempting to quantify the influence of winning and losing streaks instead of analyzing individual performance variables. We will discuss initial findings of data collected from the LA Dodgers and Colorado Rockies and apply further statistical analysis to …


Methods For Clustering Mixed Data, Jeanmarie L. Hendrickson Jan 2014

Methods For Clustering Mixed Data, Jeanmarie L. Hendrickson

Theses and Dissertations

We give a brief introduction to cluster analysis and then propose and discuss a few methods for clustering mixed data. In particular, a model-based clustering method for mixed data based on Everitt's (1988) work is described, and we use a simulated annealing method to estimate the parameters for Everitt's model. A penalized log likelihood with the simulated annealing method is proposed as a remedy for the parameter estimates being drawn to extremes. Everitt's approach and the proposed method are compared based on their performance in clustering simulated data. We then use the penalized log likelihood method on a heart disease …


Reference Interval Studies: What Is The Maximum Number Of Samples Recommended?, Robert Hawkins, Tony Badrick Sep 2013

Reference Interval Studies: What Is The Maximum Number Of Samples Recommended?, Robert Hawkins, Tony Badrick

Tony Badrick

Background: Little attention has been paid to the maximum number of specimens for reference interval calculation, i.e., the number of specimens beyond which there is no further benefit in reference interval calculation. We present a model for the estimation of the maximum number of specimens for reference interval studies based on setting the 90% confidence interval of the reference limits to be equal to the analyte reporting interval. Methods: Equations describing the bounds on the upper and lower 90% confidence intervals for logarithmically transformed and untransformed data were derived and applied to determine the maximum number of specimens required to …


A Bayesian Approach To Deriving Ages Of Individual Field White Dwarfs, Erin M. O'Malley, Ted Von Hippel, David A. Van Dyk Aug 2013

A Bayesian Approach To Deriving Ages Of Individual Field White Dwarfs, Erin M. O'Malley, Ted Von Hippel, David A. Van Dyk

Dartmouth Scholarship

We apply a self-consistent and robust Bayesian statistical approach to determine the ages, distances, and zero-age main sequence (ZAMS) masses of 28 field DA white dwarfs (WDs) with ages of approximately 4-8 Gyr. Our technique requires only quality optical and near-infrared photometry to derive ages with <15% uncertainties, generally with little sensitivity to our choice of modern initial-final mass relation. We find that age, distance, and ZAMS mass are correlated in a manner that is too complex to be captured by traditional error propagation techniques. We further find that the posterior distributions of age are often asymmetric, indicating that the standard approach to deriving WD ages can yield misleading results.


Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe Aug 2013

Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe

UW Biostatistics Working Paper Series

Background Net Reclassification Indices (NRI) have recently become popular statistics for measuring the prediction increment of new biomarkers.

Methods In this review, we examine the various types of NRI statistics and their correct interpretations. We evaluate the advantages and disadvantages of the NRI approach. For pre-defined risk categories, we relate NRI to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for NRI statistics and evaluate the merits of NRI-based hypothesis testing.

Conclusions Investigators using NRI statistics should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the …


Big Data: Immediate Opportunities And Longer Term Challenges, Jens Pohl, Kym Jason Pohl Jul 2013

Big Data: Immediate Opportunities And Longer Term Challenges, Jens Pohl, Kym Jason Pohl

Collaborative Agent Design (CAD) Research Center

The transformation of words, locations, and human interactions into digital data forms the basis of trend detection and information extraction opportunities that can be automated with the increasing availability of relatively inexpensive computer storage and processing technology. Trend detection, which focuses on what, is facilitated by the ability to apply analytics to an entire corpus of data instead of a random sample. Since the corpus essentially includes all data within a population there is no need to apply any of the precautions that are in order to ensure the representativeness of a sample in traditional statistical analysis. Several examples are …


Examining Middle School Students' Statistical Thinking While Working In A Technological Environment, Melissa Arnold Scranton Jul 2013

Examining Middle School Students' Statistical Thinking While Working In A Technological Environment, Melissa Arnold Scranton

Theses and Dissertations

Examining Middle School Students' Statistical Thinking

While Working in a Technological Environment

Melissa Arnold Scranton

The purpose of this study was to gain a better understanding of how students think in a technological environment. This was accomplished by exploring the differences in the thinking of students while they worked in a technological environment and comparing this to their work in a paper and pencil environment. The software program TinkerPlots: Dynamic Data Exploration (Konold & Miller, 2005), a construction tool that middle school students use for data analysis was the technological environment. In both environments, types of critical, creative, and statistical …


Examining Introductory Students’ Attitudes In A Randomization-Based Curriculum, Joshua Ryan Beemer Jun 2013

Examining Introductory Students’ Attitudes In A Randomization-Based Curriculum, Joshua Ryan Beemer

Statistics

Student attitudes regarding introductory statistics courses are not always the most positive. The purpose of this research is to utilize the Survey of Attitudes Toward Statistics to evaluate introductory statistics students’ attitudes pre- and post course. Furthermore, comparisons of attitudes within different introductory course curricula across institutions will be made. Various components within the survey, such as difficulty, value, and interest, will be assessed in order to determine where students’ attitudes are affected the most and how they are correlated with other variables such as current GPA and curriculum taught. The outcomes for these models look at demographic predictors that …


P-Values Versus Significance Levels, Phillip I. Good May 2013

P-Values Versus Significance Levels, Phillip I. Good

Journal of Modern Applied Statistical Methods

In this article Phillip Good responds to Richard Anderson's article Conceptual Distinction between the Critical p Value and the Type I Error Rate in Permutation Testing.


Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing: Author Response To Peer Comments, Richard B. Anderson May 2013

Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing: Author Response To Peer Comments, Richard B. Anderson

Journal of Modern Applied Statistical Methods

Richard Anderson responds to comments regarding his target article Conceptual Distinction between the Critical p Value and the Type I Error Rate in Permutation Testing.


A Response To Anderson's (2013) Conceptual Distinction Between The Critical P Value And Type I Error Rate In Permutation Testing, Fortunato Pesarin, Stefano Bonnini May 2013

A Response To Anderson's (2013) Conceptual Distinction Between The Critical P Value And Type I Error Rate In Permutation Testing, Fortunato Pesarin, Stefano Bonnini

Journal of Modern Applied Statistical Methods

Pesarin and Bonnini respond to Anderson's (2013) Conceptual Distinction between the Critical p value and Type I Error Rate in Permutation Testing


Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing, Richard B. Anderson May 2013

Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing, Richard B. Anderson

Journal of Modern Applied Statistical Methods

To counter past assertions that permutation testing is not distribution-free, this article clarifies that the critical p value (alpha) in permutation testing is not a Type I error rate and that a test's validity is independent of the concept of Type I error.


Customer Age As A Predictor Of Contact Volume, Tollan Renner Apr 2013

Customer Age As A Predictor Of Contact Volume, Tollan Renner

Honors Theses and Capstones

A two stage modeling approach for modeling customer age as a predictor of contact volume was conducted using a real-world data set of approximately 2,000,000 contacts from a company call center. Two models were constructed in the first stage, one a straightforward regression and the other a series of regressions. One was selected as better performing and scaled up to predict calls received from calls answered. The second stage of the modeling included a day of the week covariate and performed the best of the models created. This model uses age bins as model effects, of which the youngest age …


Csc Senior Project: Nlpstats, Michael Mease Mar 2013

Csc Senior Project: Nlpstats, Michael Mease

Computer Science and Software Engineering

Natural Language Processing has recently increased in popularity. The field of authorship analysis, specifically, uses various characteristics of text quantified by markers. NLPStats serves as a tool designed to streamline marker extraction based on user needs. A flexible query system allows for custom marker requests, adjustment of result formatting, and preprocessing options. Furthermore, an efficiently designed structure ensures that users retrieve information quickly. As a whole, NLPStats enables anyone, regardless of NLP experience, to extract important information about the text of a document.


Views On Sexual Assault Among Ifc Fraternities, Steven Legore Mar 2013

Views On Sexual Assault Among Ifc Fraternities, Steven Legore

Statistics

The data collection and analysis for this project was performed for a consulting client Cierra, a fourth year Sociology major that works at the Safer Office on campus. She went to the consulting center on campus for help with the analysis of her project. She wanted to survey the IFC fraternities at Cal Poly on their views on sexual assault and rape. Thirteen IFC fraternities were surveyed with a total of 488 respondents. The responses to the 30 question True/False survey were used to evaluate the respondent’s empathy towards women, hostility towards women, and sexual aggression. Another research interest was …


Clustering Revisited: A Spectral Analysis Of Microseismic Events, Deborah Fagan, Kasper Van Wijk, James Rutledge Mar 2013

Clustering Revisited: A Spectral Analysis Of Microseismic Events, Deborah Fagan, Kasper Van Wijk, James Rutledge

CGISS Publications and Presentations

Identifying individual subsurface faults in a larger fault system is important to characterize and understand the relationship between microseismicity and subsurface processes. This information can potentially help drive reservoir management and mitigate the risks of natural or induced seismicity. We have evaluated a method of statistically clustering power spectra from microseismic events associated with an enhanced oil recovery operation in southeast Utah. Specifically, we were able to provide a clear distinction within a set of events originally designated in the time domain as a single cluster and to identify evidence of en echelon faulting. Subtle time-domain differences between events were …


Spatial Statistics In The Presence Of Location Error With An Application To Remote Sensing Of The Environment, Noel A. Cressie, John Kornak Feb 2013

Spatial Statistics In The Presence Of Location Error With An Application To Remote Sensing Of The Environment, Noel A. Cressie, John Kornak

Professor Noel Cressie

Techniques for the analysis of spatial data have, to date, tended to ignore any effect caused by error in specifying the spatial locations at which measurements are recorded. This paper reviews the methods for adjusting spatial inference in the presence of data-location error, particularly for data that. have a continuous spatial index (geostatistical data). New kriging equations are developed and evaluated based on a simulation experiment. They are also applied to remote-sensing data from the Total Ozone Mapping Spectrometer instrument on the Nimbus-7 satellite, where the location error is caused by assignment of the data to their nearest grid-cell centers. …


Size And Power Considerations For Testing Loglinear Models Using Divergence Test Statistics, Noel A. Cressie, L Pardo, M Del Carmen Pardo Feb 2013

Size And Power Considerations For Testing Loglinear Models Using Divergence Test Statistics, Noel A. Cressie, L Pardo, M Del Carmen Pardo

Professor Noel Cressie

In this article, we assume that categorical data are distributed according to a multinomial distribution whose probabilities follow a loglinear model. The inference problem we consider is that of hypothesis testing in a loglinear-model setting. The null hypothesis is a composite hypothesis nested within the alternative. Test statistics are chosen from the general class of divergence statistics. This article collects together the operating characteristics of the hypothesis test based on both asymptotic (using large-sample theory) and finite-sample (using a designed simulation study) results. Members of the class of power divergence statistics are compared, and it is found that the Cressie-Read …


Data Mining Of Misr Aerosol Product Using Spatial Statistics, Tao Shi, Noel A. Cressie Feb 2013

Data Mining Of Misr Aerosol Product Using Spatial Statistics, Tao Shi, Noel A. Cressie

Professor Noel Cressie

In climate models, aerosol forcing is the major source of uncertainty in climate forcing, over the industrial period. To reduce this uncertainty, instruments on satellites have been put in place to collect global data. However, missing and noisy observations impose considerable difficulties for scientists researching global aerosol distribution, aerosol transportation, and comparisons between satellite observations and global-climate-model outputs. In this paper, we propose a Spatial Mixed Effects (SME) statistical model to predict the missing values, denoise the observed values, and quantify the spatial-prediction uncertainties. The computations associated with the SME model are linear scalable to the number of data points, …


Estimation Of Hiv Incidence Using Multiple Biomakers, Ron Brookmeyer, Jacob Konikoff, Oliver Laeyendecker, Susan Eshleman Jan 2013

Estimation Of Hiv Incidence Using Multiple Biomakers, Ron Brookmeyer, Jacob Konikoff, Oliver Laeyendecker, Susan Eshleman

Ron Brookmeyer

The incidence of human immunodeficiency virus (HIV) is the rate at which new HIV infections occur in populations. The development of accurate, practical, and cost-effective approaches to estimation of HIV incidence is a priority among researchers in HIV surveillance because of limitations with existing methods. In this paper, we develop methods for estimating HIV incidence rates using multiple biomarkers in biological samples collected from a cross-sectional survey. An advantage of the method is that it does not require longitudinal follow-up of individuals. We use assays for BED, avidity, viral load, and CD4 cell count data from clade B samples collected …


Investigation Of A Pregnancy Lifestyle Intervention Using Mediation Analysis And A Power Analysis Simulation, Kelsey Grantham Jan 2013

Investigation Of A Pregnancy Lifestyle Intervention Using Mediation Analysis And A Power Analysis Simulation, Kelsey Grantham

Statistics

No abstract provided.


Descriptive Statistical Attributes Of Special Education Datasets, Valerie Felder Jan 2013

Descriptive Statistical Attributes Of Special Education Datasets, Valerie Felder

Wayne State University Dissertations

ABSTRACT

Descriptive Statistical Attributes of Special Education Data Sets

by

VALERIE FELDER

December 2013

Advisor: Dr. Shlomo Sawilowsky

Major: Educational Evaluation and Research

Degree: Doctor of Philosophy

Micceri (1989) examined the distributional characteristics of 440 large-sample achievement and psychometric measures. All the distributions were found to be nonnormal at alpha = .01. Micceri indicated three factors that might contribute to a non-Gaussian error distribution in the population. The first factor is subpopulations within a target population. The second factor is ceiling effects and the third factor is treatment effects that may change the location parameter, variability, or shape of the …


Student Fact Book, Fall 2013, Thirty-Seventh Annual Edition, Wright State University, Office Of Student Information Systems, Wright State University Jan 2013

Student Fact Book, Fall 2013, Thirty-Seventh Annual Edition, Wright State University, Office Of Student Information Systems, Wright State University

Wright State University Student Fact Books

The student fact book has general demographic information on all students enrolled at Wright State University for Fall Semester, 2013.


Raman Spectroscopy For The Identification Of Body Fluid Traces : Mixtures And Contaminations, Race And Gender Differentation, Aliaksandra Sikirzhytskaya Jan 2013

Raman Spectroscopy For The Identification Of Body Fluid Traces : Mixtures And Contaminations, Race And Gender Differentation, Aliaksandra Sikirzhytskaya

Legacy Theses & Dissertations (2009 - 2024)

Body fluid traces are an important type of forensic evidence, which play a significant role in the reconstruction of a violent crime and often help to identify a victim or suspect based on DNA analysis. Despite a great need, there is no a single method, which could identify multiple body fluids. In addition, majority of current methods are destructive for the evidence. For about eight years, our laboratory has been working on the development of a new nondestructive method for identification of body fluid traces based on Raman microspectroscopy combined with advanced statistics. High differentiation power of the method has …


The Complete Plus-Minus: A Case Study Of The Columbus Blue Jackets, Nathan Spagnola Jan 2013

The Complete Plus-Minus: A Case Study Of The Columbus Blue Jackets, Nathan Spagnola

Theses and Dissertations

A new hockey statistic termed the Complete Plus-Minus (CPM) was created to calculate the abilities of hockey players in the National Hockey League (NHL). This new statistic was used to analyze the Columbus Blue Jackets for the 2011-2012 season. The CPM for the Blue Jackets was created using two logistic regressions that modeled a goal being scored for and against the Blue Jackets. Whether a goal was scored for or against the team were the responses, while events on the ice were the predictors in the model. It was found that the team's poor performance was due to a weak …


Revising Common Core Georgia Performance Standards Statistics Lesson Plans To Better Align With Statistical Practice, Rachel Bonilla Jan 2013

Revising Common Core Georgia Performance Standards Statistics Lesson Plans To Better Align With Statistical Practice, Rachel Bonilla

Electronic Theses and Dissertations

In this thesis, lesson plans provided by the Georgia Department of Education are revised to give students better exposure and practice working with real-life data. Three learning tasks and a performance task are presented covering a unit lesson on statistical regression. The development of Georgia statistics curriculum standards are reviewed and presented.


A Method For Generating Realistic Correlation Matrices, Johanna S. Hardin, Stephan Ramon Garcia, David Golan Jan 2013

A Method For Generating Realistic Correlation Matrices, Johanna S. Hardin, Stephan Ramon Garcia, David Golan

Pomona Faculty Publications and Research

Simulating sample correlation matrices is important in many areas of statistics. Approaches such as generating Gaussian data and finding their sample correlation matrix or generating random uniform $[-1,1]$ deviates as pairwise correlations both have drawbacks. We develop an algorithm for adding noise, in a highly controlled manner, to general correlation matrices. In many instances, our method yields results which are superior to those obtained by simply simulating Gaussian data. Moreover, we demonstrate how our general algorithm can be tailored to a number of different correlation models. Using our results with a few different applications, we show that simulating correlation matrices …