Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 301 - 330 of 826

Full-Text Articles in Physical Sciences and Mathematics

Analyzing Behavioral Adaptation To Covid-19 And Return To Pre-Pandemic Baselines In A Cohort Of College Seniors, Vlado Vojdanovski Jan 2022

Analyzing Behavioral Adaptation To Covid-19 And Return To Pre-Pandemic Baselines In A Cohort Of College Seniors, Vlado Vojdanovski

Computer Science Senior Theses

As the critical phase of the COVID-19 pandemic seems to be winding down, it is important to analyze the adjustment to COVID-19 and return to normalcy of various populations. In this study we focus on the behavioral adjustments exhibited by a cohort of N=114 college seniors. To infer COVID-19 adjustment we compare the 2021 year (second year of COVID-19) to the 2020 year (first year of COVID-19) and 2019 (prepandemic baseline year). We begin with a broad analysis between the second and first covid year, finding that the second year of COVID-19 shows significant returns to pre-pandemic baselines on multiple …


Learning Robot Motion From Creative Human Demonstration, Charles C. Dietzel Jan 2022

Learning Robot Motion From Creative Human Demonstration, Charles C. Dietzel

Theses and Dissertations

This thesis presents a learning from demonstration framework that enables a robot to learn and perform creative motions from human demonstrations in real-time. In order to satisfy all of the functional requirements for the framework, the developed technique is comprised of two modular components, which integrate together to provide the desired functionality. The first component, called Dancing from Demonstration (DfD), is a kinesthetic learning from demonstration technique. This technique is capable of playing back newly learned motions in real-time, as well as combining multiple learned motions together in a configurable way, either to reduce trajectory error or to generate entirely …


Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh Jan 2022

Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh

Browse all Theses and Dissertations

This dissertation focuses on disambiguation of language use on Twitter about drug use, consumption types of drugs, drug legalization, ontology-enhanced approaches, and prediction analysis of data-driven by developing novel NLP models. Three technical aims comprise this work: (a) leveraging pattern recognition techniques to improve the quality and quantity of crawled Twitter posts related to drug abuse; (b) using an expert-curated, domain-specific DsOn ontology model that improve knowledge extraction in the form of drug-to-symptom and drug-to-side effect relations; and (c) modeling the prediction of public perception of the drug’s legalization and the sentiment analysis of drug consumption on Twitter. We collected …


Deep Understanding Of Technical Documents : Automated Generation Of Pseudocode From Digital Diagrams & Analysis/Synthesis Of Mathematical Formulas, Nikolaos Gkorgkolis Jan 2022

Deep Understanding Of Technical Documents : Automated Generation Of Pseudocode From Digital Diagrams & Analysis/Synthesis Of Mathematical Formulas, Nikolaos Gkorgkolis

Browse all Theses and Dissertations

The technical document is an entity that consists of several essential and interconnected parts, often referred to as modalities. Despite the extensive attention that certain parts have already received, per say the textual information, there are several aspects that severely under researched. Two such modalities are the utility of diagram images and the deep automated understanding of mathematical formulas. Inspired by existing holistic approaches to the deep understanding of technical documents, we develop a novel formal scheme for the modelling of digital diagram images. This extends to a generative framework that allows for the creation of artificial images and their …


Hydrocarbon Pay Zone Prediction Using Ai Neural Network Modeling., Darren D. Guedon Jan 2022

Hydrocarbon Pay Zone Prediction Using Ai Neural Network Modeling., Darren D. Guedon

Graduate Theses, Dissertations, and Problem Reports

This paper captures the ability of AI neural network technology to analyze petrophysical datasets for pattern recognition and accurate prediction of the pay zone of a vertical well from the Santa Fe field in Kansas.

During this project, data from 10 completed wells in the Santa Fe field were gathered, resulting in a dataset with 25,580 records, ten predictors (logs data), and a single binary output (Yes or No) to identify the availability of Hydrocarbon over a half feet depth segment in the well. Several models composed of different predictors combinations were also tested to determine how impactful some logs …


Predicting Outcomes Of El Clásico Using Random Forests And Extreme Gradient Boosting, Emanuel Jarquin Jan 2022

Predicting Outcomes Of El Clásico Using Random Forests And Extreme Gradient Boosting, Emanuel Jarquin

CMC Senior Theses

In the modern era, sports betting is becoming increasingly popular. This is especially true in the realm of soccer (or ‘football’ as it is known outside the United States). As a result, the concept of attempting to predict the outcomes of soccer matches using machine learning has garnered much attention in recent years. In this thesis, I utilize well-known machine learning techniques to predict the outcomes of El Clásico matchups and compare the predictive performance of these techniques. The predictive methods employed for this thesis are random forests using the party package in R and extreme gradient boosting using the …


Exploiting Context In Linear Influence Games: Improved Algorithms For Model Selection And Performance Evaluation, Daniel Little Jan 2022

Exploiting Context In Linear Influence Games: Improved Algorithms For Model Selection And Performance Evaluation, Daniel Little

Honors Projects

In the recent past, extensive experimental works have been performed to predict joint voting outcomes in Congress based on a game-theoretic model of voting behavior known as Linear Influence Games. In this thesis, we improve the model selection and evaluation procedure of these past experiments. First, we implement two methods, Nested Cross-Validation with Tuning (Nested CVT) and Bootstrap Bias Corrected Cross-Validation (BBC-CV), to perform model selection and evaluation with less bias than previous methods. While Nested CVT is a commonly used method, it requires learning a large number of models; BBC-CV is a more recent method boasting less computational cost. …


Smart City Management Using Machine Learning Techniques, Mostafa Zaman Jan 2022

Smart City Management Using Machine Learning Techniques, Mostafa Zaman

Theses and Dissertations

In response to the growing urban population, "smart cities" are designed to improve people's quality of life by implementing cutting-edge technologies. The concept of a "smart city" refers to an effort to enhance a city's residents' economic and environmental well-being via implementing a centralized management system. With the use of sensors and actuators, smart cities can collect massive amounts of data, which can improve people's quality of life and design cities' services. Although smart cities contain vast amounts of data, only a percentage is used due to the noise and variety of the data sources. Information and communication technology (ICT) …


Reinforcement Learning: Low Discrepancy Action Selection For Continuous States And Actions, Jedidiah Lindborg Jan 2022

Reinforcement Learning: Low Discrepancy Action Selection For Continuous States And Actions, Jedidiah Lindborg

Electronic Theses and Dissertations

In reinforcement learning the process of selecting an action during the exploration or exploitation stage is difficult to optimize. The purpose of this thesis is to create an action selection process for an agent by employing a low discrepancy action selection (LDAS) method. This should allow the agent to quickly determine the utility of its actions by prioritizing actions that are dissimilar to ones that it has already picked. In this way the learning process should be faster for the agent and result in more optimal policies.


Application Of Machine Learning In Geophysics: Ranking Teleseismic Shear Wave Splitting Measurements And Classifying Different Types Of Earthquakes, Yanwei Zhang Jan 2022

Application Of Machine Learning In Geophysics: Ranking Teleseismic Shear Wave Splitting Measurements And Classifying Different Types Of Earthquakes, Yanwei Zhang

Doctoral Dissertations

"During the past decades, applications of Machine Learning have been explosively developed to solve various academic and industrial problems, and over-human performance has been shown in diverse areas. In geophysical research, Machine Learning, especially Convolutional Neural Network (CNN), has been applied in numerous studies and demonstrated considerable potential. In this study, we applied CNN to solve two geophysical problems, ranking teleseismic shear splitting (SWS) measurements and classifying different types of earthquakes.

For ranking teleseismic SWS measurements, we utilized a CNN-based method to automatically select reliable SWS measurements. The CNN was trained by human-verified teleseismic SWS measurements and tested using synthetic …


Modeling The Broader Impact Of Science And Health Using Social Media, Abdul Rahman Shaikh Jan 2022

Modeling The Broader Impact Of Science And Health Using Social Media, Abdul Rahman Shaikh

Graduate Research Theses & Dissertations

Research and development have always initiated innovation and breakthroughs in technology. These technological advancements in recent years have provided a global medium for research to be disseminated through online platforms. These web-based platforms and the interactions that take place on them affect the dissemination, impact, and perception of online information. This thesis investigates the broader impact of science and health using social media posts, online patents, videos, and images by building machine learning and topic models. First, this study predicts patent citations to scientific research and identifies important factors essential to economic impact. We found that the citation of research …


Batch Normalization Preconditioning For Neural Network Training, Susanna Luisa Gertrude Lange Jan 2022

Batch Normalization Preconditioning For Neural Network Training, Susanna Luisa Gertrude Lange

Theses and Dissertations--Mathematics

Batch normalization (BN) is a popular and ubiquitous method in deep learning that has been shown to decrease training time and improve generalization performance of neural networks. Despite its success, BN is not theoretically well understood. It is not suitable for use with very small mini-batch sizes or online learning. In this work, we propose a new method called Batch Normalization Preconditioning (BNP). Instead of applying normalization explicitly through a batch normalization layer as is done in BN, BNP applies normalization by conditioning the parameter gradients directly during training. This is designed to improve the Hessian matrix of the loss …


Development Of Accurate And Efficient Computational Methodologies For Predicting Protein-Ligand And Protein-Protein Binding Free Energies, Alexander Hamilton Williams Jan 2022

Development Of Accurate And Efficient Computational Methodologies For Predicting Protein-Ligand And Protein-Protein Binding Free Energies, Alexander Hamilton Williams

Theses and Dissertations--Pharmacy

Computational modeling is an invaluable tool in the drug discovery process either for small ligand or protein therapeutics. The widespread availability of protein X-Ray Crystal and Cryo-Electron Microscopy (Cryo-EM) structures has allowed for more accurate molecular dynamics (MD) simulations that are not reliant on methods such as homology modeling, which may produce structures that require significant computational time to demonstrate their stability. In this thesis we describe several novel methodologies for the computationally efficient modeling of protein/ligand and protein/protein complexes that may be employed within both large-scale virtual screenings and lead compound optimization. These methodologies may also be utilized in …


Genetic Algorighm Representation Selection Impact On Binary Classification Problems, Stephen V. Maldonado Jan 2022

Genetic Algorighm Representation Selection Impact On Binary Classification Problems, Stephen V. Maldonado

Honors Undergraduate Theses

In this thesis, we explore the impact of problem representation on the ability for the genetic algorithms (GA) to evolve a binary prediction model to predict whether a physical therapist is paid above or below the median amount from Medicare. We explore three different problem representations, the vector GA (VGA), the binary GA (BGA), and the proportional GA (PGA). We find that all three representations can produce models with high accuracy and low loss that are better than Scikit-Learn’s logistic regression model and that all three representations select the same features; however, the PGA representation tends to create lower weights …


Searching For Anomalous Extensive Air Showers Using The Pierre Auger Observatory Fluorescence Detector, Andrew Puyleart Jan 2022

Searching For Anomalous Extensive Air Showers Using The Pierre Auger Observatory Fluorescence Detector, Andrew Puyleart

Dissertations, Master's Theses and Master's Reports

Anomalous extensive air showers have yet to be detected by cosmic ray observatories. Fluorescence detectors provide a way to view the air showers created by cosmic rays with primary energies reaching up to hundreds of EeV . The resulting air showers produced by these highly energetic collisions can contain features that deviate from average air showers. Detection of these anomalous events may provide information into unknown regions of particle physics, and place constraints on cross-sectional interaction lengths of protons. In this dissertation, I propose measurements of extensive air shower profiles that are used in a machine learning pipeline to distinguish …


Forecasting Bitcoin, Ethereum And Litecoin Prices Using Machine Learning, Sai Prabhu Jaligama Jan 2022

Forecasting Bitcoin, Ethereum And Litecoin Prices Using Machine Learning, Sai Prabhu Jaligama

Graduate Research Theses & Dissertations

This research aims to predict the cryptocurrencies Bitcoin, Litecoin and Ethereum using Time Series Modelling with daily data of closing price from 16th of October 2018 to 9th of September 2021for a total of 1073 days. Augmented Dickey Fuller test was first used to check stationarity of the time series, then two forecasting algorithms called ARIMA, and PROPHET were used to make predictions. The findings show similar results for both the models for each of Bitcoin, Ethereum and Litecoin. The results achieved show modelling cryptocurrencies which are volatile using a single variable produces satisfying results.


Integrated Gradients Is A Nonlinear Generalization Of The Industry Standard Approach To Variable Attribution For Credit Risk Models, Jonathan Boardman, Md Shafiul Alam, Xiao Huang, Ying Xie Jan 2022

Integrated Gradients Is A Nonlinear Generalization Of The Industry Standard Approach To Variable Attribution For Credit Risk Models, Jonathan Boardman, Md Shafiul Alam, Xiao Huang, Ying Xie

Published and Grey Literature from PhD Candidates

In modern society, epistemic uncertainty limits trust in financial relationships, necessitating transparency and accountability mechanisms for both consumers and lenders. One upshot is that credit risk assessments must be explainable to the consumer. In the United States regulatory milieu, this entails both the identification of key factors in a decision and the provision of consistent actions that would improve standing. The traditionally accepted approach to explainable credit risk modeling involves generating scores with Generalized Linear Models (GLMs) - usually logistic regression, calculating the contribution of each predictor to the total points lost from the theoretical maximum, and generating reason codes …


Caption And Image Based Next-Word Auto-Completion, Meet Patel Jan 2022

Caption And Image Based Next-Word Auto-Completion, Meet Patel

Master's Projects

With the increasing number of options or choices in terms of entities like products, movies, songs, etc. which are now available to users, they try to save time by looking for an application or system that provides automatic recommendations. Recommender systems are automated computing processes that leverage concepts of Machine Learning, Data Mining and Artificial Intelligence towards generating product recommendations based on a user’s preferences. These systems have given a significant boost to businesses across multiple segments as a result of reduced human intervention. One similar aspect of this is content writing. It would save users a lot of time …


Graph Neural Networks For Malware Classification, Vrinda Malhotra Jan 2022

Graph Neural Networks For Malware Classification, Vrinda Malhotra

Master's Projects

Malware is a growing threat to the digital world. The first step to managing this threat is malware detection and classification. While traditional techniques rely on static or dynamic analysis of malware, the generation of these features requires expert knowledge. Function call graphs (FCGs) consist of program functions as their nodes and their interprocedural calls as their edges, providing a wealth of knowledge that can be utilized to classify malware without feature extraction that requires experts. This project treats malware classification as a graph classification problem, setting node features using the Local Degree Profile (LDP) model and using different graph …


A Machine Learning Algorithm Improves Surface Freeze-Thaw Classification, Fredrick Bunt Jan 2022

A Machine Learning Algorithm Improves Surface Freeze-Thaw Classification, Fredrick Bunt

Graduate Student Theses, Dissertations, & Professional Papers

The frozen or thawed state of the land surface is an important factor affecting a wide range of natural processes such as surface water movement, the carbon cycle, and ecosystem development. It is also important for human endeavors such as permafrost engineering and agricultural planning. This makes having an accurate record important. The Freeze-Thaw (FT) Earth System Data Record (FT-ESDR) is a global, daily product that strives to be a reliable record of the FT ground state. In its current form, the FT-ESDR uses annual regression analysis of reanalysis surface air temperatures (SAT) and brightness temperatures (Tb) at each grid …


Interpretable Machine Learning For Self-Service High-Risk Decision Making, Charles Recaido Jan 2022

Interpretable Machine Learning For Self-Service High-Risk Decision Making, Charles Recaido

All Master's Theses

This research contributes to interpretable machine learning via visual knowledge discovery in General Line Coordinates (GLC). The concepts of hyperblocks as interpretable dataset units and GLC are combined to create a visual self-service machine learning model. Two variants of GLC known as Dynamic Scaffold Coordinates (DSC) are proposed. DSC1 and DSC2 can map in a lossless manner multiple dataset attributes to a single two-dimensional (X, Y) Cartesian plane using a dynamic scaffolding graph construction algorithm.

Hyperblock analysis is used to determine visually appealing dataset attribute orders and to reduce line occlusion. It is shown that hyperblocks can generalize decision tree …


A Novel Handover Method Using Destination Prediction In 5g-V2x Networks, Pooja Shyamsundar Jan 2022

A Novel Handover Method Using Destination Prediction In 5g-V2x Networks, Pooja Shyamsundar

Master's Projects

This paper proposes a novel approach to handover optimization in fifth generation vehicular networks. A key principle in designing fifth generation vehicular network technology is continuous connectivity. This makes it important to ensure that there are no gaps in communication for mobile user equipment. Handovers can cause disruption in connectivity as the process involves switching from one base station to another. Issues in the handover process include poor load management for moving traffic resulting in low bandwidth or connectivity gaps, too many hops resulting in multiple unneccessary handovers, short dwell times and ineffective base station selection resulting in delays and …


From Evaluating The Performance Of Approximations In Density Functional Theory To A Machine Learning Design, Pedram Tavazohi Jan 2022

From Evaluating The Performance Of Approximations In Density Functional Theory To A Machine Learning Design, Pedram Tavazohi

Graduate Theses, Dissertations, and Problem Reports

Density-functional theory (DFT) has gained popularity because of its ability to predict the properties of a large group of materials a priori. Even though DFT is exact, there are inaccuracies introduced into the theory due to the approximations in the exchange-correlation (XC) functionals. Over the 50 years of its existence, scientists have tried to improve the design of the XC functionals. The errors introduced by these functionals are not consistent across all types of solid-state materials. In this project, a high throughput framework was utilized to compare the theoretical DFT predictions with the experimental results available in the Inorganic Crystal …


Efficacy Of Reported Issue Times As A Means For Effort Estimation, Paul Phillip Maclean Jan 2022

Efficacy Of Reported Issue Times As A Means For Effort Estimation, Paul Phillip Maclean

Graduate Theses, Dissertations, and Problem Reports

Software effort is a measure of manpower dedicated to developing and maintaining and software. Effort estimation can help project managers monitor their software, teams, and timelines. Conversely, improper effort estimation can result in budget overruns, delays, lost contracts, and accumulated Technical Debt (TD). Issue Tracking Systems (ITS) have become mainstream project management tools, with over 65,000 companies using Jira alone. ITS are an untapped resource for issue resolution effort research. Related work investigates issue effort for specific issue types, usually Bugs or similar. They model their developer-documented issue resolution times using features from the issues themselves. This thesis explores a …


The Burning Bush: Linking Lidar-Derived Shrub Architecture To Flammability, Michelle S. Bester Jan 2022

The Burning Bush: Linking Lidar-Derived Shrub Architecture To Flammability, Michelle S. Bester

Graduate Theses, Dissertations, and Problem Reports

Light detection and ranging (LiDAR) and terrestrial laser scanning (TLS) sensors are powerful tools for characterizing vegetation structure and for constructing three-dimensional (3D) models of trees, also known as quantitative structural models (QSM). 3D models and structural traits derived from them provide valuable information for biodiversity conservation, forest management, and fire behavior modeling. However, vegetation studies and 3D modeling methodologies often only focus on the forest canopy, with little attention given to understory vegetation. In particular, 3D structural information of shrubs is limited or not included in fire behavior models. Yet, understory vegetation is an important component of forested ecosystems, …


A Low-Cost Machine Learning Based Network Intrusion Detection System With Data Privacy Preservation, Jyoti Fakirah, Lauhim Mahfuz Zishan, Roshni Mooruth, Michael L. Johnstone, Wencheng Yang Jan 2022

A Low-Cost Machine Learning Based Network Intrusion Detection System With Data Privacy Preservation, Jyoti Fakirah, Lauhim Mahfuz Zishan, Roshni Mooruth, Michael L. Johnstone, Wencheng Yang

Research outputs 2022 to 2026

Network intrusion is a well-studied area of cyber security. Current machine learning-based network intrusion detection systems (NIDSs) monitor network data and the patterns within those data but at the cost of presenting significant issues in terms of privacy violations which may threaten end-user privacy. Therefore, to mitigate risk and preserve a balance between security and privacy, it is imperative to protect user privacy with respect to intrusion data. Moreover, cost is a driver of a machine learning-based NIDS because such systems are increasingly being deployed on resource-limited edge devices. To solve these issues, in this paper we propose a NIDS …


On The Documentation Of Refactoring Types, Eman Abdullah Alomar, Jiaqian Liu, Kenneth Addo, Mohamed Wiem Mkaouer, Christian D. Newman, Ali Ouni, Zhe Yu Dec 2021

On The Documentation Of Refactoring Types, Eman Abdullah Alomar, Jiaqian Liu, Kenneth Addo, Mohamed Wiem Mkaouer, Christian D. Newman, Ali Ouni, Zhe Yu

Articles

Commit messages are the atomic level of software documentation. They provide a natural language description of the code change and its purpose. Messages are critical for software maintenance and program comprehension. Unlike documenting feature updates and bug fixes, little is known about how developers document their refactoring activities. Specifically, developers can perform multiple refactoring operations, including moving methods, extracting classes, renaming attributes, for various reasons, such as improving software quality, managing technical debt, and removing defects. Yet, there is no systematic study that analyzes the extent to which the documentation of refactoring accurately describes the refactoring operations performed at the …


Physics-Informed Machine Learning To Predict Extreme Weather Events, Rthvik Raviprakash, Jonathan Buchanan, Mahdi Bu Ali Dec 2021

Physics-Informed Machine Learning To Predict Extreme Weather Events, Rthvik Raviprakash, Jonathan Buchanan, Mahdi Bu Ali

Discovery Undergraduate Interdisciplinary Research Internship

Extreme weather events refer to unexpected, severe, or unseasonal weather events, which are dynamically related to specific large-scale atmospheric patterns. These extreme weather events have a significant impact on human society and also natural ecosystems. For example, natural disasters due to extreme weather events caused more than $90 billion global direct losses in 2015. These extreme weather events are challenging to predict due to the chaotic nature of the atmosphere and are highly correlated with the occurrence of atmospheric blocking. A key aspect for preparedness and response to extreme climate events is accurate medium-range forecasting of atmospheric blocking events.

Unlike …


Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar Dec 2021

Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar

Master's Projects

In an average human life, the eyes not only passively scan visual scenes, but most times end up actively performing tasks including, but not limited to, searching, comparing, and counting. As a result of the advances in technology, we are observing a boost in the average screen time. Humans are now looking at an increasing number of screens and in turn images and videos. Understanding what scene a user is looking at and what type of visual task is being performed can be useful in developing intelligent user interfaces, and in virtual reality and augmented reality devices. In this research, …


Identifying Bots On Twitter With Benford’S Law, Sanmesh Bhosale Dec 2021

Identifying Bots On Twitter With Benford’S Law, Sanmesh Bhosale

Master's Projects

Over time Online Social Networks (OSNs) have grown exponentially in terms of active users and have now become an influential factor in the formation of public opinions. Due to this, the use of bots and botnets for spreading misinformation on OSNs has become a widespread concern. The biggest example of this was during the 2016 American Presidential Elections, where Russian bots on Twitter pumped out fake news to influence the election results.

Identifying bots and botnets on Twitter is not just based on visual analysis and can require complex statistical methods to score a profile based on multiple features and …