Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 451 - 480 of 826

Full-Text Articles in Physical Sciences and Mathematics

Approaching Hanabi With Q-Learning And Evolutionary Algorithm, Joseph Palmersten Dec 2020

Approaching Hanabi With Q-Learning And Evolutionary Algorithm, Joseph Palmersten

Culminating Projects in Computer Science and Information Technology

Hanabi is a cooperative card game with hidden information that requires cooperation and communication between the players. For a machine learning agent to be successful at the Hanabi, it will have to learn how to communicate and infer information from the communication of other players. To approach the problem of Hanabi the machine learning methods of Q-learning and Evolutionary algorithm are proposed as potential solutions. The agents that were created using the method are shown to not achieve human levels of communication.


Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li Dec 2020

Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li

Masters Theses

Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …


Defense By Deception Against Stealthy Attacks In Power Grids, Md Hasan Shahriar Nov 2020

Defense By Deception Against Stealthy Attacks In Power Grids, Md Hasan Shahriar

FIU Electronic Theses and Dissertations

Cyber-physical Systems (CPSs) and the Internet of Things (IoT) are converging towards a hybrid platform that is becoming ubiquitous in all modern infrastructures. The integration of the complex and heterogeneous systems creates enormous space for the adversaries to get into the network and inject cleverly crafted false data into measurements, misleading the control center to make erroneous decisions. Besides, the attacker can make a critical part of the system unavailable by compromising the sensor data availability. To obfuscate and mislead the attackers, we propose DDAF, a deceptive data acquisition framework for CPSs' hierarchical communication network. Each switch in the hierarchical …


Towards High Performance Stock Market Prediction Methods, Warren M. Landis, Sangwhan Cha Oct 2020

Towards High Performance Stock Market Prediction Methods, Warren M. Landis, Sangwhan Cha

Other Student Works

Stock markets of today, and will continue to in the future, rely on the metrics of timeliness and efficiency to reach optimal profits. A way stock investors have continued to strive for the best of these two factors of the business is through the use of predictive machine learning systems to help aid in their decision making. However, among the many systems currently in use, it could be said that the myriad of data that they are based on may not be sufficient. In an effort to devise an ensemble learning predictive system that will utilize an array of big …


Using Spatial Analysis And Machine Learning Techniques To Develop A Comprehensive Highway-Rail Grade Crossing Consolidation Model, Samira Soleimani Oct 2020

Using Spatial Analysis And Machine Learning Techniques To Develop A Comprehensive Highway-Rail Grade Crossing Consolidation Model, Samira Soleimani

LSU Doctoral Dissertations

The safety of highway-railroad grade crossings (HRGC) is still an issue in the United States of America (USA). The grade crossing is where a railroad crosses a road at the same level without any over or underpass. To improve the safety of crossings, the crossings’ condition should be explored from several aspects such as engineering design (speed limit, warning signs, etc.), road condition (number of lanes, surface markings, etc.), rail design (the type of track, ballast, etc.), temporal variables (weather, visibility, time of day, lightning, etc.), social variables (population, race, etc.), and last but not least, spatial variables (the type …


Co-Design And Evaluation Of An Intelligent Decision Support System For Stroke Rehabilitation Assessment, Min Hun Lee, Daniel P. Siewiorek, Asim Smailagic, Alexandre Bernardino, Sergi Badia Oct 2020

Co-Design And Evaluation Of An Intelligent Decision Support System For Stroke Rehabilitation Assessment, Min Hun Lee, Daniel P. Siewiorek, Asim Smailagic, Alexandre Bernardino, Sergi Badia

Research Collection School Of Computing and Information Systems

Clinical decision support systems have the potential to improve work flows of experts in practice (e.g. therapist's evidence-based rehabilitation assessment). However, the adoption of these systems is challenging, and the gains of these systems have not fully demonstrated yet. In this paper, we identified the needs of therapists to assess patient's functional abilities (e.g. alternative perspectives with quantitative information on patient's exercise motions). As a result, we co-designed and developed an intelligent decision support system that automatically identifies salient features of assessment using reinforcement learning to assess the quality of motion and generate patient-specific analysis. We evaluated this system with …


Using Object Detection Algorithm And Optical Character Recognition To Read Data From Alphanumeric Tags In Text, Ana Bazerque, Davi Moraes, Marcela Souza Oct 2020

Using Object Detection Algorithm And Optical Character Recognition To Read Data From Alphanumeric Tags In Text, Ana Bazerque, Davi Moraes, Marcela Souza

ICT

The present document explores the use of machine learning techniques, specifically supervised learning and classification. It applies those techniques to create a solution for a real world company that provides medical products and services to hospitals. This project will deal with streamlining the calibration of medical weighing scales. The developed application will use object detection and character recognition to identify and classify a digital image of a scale’s tag, and fill in a form with the corresponding data. The main reason for the need of this application is to avoid human errors and automate the collection of data from the …


Forecasting Vegetation Health In The Mena Region By Predicting Vegetation Indicators With Machine Learning Models, Sachi Perera, Wenzhao Li, Erik Linstead, Hesham El-Askary Sep 2020

Forecasting Vegetation Health In The Mena Region By Predicting Vegetation Indicators With Machine Learning Models, Sachi Perera, Wenzhao Li, Erik Linstead, Hesham El-Askary

Mathematics, Physics, and Computer Science Faculty Articles and Research

Machine learning (ML) techniques can be applied to predict and monitor drought conditions due to climate change. Predicting future vegetation health indicators (such as EVI, NDVI, and LAI) is one approach to forecast drought events for hotspots (e.g. Middle East and North Africa (MENA) regions). Recently, ML models were implemented to predict EVI values using parameters such as land types, time series, historical vegetation indices, land surface temperature, soil moisture, evapotranspiration etc. In this work, we collected the MODIS atmospherically corrected surface spectral reflectance imagery with multiple vegetation related indices for modeling and evaluation of drought conditions in the MENA …


Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed Sep 2020

Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed

SMU Data Science Review

Music is incorporated into our daily lives whether intentional or unintentional. It evokes responses and behavior so much so there is an entire study dedicated to the psychology of music. Music creates the mood for dancing, exercising, creative thought or even relaxation. It is a powerful tool that can be used in various venues and through advertisements to influence and guide human reactions. Music is also often "borrowed" in the industry today. The practices of sampling and remixing music in the digital age have made cover song identification an active area of research. While most of this research is focused …


Reducing Age Bias In Machine Learning: An Algorithmic Approach, Adriana Solange Garcia De Alford, Steven K. Hayden, Nicole Wittlin, Amy Atwood Sep 2020

Reducing Age Bias In Machine Learning: An Algorithmic Approach, Adriana Solange Garcia De Alford, Steven K. Hayden, Nicole Wittlin, Amy Atwood

SMU Data Science Review

In this paper, we study the prevalence of bias in machine learning; we explore the life cycle phases where bias is potentially introduced into a machine learning model; and lastly, we present how adversarial learning can be leveraged to measure unwanted bias and unfair behavior from a machine learning algorithm. This study focuses particularly on the topics of age bias in predicting employee attrition and presents a practical approach for how adversarial learning can be successful in mitigating age bias. To measure bias, we calculate group fairness metrics across five-year age groups and evaluate fairness between a baseline predictive model …


Forecasting Spare Parts Sporadic Demand Using Traditional Methods And Machine Learning - A Comparative Study, Bhuvana Adur Kannan, Ganesh Kodi, Oscar Padilla, Dough Gray, Barry C. Smith Sep 2020

Forecasting Spare Parts Sporadic Demand Using Traditional Methods And Machine Learning - A Comparative Study, Bhuvana Adur Kannan, Ganesh Kodi, Oscar Padilla, Dough Gray, Barry C. Smith

SMU Data Science Review

Sporadic demand presents a particular challenge to traditional time forecasting methods. In the past 50 years, there has been developments, such as, the Croston Model [3], which has improved forecast performance. With the rise of Machine Learning (ML) there is abundant research in the field of applying ML algorithms to predict sporadic demand [8][12][9]. However, most existing research has analyzed this problem from the demand side [17]. In this paper, we tackle this predictive analytics challenge from the supply side. We perform a comparative analysis utilizing a spare parts demand dataset from an Original Equipment Manufacturer (OEM). Since traditional measurements …


Tag: Automated Image Captioning, Nathan Funckes Sep 2020

Tag: Automated Image Captioning, Nathan Funckes

McNair Scholars Manuscripts

Many websites remain non-ADA compliant, containing images which lack accompanying textual descriptions. This leaves sight-impaired individuals unable to fully enjoy the rich wonders of the web. To address this inequity, our research aims to create an autonomous system capable of generating semantically accurate descriptions of images. This problem involves two tasks: recognizing an image and linguistically describing it. Our solution uses state-of-the-art deep learning: employing a convolutional neural network that "learns" to understand images and extracts their salient features, and a recurrent neural network that learns to generate structured, coherent sentences. These two networks are merged to create a single …


Creating A Culture Of Data-Driven Decision-Making, Kevin Bryan Rogers Sep 2020

Creating A Culture Of Data-Driven Decision-Making, Kevin Bryan Rogers

Doctoral Dissertations and Projects

Researchers have consistently shown that a supportive culture is one of the most crucial success factors in the implementation of any big data solution. Creating a culture that supports data-driven decision-making is a difficult but ultimately required step in transforming an organization into one that can readily and successfully adopt business intelligence technologies. The purpose of this qualitative case study was to understand the ways in which organizations can foster a culture of smarter decision-making and accountability so that businesses can improve operational metrics and ultimately profitability. Participants identified three major themes that drive the adoption of a data-driven culture. …


Machine Learning Applications For Drug Repurposing, Hansaim Lim Sep 2020

Machine Learning Applications For Drug Repurposing, Hansaim Lim

Dissertations, Theses, and Capstone Projects

The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected …


Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre Aug 2020

Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre

SMU Data Science Review

In this paper, we explore a representation methodology for the compression of DNA isolates. Using lossless string compression via tokenization of frequently repeated segments of DNA, we reduce the length of the isolates to be counted as k-mers for classification. With this new representation, we apply a previously established feature sampling method to dramatically reduce the feature space. In understanding the genetic diversity, we also look at conserving biological function across these spaces. Using a random forest model we were able to predict the resistance or susceptibility of bacteria with 85-90\% accuracy, with a 30-50\% reduction in overall isolate length, …


Evaluation Of Standard And Semantically-Augmented Distance Metrics For Neurology Patients, Daniel B. Hier, Jonathan Kopel, Steven U. Brint, Donald C. Wunsch, Gayla R. Olbricht, Sima Azizi, Blaine Allen Aug 2020

Evaluation Of Standard And Semantically-Augmented Distance Metrics For Neurology Patients, Daniel B. Hier, Jonathan Kopel, Steven U. Brint, Donald C. Wunsch, Gayla R. Olbricht, Sima Azizi, Blaine Allen

Electrical and Computer Engineering Faculty Research & Creative Works

Background: Patient distances can be calculated based on signs and symptoms derived from an ontological hierarchy. There is controversy as to whether patient distance metrics that consider the semantic similarity between concepts can outperform standard patient distance metrics that are agnostic to concept similarity. The choice of distance metric can dominate the performance of classification or clustering algorithms. Our objective was to determine if semantically augmented distance metrics would outperform standard metrics on machine learning tasks.

Methods: We converted the neurological findings from 382 published neurology cases into sets of concepts with corresponding machine-readable codes. We calculated patient distances by …


Routing Optimization In Heterogeneous Wireless Networks For Space And Mission-Driven Internet Of Things (Iot) Environments, Sara El Alaoui Aug 2020

Routing Optimization In Heterogeneous Wireless Networks For Space And Mission-Driven Internet Of Things (Iot) Environments, Sara El Alaoui

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

As technological advances have made it possible to build cheap devices with more processing power and storage, and that are capable of continuously generating large amounts of data, the network has to undergo significant changes as well. The rising number of vendors and variety in platforms and wireless communication technologies have introduced heterogeneity to networks compromising the efficiency of existing routing algorithms. Furthermore, most of the existing solutions assume and require connection to the backbone network and involve changes to the infrastructures, which are not always possible -- a 2018 report by the Federal Communications Commission shows that over 31% …


Dictionary-Based Data Generation For Fine-Tuning Bert For Adverbial Paraphrasing Tasks, Mark Anthony Carthon Aug 2020

Dictionary-Based Data Generation For Fine-Tuning Bert For Adverbial Paraphrasing Tasks, Mark Anthony Carthon

Theses and Dissertations

Recent advances in natural language processing technology have led to the emergence of

large and deep pre-trained neural networks. The use and focus of these networks are on transfer

learning. More specifically, retraining or fine-tuning such pre-trained networks to achieve state

of the art performance in a variety of challenging natural language processing/understanding

(NLP/NLU) tasks. In this thesis, we focus on identifying paraphrases at the sentence level using

the network Bidirectional Encoder Representations from Transformers (BERT). It is well

understood that in deep learning the volume and quality of training data is a determining factor

of performance. The objective of …


A Study Of Information Bots And Knowledge Bots, Amartya Hatua Aug 2020

A Study Of Information Bots And Knowledge Bots, Amartya Hatua

Dissertations

In this dissertation, a study of different aspects of information bots and knowledge bots is done. The research contributes to a better understanding of the various characteristics of information bots as well as the different patterns and factors responsible for the information diffusion in a social network. This research also shows how these factors can be used to predict information diffusion for a particular topic in a social network. The second part of the research is focused on strategies for improving the knowledge base of knowledge bots, where two different approaches are studied. In the first approach, knowledge is transferred …


Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis Aug 2020

Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis

Theses and Dissertations

Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease.

This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an …


An Investigation Into Multi-View Error Correcting Output Code Classifiers Applied To Organ Tissue Classification, Daniel Alvarez Aug 2020

An Investigation Into Multi-View Error Correcting Output Code Classifiers Applied To Organ Tissue Classification, Daniel Alvarez

UNLV Theses, Dissertations, Professional Papers, and Capstones

Large amounts of data is being generated constantly each day, so much data that it is difficult to find patterns in order to predict outcomes and make decisions for both humans and machines alike. It would be useful if this data could be simplified using machine learning techniques. For example, biological cell identity is dependent on many factors tied to genetic processes. Such factors include proteins, gene transcription, and gene methylation. Each of these factors are highly complex mechanism with immense amounts of data. Simplifying these can then be helpful in finding patterns in them. Error-Correcting Output Codes (ECOC) does …


Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat Jul 2020

Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat

Electronic Thesis and Dissertation Repository

The rapid growth of the Internet and related technologies has led to the collection of large amounts of data by individuals, organizations, and society in general [1]. However, this often leads to information overload which occurs when the amount of input (e.g. data) a human is trying to process exceeds their cognitive capacities [2]. Machine learning (ML) has been proposed as one potential methodology capable of extracting useful information from large sets of data [1]. This thesis focuses on two applications. The first is education, namely e-Learning environments. Within this field, this thesis proposes different optimized ML ensemble models to …


Data Mining And Image Classification Using Genetic Programming, Mahsa Shokri Varniab Jul 2020

Data Mining And Image Classification Using Genetic Programming, Mahsa Shokri Varniab

Master of Science in Computer Science Theses

Genetic programming (GP), a capable machine learning and search method, motivated by Darwinian-evolution, is an evolutionary learning algorithm which automatically evolves computer programs in the form of trees to solve problems. This thesis studies the application of GP for data mining and image processing. Knowledge discovery and data mining have been widely used in business, healthcare, and scientific fields. In data mining, classification is supervised learning that identifies new patterns and maps the data to predefined targets. A GP based classifier is developed in order to perform these mappings. GP has been investigated in a series of studies to classify …


Variability In The Effectiveness Of Psychological Interventions Based On Machine Learning In Stem Education, Mohammad Hasan, Bilal Khan Jul 2020

Variability In The Effectiveness Of Psychological Interventions Based On Machine Learning In Stem Education, Mohammad Hasan, Bilal Khan

School of Computing: Faculty Publications

This manuscript presents a framework to investigate the variability in the effectiveness of psychological interventions supported by Machine Learning (ML) based early-warning systems (EWS) in science, technology, engineering, and mathematics education. It emphasizes the importance of investigating the resulting variability and suggests that effective EWS cannot be designed without a deeper understanding of the variability. The framework uses an ML-based model to predict students’ academic performance early in the semester for a Sophomore-level Computer Science course at a public university in the United States. The students were given psychological interventions by sending their end-of-term performance forecast thrice during the semester. …


Automated Anomaly Detection And Localization System For A Microservices Based Cloud System, Priyanka Prakash Naikade Jul 2020

Automated Anomaly Detection And Localization System For A Microservices Based Cloud System, Priyanka Prakash Naikade

Electronic Thesis and Dissertation Repository

Context: With an increasing number of applications running on a microservices-based cloud system (such as AWS, GCP, IBM Cloud), it is challenging for the cloud providers to offer uninterrupted services with guaranteed Quality of Service (QoS) factors. Problem Statement: Existing monitoring frameworks often do not detect critical defects among a large volume of issues generated, thus affecting recovery response times and usage of maintenance human resource. Also, manually tracing the root causes of the issues requires a significant amount of time. Objective: The objective of this work is to: (i) detect performance anomalies, in real-time, through monitoring KPIs (Key Performance …


Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah Jul 2020

Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah

Electronic Thesis and Dissertation Repository

The increasing use of electronic platforms in healthcare has resulted in the generation of unprecedented amounts of data in recent years. The amount of data available to clinical researchers, physicians, and healthcare administrators continues to grow, which creates an untapped resource with the ability to improve the healthcare system drastically. Despite the enthusiasm for adopting electronic health records (EHRs), some recent studies have shown that EHR-based systems hardly improve the ability of healthcare providers to make better decisions. One reason for this inefficacy is that these systems do not allow for human-data interaction in a manner that fits and supports …


Deep Learning Predictive Modeling With Data Challenges (Small, Big, Or Imbalanced), Renhao Liu Jul 2020

Deep Learning Predictive Modeling With Data Challenges (Small, Big, Or Imbalanced), Renhao Liu

USF Tampa Graduate Theses and Dissertations

In the real world, data used to build machine learning models always has different sizes and characteristics. These size and characteristic features, including small datasets, big datasets, imbalanced datasets, often lead to different challenges when training machine learning models. Models trained on a small number of observations tend to overfit the training data and produce inaccurate results. When it comes to big data, efficiently learning from "huge" size data in a short time becomes important. With an imbalanced dataset, learning is usually biased towards the majority class in the data and appropriate measurements are needed to check model performance.

As …


An Improved Method For Spectroscopic Quality Classification, Elizabeth G. Mayer Jul 2020

An Improved Method For Spectroscopic Quality Classification, Elizabeth G. Mayer

Mathematics & Statistics ETDs

Spectral quality classification is a vital step in data cleaning before the

analysis of magnetic resonance spectroscopy (MRS) data can be done. This

analysis compares five methods of quality classification; three of these are

legacy methods, Maudsley et al. (2006), Zhang et al. (2018), and

Bustillo et al. (2020), and two newly created methods that used a random forests

classifier (RFC) to inform their classifications. We found that the random forest

classifier was the most accurate at predicting spectra quality (balanced

accuracy for RF of 88% vs legacy of 70%, 72%, or 72%). A

Random-Forests-Informed Filtering method (RFIFM) for quality …


A Study Of The Efficacy Of Machine Learning For Diagnosing Obstructive Coronary Artery Disease In Non-Diabetic Patients, Demond Larae Handley Jul 2020

A Study Of The Efficacy Of Machine Learning For Diagnosing Obstructive Coronary Artery Disease In Non-Diabetic Patients, Demond Larae Handley

Theses and Dissertations

According to the Centers for Disease Control and Prevention, about 18.2 million adults age 20 and older have Coronary Artery Disease in the United States. Early diagnosis is therefore of crucial importance to help prevent debilitating consequences, and principally death for many patients. In this study we use data containing gene expression values from peripheral blood samples in 198 non-diabetic patients, with the goal of developing an age and sex gene expression model for diagnosis of Coronary Artery Disease. We employ machine learning methods to obtain a classification based on genetic information, age and sex. Our implementation uses feed forward …


Anta: Accelerated Network Traffic Analytics., Matthew Grohotolski, Connor Dileo Jul 2020

Anta: Accelerated Network Traffic Analytics., Matthew Grohotolski, Connor Dileo

Summer Scholarship, Creative Arts and Research Projects (SCARP)

Implementing traditional machine learning models and neural networks has become trivial in detecting malicious network traffic and has sparked interest in many researchers investigating this field. Standard implementations include using the baseline models in packages such as sklearn, tensorflow, and keras. In this paper we seek to advance the field of network detection and produce results which will have great benefits in terms of speed and performance of these models. We take advantage of Intel’s DAAL and OpenVINO packages as they are the two best performance enhancing methods which are publicly available today. Furthermore, comparisons will be made to determine …