Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 721 - 750 of 826

Full-Text Articles in Physical Sciences and Mathematics

Adaptive Region-Based Approaches For Cellular Segmentation Of Bright-Field Microscopy Images, Hady Ahmady Phoulady May 2017

Adaptive Region-Based Approaches For Cellular Segmentation Of Bright-Field Microscopy Images, Hady Ahmady Phoulady

USF Tampa Graduate Theses and Dissertations

Microscopy image processing is an emerging and quickly growing field in medical imaging research area. Recent advancements in technology including higher computation power, larger and cheaper storage modules, and more efficient and faster data acquisition devices such as whole-slide imaging scanners contributed to the recent microscopy image processing research advancement. Most of the methods in this research area either focus on automatically process images and make it easier for pathologists to direct their focus on the important regions in the image, or they aim to automate the whole job of experts including processing and classifying images or tissues that leads …


On The Role Of Genetic Algorithms In The Pattern Recognition Task Of Classification, Isaac Ben Sherman May 2017

On The Role Of Genetic Algorithms In The Pattern Recognition Task Of Classification, Isaac Ben Sherman

Masters Theses

In this dissertation we ask, formulate an apparatus for answering, and answer the following three questions: Where do Genetic Algorithms fit in the greater scheme of pattern recognition? Given primitive mechanics, can Genetic Algorithms match or exceed the performance of theoretically-based methods? Can we build a generic universal Genetic Algorithm for classification? To answer these questions, we develop a genetic algorithm which optimizes MATLAB classifiers and a variable length genetic algorithm which does classification based entirely on boolean logic. We test these algorithms on disparate datasets rooted in cellular biology, music theory, and medicine. We then get results from these …


Parallel Design Of A Product And Internet Of Things (Iot) Architecture To Minimize The Cost Of Utilizing Big Data (Bd) For Sustainable Value Creation, Ryan Bradley, Ibrahim S. Jawahir, Niko Murrell, Julie Whitney Apr 2017

Parallel Design Of A Product And Internet Of Things (Iot) Architecture To Minimize The Cost Of Utilizing Big Data (Bd) For Sustainable Value Creation, Ryan Bradley, Ibrahim S. Jawahir, Niko Murrell, Julie Whitney

Institute for Sustainable Manufacturing Faculty Publications

Information has become today's addictive currency; hence, companies are investing billions in the creation of Internet of Things (IoT) frameworks that gamble on finding trends that reveal sustainability and/or efficiency improvements. This approach to “Big Data” can lead to blind, astronomical costs. Therefore, this paper presents a counter approach aimed at minimizing the cost of utilizing “Big Data” for sustainable value creation. The proposed approach leverages domain/expert knowledge of the system in combination with a machine learning algorithm in order to limit the needed infrastructure and cost. A case study of the approach implemented in a consumer electronics company is …


What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, And Prevention, Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William L. Romine, Amit Sheth Apr 2017

What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, And Prevention, Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William L. Romine, Amit Sheth

Kno.e.sis Publications

Background: In order to harness what people are tweeting about Zika, there needs to be a computational framework that leverages machine learning techniques to recognize relevant Zika tweets and, further, categorize these into disease-specific categories to address specific societal concerns related to the prevention, transmission, symptoms, and treatment of Zika virus.

Objective: The purpose of this study was to determine the relevancy of the tweets and what people were tweeting about the 4 disease characteristics of Zika: symptoms, transmission, prevention, and treatment.

Methods: A combination of natural language processing and machine learning techniques was used to determine what people were …


Deep Learning Approach For Intrusion Detection System (Ids) In The Internet Of Things (Iot) Network Using Gated Recurrent Neural Networks (Gru), Manoj Kumar Putchala Jan 2017

Deep Learning Approach For Intrusion Detection System (Ids) In The Internet Of Things (Iot) Network Using Gated Recurrent Neural Networks (Gru), Manoj Kumar Putchala

Browse all Theses and Dissertations

The Internet of Things (IoT) is a complex paradigm where billions of devices are connected to a network. These connected devices form an intelligent system of systems that share the data without human-to-computer or human-to-human interaction. These systems extract meaningful data that can transform human lives, businesses, and the world in significant ways. However, the reality of IoT is prone to countless cyber-attacks in the extremely hostile environment like the internet. The recent hack of 2014 Jeep Cherokee, iStan pacemaker, and a German steel plant are a few notable security breaches. To secure an IoT system, the traditional high-end security …


Context-Aware Debugging For Concurrent Programs, Justin Chu Jan 2017

Context-Aware Debugging For Concurrent Programs, Justin Chu

Theses and Dissertations--Computer Science

Concurrency faults are difficult to reproduce and localize because they usually occur under specific inputs and thread interleavings. Most existing fault localization techniques focus on sequential programs but fail to identify faulty memory access patterns across threads, which are usually the root causes of concurrency faults. Moreover, existing techniques for sequential programs cannot be adapted to identify faulty paths in concurrent programs. While concurrency fault localization techniques have been proposed to analyze passing and failing executions obtained from running a set of test cases to identify faulty access patterns, they primarily focus on using statistical analysis. We present a novel …


Analysing The Effects Of Data Augmentation And Free Parameters For Text Classification With Recurrent Convolutional Neural Networks, Jonathan Quijas Jan 2017

Analysing The Effects Of Data Augmentation And Free Parameters For Text Classification With Recurrent Convolutional Neural Networks, Jonathan Quijas

Open Access Theses & Dissertations

Convolutional neural networks have seen much success in computer vision and natural language processing tasks. When training convolutional neural networks for text classification tasks, a common technique is to transform an input sequence of words into a dense matrix of word embeddings, or words represented as dense vectors, using table lookup operations. This enables the inputs to be represented in a way that the well-known convolution/pooling operations can be applied to them in a manner similar to images. These word embeddings may be further incorporated into the neural network itself as a trainable layer to allow fine-tuning, usually leading to …


Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri Jan 2017

Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri

Browse all Theses and Dissertations

The success of any multiplayer game depends on the player’s experience. Cheating/Hacking undermines the player’s experience and thus the success of that game. Cheaters, who use hacks, bots or trainers are ruining the gaming experience of a player and are making him leave the game. As the video game industry is a constantly increasing multibillion dollar economy, it is crucial to assure and maintain a state of security. Players reflect their gaming experience in one of the following places: multiplayer chat, game reviews, and social media. This thesis is an exploratory study where our goal is to experiment and propose …


Optimized Multilayer Perceptron With Dynamic Learning Rate To Classify Breast Microwave Tomography Image, Chulwoo Pack Jan 2017

Optimized Multilayer Perceptron With Dynamic Learning Rate To Classify Breast Microwave Tomography Image, Chulwoo Pack

Electronic Theses and Dissertations

Most recently developed Computer Aided Diagnosis (CAD) systems and their related research is based on medical images that are usually obtained through conventional imaging techniques such as Magnetic Resonance Imaging (MRI), x-ray mammography, and ultrasound. With the development of a new imaging technology called Microwave Tomography Imaging (MTI), it has become inevitable to develop a CAD system that can show promising performance using new format of data. The platform can have a flexibility on its input by adopting Artificial Neural Network (ANN) as a classifier. Among the various phases of CAD system, we have focused on optimizing the classification phase …


Daily Traffic Flow Pattern Recognition By Spectral Clustering, Matthew Aven Jan 2017

Daily Traffic Flow Pattern Recognition By Spectral Clustering, Matthew Aven

CMC Senior Theses

This paper explores the potential applications of existing spectral clustering algorithms to real life problems through experiments on existing road traffic data. The analysis begins with an overview of previous unsupervised machine learning techniques and constructs an effective spectral clustering algorithm that demonstrates the analytical power of the method. The paper focuses on the spectral embedding method’s ability to project non-linearly separable, high dimensional data into a more manageable space that allows for accurate clustering. The key step in this method involves solving a normalized eigenvector problem in order to construct an optimal representation of the original data.

While this …


Soil Hydraulic Property Estimation Under Major Land-Uses In The Shawnee Hills, Trinity Joseph Baker Jan 2017

Soil Hydraulic Property Estimation Under Major Land-Uses In The Shawnee Hills, Trinity Joseph Baker

Theses and Dissertations--Plant and Soil Sciences

The ability to map soil moisture is becoming more important with changing climates and modeling these effects depends on reliable estimations of hydrologic soil properties under different land managements. This study: 1) tests the application of existing soil hydraulic property estimation methods against in-situ values of six catenas under two covers (forest and grass); 2) validate Random Forest Algorithm (RF) estimates informed from the six catenas on two separate catenas; 3) identify Rapid Carbon Assessment (RaCA) sites within the Shawnee Hills Region that represent different land-uses (Crop, Conservation Reserve Program (CRP), Forest, and Pasture); 4) apply RF learning tree informed …


Pulsar Search Using Supervised Machine Learning, John M. Ford Jan 2017

Pulsar Search Using Supervised Machine Learning, John M. Ford

CCE Theses and Dissertations

Pulsars are rapidly rotating neutron stars which emit a strong beam of energy through mechanisms that are not entirely clear to physicists. These very dense stars are used by astrophysicists to study many basic physical phenomena, such as the behavior of plasmas in extremely dense environments, behavior of pulsar-black hole pairs, and tests of general relativity. Many of these tasks require information to answer the scientific questions posed by physicists. In order to provide more pulsars to study, there are several large-scale pulsar surveys underway, which are generating a huge backlog of unprocessed data. Searching for pulsars is a very …


Improved Detection For Advanced Polymorphic Malware, James B. Fraley Jan 2017

Improved Detection For Advanced Polymorphic Malware, James B. Fraley

CCE Theses and Dissertations

Malicious Software (malware) attacks across the internet are increasing at an alarming rate. Cyber-attacks have become increasingly more sophisticated and targeted. These targeted attacks are aimed at compromising networks, stealing personal financial information and removing sensitive data or disrupting operations. Current malware detection approaches work well for previously known signatures. However, malware developers utilize techniques to mutate and change software properties (signatures) to avoid and evade detection. Polymorphic malware is practically undetectable with signature-based defensive technologies. Today’s effective detection rate for polymorphic malware detection ranges from 68.75% to 81.25%. New techniques are needed to improve malware detection rates. Improved detection …


Performance Envelopes Of Adaptive Ensemble Data Stream Classifiers, Stefan Joe-Yen Jan 2017

Performance Envelopes Of Adaptive Ensemble Data Stream Classifiers, Stefan Joe-Yen

CCE Theses and Dissertations

This dissertation documents a study of the performance characteristics of algorithms designed to mitigate the effects of concept drift on online machine learning. Several supervised binary classifiers were evaluated on their performance when applied to an input data stream with a non-stationary class distribution. The selected classifiers included ensembles that combine the contributions of their member algorithms to improve overall performance. These ensembles adapt to changing class definitions, known as “concept drift,” often present in real-world situations, by adjusting the relative contributions of their members. Three stream classification algorithms and three adaptive ensemble algorithms were compared to determine the capabilities …


Autonomous Driving With A Simulation Trained Convolutional Neural Network, Cameron Franke Jan 2017

Autonomous Driving With A Simulation Trained Convolutional Neural Network, Cameron Franke

University of the Pacific Theses and Dissertations

Autonomous vehicles will help society if they can easily support a broad range of driving environments, conditions, and vehicles.

Achieving this requires reducing the complexity of the algorithmic system, easing the collection of training data, and verifying operation using real-world experiments. Our work addresses these issues by utilizing a reflexive neural network that translates images into steering and throttle commands. This network is trained using simulation data from Grand Theft Auto V~\cite{gtav}, which we augment to reduce the number of simulation hours driven. We then validate our work using a RC car system through numerous tests. Our system successfully drive …


Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner Jan 2017

Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner

CMC Senior Theses

Topic modeling refers to the process of algorithmically sorting documents into categories based on some common relationship between the documents. This common relationship between the documents is considered the “topic” of the documents. Sentiment analysis refers to the process of algorithmically sorting a document into a positive or negative category depending whether this document expresses a positive or negative opinion on its respective topic. In this paper, I consider the open problem of document classification into a topic category, as well as a sentiment category. This has a direct application to the retail industry where companies may want to scour …


Machine Learning And Natural Language Methods For Detecting Psychopathy In Textual Data, Andrew Stephen Henning Jan 2017

Machine Learning And Natural Language Methods For Detecting Psychopathy In Textual Data, Andrew Stephen Henning

Electronic Theses and Dissertations

Among the myriad of mental conditions permeating through society, psychopathy is perhaps the most elusive to diagnose and treat. With the advent of natural language processing and machine learning, however, we have ushered in a new age of technology that provides a fresh toolkit for analyzing text and context. Because text remains the medium of choice for most personal and professional interactions, it may be possible to use textual samples from psychopaths as a means for understanding and ultimately classifying similar individuals based on the content of their language usage. This paper aims to investigate natural language processing and supervised …


Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh Dec 2016

Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh

Conference papers

Accurate classification of astronomical objects currently relies on spectroscopic data. Acquiring this data is time-consuming and expensive compared to photometric data. Hence, improving the accuracy of photometric classification could lead to far better coverage and faster classification pipelines. This paper investigates the benefit of using unsupervised feature-extraction from multi-wavelength image data for photometric classification of stars, galaxies and QSOs. An unsupervised Deep Belief Network is used, giving the model a higher level of interpretability thanks to its generative nature and layer-wise training. A Random Forest classifier is used to measure the contribution of the novel features compared to a set …


Review Classification, Balraj Aujla Dec 2016

Review Classification, Balraj Aujla

Computer Science and Software Engineering

The goal of this project is to find a way to analyze reviews and determine the sentiment of a review. It uses various machine learning techniques in order to achieve its goals such as SVMs and Naive Bayes. Overall the purpose is to learn many different machine learning techniques, determine which ones would be useful for the project, then compare the results. Research is the foremost goal of the project, and it is able to determine the better algorithm for review classification, naive bayes or an SVM. In addition, an SVM which actually gave review’s scores rather than just classifying …


Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour Dec 2016

Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour

Theses and Dissertations

Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …


Fundamentals Of Machine Learning For Neural Machine Translation, John D. Kelleher Oct 2016

Fundamentals Of Machine Learning For Neural Machine Translation, John D. Kelleher

Conference papers

This paper presents a short introduction to neural networks and how they are used for machine translation and concludes with some discussion on the current research challenges being addressed by neural machine translation (NMT) research. The primary goal of this paper is to give a no-tears introduction to NMT to readers that do not have a computer science or mathematical background. The secondary goal is to provide the reader with a deep enough understanding of NMT that they can appreciate the strengths of weaknesses of the technology. The paper starts with a brief introduction to standard feed-forward neural networks (what …


Quantitative Metrics For Comparison Of Hyper-Dimensional Lsa Spaces For Semantic Differences, John Christopher Martin Aug 2016

Quantitative Metrics For Comparison Of Hyper-Dimensional Lsa Spaces For Semantic Differences, John Christopher Martin

Doctoral Dissertations

Latent Semantic Analysis (LSA) is a mathematically based machine learning technology that has demonstrated success in numerous applications in text analytics and natural language processing. The construction of a large hyper-dimensional space, a LSA space, is central to the functioning of this technique, serving to define the relationships between the information items being processed. This hyper-dimensional space serves as a semantic mapping system that represents learned meaning derived from the input content. The meaning represented in an LSA space, and therefore the mappings that are generated and the quality of the results obtained from using the space, is completely dependent …


Significant Permission Identification For Android Malware Detection, Lichao Sun Jul 2016

Significant Permission Identification For Android Malware Detection, Lichao Sun

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

A recent report indicates that a newly developed malicious app for Android is introduced every 11 seconds. To combat this alarming rate of malware creation, we need a scalable malware detection approach that is effective and efficient. In this thesis, we introduce SigPID, a malware detection system based on permission analysis to cope with the rapid increase in the number of Android malware. Instead of analyzing all 135 Android permissions, our approach applies 3-level pruning by mining the permission data to identify only significant permissions that can be effective in distinguishing benign and malicious apps. Based on the identified significant …


Applying Machine Learning To Predict Stock Value, Joseph Lemley, Yishui Liu, Dipayan Banik, Sadia Afroze May 2016

Applying Machine Learning To Predict Stock Value, Joseph Lemley, Yishui Liu, Dipayan Banik, Sadia Afroze

Symposium Of University Research and Creative Expression (SOURCE)

The purpose of this study was to compare machine learning techniques for short term stock prediction and evaluate their effectiveness. Stock value analysis is an important element of modern economies. The ability to predict future stock prices from historical price values is of tremendous interest to investors. The prediction of stock performance is still an unsolved problem with a variety of techniques being proposed. Real stock values are affected by many elements, some of which cannot be measured. In this study, we limit our analysis to stock closing prices. We use these prices to predict the future stock value using …


Learning With Scalability And Compactness, Wenlin Chen May 2016

Learning With Scalability And Compactness, Wenlin Chen

McKelvey School of Engineering Theses & Dissertations

Artificial Intelligence has been thriving for decades since its birth. Traditional AI features heuristic search and planning, providing good strategy for tasks that are inherently search-based problems, such as games and GPS searching. In the meantime, machine learning, arguably the hottest subfield of AI, embraces data-driven methodology with great success in a wide range of applications such as computer vision and speech recognition. As a new trend, the applications of both learning and search have shifted toward mobile and embedded devices which entails not only scalability but also compactness of the models. Under this general paradigm, we propose a series …


Visualization Of Deep Convolutional Neural Networks, Dingwen Li May 2016

Visualization Of Deep Convolutional Neural Networks, Dingwen Li

McKelvey School of Engineering Theses & Dissertations

Deep learning has achieved great accuracy in large scale image classification and scene recognition tasks, especially after the Convolutional Neural Network (CNN) model was introduced. Although a CNN often demonstrates very good classification results, it is usually unclear how or why a classification result is achieved. The objective of this thesis is to explore several existing visualization approaches which offer intuitive visual results. The thesis focuses on three visualization approaches: (1) image masking which highlights the region of image with high influence on the classification, (2) Taylor decomposition back-propagation which generates a per pixel heat map that describes each pixel's …


Machine Learning Of Lifestyle Data For Diabetes, Yan Luo Apr 2016

Machine Learning Of Lifestyle Data For Diabetes, Yan Luo

Electronic Thesis and Dissertation Repository

Self-Monitoring of Blood Glucose (SMBG) for Type-2 Diabetes (T2D) remains highly challenging for both patients and doctors due to the complexities of diabetic lifestyle data logging and insufficient short-term and personalized recommendations/advice. The recent mobile diabetes management systems have been proved clinically effective to facilitate self-management. However, most such systems have poor usability and are limited in data analytic functionalities. These two challenges are connected and affected by each other. The ease of data recording brings better data for applicable data analytic algorithms. On the other hand, the irrelevant or inaccurate data input will certainly commit errors and noises. The …


How To Measure Metallicity From Five-Band Photometry With Supervised Machine Learning Algorithms, Viviana Acquaviva Feb 2016

How To Measure Metallicity From Five-Band Photometry With Supervised Machine Learning Algorithms, Viviana Acquaviva

Publications and Research

We demonstrate that it is possible to measure metallicity from the SDSS five-band photometry to better than 0.1 dex using supervised machine learning algorithms. Using spectroscopic estimates of metallicity as ground truth, we build, optimize and train several estimators to predict metallicity. We use the observed photometry, as well as derived quantities such as stellar mass and photometric redshift, as features, and we build two sample data sets at median redshifts of 0.103 and 0.218 and median r-band magnitude of 17.5 and 18.3, respectively. We find that ensemble methods, such as random forests of trees and extremely randomized trees and …


Care-Chair: Opportunistic Health Assessment With Smart Sensing On Chair Backrest, Rakesh Kumar Jan 2016

Care-Chair: Opportunistic Health Assessment With Smart Sensing On Chair Backrest, Rakesh Kumar

Masters Theses

"A vast majority of the population spend most of their time in a sedentary position, which potentially makes a chair a huge source of information about a person's daily activity. This information, which often gets ignored, can reveal important health data but the overhead and the time consumption needed to track the daily activity of a person is a major hurdle. Considering this, a simple and cost-efficient sensory system, named Care-Chair, with four square force sensitive resistors on the backrest of a chair has been designed to collect the activity details and breathing rate of the users. The Care-Chair system …


Enabling Machine Science Through Distributed Human Computing, Mark David Wagy Jan 2016

Enabling Machine Science Through Distributed Human Computing, Mark David Wagy

Graduate College Dissertations and Theses

Distributed human computing techniques have been shown to be effective ways of accessing the problem-solving capabilities of a large group of anonymous individuals over the World Wide Web. They have been successfully applied to such diverse domains as computer security, biology and astronomy. The success of distributed human computing in various domains suggests that it can be utilized for complex collaborative problem solving. Thus it could be used for "machine science": utilizing machines to facilitate the vetting of disparate human hypotheses for solving scientific and engineering problems.

In this thesis, we show that machine science is possible through distributed human …