Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 661 - 690 of 826

Full-Text Articles in Physical Sciences and Mathematics

Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu Apr 2018

Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu

Electronic Thesis and Dissertation Repository

ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.

This thesis presents a novel …


Detecting Speakers In Video Footage, Michael Williams Apr 2018

Detecting Speakers In Video Footage, Michael Williams

Master's Theses

Facial recognition is a powerful tool for identifying people visually. Yet, when the end goal is more specific than merely identifying the person in a picture problems can arise. Speaker identification is one such task which expects more predictive power out of a facial recognition system than can be provided on its own. Speaker identification is the task of identifying who is speaking in video not simply who is present in the video. This extra requirement introduces numerous false positives into the facial recognition system largely due to one main scenario. The person speaking is not on camera. This paper …


A Legal Perspective On The Trials And Tribulations Of Ai: How Artificial Intelligence, The Internet Of Things, Smart Contracts, And Other Technologies Will Affect The Law, Iria Giuffrida, Fredric Lederer, Nicolas Vermeys Apr 2018

A Legal Perspective On The Trials And Tribulations Of Ai: How Artificial Intelligence, The Internet Of Things, Smart Contracts, And Other Technologies Will Affect The Law, Iria Giuffrida, Fredric Lederer, Nicolas Vermeys

Faculty Publications

No abstract provided.


Using Autoencoder To Reduce The Length Of The Autism Diagnostic Observation Schedule (Ados), Sara Hussain Daghustani Mar 2018

Using Autoencoder To Reduce The Length Of The Autism Diagnostic Observation Schedule (Ados), Sara Hussain Daghustani

Electronic Theses, Projects, and Dissertations

This thesis uses autoencoders to explore the possibility of reducing the length of the Autism Diagnostic Observation Schedule (ADOS), which is a series of tests and observations used to diagnose autism spectrum disorders in children, adolescents, and adults of different developmental levels. The length of the ADOS, directly and indirectly, causes barriers to its access for many individuals, which means that individuals who need testing are unable to get it. Reducing the length of the ADOS without significantly sacrificing its accuracy would increase its accessibility. The autoencoders used in this thesis have specific connections between layers that mimic the sectional …


Multimodal Sensing And Data Processing For Speaker And Emotion Recognition Using Deep Learning Models With Audio, Video And Biomedical Sensors, Farnaz Abtahi Feb 2018

Multimodal Sensing And Data Processing For Speaker And Emotion Recognition Using Deep Learning Models With Audio, Video And Biomedical Sensors, Farnaz Abtahi

Dissertations, Theses, and Capstone Projects

The focus of the thesis is on Deep Learning methods and their applications on multimodal data, with a potential to explore the associations between modalities and replace missing and corrupt ones if necessary. We have chosen two important real-world applications that need to deal with multimodal data: 1) Speaker recognition and identification; 2) Facial expression recognition and emotion detection.

The first part of our work assesses the effectiveness of speech-related sensory data modalities and their combinations in speaker recognition using deep learning models. First, the role of electromyography (EMG) is highlighted as a unique biometric sensor in improving audio-visual speaker …


Integrated Strategies For Sustainable Wastewater-Based Algal Biofuel Production And Environmental Mitigation In The Us, Javad Roostaei Jan 2018

Integrated Strategies For Sustainable Wastewater-Based Algal Biofuel Production And Environmental Mitigation In The Us, Javad Roostaei

Wayne State University Dissertations

Integration of algae cultivation with wastewater treatment has received increasing interest as a cost-effective strategy for biofuel production. However, there has been no full assessment of algal biofuel production with wastewater on macro-scale by taking into account wastewater resources, land availability, CO2 emission resources, and geographic variation. This research addressed and evaluated the use of wastewater for algae cultivation, in terms of modeling and laboratory experiments. The first goal of this research was to develop a spatially explicit lifecycle model, by integrating life cycle assessment (LCA), and Geographic Information Systems (GIS) analysis, for the evaluation of the environmental and economic …


Don't Take This Personally: Sentiment Analysis For Identification Of "Subtweeting" On Twitter, Noah L. Segal-Gould Jan 2018

Don't Take This Personally: Sentiment Analysis For Identification Of "Subtweeting" On Twitter, Noah L. Segal-Gould

Senior Projects Spring 2018

The purpose of this project is to identify subtweets. The Oxford English Dictionary defines "subtweet" as a "[Twitter post] that refers to a particular user without directly mentioning them, typically as a form of furtive mockery or criticism." This paper details a process for gathering a labeled ground truth dataset, training a classifier, and creating a Twitter bot which interacts with subtweets in real time. The Naive Bayes classifier trained in this project classifies tweets as subtweets and non-subtweets with an average F1 score of 72%.


A Framework To Understand Emoji Meaning: Similarity And Sense Disambiguation Of Emoji Using Emojinet, Sanjaya Wijeratne Jan 2018

A Framework To Understand Emoji Meaning: Similarity And Sense Disambiguation Of Emoji Using Emojinet, Sanjaya Wijeratne

Browse all Theses and Dissertations

Pictographs, commonly referred to as `emoji’, have become a popular way to enhance electronic communications. They are an important component of the language used in social media. With their introduction in the late 1990’s, emoji have been widely used to enhance the sentiment, emotion, and sarcasm expressed in social media messages. They are equally popular across many social media sites including Facebook, Instagram, and Twitter. In 2015, Instagram reported that nearly half of the photo comments posted on Instagram contain emoji, and in the same year, Twitter reported that the `face with tears of joy’ emoji has been tweeted 6.6 …


Data-Driven Predictive Framework For Modeling Complex Multi-Physics Engineering Applications, Arturo Schiaffino Bustamante Jan 2018

Data-Driven Predictive Framework For Modeling Complex Multi-Physics Engineering Applications, Arturo Schiaffino Bustamante

Open Access Theses & Dissertations

Computational models are often encountered in multiple engineering application, such as structural design, material science, heat transfer and fluid dynamics. These simulations offer the engineers the capability of understanding complex physical situations before putting them to practice, either through experimentation or prototyping. The current advances in computational sciences, hardware architecture, software development and big data technology, have allowed the construction of sturdy predicting frameworks for analyzing a wide array of natural phenomena across different disciplines, either through the implementation of statistical methods, such as big data, and uncertainty quantification, or through high performance computing of a numerical model. The objective …


Tracking Topical Evolution In Large Document Collections, Sheikh Motahar Naim Jan 2018

Tracking Topical Evolution In Large Document Collections, Sheikh Motahar Naim

Open Access Theses & Dissertations

A large document collection that builds up over time usually contains a number of different themes. All of these themes or topics are not equally important at the same time. One topic might have high probabilities in some years due to some relevant events, and low probabilities in other years. Analyzing the evolution of such topics has useful applications in a variety of domains, for example, helping researchers to quickly see the changes of research topics in an area, assisting intelligence agents in tracking the activities of a terrorist group, or monitoring damages caused by a natural disaster. In this …


Deep Neural Networks For Multi-Label Text Classification: Application To Coding Electronic Medical Records, Anthony Rios Jan 2018

Deep Neural Networks For Multi-Label Text Classification: Application To Coding Electronic Medical Records, Anthony Rios

Theses and Dissertations--Computer Science

Coding Electronic Medical Records (EMRs) with diagnosis and procedure codes is an essential task for billing, secondary data analyses, and monitoring health trends. Both speed and accuracy of coding are critical. While coding errors could lead to more patient-side financial burden and misinterpretation of a patient’s well-being, timely coding is also needed to avoid backlogs and additional costs for the healthcare facility. Therefore, it is necessary to develop automated diagnosis and procedure code recommendation methods that can be used by professional medical coders.

The main difficulty with developing automated EMR coding methods is the nature of the label space. The …


Sports Analytics With Computer Vision, Colby T. Jeffries Jan 2018

Sports Analytics With Computer Vision, Colby T. Jeffries

Senior Independent Study Theses

Computer vision in sports analytics is a relatively new development. With multi-million dollar systems like STATS’s SportVu, professional basketball teams are able to collect extremely fine-detailed data better than ever before. This concept can be scaled down to provide similar statistics collection to college and high school basketball teams. Here we investigate the creation of such a system using open-source technologies and less expensive hardware. In addition, using a similar technology, we examine basketball free throws to see whether a shooter’s form has a specific relationship to a shot’s outcome. A system that learns this relationship could be used to …


Non-Linear Machine Learning With Active Sampling For Mox Drift Compensation, Tamara Matthews, Muhammad Iqbal, Horacio Gonzalez-Velez Jan 2018

Non-Linear Machine Learning With Active Sampling For Mox Drift Compensation, Tamara Matthews, Muhammad Iqbal, Horacio Gonzalez-Velez

Conference papers

Abstract—Metal oxide (MOX) gas detectors based on SnO2 provide low-cost solutions for real-time sensing of complex gas mixtures for indoor ambient monitoring. With high sensitivity under ideal conditions, MOX detectors may have poor longterm response accuracy due to environmental factors (humidity and temperature) along with sensor aging, leading to calibration drifts. Finding a simple and efficient solution to correct such calibration drifts has been the subject of numerous studies but remains an open problem. In this work, we present an efficient approach to MOX calibration using active and transfer sampling techniques coupled with non-linear machine learning algorithms, namely neural networks, …


Support Vector Machines For Image Spam Analysis, Aneri Chavda, Katerina Potika, Fabio Di Troia, Mark Stamp Jan 2018

Support Vector Machines For Image Spam Analysis, Aneri Chavda, Katerina Potika, Fabio Di Troia, Mark Stamp

Faculty Publications, Computer Science

Email is one of the most common forms of digital communication. Spam is unsolicited bulk email, while image spam consists of spam text embedded inside an image. Image spam is used as a means to evade text-based spam filters, and hence image spam poses a threat to email-based communication. In this research, we analyze image spam detection using support vector machines (SVMs), which we train on a wide variety of image features. We use a linear SVM to quantify the relative importance of the features under consideration. We also develop and analyze a realistic “challenge” dataset that illustrates the limitations …


Use Of Adaptive Mobile Applications To Improve Mindfulness, Wiehan Boshoff Jan 2018

Use Of Adaptive Mobile Applications To Improve Mindfulness, Wiehan Boshoff

Browse all Theses and Dissertations

Mindfulness is the state of retaining awareness of what is happening at the current point in time. It has been used in multiple forms to reduce stress, anxiety, and even depression. Promoting Mindfulness can be done in various ways, but current research shows a trend towards preferential usage of breathing exercises over other methods to reach a mindful state. Studies have showcased that breathing can be used as a tool to promote brain control, specifically in the auditory cortex region. Research pertaining to disorders such as Tinnitus, the phantom awareness of sound, could potentially benefit from using these brain control …


A Study Of Neural Networks For The Quantum Many-Body Problem, Liam B. Schramm Jan 2018

A Study Of Neural Networks For The Quantum Many-Body Problem, Liam B. Schramm

Senior Projects Spring 2018

One of the fundamental problems in analytically approaching the quantum many-body problem is that the amount of information needed to describe a quantum state. As the number of particles in a system grows, the amount of information needed for a full description of the system increases exponentially. A great deal of work then has gone into finding efficient approximate representations of these systems. Among the most popular techniques are Tensor Networks and Quantum Monte Carlo methods. However, one new method with a number of promising theoretical guarantees is the Neural Quantum State. This method is an adaptation of the Restricted …


Machine Learning Techniques Implementation In Power Optimization, Data Processing, And Bio-Medical Applications, Khalid Khairullah Mezied Al-Jabery Jan 2018

Machine Learning Techniques Implementation In Power Optimization, Data Processing, And Bio-Medical Applications, Khalid Khairullah Mezied Al-Jabery

Doctoral Dissertations

"The rapid progress and development in machine-learning algorithms becomes a key factor in determining the future of humanity. These algorithms and techniques were utilized to solve a wide spectrum of problems extended from data mining and knowledge discovery to unsupervised learning and optimization. This dissertation consists of two study areas. The first area investigates the use of reinforcement learning and adaptive critic design algorithms in the field of power grid control. The second area in this dissertation, consisting of three papers, focuses on developing and applying clustering algorithms on biomedical data. The first paper presents a novel modelling approach for …


Development Of An Electronic Nose For Olfactory System Modelling Using Artificial Neural Network, Proceso L. Fernandez Jr, Mary Anne Sy Roa Jan 2018

Development Of An Electronic Nose For Olfactory System Modelling Using Artificial Neural Network, Proceso L. Fernandez Jr, Mary Anne Sy Roa

Department of Information Systems & Computer Science Faculty Publications

Electronic nose (e-nose) devices have received considerable attention in the field of sensor technology because of their many potential uses such as in identification of toxic wastes, monitoring air quality, examining odors in infected wounds and in inspection of food. Notwithstanding the vast amount of literature on the usage of e-noses for specific purposes, the technology originally and ultimately aims to mimic the capability of mammals to discriminate odors from all sorts of objects. This study demonstrates the theoretical and practical feasibility of designing an e-nose towards general odor classification. A multi-sensor array hardware unit was carefully constructed for data …


Novelty Detection Of Machinery Using A Non-Parametric Machine Learning Approach, Enrique Angola Jan 2018

Novelty Detection Of Machinery Using A Non-Parametric Machine Learning Approach, Enrique Angola

Graduate College Dissertations and Theses

A novelty detection algorithm inspired by human audio pattern recognition is conceptualized and experimentally tested. This anomaly detection technique can be used to monitor the health of a machine or could also be coupled with a current state of the art system to enhance its fault detection capabilities. Time-domain data obtained from a microphone is processed by applying a short-time FFT, which returns time-frequency patterns. Such patterns are fed to a machine learning algorithm, which is designed to detect novel signals and identify windows in the frequency domain where such novelties occur. The algorithm presented in this paper uses one-dimensional …


Probabilistic Clustering Ensemble Evaluation For Intrusion Detection, Steven M. Mcelwee Jan 2018

Probabilistic Clustering Ensemble Evaluation For Intrusion Detection, Steven M. Mcelwee

CCE Theses and Dissertations

Intrusion detection is the practice of examining information from computers and networks to identify cyberattacks. It is an important topic in practice, since the frequency and consequences of cyberattacks continues to increase and affect organizations. It is important for research, since many problems exist for intrusion detection systems. Intrusion detection systems monitor large volumes of data and frequently generate false positives. This results in additional effort for security analysts to review and interpret alerts. After long hours spent reviewing alerts, security analysts become fatigued and make bad decisions. There is currently no approach to intrusion detection that reduces the workload …


Generating Diverse And Meaningful Captions: Unsupervised Specificity Optimization For Image Captioning, Annika Lindh, Robert J. Ross, Abhijit Mahalunkar, Giancarlo Salton, John D. Kelleher Jan 2018

Generating Diverse And Meaningful Captions: Unsupervised Specificity Optimization For Image Captioning, Annika Lindh, Robert J. Ross, Abhijit Mahalunkar, Giancarlo Salton, John D. Kelleher

Conference papers

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty.

We make our …


Fuzziness-Based Active Learning Framework To Enhance Hyperspectral Image Classification Performance For Discriminative And Generative Classifiers, Muhammad Ahmad, Stanislav Protasov, Adil Mehmood Khan, Rasheed Hussain, Asad Masood Khattak, Wajahat Ali Khan Jan 2018

Fuzziness-Based Active Learning Framework To Enhance Hyperspectral Image Classification Performance For Discriminative And Generative Classifiers, Muhammad Ahmad, Stanislav Protasov, Adil Mehmood Khan, Rasheed Hussain, Asad Masood Khattak, Wajahat Ali Khan

All Works

© 2018 Ahmad et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Hyperspectral image classification with a limited number of training samples without loss of accuracy is desirable, as collecting such data is often expensive and time-consuming. However, classifiers trained with limited samples usually end up with a large generalization error. To overcome the said problem, we propose a fuzziness-based active learning framework (FALF), in which we implement the idea of selecting optimal …


Artificial Neural Network (Ann) In A Small Dataset To Determine Neutrality In The Pronunciation Of English As A Foreign Language In Filipino Call Center Agents, Proceso L. Fernandez Jr, Rey Benjamin M. Baquirin Jan 2018

Artificial Neural Network (Ann) In A Small Dataset To Determine Neutrality In The Pronunciation Of English As A Foreign Language In Filipino Call Center Agents, Proceso L. Fernandez Jr, Rey Benjamin M. Baquirin

Department of Information Systems & Computer Science Faculty Publications

Artificial Neural Networks (ANNs) have continued to be efficient models in solving classification problems. In this paper, we explore the use of an ANN with a small dataset to accurately classify whether Filipino call center agents’ pronunciations are neutral or not based on their employer’s standards. Isolated utterances of the ten most commonly used words in the call center were recorded from eleven agents creating a dataset of 110 utterances. Two learning specialists were consulted to establish ground truths and Cohen’s Kappa was computed as 0.82, validating the reliability of the dataset. The first thirteen Mel-Frequency Cepstral Coefficients (MFCCs) were …


Machine Learning Methods For Septic Shock Prediction, Aiman A. Darwiche Jan 2018

Machine Learning Methods For Septic Shock Prediction, Aiman A. Darwiche

CCE Theses and Dissertations

Sepsis is an organ dysfunction life-threatening disease that is caused by a dysregulated body response to infection. Sepsis is difficult to detect at an early stage, and when not detected early, is difficult to treat and results in high mortality rates. Developing improved methods for identifying patients in high risk of suffering septic shock has been the focus of much research in recent years. Building on this body of literature, this dissertation develops an improved method for septic shock prediction. Using the data from the MMIC-III database, an ensemble classifier is trained to identify high-risk patients. A robust prediction model …


Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom Jan 2018

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom

HMC Senior Theses

Since some of the location of where the users posted their tweets collected by social media company have varied accuracy, and some are missing. We want to use those tweets with highest accuracy to help fill in the data of those tweets with incomplete information. To test our algorithm, we used the sets of social media data from a city, we separated them into training sets, where we know all the information, and the testing sets, where we intentionally pretend to not know the location. One prediction method that was used in (Dukler, Han and Wang, 2016) requires appending one-hot …


Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara Jan 2018

Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara

Dissertations, Master's Theses and Master's Reports

Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient.

This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is …


Machine Learning Based Disease Gene Identification And Mhc Immune Protein-Peptide Binding Prediction, Zhonghao Liu Jan 2018

Machine Learning Based Disease Gene Identification And Mhc Immune Protein-Peptide Binding Prediction, Zhonghao Liu

Theses and Dissertations

Machine learning and deep learning methods have been increasingly applied to solve challenging and important bioinformatics problems such as protein structure prediction, disease gene identification, and drug discovery. However, the performances of existing machine learning based predictive models are still not satisfactory. The question of how to exploit the specific properties of bioinformatics data and couple them with the unique capabilities of the learning algorithms remains elusive. In this dissertation, we propose advanced machine learning and deep learning algorithms to address two important problems: mislocation-related cancer gene identification and major histocompatibility complex-peptide binding affinity prediction. Our first contribution proposes a …


Classification Of Eeg Signals Of User States In Gaming Using Machine Learning, Chandana Mallapragada Jan 2018

Classification Of Eeg Signals Of User States In Gaming Using Machine Learning, Chandana Mallapragada

Masters Theses

"In this research, brain activity of user states was analyzed using machine learning algorithms. When a user interacts with a computer-based system including playing computer games like Tetris, he or she may experience user states such as boredom, flow, and anxiety. The purpose of this research is to apply machine learning models to Electroencephalogram (EEG) signals of three user states -- boredom, flow and anxiety -- to identify and classify the EEG correlates for these user states. We focus on three research questions: (i) How well do machine learning models like support vector machine, random forests, multinomial logistic regression, and …


Application Of Machine Learning On Fracture Interference, Dennis Wayne Chamberlain Jr. Jan 2018

Application Of Machine Learning On Fracture Interference, Dennis Wayne Chamberlain Jr.

Graduate Theses, Dissertations, and Problem Reports

A method has been developed that locates and determines well-to-well hydraulic fracture interference (frac-hit) in shale plays using hard data. This method uses Artificial Neural Networks (ANN) with designated parameters and target outputs in conjunction with graphs of gas flowrate, tubing pressure, and cumulative gas prediction. The method was created to address the significant increase in frac-hit occurrences due to the infill wells being completed in shale plays. The production data of the well is first cleaned to eliminate outliers in the initial timeframe of the well and periods of no production so that the ANN model can be accurately …


Scalable Feature Selection And Extraction With Applications In Kinase Polypharmacology, Derek Jones Jan 2018

Scalable Feature Selection And Extraction With Applications In Kinase Polypharmacology, Derek Jones

Theses and Dissertations--Computer Science

In order to reduce the time associated with and the costs of drug discovery, machine learning is being used to automate much of the work in this process. However the size and complex nature of molecular data makes the application of machine learning especially challenging. Much work must go into the process of engineering features that are then used to train machine learning models, costing considerable amounts of time and requiring the knowledge of domain experts to be most effective. The purpose of this work is to demonstrate data driven approaches to perform the feature selection and extraction steps in …