Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 241 - 270 of 826

Full-Text Articles in Physical Sciences and Mathematics

A Gpu-Based Machine Learning Approach For Detection Of Botnet Attacks, Michal Motylinski, Áine Macdermott, Farkhund Iqbal, Babar Shah Sep 2022

A Gpu-Based Machine Learning Approach For Detection Of Botnet Attacks, Michal Motylinski, Áine Macdermott, Farkhund Iqbal, Babar Shah

All Works

Rapid development and adaptation of the Internet of Things (IoT) has created new problems for securing these interconnected devices and networks. There are hundreds of thousands of IoT devices with underlying security vulnerabilities, such as insufficient device authentication/authorisation making them vulnerable to malware infection. IoT botnets are designed to grow and compete with one another over unsecure devices and networks. Once infected, the device will monitor a Command-and-Control (C&C) server indicating the target of an attack via Distributed Denial of Service (DDoS) attack. These security issues, coupled with the continued growth of IoT, presents a much larger attack surface for …


Data Preprocessing For Machine Learning Modules, Rawan El Moghrabi Aug 2022

Data Preprocessing For Machine Learning Modules, Rawan El Moghrabi

Undergraduate Student Research Internships Conference

Data preprocessing is an essential step when building machine learning solutions. It significantly impacts the success of machine learning modules and the output of these algorithms. Typically, data preprocessing is made-up of data sanitization, feature engineering, normalization, and transformation. This paper outlines the data preprocessing methodology implemented for a data-driven predictive maintenance solution. The above-mentioned project entails acquiring historical electrical data from industrial assets and creating a health index indicating each asset's remaining useful life. This solution is built using machine learning algorithms and requires several data processing steps to increase the solution's accuracy and efficiency. In this project, the …


Integrating Physical Models And Deep Priors For Computational Imaging, Yu Sun Aug 2022

Integrating Physical Models And Deep Priors For Computational Imaging, Yu Sun

McKelvey School of Engineering Theses & Dissertations

This dissertation addresses integrating physical models and learning priors for computational imaging. The motivation of our work is driven by the recent discussion of learning-based methods that solve the imaging inverse problem by directly learning a measurement-to-image mapping from the existing data: they achieve superior performance over the traditional model-based methods but lack the physical model to impose sufficient interpretation and guarantee of the final image. We adopt the classic statistical inference as the underlying formulation and integrate learning models as implicit image priors, such that our framework is able to simultaneously leverage physical models and learning priors. Additionally, the …


Development Of The Assessment Of Clinical Prediction Model Transportability (Apt) Checklist, Sean Chonghwan Yu Aug 2022

Development Of The Assessment Of Clinical Prediction Model Transportability (Apt) Checklist, Sean Chonghwan Yu

McKelvey School of Engineering Theses & Dissertations

Clinical Prediction Models (CPM) have long been used for Clinical Decision Support (CDS) initially based on simple clinical scoring systems, and increasingly based on complex machine learning models relying on large-scale Electronic Health Record (EHR) data. External implementation – or the application of CPMs on sites where it was not originally developed – is valuable as it reduces the need for redundant de novo CPM development, enables CPM usage by low resource organizations, facilitates external validation studies, and encourages collaborative development of CPMs. Further, adoption of externally developed CPMs has been facilitated by ongoing interoperability efforts in standards, policy, and …


Mathematical Models Yield Insights Into Cnns: Applications In Natural Image Restoration And Population Genetics, Ryan Cecil Aug 2022

Mathematical Models Yield Insights Into Cnns: Applications In Natural Image Restoration And Population Genetics, Ryan Cecil

Electronic Theses and Dissertations

Due to a rise in computational power, machine learning (ML) methods have become the state-of-the-art in a variety of fields. Known to be black-box approaches, however, these methods are oftentimes not well understood. In this work, we utilize our understanding of model-based approaches to derive insights into Convolutional Neural Networks (CNNs). In the field of Natural Image Restoration, we focus on the image denoising problem. Recent work have demonstrated the potential of mathematically motivated CNN architectures that learn both `geometric' and nonlinear higher order features and corresponding regularizers. We extend this work by showing that not only can geometric features …


Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero Aug 2022

Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero

Doctoral Dissertations

With the continuous improvements in biological data collection, new techniques are needed to better understand the complex relationships in genomic and other biological data sets. Explainable Artificial Intelligence (X-AI) techniques like Iterative Random Forest (iRF) excel at finding interactions within data, such as genomic epistasis. Here, the introduction of new methods to mine for these complex interactions is shown in a variety of scenarios. The application of iRF as a method for Genomic Wide Epistasis Studies shows that the method is robust in finding interacting sets of features in synthetic data, without requiring the exponentially increasing computation time of many …


Cyberbullying Detection Using Weakly Supervised And Fully Supervised Learning, Abhinav Abhishek Aug 2022

Cyberbullying Detection Using Weakly Supervised And Fully Supervised Learning, Abhinav Abhishek

ETD Archive

Machine learning is a very useful tool to solve issues in multiple domains such as sentiment analysis, fake news detection, facial recognition, and cyberbullying. In this work, we have leveraged its ability to understand the nuances of natural language to detect cyberbullying. We have further utilized it to detect the subject of cyberbullying such as age, gender, ethnicity, and religion. Further, we have built another layer to detect the cases of misogyny in cyberbullying. In one of our experiments, we created a three-layered architecture to detect cyberbullying , then to detect if it is gender based and finally if it …


Predicting Order Status Using Xgboost, Kegan J. Penovich Aug 2022

Predicting Order Status Using Xgboost, Kegan J. Penovich

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Invista, a Koch subsidiary, is a multinational producer of fibers, resins, and intermediaries, particularly nylon. To keep the company operating required them to take over 1.5 million orders over the course of - years, less than a third of which arrived on-time. Orders arriving other than when expected can cause many problems for any company. While arriving late is a clear problem, it also troublesome for them to arrive early. In the face of this, it becomes important to be able to tell a-priori if an order will arrive on-time or not.

To address this problem, we made use of …


State-Based Biological Communication, Nathan Clement Aug 2022

State-Based Biological Communication, Nathan Clement

All Theses

Allostery (1) is the process through which proteins self-regulate in response to various stimuli. Allosteric interactions occur between nonadjacent spatially distant residues (1), and they are exhibited through the correlated motions (2) and momenta of participating residues. The location of allosteric sites in proteins can be determined experimentally but computational methods to predict the location of allosteric sites are being developed as well (2-4, 10). Experimental and computational methodologies for locating allosteric sites can be used to design specific targeted drug delivery (5-6, 19), but these methods have not yet …


Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty Aug 2022

Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty

Dissertations & Theses (Open Access)

Identifying genes involved in disease pathology has been a goal of genomic research since the early days of the field. However, as technology improves and the body of research grows, we are faced with more questions than answers. Among these is the pressing matter of our incomplete understanding of the genetic underpinnings of complex diseases. Many hypotheses offer explanations as to why direct and independent analyses of variants, as done in genome-wide association studies (GWAS), may not fully elucidate disease genetics. These range from pointing out flaws in statistical testing to invoking the complex dynamics of epigenetic processes. In the …


Perturbation Modeling For Molecular Design Of Protein Tyrosine Kinase Inhibitors Using Unsupervised Machine Learning, Keerthi Krishnan Aug 2022

Perturbation Modeling For Molecular Design Of Protein Tyrosine Kinase Inhibitors Using Unsupervised Machine Learning, Keerthi Krishnan

Computational and Data Sciences (MS) Theses

The field of computational drug discovery and development has grown, with the aid of new computational tools for novel molecule discovery. In specific, generative deep learning models have excelled as tools to aid in navigating the large space of known molecules and in the creation of new molecules. These models are fed various representations of molecules as inputs and learn to perform a variety of things, such as the optimization of these molecules towards a targeted property. Ultimately, these generative learning models allow us to build bridges between chemical and continuous spaces to understand the compromise between invoking small incremental …


Hyperspectral Image Analysis Of Food For Nutritional Intake, Shirin Nasr Esfahani Aug 2022

Hyperspectral Image Analysis Of Food For Nutritional Intake, Shirin Nasr Esfahani

UNLV Theses, Dissertations, Professional Papers, and Capstones

The primary object of this dissertation is to investigate the application of hyperspectral technology to accommodate for the growing demand in the automatic dietary assessment applications. Food intake is one of the main factors that contribute to human health. In other words, it is necessary to get information about the amount of nutrition and vitamins that a human body requires through a daily diet. Manual dietary assessments are time-consuming and are also not precise enough, especially when the information is used for the care and treatment of hospitalized patients. Moreover, the data must be analyzed by nutritional experts. Therefore, researchers …


Deep Learning For Detecting Trees In The Urban Environment From Lidar, Julian R. Rice Aug 2022

Deep Learning For Detecting Trees In The Urban Environment From Lidar, Julian R. Rice

Master's Theses

Cataloguing and classifying trees in the urban environment is a crucial step in urban and environmental planning. However, manual collection and maintenance of this data is expensive and time-consuming. Algorithmic approaches that rely on remote sensing data have been developed for tree detection in forests, though they generally struggle in the more varied urban environment. This work proposes a novel method for the detection of trees in the urban environment that applies deep learning to remote sensing data. Specifically, we train a PointNet-based neural network to predict tree locations directly from LIDAR data augmented with multi-spectral imaging. We compare this …


Computational Models To Detect Radiation In Urban Environments: An Application Of Signal Processing Techniques And Neural Networks To Radiation Data Analysis, Jose Nicolas Gachancipa Jul 2022

Computational Models To Detect Radiation In Urban Environments: An Application Of Signal Processing Techniques And Neural Networks To Radiation Data Analysis, Jose Nicolas Gachancipa

Beyond: Undergraduate Research Journal

Radioactive sources, such as uranium-235, are nuclides that emit ionizing radiation, and which can be used to build nuclear weapons. In public areas, the presence of a radioactive nuclide can present a risk to the population, and therefore, it is imperative that threats are identified by radiological search and response teams in a timely and effective manner. In urban environments, such as densely populated cities, radioactive sources may be more difficult to detect, since background radiation produced by surrounding objects and structures (e.g., buildings, cars) can hinder the effective detection of unnatural radioactive material. This article presents a computational model …


An Empirical Study Towards An Automatic Phishing Attack Detection Using Ensemble Stacking Model, Mahmoud Othman, Hesham Hassan Jul 2022

An Empirical Study Towards An Automatic Phishing Attack Detection Using Ensemble Stacking Model, Mahmoud Othman, Hesham Hassan

Future Computing and Informatics Journal

Phishing attacks have become one of the most attacks facing internet users, especially after the COVID-19 pandemic, as most organizations have transferred part or most of their work and communication to become online using well-known tools, like email, Zoom, WebEx, etc. Therefore, cyber phishing attacks have become progressively recent, directly and frankly reflecting the designated website, allowing the attacker to observe everything while the victim is exploring Webpages. Hence, utilizing Artificial Intelligence (AI) techniques has become a necessary approach that could be used to detect such attacks automatically. In this paper, we introduce an empirical analysis for automatic phishing detection …


A Smartphone-Based Non-Invasive Measurement System For Blood Constituents From Photoplethysmography (Ppg) And Fingertip Videos Illuminated With The Near-Infrared Leds, Md Hasanul Aziz Jul 2022

A Smartphone-Based Non-Invasive Measurement System For Blood Constituents From Photoplethysmography (Ppg) And Fingertip Videos Illuminated With The Near-Infrared Leds, Md Hasanul Aziz

Dissertations (1934 -)

At least two billion people are affected by hemoglobin (Hgb), diabetic-related, and other blood-related diseases. Regular clinical assessments of these problems are conducted by analyzing venipuncture-obtained blood samples in laboratories. A non-invasive, cheap, point-of-care, and accurate test is needed everywhere. We started with Hgb measurement, and after an extensive literature survey, we came up with a non-invasive solution with 10-second Smartphone videos of the index fingertips using custom hardware sets to illuminate the fingers. We tested four lighting conditions with wavelengths in the near-infrared spectrum suggested by the absorption properties of two primary components of blood- oxygenated Hgb and plasma. …


Artifact Development For The Prediction Of Stress Levels On Higher Education Students Using Machine Learning, Valentina Quiroga, Alejandra Hurtado, José Rojas Jul 2022

Artifact Development For The Prediction Of Stress Levels On Higher Education Students Using Machine Learning, Valentina Quiroga, Alejandra Hurtado, José Rojas

ICT

Stress is an adaptative reaction of an organism, human or not, to the demands of fitting in an environment (Kav Vedhara, 1996). When stress originates in an educational context, it is common to refer to it as a student and their mechanisms to adapt and cope with the academic demand. All humans experience stress during their lifetime, but when this overwhelmed feeling is prolonged can affect human behaviour and the ability to deal with physical and emotional pressure, having, as a result, a different range of problems. It is important for higher-level educations institutions, such as colleges and universities, to …


Development Of Software Tools For Efficient And Sustainable Process Development And Improvement, Jake P. Stengel Jun 2022

Development Of Software Tools For Efficient And Sustainable Process Development And Improvement, Jake P. Stengel

Theses and Dissertations

Infrastructure is a key component in the well-being of our society that leads to its growth, development, and productive operations. A well-built infrastructure allows the community to be more competitive and promotes economic advancement. In 2021, the ASCE (American Society of Civil Engineers) ranked the American infrastructure as substandard, with an overall grade of C-. The overall ranking suffers when key infrastructure categories are not maintained according to the needs of the population. Therefore, there is a need to consider alternative methods to improve our infrastructure and make it more sustainable to enhance the overall grade. One of the challenges …


An Evolutionary Optimization Algorithm For Automated Classical Machine Learning, Leila Zahedi Jun 2022

An Evolutionary Optimization Algorithm For Automated Classical Machine Learning, Leila Zahedi

FIU Electronic Theses and Dissertations

Machine learning is an evolving branch of computational algorithms that allow computers to learn from experiences, make predictions, and solve different problems without being explicitly programmed. However, building a useful machine learning model is a challenging process, requiring human expertise to perform various proper tasks and ensure that the machine learning's primary objective --determining the best and most predictive model-- is achieved. These tasks include pre-processing, feature selection, and model selection. Many machine learning models developed by experts are designed manually and by trial and error. In other words, even experts need the time and resources to create good predictive …


Machine Learning With Big Data For Electrical Load Forecasting, Alexandra L'Heureux Jun 2022

Machine Learning With Big Data For Electrical Load Forecasting, Alexandra L'Heureux

Electronic Thesis and Dissertation Repository

Today, the amount of data collected is exploding at an unprecedented rate due to developments in Web technologies, social media, mobile and sensing devices and the internet of things (IoT). Data is gathered in every aspect of our lives: from financial information to smart home devices and everything in between. The driving force behind these extensive data collections is the promise of increased knowledge. Therefore, the potential of Big Data relies on our ability to extract value from these massive data sets. Machine learning is central to this quest because of its ability to learn from data and provide data-driven …


Machine Learning With Kay, Lasith Niroshan, James Carswell Jun 2022

Machine Learning With Kay, Lasith Niroshan, James Carswell

Conference Papers

Computational power is very important when training Deep Learning (DL) models with large amounts of data (Wooldridge, 2021). Hence, High-Performance Computing (HPC) can be leveraged to reduce computational cost, and the Irish Centre for High-End Computing (ICHEC) provides significant infrastructure and services for research and development to both academia and industry. A portion of ICHEC's HPC system has been allocated for institutional access, and this paper presents a case study of how to use Kay (Ireland's national supercomputer) in the remote sensing domain. Specifically, this study uses clusters of Kay Graphics Processing Units (GPUs) for training DL models to extract …


An Empirical Study On Sampling Approaches For 3d Image Classification Using Deep Learning, Nicholas Michelette Jun 2022

An Empirical Study On Sampling Approaches For 3d Image Classification Using Deep Learning, Nicholas Michelette

Theses and Dissertations

A 3D classification method requires more training data than a 2D image classification method to achieve good performance. These training data usually come in the form of multiple 2D images (e.g., slices in a CT scan) or point clouds (e.g., 3D CAD modeling) for volumetric object representation. The amount of data required to complete this higher dimension problem comes with the cost of requiring more processing time and space. This problem can be mitigated with data size reduction (i.e., sampling). In this thesis, we empirically study and compare the classification performance and deep learning training time of PointNet utilizing uniform …


A Machine Learning Approach To Revenue Generation Within The Professional Hair Care Industry, Alexander K. Sepenu, Linda Eliasen Jun 2022

A Machine Learning Approach To Revenue Generation Within The Professional Hair Care Industry, Alexander K. Sepenu, Linda Eliasen

SMU Data Science Review

The cosmetic and beauty industry continues to grow and evolve to satisfy its patrons. In the United States, the industry is heavily science-driven, innovative, and fast-paced, suggesting that to remain productive and profitable, companies must seek smart alternatives to their current modus operandi or risk losing out on this multi-billion-dollar industry to fierce competition. In this paper, the authors seek to utilize machine learning models such as clustering and regression to improve the efficiency of current sales and customer segmentation models to help HairCo (pseudonym for confidentiality), a professional hair products manufacturer, strategize their marketing and sales efforts for revenue …


Analysis Of The Electric Power Outage Data And Prediction Of Electric Power Outage For Major Metropolitan Areas In Texas Using Machine Learning And Time Series Methods, Renfeng Wang, Venkata Leela 'Mg' Vanga, Zachary B. Zaiken, Jonathan Bennett Jun 2022

Analysis Of The Electric Power Outage Data And Prediction Of Electric Power Outage For Major Metropolitan Areas In Texas Using Machine Learning And Time Series Methods, Renfeng Wang, Venkata Leela 'Mg' Vanga, Zachary B. Zaiken, Jonathan Bennett

SMU Data Science Review

With growing energy usage, power outages affect millions of households. This case study focuses on gathering power outage historical data, modifying the data to attach weather attributes, and gathering ERCOT energy market conditions for Dallas-Fort Worth and Houston metropolitan areas of Texas. The transformed data is then analyzed using machine learning algorithms including, but not limited to, Regression, Random Forests and XGBoost to consider current weather and ERCOT features and predict power outage percentage for locations. The transformed data is also trained using time series models and serially correlated models including Autoregression and Vector Autoregression. This study also focuses on …


Impact Of Movements On Facial Expression Recognition, Zhebin Yin Jun 2022

Impact Of Movements On Facial Expression Recognition, Zhebin Yin

Honors Theses

The ability to recognize human emotions can be a useful skill for robots. Emotion recognition can help robots understand our responses to robot movements and actions. Human emotions can be recognized through facial expressions. Facial Expression Recognition (FER) is a well-established research area, how- ever, the majority of prior research is based on static datasets of images. With robots often the subject is moving, the robot is moving, or both. The purpose of this research is to determine the impact of movement on facial expression recognition. We apply a pre-existing model for FER, which performs around 70.86% on a given …


Towards A Computational Model Of Narrative On Social Media, Anne Bailey Jun 2022

Towards A Computational Model Of Narrative On Social Media, Anne Bailey

Dartmouth College Undergraduate Theses

This thesis describes a variety of approaches to developing a computational model of narrative on social media. Our goal is to use such a narrative model to identify efforts to manipulate public opinion on social media platforms like Twitter. We present a model in which narratives in a collection of tweets are represented as a graph. Elements from each tweet that are relevant to potential narratives are made into nodes in the graph; for this thesis, we populate graph nodes with tweets’ authors, hashtags, named entities (people, locations, organizations, etc.,), and moral foundations (central moral values framing the discussion). Two …


Comparing Learned Representations Between Unpruned And Pruned Deep Convolutional Neural Networks, Parker Mitchell Jun 2022

Comparing Learned Representations Between Unpruned And Pruned Deep Convolutional Neural Networks, Parker Mitchell

Master's Theses

While deep neural networks have shown impressive performance in computer vision tasks, natural language processing, and other domains, the sizes and inference times of these models can often prevent them from being used on resource-constrained systems. Furthermore, as these networks grow larger in size and complexity, it can become even harder to understand the learned representations of the input data that these networks form through training. These issues of growing network size, increasing complexity and runtime, and ambiguity in the understanding of internal representations serve as guiding points for this work.

In this thesis, we create a neural network that …


Legislative Language For Success, Sanjana Gundala Jun 2022

Legislative Language For Success, Sanjana Gundala

Master's Theses

Legislative committee meetings are an integral part of the lawmaking process for local and state bills. The testimony presented during these meetings is a large factor in the outcome of the proposed bill. This research uses Natural Language Processing and Machine Learning techniques to analyze testimonies from California Legislative committee meetings from 2015-2016 in order to identify what aspects of a testimony makes it successful. A testimony is considered successful if the alignment of the testimony matches the bill outcome (alignment is "For" and the bill passes or alignment is "Against" and the bill fails). The process of finding what …


Machine Learning And The Network Analysis Of Ethereum Trading Data, Santosh Sivakumar Jun 2022

Machine Learning And The Network Analysis Of Ethereum Trading Data, Santosh Sivakumar

Dartmouth College Undergraduate Theses

Since their conception, cryptocurrencies have captured the public interest, motivating a growing body of research aimed at exploring blockchain-based transactions. This said, little work has been done to draw conclusions from transaction patterns, particularly in the realm of predicting cryptocurrency price movements. Moreover, research in the cryptocurrency sphere largely focuses on Bitcoin, paying little attention to Ethereum, Bitcoin's second-in-line with respect to market capitalization. In this paper, we construct hourly networks for a year of Ethereum transactions, using computed graph metrics as features in a series of machine learning models. We find that regression-based approaches to predicting Ether prices/price deltas …


Symplectically Integrated Symbolic Regression Of Hamiltonian Dynamical Systems, Daniel Dipietro Jun 2022

Symplectically Integrated Symbolic Regression Of Hamiltonian Dynamical Systems, Daniel Dipietro

Computer Science Senior Theses

Here we present Symplectically Integrated Symbolic Regression (SISR), a novel technique for learning physical governing equations from data. SISR employs a deep symbolic regression approach, using a multi-layer LSTMRNN with mutation to probabilistically sample Hamiltonian symbolic expressions. Using symplectic neural networks, we develop a model-agnostic approach for extracting meaningful physical priors from the data that can be imposed on-the-fly into the RNN output, limiting its search space. Hamiltonians generated by the RNN are optimized and assessed using a fourth-order symplectic integration scheme; prediction performance is used to train the LSTM-RNN to generate increasingly better functions via a risk-seeking policy gradients …