Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 91 - 120 of 1686

Full-Text Articles in Physical Sciences and Mathematics

Dataset Of Arabic Spam And Ham Tweets, Sanaa Kaddoura, Safaa Henno Feb 2024

Dataset Of Arabic Spam And Ham Tweets, Sanaa Kaddoura, Safaa Henno

All Works

This data article provides a dataset of 132421 posts and their corresponding information collected from Twitter social media. The data has two classes, ham or spam, where ham indicates non-spam clean tweets. The main target of this dataset is to study a way to classify whether a post is a spam or not automatically. The data is in Arabic language only, which makes the data essential to the researchers in Arabic natural language processing (NLP) due to the lack of resources in this language. The data is made publicly available to allow researchers to use it as a benchmark for …


Self-Optimizing Feature Generation Via Categorical Hashing Representation And Hierarchical Reinforcement Crossing, Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu Feb 2024

Self-Optimizing Feature Generation Via Categorical Hashing Representation And Hierarchical Reinforcement Crossing, Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu

Computer Science Faculty Publications and Presentations

Feature generation aims to generate new and meaningful features to create a discriminative representation space. A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities. We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, …


An Enhanced Deep Autoencoder For Flight Delay Prediction, Desmond B. Bisandu Phd, Dan Andrei Soviani-Sitoiu Msc, Irene Moulitsas Phd Jan 2024

An Enhanced Deep Autoencoder For Flight Delay Prediction, Desmond B. Bisandu Phd, Dan Andrei Soviani-Sitoiu Msc, Irene Moulitsas Phd

Journal of Aviation/Aerospace Education & Research

Accurate and timely flight delay prediction cannot be overemphasized because of the ever-increasing demand for air travel and its importance in deploying intelligent transportation systems. Nonetheless, there has not been a universal solution to the problem, as more intelligent flight decision systems are required for the aviation industry's future growth. Existing flight delay classification and prediction approaches are mainly shallow traffic models and do not satisfy many applications in the real world. Our motivation to rethink the deep architecture model for predicting flight delays emanates from the problem. In this research, we proposed a technique that modified stacked autoencoder architecture …


Use Of Artificial Intelligence In Drug Development, Louise C. Druedahl, Nicholson Price, Timo Minssen, Dipl Jur, Ameet Sarpatwari Jan 2024

Use Of Artificial Intelligence In Drug Development, Louise C. Druedahl, Nicholson Price, Timo Minssen, Dipl Jur, Ameet Sarpatwari

Articles

Considerable focus has been placed on the health care applications of artificial intelligence (AI). Already, machine learning, a subset of AI that involves “the use of data and algorithms to imitate the way that humans learn” has been used to predict diseases, while AI-powered smartphone apps have been developed to promote mental health and weight loss. Owing in part to such successes, the market for AI in health care has been forecasted to increase more than 1000% between 2022 and 2029, from $13.8 billion to $164.1 billion. One area of substantial promise is drug development, which is poised to benefit …


Adaptive Multi-Label Classification On Drifting Data Streams, Martha Roseberry Jan 2024

Adaptive Multi-Label Classification On Drifting Data Streams, Martha Roseberry

Theses and Dissertations

Drifting data streams and multi-label data are both challenging problems. When multi-label data arrives as a stream, the challenges of both problems must be addressed along with additional challenges unique to the combined problem. Algorithms must be fast and flexible, able to match both the speed and evolving nature of the stream. We propose four methods for learning from multi-label drifting data streams. First, a multi-label k Nearest Neighbors with Self Adjusting Memory (ML-SAM-kNN) exploits short- and long-term memories to predict the current and evolving states of the data stream. Second, a punitive k nearest neighbors algorithm with a self-adjusting …


Assessing Interatomic Potentials For Molecular Dynamics Simulation Of Soybean Oil Pyrolysis, Tanner Garrett Rust Jan 2024

Assessing Interatomic Potentials For Molecular Dynamics Simulation Of Soybean Oil Pyrolysis, Tanner Garrett Rust

MSU Graduate Theses

The world today relies on hydrocarbon combustion for many reasons, including its high energy density that provides ease of transportation. However, hydrocarbons sourced from fossil fuels are not expected to last forever. Biodiesel, a renewable alternative, has many attractive benefits but comes with other downsides. Biodiesel can gel in cold environments and may leave residue in an engine. Pyrolysis of biodiesel has shown promise in addressing these common detriments. Inducing pyrolysis on biodiesel feedstock (commonly soybean oil in the USA) would be an attractive option presuming it continues to produce fossil fuel analogs similar to biodiesel pyrolysis. Herein, Langevin molecular …


Applications Of Ai/Ml In Maritime Cyber Supply Chains, Rafael Diaz, Ricardo Ungo, Katie Smith, Lida Haghnegahdar, Bikash Singh, Tran Phuong Jan 2024

Applications Of Ai/Ml In Maritime Cyber Supply Chains, Rafael Diaz, Ricardo Ungo, Katie Smith, Lida Haghnegahdar, Bikash Singh, Tran Phuong

School of Cybersecurity Faculty Publications

Digital transformation is a new trend that describes enterprise efforts in transitioning manual and likely outdated processes and activities to digital formats dominated by the extensive use of Industry 4.0 elements, including the pervasive use of cyber-physical systems to increase efficiency, reduce waste, and increase responsiveness. A new domain that intersects supply chain management and cybersecurity emerges as many processes as possible of the enterprise require the convergence and synchronizing of resources and information flows in data-driven environments to support planning and execution activities. Protecting the information becomes imperative as big data flows must be parsed and translated into actions …


Enhancedbert: A Feature-Rich Ensemble Model For Arabic Word Sense Disambiguation With Statistical Analysis And Optimized Data Collection, Sanaa Kaddoura, Reem Nassar Jan 2024

Enhancedbert: A Feature-Rich Ensemble Model For Arabic Word Sense Disambiguation With Statistical Analysis And Optimized Data Collection, Sanaa Kaddoura, Reem Nassar

All Works

Accurate assignment of meaning to a word based on its context, known as Word Sense Disambiguation (WSD), remains challenging across languages. Extensive research aims to develop automated methods for determining word senses in different contexts. However, the literature lacks the presence of datasets generated for the Arabic language WSD. This paper presents a dataset comprising a hundred polysemous Arabic words. Each word in the dataset encompasses 3–8 distinct senses, with ten example sentences per sense. Some statistical operations are conducted to gain insights into the dataset, enlightening its characteristics and properties. Subsequently, a novel WSD approach is proposed to utilize …


A Comparison Of Lexical Tokenization Methods, Nathan Culmer Jan 2024

A Comparison Of Lexical Tokenization Methods, Nathan Culmer

Williams Honors College, Honors Research Projects

The purpose of this project was to compare tokenization methods, or methods of breaking up a text into meaningful parts for use in natural language processing. The effectiveness of several commonly used tokenization methods were investigated, including morpheme tokenization, which takes into account the linguistic features of the language. In addition, I proposed and implemented a new technique to consider the capitalization pattern of a word in the tokenization process, in order to allow this process to include more natural language features. The effectiveness of these methods was compared by using them in a sentiment analysis model for various datasets, …


Machine Learning As A Tool For Early Detection: A Focus On Late-Stage Colorectal Cancer Across Socioeconomic Spectrums, Hadiza Galadima, Rexford Anson-Dwamena, Ashley Johnson, Ghalib Bello, Georges Adunlin, James Blando Jan 2024

Machine Learning As A Tool For Early Detection: A Focus On Late-Stage Colorectal Cancer Across Socioeconomic Spectrums, Hadiza Galadima, Rexford Anson-Dwamena, Ashley Johnson, Ghalib Bello, Georges Adunlin, James Blando

Community & Environmental Health Faculty Publications

Purpose: To assess the efficacy of various machine learning (ML) algorithms in predicting late-stage colorectal cancer (CRC) diagnoses against the backdrop of socio-economic and regional healthcare disparities. Methods: An innovative theoretical framework was developed to integrate individual- and census tract-level social determinants of health (SDOH) with sociodemographic factors. A comparative analysis of the ML models was conducted using key performance metrics such as AUC-ROC to evaluate their predictive accuracy. Spatio-temporal analysis was used to identify disparities in late-stage CRC diagnosis probabilities. Results: Gradient boosting emerged as the superior model, with the top predictors for late-stage CRC diagnosis being anatomic site, …


Reinforcement Learning: Applying Low Discrepancy Action Selection To Deep Deterministic Policy Gradient, Aleksandr Svishchev Jan 2024

Reinforcement Learning: Applying Low Discrepancy Action Selection To Deep Deterministic Policy Gradient, Aleksandr Svishchev

Electronic Theses and Dissertations

Reinforcement learning (RL) is a subfield of machine learning concerned with agents learning to behave optimally by interacting with an environment. One of the most important topics in RL is how the agent should explore, that is, how to choose actions in order to rate their impact on long-term reward. For example, a simple baseline strategy might be uniformly random action selection. This thesis investigates the heuristic idea that agents will learn faster if they explore by factoring the environment’s state into their decision and intentionally choose actions which are as different as possible from what they have previously observed. …


Music Recommendation Using Exemplars And Contrastive Learning, Tina Tran Jan 2024

Music Recommendation Using Exemplars And Contrastive Learning, Tina Tran

Honors Undergraduate Theses

The popularity of AI audio applications is growing, it is used in chatbots, automated voice translation, virtual assistants, and text-to-speech translation. Audio classification is crucial in today’s world with a growing need to sort and classify millions of existing audio data with increasing amounts of new data uploaded over time. In the area of classification lies the difficult and lucrative problem of music recommendation. Research in music recommendation has trended over time towards collaborative-based approaches utilizing large amounts of user data. These approaches tend to deal with the cold-start problem of insufficient data and are costly to train. We look …


Towards Machine Learning-Based Control Of Autonomous Vehicles In Solar Panel Cleaning Systems, Farima Hajiahmadi Jan 2024

Towards Machine Learning-Based Control Of Autonomous Vehicles In Solar Panel Cleaning Systems, Farima Hajiahmadi

Theses and Dissertations

This thesis presents a machine learning (ML)-based approach for the intelligent control of Autonomous Vehicles (AVs) utilized in solar panel cleaning systems, aiming to mitigate challenges arising from uncertainties, disturbances, and dynamic environments. Solar panels, predominantly situated in dedicated lands for solar energy production (e.g., agricultural solar farms), are susceptible to dust and debris accumulation, leading to diminished energy absorption. Instead of labor-intensive manual cleaning, robotic cleaners offer a viable solution. AVs equipped to transport and precisely position these cleaning robots are indispensable for efficient navigation among solar panel arrays. However, environmental obstacles (e.g., rough terrain), variations in solar panel …


Accelerating Markov Chain Monte Carlo Sampling With Diffusion Models, N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W. Thomas, M. J. White Jan 2024

Accelerating Markov Chain Monte Carlo Sampling With Diffusion Models, N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W. Thomas, M. J. White

Physics Faculty Publications

Global fits of physics models require efficient methods for exploring high-dimensional and/or multimodal posterior functions. We introduce a novel method for accelerating Markov Chain Monte Carlo (MCMC) sampling by pairing a Metropolis-Hastings algorithm with a diffusion model that can draw global samples with the aim of approximating the posterior. We briefly review diffusion models in the context of image synthesis before providing a streamlined diffusion model tailored towards low-dimensional data arrays. We then present our adapted Metropolis-Hastings algorithm which combines local proposals with global proposals taken from a diffusion model that is regularly trained on the samples produced during the …


Adaptable And Trustworthy Machine Learning For Human Activity Recognition From Bioelectric Signals, Morgan S. Stuart Jan 2024

Adaptable And Trustworthy Machine Learning For Human Activity Recognition From Bioelectric Signals, Morgan S. Stuart

Theses and Dissertations

Enabling machines to learn measures of human activity from bioelectric signals has many applications in human-machine interaction and healthcare. However, labeled activity recognition datasets are costly to collect and highly varied, which challenges machine learning techniques that rely on large datasets. Furthermore, activity recognition in practice needs to account for user trust - models are motivated to enable interpretability, usability, and information privacy. The objective of this dissertation is to improve adaptability and trustworthiness of machine learning models for human activity recognition from bioelectric signals. We improve adaptability by developing pretraining techniques that initialize models for later specialization to unseen …


Machine Learning - Hail Awareness Spatial Analysis Toolkit (Hasat), Haoruo Fu M.S., Joseph P. Hupy Ph.D., Chien-Tsung Lu Ph.D., Zhenglei Ji M.S. Jan 2024

Machine Learning - Hail Awareness Spatial Analysis Toolkit (Hasat), Haoruo Fu M.S., Joseph P. Hupy Ph.D., Chien-Tsung Lu Ph.D., Zhenglei Ji M.S.

Journal of Aviation/Aerospace Education & Research

The National Airspace System (NAS) is a sophisticated network of air traffic control, navigation, and communication systems that play a critical role in ensuring the safe and efficient flow of air traffic across the United States. However, the occurrence of severe weather conditions, particularly hailstorms, poses a significant threat to flight safety within the NAS. To mitigate the risks associated with hail, aviation organizations have implemented a range of safety measures. This study utilized Esri’s ArcGIS as a mapping software to conduct a geospatial analysis of the impact of severe weather, particularly hail, on the NAS. The Hail Awareness Spatial …


A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


Short: Can Citations Tell Us About A Paper's Reproducibility? A Case Study Of Machine Learning Papers, Rochana R. Obadage, Sarah M. Rajtmajer, Jian Wu Jan 2024

Short: Can Citations Tell Us About A Paper's Reproducibility? A Case Study Of Machine Learning Papers, Rochana R. Obadage, Sarah M. Rajtmajer, Jian Wu

Computer Science Faculty Publications

The iterative character of work in machine learning (ML) and artificial intelligence (AI) and reliance on comparisons against benchmark datasets emphasize the importance of reproducibility in that literature. Yet, resource constraints and inadequate documentation can make running replications particularly challenging. Our work explores the potential of using downstream citation contexts as a signal of reproducibility. We introduce a sentiment analysis framework applied to citation contexts from papers involved in Machine Learning Reproducibility Challenges in order to interpret the positive or negative outcomes of reproduction attempts. Our contributions include training classifiers for reproducibility-related contexts and sentiment analysis, and exploring correlations between …


Selecting And Evaluating Key Mds-Updrs Activities Using Wearable Devices For Parkinson's Disease Self-Assessment, Yuting Zhao, Xulong Wang, Xiyang Peng, Ziheng Li, Fengtao Nan, Menghui Zhuo, Jun Qi, Yun Yang, Zhong Zhao, Lida Xu, Po Yang Jan 2024

Selecting And Evaluating Key Mds-Updrs Activities Using Wearable Devices For Parkinson's Disease Self-Assessment, Yuting Zhao, Xulong Wang, Xiyang Peng, Ziheng Li, Fengtao Nan, Menghui Zhuo, Jun Qi, Yun Yang, Zhong Zhao, Lida Xu, Po Yang

Information Technology & Decision Sciences Faculty Publications

Parkinson's disease (PD) is a complex neurodegenerative disease in the elderly. This disease has no cure, but assessing these motor symptoms will help slow down that progression. Inertial sensing-based wearable devices (ISWDs) such as mobile phones and smartwatches have been widely employed to analyse the condition of PD patients. However, most studies purely focused on a single activity or symptom, which may ignore the correlation between activities and complementary characteristics. In this paper, a novel technical pipeline is proposed for fine-grained classification of PD severity grades, which identify the most representative activities. We also propose a multi-activities combination scheme based …


Trading Cloud Computing Stocks Using Sma, Xianrong Zheng, Lingyu Li Jan 2024

Trading Cloud Computing Stocks Using Sma, Xianrong Zheng, Lingyu Li

Information Technology & Decision Sciences Faculty Publications

As cloud computing adoption becomes mainstream, the cloud services market offers vast profits. Moreover, serverless computing, the next stage of cloud computing, comes with huge economic potential. To capitalize on this trend, investors are interested in trading cloud stocks. As high-growth technology stocks, investing in cloud stocks is both rewarding and challenging. The research question here is how a trading strategy will perform on cloud stocks. As a result, this paper employs an effective method—Simple Moving Average (SMA)—to trade cloud stocks. To evaluate its performance, we conducted extensive experiments with real market data that spans over 23 years. Results show …


Reducing The Uncertainty In Estimating Soil Microbial-Derived Carbon Storage, Han Hu, Chao Qian, Ke Xue, Rainer Georg Jörgensen, Marco Keiluweit, Chao Liang, Xuefeng Zhu, Ji Chen, Yishen Sun, Haowei Ni, Jixian Ding, Weigen Huang, Jingdong Mao, Rong-Xi Tan, Jizhong Zhou, Thomas W. Crowther, Zhi-Hua Zhou, Jiabao Zhang, Yuting Liang Jan 2024

Reducing The Uncertainty In Estimating Soil Microbial-Derived Carbon Storage, Han Hu, Chao Qian, Ke Xue, Rainer Georg Jörgensen, Marco Keiluweit, Chao Liang, Xuefeng Zhu, Ji Chen, Yishen Sun, Haowei Ni, Jixian Ding, Weigen Huang, Jingdong Mao, Rong-Xi Tan, Jizhong Zhou, Thomas W. Crowther, Zhi-Hua Zhou, Jiabao Zhang, Yuting Liang

Chemistry & Biochemistry Faculty Publications

Soil organic carbon (SOC) is the largest carbon pool in terrestrial ecosystems and plays a crucial role in mitigating climate change and enhancing soil productivity. Microbial-derived carbon (MDC) is the main component of the persistent SOC pool. However, current formulas used to estimate the proportional contribution of MDC are plagued by uncertainties due to limited sample sizes and the neglect of bacterial group composition effects. Here, we compiled the comprehensive global dataset and employed machine learning approaches to refine our quantitative understanding of MDC contributions to total carbon storage. Our efforts resulted in a reduction in the relative standard errors …


Artificial Intelligence For The Electron Ion Collider (Ai4eic), C. Allaire, R. Ammendola, E.-C. Aschenauer, M. Balandat, M. Battaglieri, J. Bernauer, M. Bondì, N. Branson, T. Britton, A. Butter, I. Chahrour, P. Chatagnon, E. Cisbani, E. W. Cline, S. Dash, C. Dean, W. Deconinck, A. Deshpande, M. Diefenthaler, R. Ent, C. Fanelli, M. Finger, M. Finger Jr., E. Fol, S. Furletov, Y. Gao, J. Giroux, N. C. Gunawardhana Waduge, O. Hassan, P. L. Hegde, R. J. Hernandez-Pinto, A. Hiller Blin, T. Horn, J. Huang, A. Jalotra, D. Jayakodige, B. Joo, M. Junaid, N. Kalantarians, P. Karande, B. Kriesten, R. Kunnawalkam Elayavalli, Y. Li, M. Lin, F. Liu, S. Liuti, G. Matousek, M. Mceneaney, D. Mcspadden, T. Menzo, T. Miceli, V. Mikuni, R. Montgomery, B. Nachman, R. R. Nair, J. Niestroy, S. A. Ochoa Oregon, J. Oleniacz, J. D. Osborn, C. Paudel, C. Pecar, C. Peng, G. N. Perdue, W. Phelps, M. L. Purschke, H. Rajendran, K. Rajput, Y. Ren, D. F. Renteria-Estrada, D. Richford, B. J. Roy, D. Roy, A. Saini, N. Sato, T. Satogata, G. Sborlini, M. Schram, D. Shih, J. Singh, R. Singh, A. Siodmok, J. Stevens, P. Stone, L. Suarez, K. Suresh, A. -N. Tawfik, F. Torales Acosta, N. Tran, R. Trotta, F. J. Twagirayezu, R. Tyson, S. Volkova, A. Vossen, E. Walter, D. Whiteson, M. Williams, S. Wu, N. Zachariou, P. Zurita Jan 2024

Artificial Intelligence For The Electron Ion Collider (Ai4eic), C. Allaire, R. Ammendola, E.-C. Aschenauer, M. Balandat, M. Battaglieri, J. Bernauer, M. Bondì, N. Branson, T. Britton, A. Butter, I. Chahrour, P. Chatagnon, E. Cisbani, E. W. Cline, S. Dash, C. Dean, W. Deconinck, A. Deshpande, M. Diefenthaler, R. Ent, C. Fanelli, M. Finger, M. Finger Jr., E. Fol, S. Furletov, Y. Gao, J. Giroux, N. C. Gunawardhana Waduge, O. Hassan, P. L. Hegde, R. J. Hernandez-Pinto, A. Hiller Blin, T. Horn, J. Huang, A. Jalotra, D. Jayakodige, B. Joo, M. Junaid, N. Kalantarians, P. Karande, B. Kriesten, R. Kunnawalkam Elayavalli, Y. Li, M. Lin, F. Liu, S. Liuti, G. Matousek, M. Mceneaney, D. Mcspadden, T. Menzo, T. Miceli, V. Mikuni, R. Montgomery, B. Nachman, R. R. Nair, J. Niestroy, S. A. Ochoa Oregon, J. Oleniacz, J. D. Osborn, C. Paudel, C. Pecar, C. Peng, G. N. Perdue, W. Phelps, M. L. Purschke, H. Rajendran, K. Rajput, Y. Ren, D. F. Renteria-Estrada, D. Richford, B. J. Roy, D. Roy, A. Saini, N. Sato, T. Satogata, G. Sborlini, M. Schram, D. Shih, J. Singh, R. Singh, A. Siodmok, J. Stevens, P. Stone, L. Suarez, K. Suresh, A. -N. Tawfik, F. Torales Acosta, N. Tran, R. Trotta, F. J. Twagirayezu, R. Tyson, S. Volkova, A. Vossen, E. Walter, D. Whiteson, M. Williams, S. Wu, N. Zachariou, P. Zurita

Computer Science Faculty Publications

The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. …


Bringing Gans To Medieval Times: Manuscript Translation Models, Tonilynn M. Holtz Jan 2024

Bringing Gans To Medieval Times: Manuscript Translation Models, Tonilynn M. Holtz

Electronic Theses and Dissertations

The Generative Adversarial Networks (GAN) recently emerged as a powerful framework for producing new knowledge from existing knowledge. These models aim to learn patterns from input data then use that knowledge to generate output data samples that plausibly appear to belong to the same set as the input data. Medieval manuscripts study has been an important research area in the humanities field for many decades. These rare manuscripts are often times inaccessible to the general public, including students in scholars, and it is of a great interest to provide digital support (including, but not limited to translation and search) for …


Molecular Understanding And Design Of Deep Eutectic Solvents And Proteins Using Computer Simulations And Machine Learning, Usman Lame Abbas Jan 2024

Molecular Understanding And Design Of Deep Eutectic Solvents And Proteins Using Computer Simulations And Machine Learning, Usman Lame Abbas

Theses and Dissertations--Chemical and Materials Engineering

Hydrophobic deep eutectic solvents (DESs) have emerged as excellent extractants. A major challenge is the lack of an efficient tool to discover DES candidates. Currently, the search relies heavily on the researchers’ intuition or a trial-and-error process, which leads to a low success rate or bypassing of promising candidates. DES performance depends on the heterogeneous hydrogen bond environment formed by multiple hydrogen bond donors and acceptors. Understanding this heterogeneous hydrogen bond environment can help develop principles for designing high performance DESs for extraction and other separation applications. This work investigates the structure and dynamics of hydrogen bonds in hydrophobic DESs …


Identifying The Origins Of Business’ Data Breaches Utilizing Covert Timing Channels, Gayle L. Frisbie Jan 2024

Identifying The Origins Of Business’ Data Breaches Utilizing Covert Timing Channels, Gayle L. Frisbie

Master's Theses and Doctoral Dissertations

Cybersecurity events and data breaches are on the rise and are very costly to businesses. Businesses rely on connectivity and information systems to conduct business, yet those same information systems can be breached and the organization's data exposed. Today, there is a heavy reliance of organizations upon network connections to connect the entire organization in order to conduct business efficiently and from multiple locations. Covert timing channels are a cybersecurity attack method in which malicious actors embed privileged information into normal network traffic without authorization. Malicious actors, by carefully manipulating timing patterns in covert timing channels, can create a hidden …


Probing The Ising Model’S Thermodynamics Through Restricted Boltzmann Machines, Xiaobei (Emma) Zhang Jan 2024

Probing The Ising Model’S Thermodynamics Through Restricted Boltzmann Machines, Xiaobei (Emma) Zhang

HMC Senior Theses

This thesis explores the connection between physics and machine learning by using Restricted Boltzmann Machines (RBMs) to study the thermodynamic properties of the Ising model. The Ising model is a simple but realistic model that captures the magnetic behavior of a system, where spins occupy a lattice of sites and different spin configurations correspond to different energies. The model exhibits phase transitions between ferromagnetic and paramagnetic phases as a function of temperature. RBMs are two-layered neural networks that can learn probability distributions over binary spins. The study generates 2D Ising model data at different temperatures using Monte Carlo simulations, including …


Advancing The Understanding Of Clinical Sepsis Using Gene Expression–Driven Machine Learning To Improve Patient Outcomes, Asrar Rashid, Feras Al-Obeidat, Wael Hafez, Govind Benakatti, Rayaz A. Malik, Christos Koutentis, Javed Sharief, Joe Brierley, Nasir Quraishi, Zainab A. Malik, Arif Anwary, Hoda Alkhzaimi, Syed Ahmed Zaki, Praveen Khilnani, Raziya Kadwa, Rajesh Phatak, Maike Schumacher, M. Guftar Shaikh, Ahmed Al-Dubai, Amir Hussain Jan 2024

Advancing The Understanding Of Clinical Sepsis Using Gene Expression–Driven Machine Learning To Improve Patient Outcomes, Asrar Rashid, Feras Al-Obeidat, Wael Hafez, Govind Benakatti, Rayaz A. Malik, Christos Koutentis, Javed Sharief, Joe Brierley, Nasir Quraishi, Zainab A. Malik, Arif Anwary, Hoda Alkhzaimi, Syed Ahmed Zaki, Praveen Khilnani, Raziya Kadwa, Rajesh Phatak, Maike Schumacher, M. Guftar Shaikh, Ahmed Al-Dubai, Amir Hussain

All Works

Sepsis remains a major challenge that necessitates improved approaches to enhance patient outcomes. This study explored the potential of machine learning (ML) techniques to bridge the gap between clinical data and gene expression information to better predict and understand sepsis. We discuss the application of ML algorithms, including neural networks, deep learning, and ensemble methods, to address key evidence gaps and overcome the challenges in sepsis research. The lack of a clear definition of sepsis is highlighted as a major hurdle, but ML models offer a workaround by focusing on endpoint prediction. We emphasize the significance of gene transcript information …


Deep Transfer Learning-Based Bird Species Classification Using Mel Spectrogram Images, Mrinal Kanti Baowaly, Bisnu Chandra Sarkar, Md.Abul Ala Walid, Md. Martuza Ahamad, Bikash Chandra Singh, Eduardo Silva Alvarado, Imran Ashraf, Md. Abdus Samad Jan 2024

Deep Transfer Learning-Based Bird Species Classification Using Mel Spectrogram Images, Mrinal Kanti Baowaly, Bisnu Chandra Sarkar, Md.Abul Ala Walid, Md. Martuza Ahamad, Bikash Chandra Singh, Eduardo Silva Alvarado, Imran Ashraf, Md. Abdus Samad

School of Cybersecurity Faculty Publications

The classification of bird species is of significant importance in the field of ornithology, as it plays an important role in assessing and monitoring environmental dynamics, including habitat modifications, migratory behaviors, levels of pollution, and disease occurrences. Traditional methods of bird classification, such as visual identification, were time-intensive and required a high level of expertise. However, audio-based bird species classification is a promising approach that can be used to automate bird species identification. This study aims to establish an audio-based bird species classification system for 264 Eastern African bird species employing modified deep transfer learning. In particular, the pre-trained EfficientNet …


Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Inexact Fixed-Point Proximity Algorithm For The ℓ₀ Sparse Regularization Problem, Ronglong Fang, Yuesheng Xu, Mingsong Yan Jan 2024

Inexact Fixed-Point Proximity Algorithm For The ℓ₀ Sparse Regularization Problem, Ronglong Fang, Yuesheng Xu, Mingsong Yan

Mathematics & Statistics Faculty Publications

We study inexact fixed-point proximity algorithms for solving a class of sparse regularization problems involving the ℓ₀ norm. Specifically, the ℓ₀ model has an objective function that is the sum of a convex fidelity term and a Moreau envelope of the ℓ₀ norm regularization term. Such an ℓ₀ model is non-convex. Existing exact algorithms for solving the problems require the availability of closed-form formulas for the proximity operator of convex functions involved in the objective function. When such formulas are not available, numerical computation of the proximity operator becomes inevitable. This leads to inexact iteration algorithms. We investigate in this …