Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (648)
- Artificial Intelligence and Robotics (297)
- Engineering (201)
- Data Science (148)
- Statistics and Probability (88)
-
- Computer Engineering (74)
- Databases and Information Systems (57)
- Electrical and Computer Engineering (53)
- Social and Behavioral Sciences (53)
- Other Computer Sciences (51)
- Life Sciences (47)
- Medicine and Health Sciences (45)
- Mathematics (43)
- Software Engineering (43)
- Theory and Algorithms (42)
- Applied Mathematics (40)
- Numerical Analysis and Scientific Computing (40)
- Information Security (33)
- Physics (30)
- Business (26)
- Earth Sciences (24)
- Bioinformatics (23)
- Statistical Models (23)
- Applied Statistics (22)
- Environmental Sciences (19)
- Graphics and Human Computer Interfaces (18)
- Mechanical Engineering (17)
- Operations Research, Systems Engineering and Industrial Engineering (16)
- Chemistry (15)
- Institution
-
- Singapore Management University (30)
- California Polytechnic State University, San Luis Obispo (28)
- Southern Methodist University (28)
- Western University (28)
- University of Texas at El Paso (27)
-
- Technological University Dublin (26)
- San Jose State University (25)
- University of South Florida (23)
- University of Wisconsin Milwaukee (23)
- University of Kentucky (22)
- City University of New York (CUNY) (20)
- Missouri University of Science and Technology (19)
- West Virginia University (19)
- University of Tennessee, Knoxville (18)
- Dartmouth College (17)
- University of Arkansas, Fayetteville (17)
- University of Nebraska - Lincoln (16)
- Utah State University (16)
- Northern Illinois University (15)
- Washington University in St. Louis (15)
- Wright State University (15)
- Claremont Colleges (14)
- University of South Carolina (12)
- Chapman University (11)
- Kennesaw State University (11)
- Selected Works (11)
- University of Nevada, Las Vegas (11)
- Virginia Commonwealth University (11)
- Clemson University (10)
- Purdue University (9)
- Publication Year
- Publication
-
- Theses and Dissertations (58)
- SMU Data Science Review (28)
- Open Access Theses & Dissertations (27)
- Master's Theses (25)
- Research Collection School Of Computing and Information Systems (25)
-
- Electronic Theses and Dissertations (24)
- Electronic Thesis and Dissertation Repository (24)
- Master's Projects (23)
- USF Tampa Graduate Theses and Dissertations (23)
- Doctoral Dissertations (19)
- Graduate Theses, Dissertations, and Problem Reports (18)
- Graduate Theses and Dissertations (15)
- Conference papers (14)
- Dissertations (14)
- Graduate Research Theses & Dissertations (13)
- McKelvey School of Engineering Theses & Dissertations (13)
- Browse all Theses and Dissertations (12)
- Masters Theses (12)
- Dissertations, Theses, and Capstone Projects (11)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (10)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (10)
- CCE Theses and Dissertations (8)
- CMC Senior Theses (8)
- Dissertations and Theses (8)
- Electronic Theses, Projects, and Dissertations (8)
- Theses and Dissertations--Computer Science (8)
- Computer Science Senior Theses (7)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (7)
- Dissertations, Master's Theses and Master's Reports (7)
- FIU Electronic Theses and Dissertations (7)
- Publication Type
- File Type
Articles 61 - 90 of 826
Full-Text Articles in Physical Sciences and Mathematics
Enhancing Scanning Tunneling Microscopy With Automation And Machine Learning, Darian Smalley
Enhancing Scanning Tunneling Microscopy With Automation And Machine Learning, Darian Smalley
Graduate Thesis and Dissertation 2023-2024
The scanning tunneling microscope (STM) is one of the most advanced surface science tools capable of atomic resolution imaging and atomic manipulation. Unfortunately, STM has many time-consuming bottlenecks, like probe conditioning, tip instability, and noise artificing, which causes the technique to have low experimental throughput. This dissertation describes my efforts to address these challenges through automation and machine learning. It consists of two main sections each describing four projects for a total of eight studies.
The first section details two studies on nanoscale sample fabrication and two studies on STM tip preparation. The first two studies describe the fabrication of …
Improveing F-Beta Score In Classifying Shark Data Into Shark Behaviors, Ibrahim M. Ali
Improveing F-Beta Score In Classifying Shark Data Into Shark Behaviors, Ibrahim M. Ali
CGU Theses & Dissertations
One metric used to measure classification performance in machine learning is F-beta score. The objective in this thesis is to improve the average F-b score computed in classifying shark data into shark behaviors, namely; Resting, Swimming, Feeding, and Non-Directed Motion (NDM). Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN) are utilized to balance the data, from which pre-processed Fast Fourier Transform (FFT), Walsh-Hadamard Transform (WHT), and Autocorrelation (AC) features are extracted then classified using Convolutional Neural Network (CNN) and K-Nearest Neighbors (K-NN). All the combinations of the two balancing techniques, the three feature types, and the two machine …
Smart Applications And Resource Management In Internet Of Things, Zeinab Akhavan
Smart Applications And Resource Management In Internet Of Things, Zeinab Akhavan
Computer Science ETDs
Internet of Things (IoT) technologies are currently the principal solutions driving smart cities. These new technologies such as Cyber Physical Systems, 5G and data analytic have emerged to address various cities' infrastructure issues ranging from transportation and energy management to healthcare systems. An IoT setting primarily consists of a wide range of users and devices as a massive network interacting with different layers of the city infrastructure resulting in generating sheer volume of data to enable smart city services. The goal of smart city services is to create value for the entire ecosystem, whether this is health, education, transportation, energy, …
An Empirical Study Of Machine Learning Techniques For Accurate Stock Price Forecasting, Daniel Paliulis, Hari Patchigolla
An Empirical Study Of Machine Learning Techniques For Accurate Stock Price Forecasting, Daniel Paliulis, Hari Patchigolla
Honors Scholar Theses
This paper presents a comprehensive approach to predicting future stock prices of companies using machine learning and time series analysis. The research problem is centered around addressing the complexity and emotion-driven nature of stock investment decisions. To create an objective determinant in stock decisions, we propose a machine learning model utilizing time series data from major companies, including Amazon, Apple, Google, Nvidia, Meta, Tesla, Salesforce, Intel, and Microsoft. We explore the use of Long Short-Term Memory (LSTM) neural networks, to capture the temporal dynamics of stock prices. These models are designed to process sequential data, maintaining short term and long …
Investigation Into A Practical Application Of Reinforcement Learning For The Stock Market, Philip Traxler, Sadik Aman, Will Rogers, Allyn Okun
Investigation Into A Practical Application Of Reinforcement Learning For The Stock Market, Philip Traxler, Sadik Aman, Will Rogers, Allyn Okun
SMU Data Science Review
A major problem of the financial industry is the ability to adapt their trading strategies at the same rate the market evolves. This paper proposes a solution using existing Reinforcement Learning libraries to help find new strategies at a practical scale. Using a wide domain of ticker symbols, an algorithm is trained in an environment that better represents reality. The supplied decision-making algorithm is tested using recorded data from the U.S stock market from 2000 through 2022. The results of this research show that existing techniques are statistically better than making decisions at random. With this result, this research shows …
A Prompt Engineering Approach To Creating Automated Commentary For Microsoft Self-Help Documentation Metric Reports Using Chatgpt, Ryan Herrin, Luke Stodgel, Brian Raffety
A Prompt Engineering Approach To Creating Automated Commentary For Microsoft Self-Help Documentation Metric Reports Using Chatgpt, Ryan Herrin, Luke Stodgel, Brian Raffety
SMU Data Science Review
Microsoft collects an immense amount of data from the users of their product-self-help documentation. Employees use this data to identify these self-help articles' performance trends and measure their impact on business Key Performance Indicators (KPIs). Microsoft uses various tools like Power BI and Python to analyze this data. The problem is that their analysis and findings are summarized manually. Therefore, this research will improve upon their current analysis methods by applying the latest prompt engineering practices and the power of ChatGPT's large language models (LLMs). Using VBA code, Microsoft Excel, and the ChatGPT API as an Excel add-in, this research …
Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded
Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded
Theses and Dissertations
Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance.
Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron
Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron
Doctoral Dissertations
This work introduces improvements to the stability and generalizability of Cyclic DARTS (CDARTS). CDARTS is a Differentiable Architecture Search (DARTS)-based approach to neural architecture search (NAS) that uses a cyclic feedback mechanism to train search and evaluation networks concurrently, thereby optimizing the search process by enforcing that the networks produce similar outputs. However, the dissimilarity between the loss functions used by the evaluation networks during the search and retraining phases results in a search-phase evaluation network, a sub-optimal proxy for the final evaluation network utilized during retraining. ICDARTS, a revised algorithm that reformulates the search phase loss functions to ensure …
Towards Explaining Neural Networks: Tools For Visualizing Activations And Parameters, Juan Puebla
Towards Explaining Neural Networks: Tools For Visualizing Activations And Parameters, Juan Puebla
Open Access Theses & Dissertations
There is a growing number of applications using neural networks for making decisions. However, there is a general lack of understanding of how neural networks work. Neural networks have even been described as black boxes which has led to a lack of trust in artificially intelligent programs. To remedy this, explainable artificial intelligence has risen as a means to validate the decision-making processes and the results of computer programs that use artificial intelligence. The work in this masterâ??s thesis is our contribution to explainable artificial intelligence, focusing on neural networks with the goal of helping users make more sense of …
Context-Aware Temporal Embeddings For Text And Video Data, Ahnaf Farhan
Context-Aware Temporal Embeddings For Text And Video Data, Ahnaf Farhan
Open Access Theses & Dissertations
Recent years have seen an exponential increase in unstructured data, primarily in the form of text, images, and videos. Extracting useful features and trends from large-scale unstructured datasets -- such as news outlets, scientific papers, and videos like security cameras or body cam recordings -- is faced with substantial challenges of volume, scalability, complexity, and semantic understanding. In analyzing trends, comprehending the temporal context is vital for uncovering patterns and narratives that are not apparent from a single video frame or text document. Despite its importance, many existing data mining and machine learning approaches overlook extracting evolutionary contextual features in …
Integrating Machine Learning Methods For Medical Diagnosis, Jazmin Quezada
Integrating Machine Learning Methods For Medical Diagnosis, Jazmin Quezada
Open Access Theses & Dissertations
Abstract:The rapid advancement of machine learning techniques has revolutionized the field of medical diagnosis by offering powerful tools to analyze complex data sets and make accurate predictions. In this proposed method, we present a novel approach that integrates machine learning and optimization models to enhance the accuracy of medical diagnoses. Our method focuses on fine-tuning and optimizing the parameters of machine learning algorithms commonly used in medical diagnosis, such as logistic regression, support vector machines, and neural networks. By employing optimization techniques, we systematically explore the parameter space of these algorithms to discover the most optimal configurations. Moreover, by representing …
Analysis Of Student Behavior And Score Prediction In Assistments Online Learning, Aswani Yaramala
Analysis Of Student Behavior And Score Prediction In Assistments Online Learning, Aswani Yaramala
All Graduate Theses and Dissertations, Fall 2023 to Present
Understanding and analyzing student behavior is paramount in enhancing online learning, and this thesis delves into the subject by presenting an in-depth analysis of student behavior and score prediction in the ASSISTments online learning platform. We used data from the EDM Cup 2023 Kaggle Competition to answer four key questions. First, we explored how students seeking hints and explanations affect their performance in assignments, shedding light on the role of guidance in learning. Second, we looked at the connection between students mastering specific skills and their performance in related assignments, giving insights into the effectiveness of curriculum alignment. Third, we …
General Population Projection Model With Census Population Data, Takenori Tsuruga
General Population Projection Model With Census Population Data, Takenori Tsuruga
Electronic Theses, Projects, and Dissertations
The US Census Bureau offers a wide range of data, and within this array, the American Community Survey 5-Year Estimate (ACS5) serves as a valuable resource for understanding the US population. This project embarks on an exploration of Machine Learning and the Software Development process with the goal of generating effective population projections from ACS5 data. The project aims to provide methods to make predictions for every city and town in the US, encompassing their total population and population divided into 5-year age groups. It's worth noting that while the generation of these projections is grounded in the generalized statistical …
Hypothyroid Disease Analysis By Using Machine Learning, Sanjana Seelam
Hypothyroid Disease Analysis By Using Machine Learning, Sanjana Seelam
Electronic Theses, Projects, and Dissertations
Thyroid illness frequently manifests as hypothyroidism. It is evident that people with hypothyroidism are primarily female. Because the majority of people are unaware of the illness, it is quickly becoming more serious. It is crucial to catch it early on so that medical professionals can treat it more effectively and prevent it from getting worse. Machine learning illness prediction is a challenging task. Disease prediction is aided greatly by machine learning. Once more, unique feature selection strategies have made the process of disease assumption and prediction easier. To properly monitor and cure this illness, accurate detection is essential. In order …
A Design Strategy To Improve Machine Learning Resiliency Of Physically Unclonable Functions Using Modulus Process, Yuqiu Jiang
A Design Strategy To Improve Machine Learning Resiliency Of Physically Unclonable Functions Using Modulus Process, Yuqiu Jiang
Theses and Dissertations
Physically unclonable functions (PUFs) are hardware security primitives that utilize non-reproducible manufacturing variations to provide device-specific challenge-response pairs (CRPs). Such primitives are desirable for applications such as communication and intellectual property protection. PUFs have been gaining considerable interest from both the academic and industrial communities because of their simplicity and stability. However, many recent studies have exposed PUFs to machine-learning (ML) modeling attacks. To improve the resilience of a system to general ML attacks instead of a specific ML technique, a common solution is to improve the complexity of the system. Structures, such as XOR-PUFs, can significantly increase the nonlinearity …
Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu
Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu
Graduate Theses and Dissertations
With the development of artificial intelligence, automated decision-making systems are increasingly integrated into various applications, such as hiring, loans, education, recommendation systems, and more. These machine learning algorithms are expected to facilitate faster, more accurate, and impartial decision-making compared to human judgments. Nevertheless, these expectations are not always met in practice due to biased training data, leading to discriminatory outcomes. In contemporary society, countering discrimination has become a consensus among people, leading the EU and the US to enact laws and regulations that prohibit discrimination based on factors such as gender, age, race, and religion. Consequently, addressing algorithmic discrimination has …
Decoding Usage And Adoption Behavior Of The Low-Carbon Transportation Market: An Ai-Driven Exploration, Vuban Chowdhury
Decoding Usage And Adoption Behavior Of The Low-Carbon Transportation Market: An Ai-Driven Exploration, Vuban Chowdhury
Graduate Theses and Dissertations
The transportation sector stands as a significant contributor to greenhouse gas emissions in the United States, with its environmental impact steadily escalating over the past few decades. This has prompted government agencies to facilitate the adoption and usage of low-carbon transportation (LCT) options as alternatives to fossil-fuel-powered transportation. LCTs include modes of transportation that minimize the overall carbon footprint of the transportation sector by relying on energy sources that are environmentally sustainable. These sustainable transportation options have also garnered significant interest in the transportation research community. For government agencies and researchers alike, a comprehensive understanding of the adoption and usage …
Demystifying Artificial Intelligence (Ai) For Early Childhood And Elementary Education: A Case Study Of Perceptions Of Ai Of State Of Missouri Educators, Kathryn Arnone, James Hutson, Karen Woodruff
Demystifying Artificial Intelligence (Ai) For Early Childhood And Elementary Education: A Case Study Of Perceptions Of Ai Of State Of Missouri Educators, Kathryn Arnone, James Hutson, Karen Woodruff
Faculty Scholarship
Artificial intelligence (AI) and its impact on society have received a great deal of attention in the past five years since the first Stanford AI100 report. AI already globally impacts individuals in critical and personal ways, and many industries will continue to experience disruptions as the full algorithmic effects are understood. However, with regard to education, adopting in disciplines remains limited largely to Computer Science and Information Technology in postsecondary education. Recent advances with technology are especially promising for their potential to create and scale personalized learning for students, to optimize strategies for learning outcomes, and to increase access to …
Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa
Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa
Doctoral Dissertations
In the burgeoning field of quantum machine learning, the fusion of quantum computing and machine learning methodologies has sparked immense interest, particularly with the emergence of noisy intermediate-scale quantum (NISQ) devices. These devices hold the promise of achieving quantum advantage, but they grapple with limitations like constrained qubit counts, limited connectivity, operational noise, and a restricted set of operations. These challenges necessitate a strategic and deliberate approach to crafting effective quantum machine learning algorithms.
This dissertation revolves around an exploration of these challenges, presenting innovative strategies that tailor quantum algorithms and processes to seamlessly integrate with commercial quantum platforms. A …
Implementation Of Adas And Autonomy On Unlv Campus, Zillur Rahman
Implementation Of Adas And Autonomy On Unlv Campus, Zillur Rahman
UNLV Theses, Dissertations, Professional Papers, and Capstones
The integration of Advanced Driving Assistance Systems (ADAS) and autonomous driving functionalities into contemporary vehicles has notably surged, driven by the remarkable progress in artificial intelligence (AI). These AI systems, capable of learning from real-world data, now exhibit the capability to perceive their surroundings via a suite of sensors, create optimal routes from source to destination, and execute vehicle control akin to a human driver.
Within the context of this thesis, we undertake a comprehensive exploration of three distinct yet interrelated ADAS and Autonomy projects. Our central objective is the implementation of autonomous driving(AD) technology at UNLV campus, culminating in …
Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen
Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen
Electrical and Computer Engineering ETDs
These days large volumes of data can be recorded and manipulated with relative ease. If valuable information can be extracted from them, these vast amounts of data can be a rich resource not just for the digital economy but also for scientific discovery and development of technology. When it comes to deriving valuable information from data, Machine Learning (ML) emerges as the key solution. To unlock the potential benefits of ML to science and technology, extensive research is needed to explore what algorithms are suitable and how they can be applied.
To shine light on various ways that ML can …
Data-Driven Decision Support Tool Co-Development With A Primary Health Care Practice Based Learning Network, Jacqueline K. Kueper, Jennifer Rayner, Sara Bhatti, Kelly Angevaare, Sandra Fitzpatrick, Paulino Lucamba, Eric Sutherland, Daniel J. Lizotte
Data-Driven Decision Support Tool Co-Development With A Primary Health Care Practice Based Learning Network, Jacqueline K. Kueper, Jennifer Rayner, Sara Bhatti, Kelly Angevaare, Sandra Fitzpatrick, Paulino Lucamba, Eric Sutherland, Daniel J. Lizotte
Epidemiology and Biostatistics Publications
Background: The Alliance for Healthier Communities is a learning health system that supports Community Health Centres (CHCs) across Ontario, Canada to provide team-based primary health care to people who otherwise experience barriers to care. This case study describes the ongoing process and lessons learned from the first Alliance for Healthier Communities’ Practice Based Learning Network (PBLN) data-driven decision support tool co-development project.
Methods: We employ an iterative approach to problem identification and methods development for the decision support tool, moving between discussion sessions and case studies with CHC electronic health record (EHR) data. We summarize our work to date in …
Deciphering Trends And Tactics: Data-Driven Techniques For Forecasting Information Spread And Detecting Coordinated Campaigns In Social Media, Kin Wai Ng Lugo
Deciphering Trends And Tactics: Data-Driven Techniques For Forecasting Information Spread And Detecting Coordinated Campaigns In Social Media, Kin Wai Ng Lugo
USF Tampa Graduate Theses and Dissertations
The main objective of this dissertation is to develop models that predict and investigate the spread of information in social media over time. In this context, we consider topics of discussions as the information that spreads. Thus, we are interested in forecasting the number of messages per day in a future interval of time. We take a data-driven approach, in which we compare our results with real datasets from a multitude of socio-political contexts and from multiple social media platforms, specifically, Twitter and YouTube.
We identified a number of challenges related to forecasting social media time series per topic. First, …
Machine Learning Prediction Of Hea Properties, Nicholas J. Beaver, Nathaniel Melisso, Travis Murphy
Machine Learning Prediction Of Hea Properties, Nicholas J. Beaver, Nathaniel Melisso, Travis Murphy
College of Engineering Summer Undergraduate Research Program
High-entropy alloys (HEA) are a very new development in the field of metallurgical materials. They are made up of multiple principle atoms unlike traditional alloys, which contributes to their high configurational entropy. The microstructure and properties of HEAs are are not well predicted with the models developed for more common engineering alloys, and there is not enough data available on HEAs to fully represent the complex behavior of these alloys. To that end, we explore how the use of machine learning models can be used to model the complex, high dimensional behavior in the HEA composition space. Based on our …
Ai For Search And Rescue - Locating A Missing Person, David Hernandez, Sai Rama Balakrishnan, Timmy Chin, Aditya Manikonda, Vasanth Pugalenthi
Ai For Search And Rescue - Locating A Missing Person, David Hernandez, Sai Rama Balakrishnan, Timmy Chin, Aditya Manikonda, Vasanth Pugalenthi
College of Engineering Summer Undergraduate Research Program
Building on the work done initially as a SURP 2021 project and continued through 2021-23, the focus for this summer project will be on the use of computer technology for locating a missing person. Over the last year, we developed the digital equivalents of about 30 paper-based S&R forms and the infrastructure to collect the respective information. In their current use, these paper forms are filled out by search teams, collected in a command post, and reviewed by search coordinators. This process is time-consuming, prone to errors and loss of information, and relies heavily on the experience, skills, and mental …
Ethics And Social Justice For Ai In Data Science, Arya Ramchander, Kylene Nicole Landenberger
Ethics And Social Justice For Ai In Data Science, Arya Ramchander, Kylene Nicole Landenberger
College of Engineering Summer Undergraduate Research Program
The advances of AI raise several critical questions about human values and ethics, highlighting the need for researchers and developers to consider the ethical implications and the risks of neglecting them. In the past few years, student researchers have developed an AI model that allows users to test their surveys for possible breaches of subject confidentiality. This allows the users to gauge the ethicality of their proposal. This summer, we have expanded on this research and launched an interactive model for students and researches to assess their current work for ethical and social justice implications. Using Langchain and Figma, we …
Your Cursor Reveals: On Analyzing Workers’ Browsing Behavior And Annotation Quality In Crowdsourcing Tasks, Pei-Chi Lo, Ee-Peng Lim
Your Cursor Reveals: On Analyzing Workers’ Browsing Behavior And Annotation Quality In Crowdsourcing Tasks, Pei-Chi Lo, Ee-Peng Lim
Research Collection School Of Computing and Information Systems
In this work, we investigate the connection between browsing behavior and task quality of crowdsourcing workers performing annotation tasks that require information judgements. Such information judgements are often required to derive ground truth answers to information retrieval queries. We explore the use of workers’ browsing behavior to directly determine their annotation result quality. We hypothesize user attention to be the main factor contributing to a worker’s annotation quality. To predict annotation quality at the task level, we model two aspects of task-specific user attention, also known as general and semantic user attentions . Both aspects of user attention can be …
Improving Semantic Document Classification Accuracy By Integrating Human-Crafted Knowledge, Zachary Weinfeld, Lubomir Stanchev
Improving Semantic Document Classification Accuracy By Integrating Human-Crafted Knowledge, Zachary Weinfeld, Lubomir Stanchev
College of Engineering Summer Undergraduate Research Program
Document classification is a pivotal task in various domains, warranting the development of robust algorithms. Among these, the Bidirectional Encoder Representations from Transformers (BERT) algorithm, introduced by Google, has proven to perform well when fine-tuned for the task at hand. Leveraging transformer architecture, BERT demonstrates stellar language understanding capabilities. However, the integration of BERT with a range of techniques has shown potential for further enhancing classification accuracy. This work investigates several techniques that leverage semantic understanding to improve the performance of document classification models trained with BERT. Specifically, we explore three methods. First, we will balance corpuses afflicted by imbalanced …
Optimization And Application Of Graph Neural Networks, Shuo Zhang
Optimization And Application Of Graph Neural Networks, Shuo Zhang
Dissertations, Theses, and Capstone Projects
Graph Neural Networks (GNNs) are widely recognized for their potential in learning from graph-structured data and solving complex problems. However, optimal performance and applicability of GNNs have been an open-ended challenge. This dissertation presents a series of substantial advances addressing this problem. First, we investigate attention-based GNNs, revealing a critical shortcoming: their ignorance of cardinality information that impacts their discriminative power. To rectify this, we propose Cardinality Preserved Attention (CPA) models that can be applied to any attention-based GNNs, which exhibit a marked improvement in performance. Next, we introduce the Directional Node Pair (DNP) descriptor and the Robust Molecular Graph …
Testsgd: Interpretable Testing Of Neural Networks Against Subtle Group Discrimination, Mengdi Zhang, Jun Sun, Jingyi Wang, Bing Sun
Testsgd: Interpretable Testing Of Neural Networks Against Subtle Group Discrimination, Mengdi Zhang, Jun Sun, Jingyi Wang, Bing Sun
Research Collection School Of Computing and Information Systems
Discrimination has been shown in many machine learning applications, which calls for sufficient fairness testing before their deployment in ethic-relevant domains. One widely concerning type of discrimination, testing against group discrimination, mostly hidden, is much less studied, compared with identifying individual discrimination. In this work, we propose TestSGD, an interpretable testing approach which systematically identifies and measures hidden (which we call ‘subtle’) group discrimination of a neural network characterized by conditions over combinations of the sensitive attributes. Specifically, given a neural network, TestSGD first automatically generates an interpretable rule set which categorizes the input space into two groups. Alongside, TestSGD …