Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (1330)
- Artificial Intelligence and Robotics (516)
- Engineering (356)
- Computer Engineering (169)
- Data Science (148)
-
- Social and Behavioral Sciences (143)
- Electrical and Computer Engineering (139)
- Statistics and Probability (129)
- Medicine and Health Sciences (117)
- Life Sciences (102)
- Databases and Information Systems (100)
- Earth Sciences (79)
- Theory and Algorithms (74)
- Mathematics (72)
- Physics (70)
- Environmental Sciences (69)
- Information Security (69)
- Numerical Analysis and Scientific Computing (69)
- Software Engineering (68)
- Other Computer Sciences (64)
- Business (58)
- Applied Mathematics (51)
- Arts and Humanities (45)
- Education (40)
- Medical Specialties (36)
- Chemistry (34)
- Applied Statistics (32)
- Operations Research, Systems Engineering and Industrial Engineering (32)
- Oceanography and Atmospheric Sciences and Meteorology (30)
- Institution
-
- Old Dominion University (115)
- Singapore Management University (105)
- Brigham Young University (74)
- Air Force Institute of Technology (66)
- TÜBİTAK (61)
-
- Zayed University (48)
- University of Texas at Arlington (44)
- New Jersey Institute of Technology (42)
- Technological University Dublin (40)
- Portland State University (38)
- University of Nebraska - Lincoln (38)
- Edith Cowan University (30)
- Western University (30)
- Chapman University (27)
- San Jose State University (26)
- City University of New York (CUNY) (25)
- University of Kentucky (25)
- University of South Florida (24)
- Boise State University (21)
- Utah State University (21)
- Louisiana State University (19)
- University of Texas Rio Grande Valley (19)
- University at Albany, State University of New York (18)
- University of Louisville (18)
- Wright State University (18)
- Southern Methodist University (17)
- University of Nevada, Las Vegas (17)
- University of Tennessee, Knoxville (17)
- California Polytechnic State University, San Luis Obispo (16)
- Dartmouth College (16)
- Publication Year
- Publication
-
- Theses and Dissertations (152)
- Research Collection School Of Computing and Information Systems (86)
- Electronic Theses and Dissertations (65)
- Turkish Journal of Electrical Engineering and Computer Sciences (60)
- Dissertations (51)
-
- All Works (48)
- Faculty Publications (40)
- Computer Science and Engineering Dissertations (24)
- Electrical & Computer Engineering Faculty Publications (24)
- Electronic Thesis and Dissertation Repository (23)
- Dissertations and Theses (22)
- Master's Projects (21)
- Conference papers (20)
- Doctoral Dissertations (20)
- Computer Science Faculty Publications (19)
- Computer Science and Engineering Theses (19)
- Legacy Theses & Dissertations (2009 - 2024) (18)
- Articles (17)
- USF Tampa Graduate Theses and Dissertations (17)
- Master's Theses (16)
- Browse all Theses and Dissertations (15)
- Research outputs 2022 to 2026 (15)
- SMU Data Science Review (15)
- Boise State University Theses and Dissertations (14)
- Dissertations, Theses, and Capstone Projects (13)
- LSU Doctoral Dissertations (13)
- Mathematics, Physics, and Computer Science Faculty Articles and Research (13)
- CCE Theses and Dissertations (12)
- Honors Theses (12)
- Journal of System Simulation (12)
- Publication Type
Articles 451 - 480 of 1687
Full-Text Articles in Physical Sciences and Mathematics
Respiratory Pattern Analysis For Covid-19 Digital Screening Using Ai Techniques, Annita Tahsin Priyoti
Respiratory Pattern Analysis For Covid-19 Digital Screening Using Ai Techniques, Annita Tahsin Priyoti
Electronic Thesis and Dissertation Repository
Corona Virus (COVID-19) is a highly contagious respiratory disease that the World Health Organization (WHO) has declared a worldwide epidemic. This virus has spread worldwide, affecting various countries until now, causing millions of deaths globally. To tackle this public health crisis, medical professionals and researchers are working relentlessly, applying different techniques and methods. In terms of diagnosis, respiratory sound has been recognized as an indicator of one’s health condition. Our work is based on cough sound analysis. This study has included an in-depth analysis of the diagnosis of COVID-19 based on human cough sound. Based on cough audio samples from …
Reporting Standards For Machine Learning Research In Type 2 Diabetes, Grace Kang
Reporting Standards For Machine Learning Research In Type 2 Diabetes, Grace Kang
Undergraduate Student Research Internships Conference
In this project, three people scored 90 papers on machine learning predictive models for type 2 diabetes to assess their adherence to TRIPOD, MI-CLAIM, and DOME reporting guidelines.
A Kuramoto Model Approach To Predicting Chaotic Systems With Echo State Networks, Sophie Wu, Jackson Howe
A Kuramoto Model Approach To Predicting Chaotic Systems With Echo State Networks, Sophie Wu, Jackson Howe
Undergraduate Student Research Internships Conference
An Echo State Network (ESN) with an activation function based on the Kuramoto model (Kuramoto ESN) is implemented, which can successfully predict the logistic map for a non-trivial number of time steps. The reservoir in the prediction stage exhibits binary dynamics when a good prediction is made, but the oscillators in the reservoir display a larger variability in states as the ESN’s prediction becomes worse. Analytical approaches to quantify how the Kuramoto ESN’s dynamics relate to its prediction are explored, as well as how the dynamics of the Kuramoto ESN relate to another widely studied physical model, the Ising model.
Asian Hate Speech Detection On Twitter During Covid-19, Amir Toliyat, Sarah Ita Levitan, Zeng Peng, Ronak Etemadpour
Asian Hate Speech Detection On Twitter During Covid-19, Amir Toliyat, Sarah Ita Levitan, Zeng Peng, Ronak Etemadpour
Publications and Research
Coronavirus disease 2019 (COVID-19) started in Wuhan, China, in late 2019, and after being utterly contagious in Asian countries, it rapidly spread to other countries. This disease caused governments worldwide to declare a public health crisis with severe measures taken to reduce the speed of the spread of the disease. This pandemic affected the lives of millions of people. Many citizens that lost their loved ones and jobs experienced a wide range of emotions, such as disbelief, shock, concerns about health, fear about food supplies, anxiety, and panic. All of the aforementioned phenomena led to the spread of racism and …
Human-Centered Machine Learning: Algorithm Design And Human Behavior, Wei Tang
Human-Centered Machine Learning: Algorithm Design And Human Behavior, Wei Tang
McKelvey School of Engineering Theses & Dissertations
Machine learning is increasingly engaged in a large number of important daily decisions and has great potential to reshape various sectors of our modern society. To fully realize this potential, it is important to understand the role that humans play in the design of machine learning algorithms and investigate the impacts of the algorithm on humans.
Towards the understanding of such interactions between humans and algorithms, this dissertation takes a human-centric perspective and focuses on investigating the interplay between human behavior and algorithm design. Accounting for the roles of humans in algorithm design creates unique challenges. For example, humans might …
Design And Analysis Of Strategic Behavior In Networks, Sixie Yu
Design And Analysis Of Strategic Behavior In Networks, Sixie Yu
McKelvey School of Engineering Theses & Dissertations
Networks permeate every aspect of our social and professional life.A networked system with strategic individuals can represent a variety of real-world scenarios with socioeconomic origins. In such a system, the individuals' utilities are interdependent---one individual's decision influences the decisions of others and vice versa. In order to gain insights into the system, the highly complicated interactions necessitate some level of abstraction. To capture the otherwise complex interactions, I use a game theoretic model called Networked Public Goods (NPG) game. I develop a computational framework based on NPGs to understand strategic individuals' behavior in networked systems. The framework consists of three …
Determining The Effects Of Elevated Carbon Dioxide On Soil Acidification, Cation Depletion, And Soil Inorganic Carbon And Mapping Soil Carbons Using Artificial Intelligence, Jannatul Ferdush
Theses and Dissertations
Soil carbon is the largest sink and source of the global carbon cycle and is disturbed by several natural, anthropogenic, and environmental factors. The global increase of atmospheric CO2 affects soil carbon cycling through varied biogeochemical processes. The first chapter is a compilation of current information on potential factors triggering soil acidification and weathering mechanisms under elevated CO2 and their consequences on soil inorganic carbon (SIC) pool and quality. Soil water content and precipitation were critical factors influencing elevated CO2 effects on the SIC pool. The second chapter examines a detailed column experiment in which six soils …
Classification Models For 2,4-D Formulations In Damaged Enlist Crops Through The Application Of Ftir Spectroscopy And Machine Learning Algorithms, Benjamin Blackburn
Classification Models For 2,4-D Formulations In Damaged Enlist Crops Through The Application Of Ftir Spectroscopy And Machine Learning Algorithms, Benjamin Blackburn
Theses and Dissertations
With new 2,4-Dichlorophenoxyacetic acid (2,4-D) tolerant crops, increases in off-target movement events are expected. New formulations may mitigate these events, but standard lab techniques are ineffective in identifying these 2,4-D formulations. Using Fourier-transform infrared spectroscopy and machine learning algorithms, research was conducted to classify 2,4-D formulations in treated herbicide-tolerant soybeans and cotton and observe the influence of leaf treatment status and collection timing on classification accuracy. Pooled Classification models using k-nearest neighbor classified 2,4-D formulations with over 65% accuracy in cotton and soybean. Tissue collected 14 DAT and 21 DAT for cotton and soybean respectively produced higher accuracies than the …
Artificial Intelligence In The Radiomic Analysis Of Glioblastomas: A Review, Taxonomy, And Perspective, Ming Zhu, Sijia Li, Yu Kuang, Virginia B. Hill, Amy B. Heimberger, Lijie Zhai, Shenjie Zhai
Artificial Intelligence In The Radiomic Analysis Of Glioblastomas: A Review, Taxonomy, And Perspective, Ming Zhu, Sijia Li, Yu Kuang, Virginia B. Hill, Amy B. Heimberger, Lijie Zhai, Shenjie Zhai
Electrical & Computer Engineering Faculty Research
Radiological imaging techniques, including magnetic resonance imaging (MRI) and positron emission tomography (PET), are the standard-of-care non-invasive diagnostic approaches widely applied in neuro-oncology. Unfortunately, accurate interpretation of radiological imaging data is constantly challenged by the indistinguishable radiological image features shared by different pathological changes associated with tumor progression and/or various therapeutic interventions. In recent years, machine learning (ML)-based artificial intelligence (AI) technology has been widely applied in medical image processing and bioinformatics due to its advantages in implicit image feature extraction and integrative data analysis. Despite its recent rapid development, ML technology still faces many hurdles for its broader applications …
Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura
Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura
Mathematics & Statistics ETDs
This study compared the performance of machine learning models in classifying COVID-19 patients using exhaled breath signals and simulated datasets. Ground truth classification was determined by the gold standard Polymerase Chain Reaction (PCR) test results. A residual bootstrapped method generated the simulated datasets by fitting signal data to Autoregressive Moving Average (ARMA) models. Classification models included neural networks, k-nearest neighbors, naïve Bayes, random forest, and support vector machines. A Recursive Feature Elimination (RFE) study was performed to determine if reducing signal features would improve the classification models performance using Gini Importance scoring for the two classes. The top 25% of …
Modern Pyromes: Biogeographical Patterns Of Fire Characteristics Across The Contiguous United States, Megan E. Cattau, Adam Mahood, Jennifer K. Balch, Carol Wessman
Modern Pyromes: Biogeographical Patterns Of Fire Characteristics Across The Contiguous United States, Megan E. Cattau, Adam Mahood, Jennifer K. Balch, Carol Wessman
Human-Environment Systems Research Center Faculty Publications and Presentations
In recent decades, wildfires in many areas of the United States (U.S.) have become larger and more frequent with increasing anthropogenic pressure, including interactions between climate, land-use change, and human ignitions. We aimed to characterize the spatiotemporal patterns of contemporary fire characteristics across the contiguous United States (CONUS). We derived fire variables based on frequency, fire radiative power (FRP), event size, burned area, and season length from satellite-derived fire products and a government records database on a 50 km grid (1984–2020). We used k-means clustering to create a hierarchical classification scheme of areas with relatively homogeneous fire characteristics, or modern …
Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee
Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee
All Dissertations
Most modern machine learning algorithms tend to focus on an "average-case" approach, where every data point contributes the same amount of influence towards calculating the fit of a model. This "per-data point" error (or loss) is averaged together into an overall loss and typically minimized with an objective function. However, this can be insensitive to valuable outliers. Inspired by game theory, the goal of this work is to explore the utility of incorporating an optimally-playing adversary into feature selection and regression frameworks. The adversary assigns weights to the data elements so as to degrade the modeler's performance in an optimal …
Understanding Learners' Motivation Through Machine Learning Analysis On Reflection Writing, Elizabeth Pluskwik, Yuezhou Wang, Lauren Singelmann
Understanding Learners' Motivation Through Machine Learning Analysis On Reflection Writing, Elizabeth Pluskwik, Yuezhou Wang, Lauren Singelmann
Integrated Engineering Department Publications
Educational data mining (EDM) is an emerging interdisciplinary field that utilizes a machine learning (ML) algorithm to collect and analyze educational data, aiming to better predict students' performance and retention. In this WIP paper, we report our methodology and preliminary results from utilizing a ML program to assess students’ motivation through their upper-division years in the XYZ project-based learning (PBL) program. ML, or more specifically, the clustering algorithm, opens the door to processing large amounts of student-written artifacts, such as reflection journals, project reports, and written assignments, and then identifies keywords that signal their levels of motivation (i.e., extrinsic vs. …
Using Machine Learning To Classify Volleyball Jumps, Miki Jauhiainen
Using Machine Learning To Classify Volleyball Jumps, Miki Jauhiainen
Theses and Dissertations
In this study, inertial measurement units (IMUs) were used to train a random forest classifier to correctly classify different jump types in volleyball. Athlete motion data were collected in a controlled setting using three IMUs, one on the waist and one on each ankle. There were 11 participants who at the time played volleyball at the collegiate level in the United States, seven male and four female. Each performed the same number of jumps across the eight jump types--five BASIC jumps and three each of the other seven--resulting in 26 jumps per subject for a total of 286. The data …
Deep Active Genetic Learning With Evidential Uncertainty For Agriculture Crops And Lake Water Quality Assessment, Oguz M. Aranay
Deep Active Genetic Learning With Evidential Uncertainty For Agriculture Crops And Lake Water Quality Assessment, Oguz M. Aranay
Legacy Theses & Dissertations (2009 - 2024)
Despite significant advancements in the field of machine learning, there are two issues that still require further exploration. First, how to learn from a small dataset; and second, how to select appropriate features from the data. Although there exist many techniques to address these issues, choosing a combination of the techniques from these two groups is challenging, and worth investigating. To address these concerns, this thesis presents a learning framework that is based on a deep learning model utilizing active learning (with evidential uncertainty as a basis for acquisition function) for the first issue and a genetic algorithm for the …
Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang
Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang
Legacy Theses & Dissertations (2009 - 2024)
Recently there are a considerable amount of work devoted to the study of the algorithmic stability as well as differential privacy (DP) for stochastic gradient methods (SGM). However, most of the existing work focus on the empirical risk minimization (ERM) and the population risk minimization problems. In this paper, we study two types of optimization problems that enjoy wide applications in modern machine learning, namely the minimax problem and the pairwise learning problem.
Data Collection And Machine Learning Methods For Automated Pedestrian Facility Detection And Mensuration, Joseph Bailey Luttrell Iv
Data Collection And Machine Learning Methods For Automated Pedestrian Facility Detection And Mensuration, Joseph Bailey Luttrell Iv
Dissertations
Large-scale collection of pedestrian facility (crosswalks, sidewalks, etc.) presence data is vital to the success of efforts to improve pedestrian facility management, safety analysis, and road network planning. However, this kind of data is typically not available on a large scale due to the high labor and time costs that are the result of relying on manual data collection methods. Therefore, methods for automating this process using techniques such as machine learning are currently being explored by researchers. In our work, we mainly focus on machine learning methods for the detection of crosswalks and sidewalks from both aerial and street-view …
Solving The Challenges Of Concept Drift In Data Stream Classification., Hanqing Hu
Solving The Challenges Of Concept Drift In Data Stream Classification., Hanqing Hu
Electronic Theses and Dissertations
The rise of network connected devices and applications leads to a significant increase in the volume of data that are continuously generated overtime time, called data streams. In real world applications, storing the entirety of a data stream for analyzing later is often not practical, due to the data stream’s potentially infinite volume. Data stream mining techniques and frameworks are therefore created to analyze streaming data as they arrive. However, compared to traditional data mining techniques, challenges unique to data stream mining also emerge, due to the high arrival rate of data streams and their dynamic nature. In this dissertation, …
Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen
Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen
All Graduate Theses and Dissertations, Spring 1920 to Summer 2023
A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.
The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …
Secrep : A Framework For Automating The Extraction And Prioritization Of Security Requirements Using Machine Learning And Nlp Techniques, Shada Khanneh
Theses, Dissertations and Culminating Projects
Gathering and extracting security requirements adequately requires extensive effort, experience, and time, as large amounts of data need to be analyzed. While many manual and academic approaches have been developed to tackle the discipline of Security Requirements Engineering (SRE), a need still exists for automating the SRE process. This need stems mainly from the difficult, error-prone, and time-consuming nature of traditional and manual frameworks. Machine learning techniques have been widely used to facilitate and automate the extraction of useful information from software requirements documents and artifacts. Such approaches can be utilized to yield beneficial results in automating the process of …
Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen
Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen
Research Collection School Of Computing and Information Systems
This research investigates changes in online behavior of users who publish in multiple communities on Reddit by measuring their toxicity at two levels. With the aid of crowdsourcing, we built a labeled dataset of 10,083 Reddit comments, then used the dataset to train and fine-tune a Bidirectional Encoder Representations from Transformers (BERT) neural network model. The model predicted the toxicity levels of 87,376,912 posts from 577,835 users and 2,205,581,786 comments from 890,913 users on Reddit over 16 years, from 2005 to 2020. This study utilized the toxicity levels of user content to identify toxicity changes by the user within the …
Data-Driven Research On Engineering Design Thinking And Behaviors In Computer-Aided Systems Design: Analysis, Modeling, And Prediction, Molla Hafizur Rahman
Data-Driven Research On Engineering Design Thinking And Behaviors In Computer-Aided Systems Design: Analysis, Modeling, And Prediction, Molla Hafizur Rahman
Graduate Theses and Dissertations
Research on design thinking and design decision-making is vital for discovering and utilizing beneficial design patterns, strategies, and heuristics of human designers in solving engineering design problems. It is also essential for the development of new algorithms embedded with human intelligence and can facilitate human-computer interactions. However, modeling design thinking is challenging because it takes place in the designer’s mind, which is intricate, implicit, and tacit. For an in-depth understanding of design thinking, fine-grained design behavioral data are important because they are the critical link in studying the relationship between design thinking, design decisions, design actions, and design performance. Therefore, …
Directed Acyclic Graph-Based Neural Networks For Tunable Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Xiao Hu, George K. Thiruvathukal, James C. Davis, Yung-Hisang Lu
Directed Acyclic Graph-Based Neural Networks For Tunable Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Xiao Hu, George K. Thiruvathukal, James C. Davis, Yung-Hisang Lu
Computer Science: Faculty Publications and Other Works
Processing visual data on mobile devices has many applications, e.g., emergency response and tracking. State-of-the-art computer vision techniques rely on large Deep Neural Networks (DNNs) that are usually too power-hungry to be deployed on resource-constrained edge devices. Many techniques improve DNN efficiency of DNNs by compromising accuracy. However, the accuracy and efficiency of these techniques cannot be adapted for diverse edge applications with different hardware constraints and accuracy requirements. This paper demonstrates that a recent, efficient tree-based DNN architecture, called the hierarchical DNN, can be converted into a Directed Acyclic Graph-based (DAG) architecture to provide tunable accuracy-efficiency tradeoff options. We …
Towards Making Transformer-Based Language Models Learn How Children Learn, Yousra Mahdy
Towards Making Transformer-Based Language Models Learn How Children Learn, Yousra Mahdy
Boise State University Theses and Dissertations
Transformer-based Language Models (LMs), learn contextual meanings for words using a huge amount of unlabeled text data. These models show outstanding performance on various Natural Language Processing (NLP) tasks. However, what the LMs learn is far from what the meaning is for humans, partly due to the fact that humans can differentiate between concrete and abstract words, but language models make no distinction. Concrete words are words that have a physical representation in the world such as “chair”, while abstract words are ideas such as “democracy”. The process of learning word meanings starts from early childhood when children acquire their …
Emotion Detection Using An Ensemble Model Trained With Physiological Signals And Inferred Arousal-Valence States, Matthew Nathanael Gray
Emotion Detection Using An Ensemble Model Trained With Physiological Signals And Inferred Arousal-Valence States, Matthew Nathanael Gray
Electrical & Computer Engineering Theses & Dissertations
Affective computing is an exciting and transformative field that is gaining in popularity among psychologists, statisticians, and computer scientists. The ability of a machine to infer human emotion and mood, i.e. affective states, has the potential to greatly improve human-machine interaction in our increasingly digital world. In this work, an ensemble model methodology for detecting human emotions across multiple subjects is outlined. The Continuously Annotated Signals of Emotion (CASE) dataset, which is a dataset of physiological signals labeled with discrete emotions from video stimuli as well as subject-reported continuous emotions, arousal and valence, from the circumplex model, is used for …
Developing Artificial Intelligence And Machine Learning To Support Primary Care Research And Practice, Jacqueline K. Kueper
Developing Artificial Intelligence And Machine Learning To Support Primary Care Research And Practice, Jacqueline K. Kueper
Electronic Thesis and Dissertation Repository
This thesis was motivated by the potential to use "everyday data", especially that collected in electronic health records (EHRs) as part of healthcare delivery, to improve primary care for clients facing complex clinical and/or social situations. Artificial intelligence (AI) techniques can identify patterns or make predictions with these data, producing information to learn about and inform care delivery. Our first objective was to understand and critique the body of literature on AI and primary care. This was achieved through a scoping review wherein we found the field was at an early stage of maturity, primarily focused on clinical decision support …
Profiling A Community-Specific Function Landscape For Bacterial Peptides Through Protein-Level Meta-Assembly And Machine Learning, Mitra Vajjala, Brady Johnson, Lauren Kasparek, Michael Leuze, Qiuming Yao
Profiling A Community-Specific Function Landscape For Bacterial Peptides Through Protein-Level Meta-Assembly And Machine Learning, Mitra Vajjala, Brady Johnson, Lauren Kasparek, Michael Leuze, Qiuming Yao
School of Computing: Faculty Publications
Small proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulating other protein activities, controlling cell cycles, and affecting disease physiology. In prokaryotes such as bacteria, the small proteins are largely unexplored for their sequence space and functional groups. For most bacterial species from a natural community, the sample cannot be easily isolated or cultured, and the bacterial peptides must be better characterized in a metagenomic manner. The bacterial …
Robustar: Interactive Toolbox Supporting Precise Data Annotation For Robust Vision Learning, Chonghan Chen, Haohan Wang, Leyang Hu, Yuhao Zhang, Shuguang Lyu, Jingcheng Wu, Xinnuo Li, Linjing Sun, Eric Xing
Robustar: Interactive Toolbox Supporting Precise Data Annotation For Robust Vision Learning, Chonghan Chen, Haohan Wang, Leyang Hu, Yuhao Zhang, Shuguang Lyu, Jingcheng Wu, Xinnuo Li, Linjing Sun, Eric Xing
Machine Learning Faculty Publications
We introduce the initial release of our software Robustar, which aims to improve the robustness of vision classification machine learning models through a data-driven perspective. Building upon the recent understanding that the lack of machine learning model’s robustness is the tendency of the model’s learning of spurious features, we aim to solve this problem from its root at the data perspective by removing the spurious features from the data before training. In particular, we introduce a software that helps the users to better prepare the data for training image classification models by allowing the users to annotate the spurious features …
Reconstructing Historical Earthquake-Induced Tsunamis: Case Study Of 1820 Event Near South Sulawesi, Indonesia, Taylor Jole Paskett
Reconstructing Historical Earthquake-Induced Tsunamis: Case Study Of 1820 Event Near South Sulawesi, Indonesia, Taylor Jole Paskett
Theses and Dissertations
We build on the method introduced by Ringer, et al., applying it to an 1820 event that happened near South Sulawesi, Indonesia. We utilize other statistical models to aid our Metropolis-Hastings sampler, including a Gaussian process which informs the prior. We apply the method to multiple possible fault zones to determine which fault is the most likely source of the earthquake and tsunami. After collecting nearly 80,000 samples, we find that between the two most likely fault zones, the Walanae fault zone matches the anecdotal accounts much better than Flores. However, to support the anecdotal data, both samplers tend toward …
Learning From Machines: Insights In Forest Transpiration Using Machine Learning Methods, Morgan Tholl
Learning From Machines: Insights In Forest Transpiration Using Machine Learning Methods, Morgan Tholl
Dissertations and Theses
Machine learning has been used as a tool to model transpiration for individual sites, but few models are capable of generalizing to new locations without calibration to site data. Using the global SAPFLUXNET database, 95 tree sap flow data sites were grouped using three clustering strategies: by biome, by tree functional type, and through use of a k-means unsupervised clustering algorithm. Two supervised machine learning algorithms, a random forest algorithm and a neural network algorithm, were used to build machine learning models that predicted transpiration for each cluster. The performance and feature importance in each model were analyzed and compared …