Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (648)
- Artificial Intelligence and Robotics (297)
- Engineering (201)
- Data Science (148)
- Statistics and Probability (88)
-
- Computer Engineering (74)
- Databases and Information Systems (57)
- Electrical and Computer Engineering (53)
- Social and Behavioral Sciences (53)
- Other Computer Sciences (51)
- Life Sciences (47)
- Medicine and Health Sciences (45)
- Mathematics (43)
- Software Engineering (43)
- Theory and Algorithms (42)
- Applied Mathematics (40)
- Numerical Analysis and Scientific Computing (40)
- Information Security (33)
- Physics (30)
- Business (26)
- Earth Sciences (24)
- Bioinformatics (23)
- Statistical Models (23)
- Applied Statistics (22)
- Environmental Sciences (19)
- Graphics and Human Computer Interfaces (18)
- Mechanical Engineering (17)
- Operations Research, Systems Engineering and Industrial Engineering (16)
- Chemistry (15)
- Institution
-
- Singapore Management University (30)
- California Polytechnic State University, San Luis Obispo (28)
- Southern Methodist University (28)
- Western University (28)
- University of Texas at El Paso (27)
-
- Technological University Dublin (26)
- San Jose State University (25)
- University of South Florida (23)
- University of Wisconsin Milwaukee (23)
- University of Kentucky (22)
- City University of New York (CUNY) (20)
- Missouri University of Science and Technology (19)
- West Virginia University (19)
- University of Tennessee, Knoxville (18)
- Dartmouth College (17)
- University of Arkansas, Fayetteville (17)
- University of Nebraska - Lincoln (16)
- Utah State University (16)
- Northern Illinois University (15)
- Washington University in St. Louis (15)
- Wright State University (15)
- Claremont Colleges (14)
- University of South Carolina (12)
- Chapman University (11)
- Kennesaw State University (11)
- Selected Works (11)
- University of Nevada, Las Vegas (11)
- Virginia Commonwealth University (11)
- Clemson University (10)
- Purdue University (9)
- Publication Year
- Publication
-
- Theses and Dissertations (58)
- SMU Data Science Review (28)
- Open Access Theses & Dissertations (27)
- Master's Theses (25)
- Research Collection School Of Computing and Information Systems (25)
-
- Electronic Theses and Dissertations (24)
- Electronic Thesis and Dissertation Repository (24)
- Master's Projects (23)
- USF Tampa Graduate Theses and Dissertations (23)
- Doctoral Dissertations (19)
- Graduate Theses, Dissertations, and Problem Reports (18)
- Graduate Theses and Dissertations (15)
- Conference papers (14)
- Dissertations (14)
- Graduate Research Theses & Dissertations (13)
- McKelvey School of Engineering Theses & Dissertations (13)
- Browse all Theses and Dissertations (12)
- Masters Theses (12)
- Dissertations, Theses, and Capstone Projects (11)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (10)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (10)
- CCE Theses and Dissertations (8)
- CMC Senior Theses (8)
- Dissertations and Theses (8)
- Electronic Theses, Projects, and Dissertations (8)
- Theses and Dissertations--Computer Science (8)
- Computer Science Senior Theses (7)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (7)
- Dissertations, Master's Theses and Master's Reports (7)
- FIU Electronic Theses and Dissertations (7)
- Publication Type
- File Type
Articles 751 - 780 of 826
Full-Text Articles in Physical Sciences and Mathematics
The New Issues In Classification Problems, Md Mahmudul Hasan
The New Issues In Classification Problems, Md Mahmudul Hasan
Open Access Theses & Dissertations
The data involved with science and engineering getting bigger everyday. To study and organize a big amount of data is difficult without classification. In machine learning, classification is the problem of identifying a given data from a set of categories. There are several classification technique people using to classify a given data. In our work we present a sparse representation technique to perform classification. The popularity of this technique motivates us to use on our collected samples. To find a sparse representation, we used an $l_1$-minimization algorithm which is a convex relaxation algorithm proven very efficient by researchers. The purpose …
Evaluation Of Supervised Machine Learning For Classifying Video Traffic, Farrell R. Taylor
Evaluation Of Supervised Machine Learning For Classifying Video Traffic, Farrell R. Taylor
CCE Theses and Dissertations
Operational deployment of machine learning based classifiers in real-world networks has become an important area of research to support automated real-time quality of service decisions by Internet service providers (ISPs) and more generally, network administrators. As the Internet has evolved, multimedia applications, such as voice over Internet protocol (VoIP), gaming, and video streaming, have become commonplace. These traffic types are sensitive to network perturbations, e.g. jitter and delay. Automated quality of service (QoS) capabilities offer a degree of relief by prioritizing network traffic without human intervention; however, they rely on the integration of real-time traffic classification to identify applications. Accordingly, …
Using Diversity Ensembles With Time Limits To Handle Concept Drift, Robert M. Van Camp
Using Diversity Ensembles With Time Limits To Handle Concept Drift, Robert M. Van Camp
CCE Theses and Dissertations
While traditional supervised learning focuses on static datasets, an increasing amount of data comes in the form of streams, where data is continuous and typically processed only once. A common problem with data streams is that the underlying concept we are trying to learn can be constantly evolving. This concept drift has been of interest to researchers the last few years and there is a need for improved machine learning algorithms that are capable of dealing with concept drifts. A promising approach involves using an ensemble of a diverse set of classifiers. The constituent classifiers are re-trained when a concept …
Radical Recognition In Off-Line Handwritten Chinese Characters Using Non-Negative Matrix Factorization, Xiangying Shuai
Radical Recognition In Off-Line Handwritten Chinese Characters Using Non-Negative Matrix Factorization, Xiangying Shuai
Senior Projects Spring 2016
In the past decade, handwritten Chinese character recognition has received renewed interest with the emergence of touch screen devices. Other popular applications include on-line Chinese character dictionary look-up and visual translation in mobile phone applications. Due to the complex structure of Chinese characters, this classification task is not exactly an easy one, as it involves knowledge from mathematics, computer science, and linguistics.
Given a large image database of handwritten character data, the goal of my senior project is to use Non-Negative Matrix Factorization (NMF), a recent method for finding a suitable representation (parts-based representation) of image data, to detect specific …
Idiom Token Classification Using Sentential Distributed Semantics, Giancarlo Salton, Robert J. Ross, John D. Kelleher
Idiom Token Classification Using Sentential Distributed Semantics, Giancarlo Salton, Robert J. Ross, John D. Kelleher
Conference papers
Idiom token classification is the task of deciding for a set of potentially idiomatic phrases whether each occurrence of a phrase is a literal or idiomatic usage of the phrase. In this work we explore the use of Skip-Thought Vectors to create distributed representations that encode features that are predictive with respect to idiom token classification. We show that classifiers using these representations have competitive performance compared with the state of the art in idiom token classification. Importantly, however, our models use only the sentence containing the tar- get phrase as input and are thus less dependent on a potentially …
Email Similarity Matching And Automatic Reply Generation Using Statistical Topic Modeling And Machine Learning, Zachery L. Schiller
Email Similarity Matching And Automatic Reply Generation Using Statistical Topic Modeling And Machine Learning, Zachery L. Schiller
Electronic Theses and Dissertations
Responding to email is a time-consuming task that is a requirement for most professions. Many people find themselves answering the same questions over and over, repeatedly replying with answers they have written previously either in whole or in part. In this thesis, the Automatic Mail Reply (AMR) system is implemented to help with repeated email response creation. The system uses past email interactions and, through unsupervised statistical learning, attempts to recover relevant information to give to the user to assist in writing their reply.
Three statistical learning models, term frequency-inverse document frequency (tf-idf), Latent Semantic Analysis (LSA), and Latent Dirichlet …
Optical Spectroscopy And Chemometrics For Discrimination Of Dyed Textile Fibers And Magnetic Audio Tapes, Nathan C. Fuenffinger
Optical Spectroscopy And Chemometrics For Discrimination Of Dyed Textile Fibers And Magnetic Audio Tapes, Nathan C. Fuenffinger
Theses and Dissertations
This dissertation focuses on the application of both novel and standard chemometric approaches toward societal problems of interest in the areas of forensic science and cultural heritage preservation. Microspectrophotometry (MSP), a technique enabling measurements of absorption of electromagnetic radiation by microscopic materials in the ultraviolet-visible (UV-Vis) region, is widely used by forensic examiners for comparisons of metameric textile fibers. These comparisons are often hindered, however, by the raw or normalized spectra showing little detail or having few points of comparison. Derivative preprocessing can enhance structure in some instances. We have demonstrated through the use of multivariate statistics that derivatives are …
Neuron Clustering For Mitigating Catastrophic Forgetting In Supervised And Reinforcement Learning, Benjamin Frederick Goodrich
Neuron Clustering For Mitigating Catastrophic Forgetting In Supervised And Reinforcement Learning, Benjamin Frederick Goodrich
Doctoral Dissertations
Neural networks have had many great successes in recent years, particularly with the advent of deep learning and many novel training techniques. One issue that has affected neural networks and prevented them from performing well in more realistic online environments is that of catastrophic forgetting. Catastrophic forgetting affects supervised learning systems when input samples are temporally correlated or are non-stationary. However, most real-world problems are non-stationary in nature, resulting in prolonged periods of time separating inputs drawn from different regions of the input space.
Reinforcement learning represents a worst-case scenario when it comes to precipitating catastrophic forgetting in neural networks. …
Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai
Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai
Electronic Theses and Dissertations
Traditional approaches to predicting financial market dynamics tend to be linear and stationary, whereas financial time series data is increasingly nonlinear and non-stationary. Lately, advances in dynamical systems theory have enabled the extraction of complex dynamics from time series data. These developments include theory of time delay embedding and phase space reconstruction of dynamical systems from a scalar time series. In this thesis, a time delay embedding approach for predicting intraday stock or stock index movement is developed. The approach combines methods of nonlinear time series analysis with those of causality testing, theory of dynamical systems and machine learning (artificial …
Ensemble Learning Method On Machine Maintenance Data, Xiaochuang Zhao
Ensemble Learning Method On Machine Maintenance Data, Xiaochuang Zhao
USF Tampa Graduate Theses and Dissertations
In the industry, a lot of companies are facing the explosion of big data. With this much information stored, companies want to make sense of the data and use it to help them for better decision making, especially for future prediction. A lot of money can be saved and huge revenue can be generated with the power of big data. When building statistical learning models for prediction, companies in the industry are aiming to build models with efficiency and high accuracy. After the learning models have been developed for production, new data will be generated. With the updated data, the …
Sudden Cardiac Arrest Prediction Through Heart Rate Variability Analysis, Luke Joseph Plewa
Sudden Cardiac Arrest Prediction Through Heart Rate Variability Analysis, Luke Joseph Plewa
Master's Theses
The increase in popularity for wearable technologies (see: Apple Watch and Microsoft Band) has opened the door for an Internet of Things solution to healthcare. One of the most prevalent healthcare problems today is the poor survival rate of out-of hospital sudden cardiac arrests (9.5% on 360,000 cases in the USA in 2013). It has been proven that heart rate derived features can give an early indicator of sudden cardiac arrest, and that providing an early warning has the potential to save many lives. Many of these new wearable devices are capable of providing this warning through their heart rate …
Application Of Machine Learning To Mapping And Simulating Gene Regulatory Networks, Hien-Haw Liow
Application Of Machine Learning To Mapping And Simulating Gene Regulatory Networks, Hien-Haw Liow
Arts & Sciences Electronic Theses and Dissertations
This dissertation explores, proposes, and examines methods of applying modernmachine learning and Bayesian statistics in the quantitative and qualitative modeling of gene regulatory networks using high-throughput gene expression data. A semi-parametric Bayesian model based on random forest is developed to infer quantitative aspects of gene regulation relations; a parametric model is developed to predict geneexpression levels solely from genotype information. Simulation of network behavior is shown to complement regression analysis greatly in capturing the dynamics of gene regulatory networks. Finally, as an application and extension of novel approaches in gene expression analysis, new methods of discovering topological structure of gene …
Modeling Visual Features To Recognize Biological Motion: A Developmental Approach, Giulio Sandini, Nicoletta Noceti, Alessia Vignolo, Alessandra Sciutti, Francesco Rea, Alessandro Verri, Francesca Odone
Modeling Visual Features To Recognize Biological Motion: A Developmental Approach, Giulio Sandini, Nicoletta Noceti, Alessia Vignolo, Alessandra Sciutti, Francesco Rea, Alessandro Verri, Francesca Odone
MODVIS Workshop
In this work we deal with the problem of designing and developing computational vision models – comparable to the early stages of the human development – using coarse low-level information.
More specifically, we consider a binary classification setting to characterize biological movements with respect to non-biological dynamic events. To this purpose, our model builds on top of the optical flow estimation, and abstract the representation to simulate the limited amount of visual information available at birth. We take inspiration from known biological motion regularities explained by the Two-Thirds Power Law, and design a motion representation that includes different low-level features, …
Hybrid Agent Based Simulation With Adaptive Learning Of Travel Mode Choices For University Commuters (Wip), Nagesh Shukla, Albert Munoz, Jun Ma, Nam Huynh
Hybrid Agent Based Simulation With Adaptive Learning Of Travel Mode Choices For University Commuters (Wip), Nagesh Shukla, Albert Munoz, Jun Ma, Nam Huynh
Nagesh Shukla
This paper presents a methodology for developing a hybrid agent-based micro-simulation model to capture the impacts of commuter travel mode choices on a University campus transport network. The proposed methodology involves: (i) developing realistic population of commuter agents (students and staff); (ii) assigning activity lists and travel mode choices to agents using machine learning method; and, (iii) traffic micro-simulation of the study area transport network. This furthers the understanding of current transport modal distributions, factors affecting the travel mode choice decisions, and, network performance through a number of hypothetical travel scenarios.
Geological Object Recognition In Extraterrestrial Environments, Gregory M. Elfers
Geological Object Recognition In Extraterrestrial Environments, Gregory M. Elfers
Electronic Thesis and Dissertation Repository
On July 4 1997, the landing of NASA’s Pathnder probe and its rover Sojourner marked the beginning of a new era in space exploration; robots with the ability to move have made up the vanguard of human extraterrestrial exploration ever since. With Sojourners landing, for the rst time, a ground traversing robot was at a distance too far from earth to make direct human control practical. This has given rise to the development of autonomous systems to improve the e?ciency of these robots,in both their ability to move,and their ability to make decisions regarding their environment. Computer Vision comprises a …
Evaluating Defect Prediction Using A Massive Set Of Metrics, Xiao Xuan, David Lo, Xin Xia, Yuan Tian
Evaluating Defect Prediction Using A Massive Set Of Metrics, Xiao Xuan, David Lo, Xin Xia, Yuan Tian
Research Collection School Of Computing and Information Systems
To evaluate the performance of a within-project defect prediction approach, people normally use precision, recall, and F-measure scores. However, in machine learning literature, there are a large number of evaluation metrics to evaluate the performance of an algorithm, (e.g., Matthews Correlation Coefficient, G-means, etc.), and these metrics evaluate an approach from different aspects. In this paper, we investigate the performance of within-project defect prediction approaches on a large number of evaluation metrics. We choose 6 state-of-the-art approaches including naive Bayes, decision tree, logistic regression, kNN, random forest and Bayesian network which are widely used in defect prediction literature. And we …
Machine Learning For Predicting Soil Classes In Three Semi-Arid Landscapes, Colby W. Brungard, Janis L. Boettinger, Michael C. Duniway, Skye A. Wills, Thomas C. Edwards Jr.
Machine Learning For Predicting Soil Classes In Three Semi-Arid Landscapes, Colby W. Brungard, Janis L. Boettinger, Michael C. Duniway, Skye A. Wills, Thomas C. Edwards Jr.
Plants, Soils, and Climate Faculty Publications
Mapping the spatial distribution of soil taxonomic classes is important for informing soil use and management decisions. Digital soil mapping (DSM) can quantitatively predict the spatial distribution of soil taxonomic classes. Key components of DSM are the method and the set of environmental covariates used to predict soil classes. Machine learning is a general term for a broad set of statistical modeling techniques. Many different machine learning models have been applied in the literature and there are different approaches for selecting covariates for DSM. However, there is little guidance as to which, if any, machine learning model and covariate set …
Effective Auto Encoder For Unsupervised Sparse Representation, Faria Mahnaz
Effective Auto Encoder For Unsupervised Sparse Representation, Faria Mahnaz
Wayne State University Theses
High dimensionality and the sheer size of unlabeled data available today demand
new development in unsupervised learning of sparse representation. Despite of recent
advances in representation learning, most of the current methods are limited when
dealing with large scale unlabeled data. In this study, we propose a new unsupervised
method that is able to learn sparse representation from unlabeled data efficiently. We
derive a closed-form solution based on the sequential minimal optimization (SMO)
for training an auto encoder-decoder module, which efficiently extracts sparse and
compact features from any data set with various size. The inference process in the
proposed learning …
Unsupervised Learning And Image Classification In High Performance Computing Cluster, Itauma Itauma
Unsupervised Learning And Image Classification In High Performance Computing Cluster, Itauma Itauma
Wayne State University Theses
Feature learning and object classification in machine learning have become very active research areas in recent decades. Identifying good features has various benefits for object classification in respect to reducing the computational cost and increasing the classification accuracy. In addition, many research studies have focused on the use of Graphics Processing Units (GPUs) to improve the training time for machine learning algorithms. In this study, the use of an alternative platform, called High Performance Computing Cluster (HPCC), to handle unsupervised feature learning, image and speech classification and improve the computational cost is proposed.
HPCC is a Big Data processing and …
Novel Classification Of Slow Movement Objects In Urban Traffic Environments Using Wideband Pulse Doppler Radar, Berta Rodriguez Hervas
Novel Classification Of Slow Movement Objects In Urban Traffic Environments Using Wideband Pulse Doppler Radar, Berta Rodriguez Hervas
Open Access Theses & Dissertations
Every year thousands of people are involved in traffic accidents, some of which are fatal. An important percentage of these fatalities are caused by human error, which could be prevented by increasing the awareness of drivers and the autonomy of vehicles. Since driver assistance systems have the potential to positively impact tens of millions of people, the purpose of this research is to study the micro-Doppler characteristics of vulnerable urban traffic components, i.e. pedestrians and bicyclists, based on information obtained from radar backscatter, and to develop a classification technique that allows automatic target recognition with a vehicle integrated system. For …
Features For Ranking Tweets Based On Credibility And Newsworthiness, Jacob W. Ross
Features For Ranking Tweets Based On Credibility And Newsworthiness, Jacob W. Ross
Browse all Theses and Dissertations
We create a robust and general feature set for learning to rank algorithms that rank tweets based on credibility and newsworthiness. In previous works, it has been demonstrated that when the training and testing data are from two distinct time periods, the ranker performs poorly. We improve upon previous work by creating a feature set that does not over fit a particular year or set of topics. This is critical given how people utilize social media changes as time progresses, and the topics discussed vary. In addition, we are constantly gaining new tweet data. Thus, it is important to be …
Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao
Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao
Theses and Dissertations
Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …
Lexical Mechanics: Partitions, Mixtures, And Context, Jake Ryland Williams
Lexical Mechanics: Partitions, Mixtures, And Context, Jake Ryland Williams
Graduate College Dissertations and Theses
Highly structured for efficient communication, natural languages are complex systems. Unlike in their computational cousins, functions and meanings in natural languages are relative, frequently prescribed to symbols through unexpected social processes. Despite grammar and definition, the presence of metaphor can leave unwitting language users "in the dark," so to speak. This is not problematic, but rather an important operational feature of languages, since the lifting of meaning onto higher-order structures allows individuals to compress descriptions of regularly-conveyed information. This compressed terminology, often only appropriate when taken locally (in context), is beneficial in an enormous world of novel experience. However, what …
Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani
Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani
Browse all Theses and Dissertations
Regression and classification techniques play an essential role in many data mining tasks and have broad applications. However, most of the state-of-the-art regression and classification techniques are often unable to adequately model the interactions among predictor variables in highly heterogeneous datasets. New techniques that can effectively model such complex and heterogeneous structures are needed to significantly improve prediction accuracy. In this dissertation, we propose a novel type of accurate and interpretable regression and classification models, named as Pattern Aided Regression (PXR) and Pattern Aided Classification (PXC) respectively. Both PXR and PXC rely on identifying regions in the data space where …
Geographic Relevance For Travel Search: The 2014-2015 Harvey Mudd College Clinic Project For Expedia, Inc., Hannah Long
Geographic Relevance For Travel Search: The 2014-2015 Harvey Mudd College Clinic Project For Expedia, Inc., Hannah Long
Scripps Senior Theses
The purpose of this Clinic project is to help Expedia, Inc. expand the search capabilities it offers to its users. In particular, the goal is to help the company respond to unconstrained search queries by generating a method to associate hotels and regions around the world with the higher-level attributes that describe them, such as “family- friendly” or “culturally-rich.” Our team utilized machine-learning algorithms to extract metadata from textual data about hotels and cities. We focused on two machine-learning models: decision trees and Latent Dirichlet Allocation (LDA). The first appeared to be a promising approach, but would require more resources …
Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale
Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale
Kno.e.sis Publications
As the world population grows, recent climatic changes seem to bring powerful storms to populated areas. The impact of these storms on utility services is devastating. Hurricane Sandy is a recent example of the enormous damages that storms can inflict on infrastructure, society, and the economy. Quick response to these emergencies represents a big challenge to electric power utilities. Traditionally utilities develop preparedness plans for storm emergency situations based on the experience of utility experts and with limited use of historical data. With the advent of the Smart Grid, utilities are incorporating automation and sensing technologies in their grids and …
Identification Of Informativeness In Text Using Natural Language Stylometry, Rushdi Shams
Identification Of Informativeness In Text Using Natural Language Stylometry, Rushdi Shams
Electronic Thesis and Dissertation Repository
In this age of information overload, one experiences a rapidly growing over-abundance of written text. To assist with handling this bounty, this plethora of texts is now widely used to develop and optimize statistical natural language processing (NLP) systems. Surprisingly, the use of more fragments of text to train these statistical NLP systems may not necessarily lead to improved performance. We hypothesize that those fragments that help the most with training are those that contain the desired information. Therefore, determining informativeness in text has become a central issue in our view of NLP. Recent developments in this field have spawned …
Complex Network Analysis For Scientific Collaboration Prediction And Biological Hypothesis Generation, Qing Zhang
Complex Network Analysis For Scientific Collaboration Prediction And Biological Hypothesis Generation, Qing Zhang
Theses and Dissertations
With the rapid development of digitalized literature, more and more knowledge has been discovered by computational approaches. This thesis addresses the problem of link prediction in co-authorship networks and protein--protein interaction networks derived from the literature. These networks (and most other types of networks) are growing over time and we assume that a machine can learn from past link creations by examining the network status at the time of their creation. Our goal is to create a computationally efficient approach to recommend new links for a node in a network (e.g., new collaborations in co-authorship networks and new interactions in …
Element Detection In Japanese Comic Book Panels, Toshihiro Kuboi
Element Detection In Japanese Comic Book Panels, Toshihiro Kuboi
Master's Theses
Comic books are a unique and increasingly popular form of entertainment combining visual and textual elements of communication. This work pertains to making comic books more accessible. Specifically, this paper explains how we detect elements such as speech bubbles present in Japanese comic book panels. Some applications of the work presented in this paper are automatic detection of text and its transformation into audio or into other languages. Automatic detection of elements can also allow reasoning and analysis at a deeper semantic level than what’s possible today. Our approach uses an expert system and a machine learning system. The expert …
Predicting Music Genre Preferences Based On Online Comments, Andrew J. Sinclair
Predicting Music Genre Preferences Based On Online Comments, Andrew J. Sinclair
Master's Theses
Communication Accommodation Theory (CAT) states that individuals adapt to each other’s communicative behaviors. This adaptation is called “convergence.” In this work we explore the convergence of writing styles of users of the online music distribution plat- form SoundCloud.com. In order to evaluate our system we created a corpus of over 38,000 comments retrieved from SoundCloud in April 2014. The corpus represents comments from 8 distinct musical genres: Classical, Electronic, Hip Hop, Jazz, Country, Metal, Folk, and World. Our corpus contains: short comments, frequent misspellings, little sentence struc- ture, hashtags, emoticons, and URLs. We adapt techniques used by researchers analyzing other …