Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 811 - 826 of 826

Full-Text Articles in Physical Sciences and Mathematics

Data-Intensive Computing For Bioinformatics Using Virtualization Technologies And Hpc Infrastructures, Pengfei Xuan Dec 2011

Data-Intensive Computing For Bioinformatics Using Virtualization Technologies And Hpc Infrastructures, Pengfei Xuan

All Theses

The bioinformatics applications often involve many computational components and massive data sets, which are very difficult to be deployed on a single computing machine. In this thesis, we designed a data-intensive computing platform for bioinformatics applications using virtualization technologies and high performance computing (HPC) infrastructures with the concept of multi-tier architecture, which can seamlessly integrate the web user interface (presentation tier), scientific workflow (logic tier) and computing infrastructure (data/computing tier). We demonstrated our platform on two bioinformatics projects. First, we redesigned and deployed the cotton marker database (CMD) (http://www.cottonmarker.org), a centralized web portal in the cotton research community, using the …


Fast Parallel Machine Learning Algorithms For Large Datasets Using Graphic Processing Unit, Qi Li Nov 2011

Fast Parallel Machine Learning Algorithms For Large Datasets Using Graphic Processing Unit, Qi Li

Theses and Dissertations

This dissertation deals with developing parallel processing algorithms for Graphic Processing Unit (GPU) in order to solve machine learning problems for large datasets. In particular, it contributes to the development of fast GPU based algorithms for calculating distance (i.e. similarity, affinity, closeness) matrix. It also presents the algorithm and implementation of a fast parallel Support Vector Machine (SVM) using GPU. These application tools are developed using Compute Unified Device Architecture (CUDA), which is a popular software framework for General Purpose Computing using GPU (GPGPU). Distance calculation is the core part of all machine learning algorithms because the closer the query …


Multivalued Subsets Under Information Theory, Indraneel Dabhade Aug 2011

Multivalued Subsets Under Information Theory, Indraneel Dabhade

All Theses

In the fields of finance, engineering and varied sciences, Data Mining/ Machine Learning has held an eminent position in predictive analysis. Complex algorithms and adaptive decision models have contributed towards streamlining directed research as well as improve on the accuracies in forecasting. Researchers in the fields of mathematics and computer science have made significant contributions towards the development of this field. Classification based modeling, which holds a significant position amongst the different rule-based algorithms, is one of the most widely used decision making tools. The decision tree has a place of profound significance in classification-based modeling. A number of heuristics …


Samplerank: Training Factor Graphs With Atomic Gradients, Michael Wick, Khashayar Rohanimanesh, Kedar Bellare, Aron Culotta, Andrew Mccallum Jan 2011

Samplerank: Training Factor Graphs With Atomic Gradients, Michael Wick, Khashayar Rohanimanesh, Kedar Bellare, Aron Culotta, Andrew Mccallum

Andrew McCallum

We present SampleRank, an alternative to contrastive divergence (CD) for estimating parameters in complex graphical models. SampleRank harnesses a user-provided loss function to distribute stochastic gradients across an MCMC chain. As a result, parameter updates can be computed between arbitrary MCMC states. SampleRank is not only faster than CD, but also achieves better accuracy in practice (up to 23% error reduction on noun-phrase coreference).


Fundamental Work Toward An Image Processing-Empowered Dental Intelligent Educational System, Grace Olsen Apr 2010

Fundamental Work Toward An Image Processing-Empowered Dental Intelligent Educational System, Grace Olsen

Theses and Dissertations

Computer-aided education in dental schools is greatly needed in order to reduce the need for human instructors to provide guidance and feedback as students practice dental procedures. A portable computer-aided educational system with advanced digital image processing capabilities would be less expensive than current computer-aided dental educational systems and would also address some of their limitations. This dissertation outlines the development of novel components that would be part of such a system. This research includes the design of a novel image processing technique, the Directed Active Shape Model algorithm, which is used to locate the tooth and drilled preparation from …


Autonomous Geometric Precision Error Estimation In Low-Level Computer Vision Tasks, Andrés Corrada-Emmanuel, Howard Schultz Jul 2008

Autonomous Geometric Precision Error Estimation In Low-Level Computer Vision Tasks, Andrés Corrada-Emmanuel, Howard Schultz

Andrés Corrada-Emmanuel

Errors in map-making tasks using computer vision are sparse. We demonstrate this by considering the construction of digital elevation models that employ stereo matching algorithms to triangulate real-world points. This sparsity, coupled with a geometric theory of errors recently developed by the authors, allows for autonomous agents to calculate their own precision independently of ground truth. We connect these developments with recent advances in the mathematics of sparse signal reconstruction or compressed sensing. The theory presented here extends the autonomy of 3-D model reconstructions discovered in the 1990s to their errors.


Autonomous Estimates Of Horizontal Decorrelation Lengths For Digital Elevation Models, Andres Corrada-Emmanuel, Howard Schultz Jan 2008

Autonomous Estimates Of Horizontal Decorrelation Lengths For Digital Elevation Models, Andres Corrada-Emmanuel, Howard Schultz

Andrés Corrada-Emmanuel

The precision errors in a collection of digital elevation models (DEMs) can be estimated in the presence of large but sparse correlations even when no ground truth is known. We demonstrate this by considering the problem of how to estimate the horizontal decorrelation length of DEMs produced by an automatic photogrammetric process that relies on the epipolar constraint equations. The procedure is based on a set of autonomous elevation difference equations recently proposed by us. In this paper we show that these equations can only estimate the precision errors of DEMs. The accuracy errors are unknowable since there is no …


Data Mining Methods For Malware Detection, Muazzam Siddiqui Jan 2008

Data Mining Methods For Malware Detection, Muazzam Siddiqui

Electronic Theses and Dissertations

This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. The traditional approaches using signatures to detect malicious programs fails for the new and unknown malwares case, where signatures are not available. We present a data mining framework to detect malicious programs. We collected, analyzed and processed several thousand malicious and clean programs to find out the best features and build models that can classify a given program into a malware or a clean class. Our research is closely related to information retrieval …


Computational Intelligence Based Classifier Fusion Models For Biomedical Classification Applications, Xiujuan Chen Nov 2007

Computational Intelligence Based Classifier Fusion Models For Biomedical Classification Applications, Xiujuan Chen

Computer Science Dissertations

The generalization abilities of machine learning algorithms often depend on the algorithms’ initialization, parameter settings, training sets, or feature selections. For instance, SVM classifier performance largely relies on whether the selected kernel functions are suitable for real application data. To enhance the performance of individual classifiers, this dissertation proposes classifier fusion models using computational intelligence knowledge to combine different classifiers. The first fusion model called T1FFSVM combines multiple SVM classifiers through constructing a fuzzy logic system. T1FFSVM can be improved by tuning the fuzzy membership functions of linguistic variables using genetic algorithms. The improved model is called GFFSVM. To better …


Topic And Role Discovery In Social Networks With Experiments On Enron And Academic Email, Andrew Mccallum, Xuerui Wang, Andrés Corrada-Emmanuel Oct 2007

Topic And Role Discovery In Social Networks With Experiments On Enron And Academic Email, Andrew Mccallum, Xuerui Wang, Andrés Corrada-Emmanuel

Andrés Corrada-Emmanuel

Previous work in social network analysis (SNA) has modeled the existence of links from one entity to another, but not the attributes such as language content or topics on those links. We present the Author-Recipient-Topic (ART) model for social network analysis, which learns topic distributions based on the direction-sensitive messages sent between entities. The model builds on Latent Dirichlet Allocation (LDA) and the Author-Topic (AT) model, adding the key attribute that distribution over topics is conditioned distinctly on both the sender and recipient---steering the discovery of topics according to the relationships between people. We give results on both the Enron …


Pedagogical Possibilities For The N-Puzzle Problem, Zdravko Markov, Ingrid Russell, Todd W. Neller, Neli Zlatareva Oct 2006

Pedagogical Possibilities For The N-Puzzle Problem, Zdravko Markov, Ingrid Russell, Todd W. Neller, Neli Zlatareva

Computer Science Faculty Publications

In this paper we present work on a project funded by the National Science Foundation with a goal of unifying the Artificial Intelligence (AI) course around the theme of machine learning. Our work involves the development and testing of an adaptable framework for the presentation of core AI topics that emphasizes the relationship between AI and computer science. Several hands-on laboratory projects that can be closely integrated into an introductory AI course have been developed. We present an overview of one of the projects and describe the associated curricular materials that have been developed. The project uses machine learning as …


Granular Support Vector Machines Based On Granular Computing, Soft Computing And Statistical Learning, Yuchun Tang May 2006

Granular Support Vector Machines Based On Granular Computing, Soft Computing And Statistical Learning, Yuchun Tang

Computer Science Dissertations

With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems. In general, GSVM works in 3 steps. Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space. Step 2 is modeling Support …


Enhancing Undergraduate Ai Courses Through Machine Learning Projects, Ingrid Russell, Zdravko Markov, Todd W. Neller, Susan Coleman Oct 2005

Enhancing Undergraduate Ai Courses Through Machine Learning Projects, Ingrid Russell, Zdravko Markov, Todd W. Neller, Susan Coleman

Computer Science Faculty Publications

It is generally recognized that an undergraduate introductory Artificial Intelligence course is challenging to teach. This is, in part, due to the diverse and seemingly disconnected core topics that are typically covered. The paper presents work funded by the National Science Foundation to address this problem and to enhance the student learning experience in the course. Our work involves the development of an adaptable framework for the presentation of core AI topics through a unifying theme of machine learning. A suite of hands-on semester-long projects are developed, each involving the design and implementation of a learning system that enhances a …


Multizoom Activity Recognition Using Machine Learning, Raymond Smith Jan 2005

Multizoom Activity Recognition Using Machine Learning, Raymond Smith

Electronic Theses and Dissertations

In this thesis we present a system for detection of events in video. First a multiview approach to automatically detect and track heads and hands in a scene is described. Then, by making use of epipolar, spatial, trajectory, and appearance constraints, objects are labeled consistently across cameras (zooms). Finally, we demonstrate a new machine learning paradigm, TemporalBoost, that can recognize events in video. One aspect of any machine learning algorithm is in the feature set used. The approach taken here is to build a large set of activity features, though TemporalBoost itself is able to work with any feature set …


Unifying An Introduction To Artificial Intelligence Course Through Machine Learning Laboratory Experiences, Ingrid Russell, Zdravko Markov, Todd W. Neller, Michael Georgiopoulos, Susan Coleman Jan 2005

Unifying An Introduction To Artificial Intelligence Course Through Machine Learning Laboratory Experiences, Ingrid Russell, Zdravko Markov, Todd W. Neller, Michael Georgiopoulos, Susan Coleman

Computer Science Faculty Publications

This paper presents work on a collaborative project funded by the National Science Foundation that incorporates machine learning as a unifying theme to teach fundamental concepts typically covered in the introductory Artificial Intelligence courses. The project involves the development of an adaptable framework for the presentation of core AI topics. This is accomplished through the development, implementation, and testing of a suite of adaptable, hands-on laboratory projects that can be closely integrated into the AI course. Through the design and implementation of learning systems that enhance commonly-deployed applications, our model acknowledges that intelligent systems are best taught through their application …


Pattern Recognition Via Machine Learning With Genetic Decision-Programming, Carl C. Hoff Jan 2005

Pattern Recognition Via Machine Learning With Genetic Decision-Programming, Carl C. Hoff

Browse all Theses and Dissertations

In the intersection of pattern recognition, machine learning, and evolutionary computation is a new search technique by which computers might program themselves. That technique is called genetic decision-programming. A computer can gain the ability to distinguish among the things that it needs to recognize by using genetic decision-programming for pattern discovery and concept learning. Those patterns and concepts can be easily encoded in the spines of a decision program (tree or diagram). A spine consists of two parts: (1) the test-outcome pairs along a path from the program's root to any of its leaves and (2) the conclusion in that …