Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1651 - 1680 of 1686

Full-Text Articles in Physical Sciences and Mathematics

Svm-Based Negative Data Mining To Binary Classification, Fuhua Jiang Aug 2006

Svm-Based Negative Data Mining To Binary Classification, Fuhua Jiang

Computer Science Dissertations

The properties of training data set such as size, distribution and the number of attributes significantly contribute to the generalization error of a learning machine. A not well-distributed data set is prone to lead to a partial overfitting model. Two approaches proposed in this dissertation for the binary classification enhance useful data information by mining negative data. First, an error driven compensating hypothesis approach is based on Support Vector Machines (SVMs) with (1+k)-iteration learning, where the base learning hypothesis is iteratively compensated k times. This approach produces a new hypothesis on the new data set in which each label is …


Particle Swarm Optimization In Dynamic Pricing, Christopher K. Monson, Patrick B. Mullen, Kevin Seppi, Sean C. Warnick Jul 2006

Particle Swarm Optimization In Dynamic Pricing, Christopher K. Monson, Patrick B. Mullen, Kevin Seppi, Sean C. Warnick

Faculty Publications

Dynamic pricing is a real-time machine learning problem with scarce prior data and a concrete learning cost. While the Kalman Filter can be employed to track hidden demand parameters and extensions to it can facilitate exploration for faster learning, the exploratory nature of Particle Swarm Optimization makes it a natural choice for the dynamic pricing problem. We compare both the Kalman Filter and existing particle swarm adaptations for dynamic and/or noisy environments with a novel approach that time-decays each particle's previous best value; this new strategy provides more graceful and effective transitions between exploitation and exploration, a necessity in the …


Temporal Data Mining In A Dynamic Feature Space, Brent K. Wenerstrom May 2006

Temporal Data Mining In A Dynamic Feature Space, Brent K. Wenerstrom

Theses and Dissertations

Many interesting real-world applications for temporal data mining are hindered by concept drift. One particular form of concept drift is characterized by changes to the underlying feature space. Seemingly little has been done to address this issue. This thesis presents FAE, an incremental ensemble approach to mining data subject to concept drift. FAE achieves better accuracies over four large datasets when compared with a similar incremental learning algorithm.


Learning Real-World Problems By Finding Correlated Basis Functions, Adam C. Drake Mar 2006

Learning Real-World Problems By Finding Correlated Basis Functions, Adam C. Drake

Theses and Dissertations

Learning algorithms based on the Fourier transform attempt to learn functions by approximating the largest coefficients of their Fourier representations. Nearly all previous work in Fourier-based learning has been in the theoretical realm, where properties of the transform have made it possible to prove many interesting learnability results. The real-world usefulness of Fourier-based methods, however, has not been thoroughly explored. This thesis explores methods for the practical application of Fourier-based learning. The primary contribution of this thesis is a new search algorithm for finding the largest coefficients of a function's Fourier representation. Although the search space is exponentially large, empirical …


Surface Realization Using A Featurized Syntactic Statistical Language Model, Thomas L. Packer Mar 2006

Surface Realization Using A Featurized Syntactic Statistical Language Model, Thomas L. Packer

Theses and Dissertations

An important challenge in natural language surface realization is the generation of grammatical sentences from incomplete sentence plans. Realization can be broken into a two-stage process consisting of an over-generating rule-based module followed by a ranker that outputs the most probable candidate sentence based on a statistical language model. Thus far, an n-gram language model has been evaluated in this context. More sophisticated syntactic knowledge is expected to improve such a ranker. In this thesis, a new language model based on featurized functional dependency syntax was developed and evaluated. Generation accuracies and cross-entropy for the new language model did not …


K X N Trust-Based Agent Reputation, Christopher Alonzo Parker Jan 2006

K X N Trust-Based Agent Reputation, Christopher Alonzo Parker

Theses and Dissertations

In this research, a multi-agent system called KMAS is presented that models an environment of intelligent, autonomous, rational, and adaptive agents that reason about trust, and adapt trust based on experience. Agents reason and adapt using a modification of the k-Nearest Neighbor algorithm called (k X n) Nearest Neighbor where k neighbors recommend reputation values for trust during each of n interactions. Reputation allows a single agent to receive recommendations about the trustworthiness of others. One goal is to present a recommendation model of trust that outperforms MAS architectures relying solely on direct agent interaction. A second goal is to …


Task Similarity Measures For Transfer In Reinforcement Learning Task Libraries, James Carroll, Kevin Seppi Aug 2005

Task Similarity Measures For Transfer In Reinforcement Learning Task Libraries, James Carroll, Kevin Seppi

Faculty Publications

Recent research in task transfer and task clustering has necessitated the need for task similarity measures in reinforcement learning. Determining task similarity is necessary for selective transfer where only information from relevant tasks and portions of a task are transferred. Which task similarity measure to use is not immediately obvious. It can be shown that no single task similarity measure is uniformly superior. The optimal task similarity measure is dependent upon the task transfer method being employed. We define similarity in terms of tasks, and propose several possible task similarity measures, dT, dp, dQ, and dR which are based on …


Dynamically Optimized Context In Recommender Systems, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang May 2005

Dynamically Optimized Context In Recommender Systems, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

Traditional approaches to recommender systems have not taken into account situational information when making recommendations, and this seriously limits the relevance of the results. This paper advocates context-awareness as a promising approach to enhance the performance of recommenders, and introduces a mechanism to realize this approach. We present a framework that separates the contextual concerns from the actual recommendation module, so that contexts can be readily shared across applications. More importantly, we devise a learning algorithm to dynamically identify the optimal set of contexts for a specific recommendation task and user. An extensive series of experiments has validated that our …


Improving And Extending Behavioral Animation Through Machine Learning, Jonathan J. Dinerstein Apr 2005

Improving And Extending Behavioral Animation Through Machine Learning, Jonathan J. Dinerstein

Theses and Dissertations

Behavioral animation has become popular for creating virtual characters that are autonomous agents and thus self-animating. This is useful for lessening the workload of human animators, populating virtual environments with interactive agents, etc. Unfortunately, current behavioral animation techniques suffer from three key problems: (1) deliberative behavioral models (i.e., cognitive models) are slow to execute; (2) interactive virtual characters cannot adapt online due to interaction with a human user; (3) programming of behavioral models is a difficult and time-intensive process. This dissertation presents a collection of papers that seek to overcome each of these problems. Specifically, these issues are alleviated …


Evaluating Online Trust Using Machine Learning Methods, Weihua Song Apr 2005

Evaluating Online Trust Using Machine Learning Methods, Weihua Song

Doctoral Dissertations

Trust plays an important role in e-commerce, P2P networks, and information filtering. Current challenges in trust evaluations include: (1) fnding trustworthy recommenders, (2) aggregating heterogeneous trust recommendations of different trust standards based on correlated observations and different evaluation processes, and (3) managing efficiently large trust systems where users may be sparsely connected and have multiple local reputations. The purpose of this dissertation is to provide solutions to these three challenges by applying ordered depth-first search, neural network, and hidden Markov model techniques. It designs an opinion filtered recommendation trust model to derive personal trust from heterogeneous recommendations; develops a reputation …


An Assessment Of Case-Based Reasoning For Spam Filtering, Sarah Jane Delany, Padraig Cunningham, Lorcan Coyle Jan 2005

An Assessment Of Case-Based Reasoning For Spam Filtering, Sarah Jane Delany, Padraig Cunningham, Lorcan Coyle

Articles

Because of the changing nature of spam, a spam filtering system that uses machine learning will need to be dynamic. This suggests that a case-based (memory-based) approach may work well. Case-Based Reasoning (CBR) is a lazy approach to machine learning where induction is delayed to run time. This means that the case base can be updated continuously and new training data is immediately available to the induction process. In this paper we present a detailed description of such a system called ECUE and evaluate design decisions concerning the case representation. We compare its performance with an alternative system that uses …


Learning Discrete Hidden Markov Models From State Distribution Vectors, Luis G. Moscovich Jan 2005

Learning Discrete Hidden Markov Models From State Distribution Vectors, Luis G. Moscovich

LSU Doctoral Dissertations

Hidden Markov Models (HMMs) are probabilistic models that have been widely applied to a number of fields since their inception in the late 1960’s. Computational Biology, Image Processing, and Signal Processing, are but a few of the application areas of HMMs. In this dissertation, we develop several new efficient learning algorithms for learning HMM parameters. First, we propose a new polynomial-time algorithm for supervised learning of the parameters of a first order HMM from a state probability distribution (SD) oracle. The SD oracle provides the learner with the state distribution vector corresponding to a query string. We prove the correctness …


A Bayesian Technique For Task Localization In Multiple Goal Markov Decision Processes, James Carroll, Kevin Seppi Dec 2004

A Bayesian Technique For Task Localization In Multiple Goal Markov Decision Processes, James Carroll, Kevin Seppi

Faculty Publications

In a reinforcement learning task library system for Multiple Goal Markov Decision Process (MGMDP), localization in the task space allows the agent to determine whether a given task is already in its library in order to exploit previously learned experience. Task localization in MGMDPs can be accomplished through a Bayesian approach, however a trivial approach fails when the rewards are not distributed normally. This can be overcome through our Bayesian Task Localization Technique (BTLT).


Vision-Based Human Directed Robot Guidance, Richard B. Arthur Oct 2004

Vision-Based Human Directed Robot Guidance, Richard B. Arthur

Theses and Dissertations

This paper describes methods to track a user-defined point in the vision of a robot as it drives forward. This tracking allows a robot to keep itself directed at that point while driving so that it can get to that user-defined point. I develop and present two new multi-scale algorithms for tracking arbitrary points between two frames of video, as well as through a video sequence. The multi-scale algorithms do not use the traditional pyramid image, but instead use a data structure called an integral image (also known as a summed area table). The first algorithm uses edge-detection to track …


Using Permutations Instead Of Student’S T Distribution For P-Values In Paired-Difference Algorithm Comparisons, Tony R. Martinez, Joshua Menke Jul 2004

Using Permutations Instead Of Student’S T Distribution For P-Values In Paired-Difference Algorithm Comparisons, Tony R. Martinez, Joshua Menke

Faculty Publications

The paired-difference t-test is commonly used in the machine learning community to determine whether one learning algorithm is better than another on a given learning task. This paper suggests the use of the permutation test instead hecause it calculates the exact p-value instead of an estimate. The permutation test is also distribution free and the time complexity is trivial for the commonly used 10-fold cross-validation paired-difference test. Results of experiments on real-world problems suggest it is not uncommon to see the t-test estimate deviate up to 30-50% from the exact p-value.


Solving Large Mdps Quickly With Partitioned Value Iteration, David Wingate Jun 2004

Solving Large Mdps Quickly With Partitioned Value Iteration, David Wingate

Theses and Dissertations

Value iteration is not typically considered a viable algorithm for solving large-scale MDPs because it converges too slowly. However, its performance can be dramatically improved by eliminating redundant or useless backups, and by backing up states in the right order. We present several methods designed to help structure value dependency, and present a systematic study of companion prioritization techniques which focus computation in useful regions of the state space. In order to scale to solve ever larger problems, we evaluate all enhancements and methods in the context of parallelizability. Using the enhancements, we discover that in many instances the limiting …


Machine Learning Techniques For Characterizing Ieee 802.11b Encrypted Data Streams, Michael J. Henson Mar 2004

Machine Learning Techniques For Characterizing Ieee 802.11b Encrypted Data Streams, Michael J. Henson

Theses and Dissertations

As wireless networks become an increasingly common part of the infrastructure in industrialized nations, the vulnerabilities of this technology need to be evaluated. Even though there have been major advancements in encryption technology, security protocols and packet header obfuscation techniques, other distinguishing characteristics do exist in wireless network traffic. These characteristics include packet size, signal strength, channel utilization and others. Using these characteristics, windows of size 11, 31, and 51 packets are collected and machine learning (ML) techniques are trained to classify applications accessing the 802.11b wireless channel. The four applications used for this study included E-Mail, FTP, HTTP, and …


Using Symbolic Knowledge In The Umls To Disambiguate Words In Small Datasets With A Naive Bayes Classifier, Gondy Leroy, Thomas C. Rindflesch Jan 2004

Using Symbolic Knowledge In The Umls To Disambiguate Words In Small Datasets With A Naive Bayes Classifier, Gondy Leroy, Thomas C. Rindflesch

CGU Faculty Publications and Research

Current approaches to word sense disambiguation use and combine various machine-learning techniques. Most refer to characteristics of the ambiguous word and surrounding words and are based on hundreds of examples. Unfortunately, developing large training sets is time-consuming. We investigate the use of symbolic knowledge to augment machine-learning techniques for small datasets. UMLS semantic types assigned to concepts found in the sentence and relationships between these semantic types form the knowledge base. A naïve Bayes classifier was trained for 15 words with 100 examples for each. The most frequent sense of a word served as the baseline. The effect of increasingly …


Visual Expectations: Using Machine Learning To Identify Patterns In Psychological Data, Skyler Place Jan 2004

Visual Expectations: Using Machine Learning To Identify Patterns In Psychological Data, Skyler Place

Honors Theses

The goal of this project was to utilize the tools of machine learning to evaluate the data obtained through experiments in psychology. Advanced pattern finding algorithms are an effective approach to analyzing large sets of data, from any domain of science. Consequently, we have a psychological question and hypothesis, and a separate machine learning technique to assess these claims. The realm of psychology that I focused on is visual cognition, and how an individual's knowledge affects how they see the world. This alteration of visual data is a part of perception -when the brain enhances the data coming in from …


On Machine Learning Methods For Chinese Document Classification, Ji He, Ah-Hwee Tan, Chew-Lim Tan May 2003

On Machine Learning Methods For Chinese Document Classification, Ji He, Ah-Hwee Tan, Chew-Lim Tan

Research Collection School Of Computing and Information Systems

This paper reports our comparative evaluation of three machine learning methods, namely k Nearest Neighbor (kNN), Support Vector Machines (SVM), and Adaptive Resonance Associative Map (ARAM) for Chinese document categorization. Based on two Chinese corpora, a series of controlled experiments evaluated their learning capabilities and efficiency in mining text classification knowledge. Benchmark experiments showed that their predictive performance were roughly comparable, especially on clean and well organized data sets. While kNN and ARAM yield better performances than SVM on small and clean data sets, SVM and ARAM significantly outperformed kNN on noisy data. Comparing efficiency, kNN was notably more costly …


Machine Learning Approaches For Determining Effective Seeds For K -Means Algorithm, Kaveephong Lertwachara Apr 2003

Machine Learning Approaches For Determining Effective Seeds For K -Means Algorithm, Kaveephong Lertwachara

Doctoral Dissertations

In this study, I investigate and conduct an experiment on two-stage clustering procedures, hybrid models in simulated environments where conditions such as collinearity problems and cluster structures are controlled, and in real-life problems where conditions are not controlled. The first hybrid model (NK) is an integration between a neural network (NN) and the k-means algorithm (KM) where NN screens seeds and passes them to KM. The second hybrid (GK) uses a genetic algorithm (GA) instead of the neural network. Both NN and GA used in this study are in their simplest-possible forms.

In the simulated data sets, I investigate two …


Machine Learning Techniques For Efficient Query Processing In Kowledge Base Systems, Kevin Paul Grant Jan 2003

Machine Learning Techniques For Efficient Query Processing In Kowledge Base Systems, Kevin Paul Grant

LSU Doctoral Dissertations

In this dissertation we propose a new technique for efficient query processing in knowledge base systems. Query processing in knowledge base systems poses strong computational challenges because of the presence of combinatorial explosion. This arises because at any point during query processing there may be too many subqueries available for further exploration. Overcoming this difficulty requires effective mechanisms for choosing from among these subqueries good subqueries for further processing. Inspired by existing works on stochastic logic programs, compositional modeling and probabilistic heuristic estimates we create a new, nondeterministic method to accomplish the task of subquery selection for query processing. Specifically, …


Machine-Learned Contexts For Linguistic Operations In German Sentence Realization, Eric K. Ringger, Simon Corston-Oliver, Michael Gamon, Robert Moore Jul 2002

Machine-Learned Contexts For Linguistic Operations In German Sentence Realization, Eric K. Ringger, Simon Corston-Oliver, Michael Gamon, Robert Moore

Faculty Publications

We show that it is possible to learn the contexts for linguistic operations which map a semantic representation to a surface syntactic tree in sentence realization with high accuracy. We cast the problem of learning the contexts for the linguistic operations as classification tasks, and apply straightforward machine learning techniques, such as decision tree learning. The training data consist of linguistic features extracted from syntactic and semantic representations produced by a linguistic analysis system. The target features are extracted from links to surface syntax trees. Our evidence consists of four examples from the German sentence realization system code-named Amalgam: case …


Modular Machine Learning Methods For Computer-Aided Diagnosis Of Breast Cancer, Mia Kathleen Markey '94 Jun 2002

Modular Machine Learning Methods For Computer-Aided Diagnosis Of Breast Cancer, Mia Kathleen Markey '94

Doctoral Dissertations

The purpose of this study was to improve breast cancer diagnosis by reducing the number of benign biopsies performed. To this end, we investigated modular and ensemble systems of machine learning methods for computer-aided diagnosis (CAD) of breast cancer. A modular system partitions the input space into smaller domains, each of which is handled by a local model. An ensemble system uses multiple models for the same cases and combines the models' predictions.

Five supervised machine learning techniques (LDA, SVM, BP-ANN, CBR, CART) were trained to predict the biopsy outcome from mammographic findings (BIRADS™) and patient age based on a …


An Analysis Of The Effectiveness Of A Constructive Induction-Based Virus Detection Prototype, Kevin T. Damp Apr 2000

An Analysis Of The Effectiveness Of A Constructive Induction-Based Virus Detection Prototype, Kevin T. Damp

Theses and Dissertations

Computer viruses remain a tangible threat to systems both within the Department of Defense and throughout the greater international data communications infrastructure on which the DoD increasingly depends. This threat is exacerbated continually, as new viruses are introduced at an alarming rate by the growing collection of connected machines and their operators. Unfortunately, current antivirus solutions are ill-equipped to address these issues in the long term. This thesis documents an investigation into the use of constructive induction, a form of machine learning, as a supplemental antivirus technique theoretically capable of detecting previously unknown viruses through generalized decision-making techniques. A group …


Multiple Stochastic Learning Automata For Vehicle Path Control In An Automated Highway System, Cem Unsal, Pushkin Kachroo, John S. Bay Jan 1999

Multiple Stochastic Learning Automata For Vehicle Path Control In An Automated Highway System, Cem Unsal, Pushkin Kachroo, John S. Bay

Electrical & Computer Engineering Faculty Research

This paper suggests an intelligent controller for an automated vehicle planning its own trajectory based on sensor and communication data. The intelligent controller is designed using the learning stochastic automata theory. Using the data received from on-board sensors, two automata (one for lateral actions, one for longitudinal actions) can learn the best possible action to avoid collisions. The system has the advantage of being able to work in unmodeled stochastic environments, unlike adaptive control methods or expert systems. Simulations for simultaneous lateral and longitudinal control of a vehicle provide encouraging results


Simulation Study Of Learning Automata Games In Automated Highway Systems, Cem Unsal, Pushkin Kachroo, John S. Bay Nov 1997

Simulation Study Of Learning Automata Games In Automated Highway Systems, Cem Unsal, Pushkin Kachroo, John S. Bay

Electrical & Computer Engineering Faculty Research

One of the most important issues in Automated Highway System (AHS) deployment is intelligent vehicle control. While the technology to safely maneuver vehicles exists, the problem of making intelligent decisions to improve a single vehicle’s travel time and safety while optimizing the overall traffic flow is still a stumbling block. We propose an artificial intelligence technique called stochastic learning automata to design an intelligent vehicle path controller. Using the information obtained by on-board sensors and local communication modules, two automata are capable of learning the best possible (lateral and longitudinal) actions to avoid collisions. This learning method is capable of …


Utilizing Data And Knowledge Mining For Probabilistic Knowledge Bases, Daniel J. Stein Iii Dec 1996

Utilizing Data And Knowledge Mining For Probabilistic Knowledge Bases, Daniel J. Stein Iii

Theses and Dissertations

Problems can arise whenever inferencing is attempted on a knowledge base that is incomplete. Our work shows that data mining techniques can be applied to fill in incomplete areas in Bayesian Knowledge Bases (BKBs), as well as in other knowledge-based systems utilizing probabilistic representations. The problem of inconsistency in BKBs has been addressed in previous work, where reinforcement learning techniques from neural networks were applied. However, the issue of automatically solving incompleteness in BKBs has yet to be addressed. Presently, incompleteness in BKBs is repaired through the application of traditional knowledge acquisition techniques. We show how association rules can be …


On The Impact Of Forgetting On Learning Machines, Rūsiņš Freivalds, Efim Kinber, Carl H. Smith Nov 1995

On The Impact Of Forgetting On Learning Machines, Rūsiņš Freivalds, Efim Kinber, Carl H. Smith

School of Computer Science & Engineering Faculty Publications

People tend not to have perfect memories when it comes to learning, or to anything else for that matter. Most formal studies of learning, however, assume a perfect memory. Some approaches have restricted the number of items that could be retained. We introduce a complexity theoretic accounting of memory utilization by learning machines. In our new model, memory is measured in bits as a function of the size of the input. There is a hierarchy of learnability based on increasing memory allotment. The lower bound results are proved using an unusual combination of pumping and mutual recursion theorem arguments. For …


Intelligent Control Of Vehicles: Preliminary Results On The Application Of Learning Automata Techniques To Automated Highway System, Cem Unsal, John S. Bay, Pushkin Kachroo Nov 1995

Intelligent Control Of Vehicles: Preliminary Results On The Application Of Learning Automata Techniques To Automated Highway System, Cem Unsal, John S. Bay, Pushkin Kachroo

Electrical & Computer Engineering Faculty Research

We suggest an intelligent controller for an automated vehicle to plan its own trajectory based on sensor and communication data received. Our intelligent controller is based on an artificial intelligence technique called learning stochastic automata. The automaton can learn the best possible action to avoid collisions using the data received from on-board sensors. The system has the advantage of being able to work in unmodeled stochastic environments. Simulations for the lateral control of a vehicle using this AI method provides encouraging results.