Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 4951 - 4980 of 6716

Full-Text Articles in Physical Sciences and Mathematics

Player Performance Prediction In Massively Multiplayer Online Role-Playing Games (Mmorpgs), Kyong Jin Shim, Richa Sharan, Jaideep Srivastava Jun 2010

Player Performance Prediction In Massively Multiplayer Online Role-Playing Games (Mmorpgs), Kyong Jin Shim, Richa Sharan, Jaideep Srivastava

Research Collection School Of Computing and Information Systems

In this study, we propose a comprehensive performance management tool for measuring and reporting operational activities of game players. This study uses performance data of game players in EverQuest II, a popular MMORPG developed by Sony Online Entertainment, to build performance prediction models forgame players. The prediction models provide a projection of player’s future performance based on his past performance, which is expected to be a useful addition to existing player performance monitoring tools. First, we show that variations of PECOTA [2] and MARCEL [3], two most popular baseball home run prediction methods, can be used for game player performance …


Weakly-Supervised Hashing In Kernel Space, Yadong Mu, Jialie Shen, Shuicheng Yan Jun 2010

Weakly-Supervised Hashing In Kernel Space, Yadong Mu, Jialie Shen, Shuicheng Yan

Research Collection School Of Computing and Information Systems

The explosive growth of the vision data motivates the recent studies on efficient data indexing methods such as locality-sensitive hashing (LSH). Most existing approaches perform hashing in an unsupervised way. In this paper we move one step forward and propose a supervised hashing method, i.e., the LAbel-regularized Max-margin Partition (LAMP) algorithm. The proposed method generates hash functions in weakly-supervised setting, where a small portion of sample pairs are manually labeled to be “similar” or “dissimilar”. We formulate the task as a Constrained Convex-Concave Procedure (CCCP), which can be relaxed into a series of convex sub-problems solvable with efficient Quadratic-Program (QP). …


Otl: A Framework Of Online Transfer Learning, Peilin Zhao, Steven C. H. Hoi Jun 2010

Otl: A Framework Of Online Transfer Learning, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

In this paper, we investigate a new machine learning framework called Online Transfer Learning (OTL) that aims to transfer knowledge from some source domain to an online learning task on a target domain. We do not assume the target data follows the same class or generative distribution as the source data, and our key motivation is to improve a supervised online learning task in a target domain by exploiting the knowledge that had been learned from large amount of training data in source domains. OTL is in general challenging since data in both domains not only can be different in …


Efficient Mutual Nearest Neighbor Query Processing For Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen, Gang Chen Jun 2010

Efficient Mutual Nearest Neighbor Query Processing For Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen, Gang Chen

Research Collection School Of Computing and Information Systems

Given a set D of trajectories, a query object q, and a query time extent Γ, a mutual (i.e., symmetric) nearest neighbor (MNN) query over trajectories finds from D, the set of trajectories that are among the k1 nearest neighbors (NNs) of q within Γ, and meanwhile, have q as one of their k2 NNs. This type of queries is useful in many applications such as decision making, data mining, and pattern recognition, as it considers both the proximity of the trajectories to q and the proximity of q to the trajectories. In this paper, we first formalize MNN search …


Variance Reduction Techniques For Estimating Quantiles And Value-At-Risk, Fang Chu May 2010

Variance Reduction Techniques For Estimating Quantiles And Value-At-Risk, Fang Chu

Dissertations

Quantiles, as a performance measure, arise in many practical contexts. In finance, quantiles are called values-at-risk (VARs), and they are widely used in the financial industry to measure portfolio risk. When the cumulative distribution function is unknown, the quantile can not be computed exactly and must be estimated. In addition to computing a point estimate for the quantile, it is important to also provide a confidence interval for the quantile as a way of indicating the error in the estimate. A problem with crude Monte Carlo is that the resulting confidence interval may be large, which is often the case …


Iris : Digital Scholarship, Publishing, And Preservation At Northeastern University, Hillary Corbett May 2010

Iris : Digital Scholarship, Publishing, And Preservation At Northeastern University, Hillary Corbett

Hillary Corbett

No abstract provided.


Sense Of Place In Virtual World Learning Environments: A Conceptual Exploration, Vipin Arora, Deepak Khazanchi May 2010

Sense Of Place In Virtual World Learning Environments: A Conceptual Exploration, Vipin Arora, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Proceedings & Presentations

In this paper we conceptually explore the notion of sense of place and its potential use in the design of a ‗place for learning‘ in 3D immersive environments such as virtual worlds. We draw from earlier research in the fields of environmental psychology, social psychology and Human Computer Interaction. Our goal in this paper is to summarize the conceptual foundations that will form the basis for further empirical research aimed to inform institutions aspiring to create learning spaces in 3D virtual worlds.


Distance-Based Measures Of Inconsistency And Incoherency For Description Logics, Yue Ma, Pascal Hitzler May 2010

Distance-Based Measures Of Inconsistency And Incoherency For Description Logics, Yue Ma, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Inconsistency and incoherency are two sorts of erroneous information in a DL ontology which have been widely discussed in ontology-based applications. For example, they have been used to detect modeling errors during ontology construction. To provide more informative metrics which can tell the differences between inconsistent ontologies and between incoherent terminologies, there has been some work on measuring inconsistency of an ontology and on measuring incoherency of a terminology. However, most of them merely focus either on measuring inconsistency or on measuring incoherency and no clear ideas of how to extend them to allow for the other. In this paper, …


Capacity-Driven Pricing Mechanism In Special Service Industries, Lijian Chen, Suraj M. Alexander May 2010

Capacity-Driven Pricing Mechanism In Special Service Industries, Lijian Chen, Suraj M. Alexander

MIS/OM/DS Faculty Publications

We propose a capacity driven pricing mechanism for several service industries in which the customer behavior, the price demand relationship, and the competition are significantly distinct from other industries. According our observation, we found that the price demand relationship in these industries cannot be modeled by fitted curves; the customers would neither plan in advance nor purchase the service strategically; and the competition would be largely local. We analyze both risk neutral and risk aversion pricing models and conclude the proposed capacity driven model would be the optimal solution under mild assumptions. The resulting pricing mechanism has been implemented at …


Some Trust Issues In Social Networks And Sensor Networks, Krishnaprasad Thirunarayan, Pramod Anantharam, Cory Andrew Henson, Amit P. Sheth May 2010

Some Trust Issues In Social Networks And Sensor Networks, Krishnaprasad Thirunarayan, Pramod Anantharam, Cory Andrew Henson, Amit P. Sheth

Kno.e.sis Publications

Trust and reputation are becoming increasingly important in diverse areas such as search, e-commerce, social media, semantic sensor networks, etc. We review past work and explore future research issues relevant to trust in social/sensor networks and interactions. We advocate a balanced, iterative approach to trust that marries both theory and practice. On the theoretical side, we investigate models of trust to analyze and specify the nature of trust and trust computation. On the practical side, we propose to uncover aspects that provide a basis for trust formation and techniques to extract trust information from concrete social/sensor networks and interactions. We …


Personalization By Website Transformation: Theory And Practice, Saverio Perugini May 2010

Personalization By Website Transformation: Theory And Practice, Saverio Perugini

Computer Science Faculty Publications

We present an analysis of a progressive series of out-of-turn transformations on a hierarchical website to personalize a user’s interaction with the site. We formalize the transformation in graph-theoretic terms and describe a toolkit we built that enumerates all of the traversals enabled by every possible complete series of these transformations in any site and computes a variety of metrics while simulating each traversal therein to qualify the relationship between a site’s structure and the cumulative effect of support for the transformation in a site. We employed this toolkit in two websites. The results indicate that the transformation enables users …


Linked Sensor Data, Harshal Kamlesh Patni, Cory Andrew Henson, Amit P. Sheth May 2010

Linked Sensor Data, Harshal Kamlesh Patni, Cory Andrew Henson, Amit P. Sheth

Kno.e.sis Publications

A number of government, corporate, and academic organizations are collecting enormous amounts of data provided by environmental sensors. However, this data is too often locked within organizations and underutilized by the greater community. In this paper, we present a framework to make this sensor data openly accessible by publishing it on the Linked Open Data (LOD) Cloud. This is accomplished by converting raw sensor observations to RDF and linking with other datasets on LOD. With such a framework, organizations can make large amounts of sensor data openly accessible, thus allowing greater opportunity for utilization and analysis.


A Comparative Study On Text Categorization, Aditya Chainulu Karamcheti May 2010

A Comparative Study On Text Categorization, Aditya Chainulu Karamcheti

UNLV Theses, Dissertations, Professional Papers, and Capstones

Automated text categorization is a supervised learning task, defined as assigning category labels to new documents based on likelihood suggested by a training set of labeled documents. Two examples of methodology for text categorizations are Naive Bayes and K-Nearest Neighbor.

In this thesis, we implement two categorization engines based on Naive Bayes and K-Nearest Neighbor methodology. We then compare the effectiveness of these two engines by calculating standard precision and recall for a collection of documents. We will further report on time efficiency of these two engines.


Two-View Transductive Support Vector Machines, Guangxia Li, Steven C. H. Hoi, Kuiyu Chang May 2010

Two-View Transductive Support Vector Machines, Guangxia Li, Steven C. H. Hoi, Kuiyu Chang

Research Collection School Of Computing and Information Systems

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data …


Exploiting Query Logs For Cross-Lingual Query Suggestions., Wei Gao, Cheng Niu, Jian-Yun Nie, Ming Zhou, Kam-Fai Wong, Hsiao-Wuen Hon May 2010

Exploiting Query Logs For Cross-Lingual Query Suggestions., Wei Gao, Cheng Niu, Jian-Yun Nie, Ming Zhou, Kam-Fai Wong, Hsiao-Wuen Hon

Research Collection School Of Computing and Information Systems

Query suggestion aims to suggest relevant queries for a given query, which helps users better specify their information needs. Previous work on query suggestion has been limited to the same language. In this article, we extend it to cross-lingual query suggestion (CLQS): for a query in one language, we suggest similar or relevant queries in other languages. This is very important to the scenarios of cross-language information retrieval (CLIR) and other related cross-lingual applications. Instead of relying on existing query translation technologies for CLQS, we present an effective means to map the input query of one language to queries of …


Enterprise Users And Web Search Behavior, April Ann Lewis May 2010

Enterprise Users And Web Search Behavior, April Ann Lewis

Masters Theses

This thesis describes analysis of user web query behavior associated with Oak Ridge National Laboratory’s (ORNL) Enterprise Search System (Hereafter, ORNL Intranet). The ORNL Intranet provides users a means to search all kinds of data stores for relevant business and research information using a single query. The Global Intranet Trends for 2010 Report suggests the biggest current obstacle for corporate intranets is “findability and Siloed content”. Intranets differ from internets in the way they create, control, and share content which can make it often difficult and sometimes impossible for users to find information. Stenmark (2006) first noted studies of corporate …


Exclusive Lasso For Multi-Task Feature Selection, Yang Zhou, Rong Jin, Steven C. H. Hoi May 2010

Exclusive Lasso For Multi-Task Feature Selection, Yang Zhou, Rong Jin, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

We propose a novel group regularization which we call exclusive lasso. Unlike the group lasso regularizer that assumes co-varying variables in groups, the proposed exclusive lasso regularizer models the scenario when variables in the same group compete with each other. Analysis is presented to illustrate the properties of the proposed regularizer. We present a framework of kernel-based multi-task feature selection algorithm based on the proposed exclusive lasso regularizer. An efficient algorithm is derived to solve the related optimization problem. Experiments with document categorization show that our approach outperforms state-of-the-art algorithms for multi-task feature selection.


Learning User Profiles For Personalized Information Dissemination, Ah-Hwee Tan, Christine Teo May 2010

Learning User Profiles For Personalized Information Dissemination, Ah-Hwee Tan, Christine Teo

Research Collection School Of Computing and Information Systems

Personalized information systems represent the recent effort of delivering information to users more effectively in the modern electronic age. This paper illustrates how a supervised Adaptive Resonance Theory (ART) system, known as fuzzy ARAM, can be used to learn user profiles for personalized information dissemination. ARAM learning is on-line, fast, and incremental. Acquisition of new knowledge does not require re-training on previously learned cases. ARAM integrates both user-defined and system-learned knowledge in a single framework. Therefore inconsistency between the two knowledge sources will not arise. ARAM has been used to develop a personalized news system known as PIN. Preliminary experiments …


Rapport: Semantic-Sensitive Namespace Management In Large-Scale File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng Apr 2010

Rapport: Semantic-Sensitive Namespace Management In Large-Scale File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng

CSE Technical Reports

Explosive growth in volume and complexity of data exacerbates the key challenge to effectively and efficiently manage data in a way that fundamentally improves the ease and efficacy of their use. Existing large-scale file systems rely on hierarchically structured namespace that leads to severe performance bottlenecks and renders it impossible to support real-time queries on multi-dimensional attributes. This paper proposes a novel semantic-sensitive scheme, called Rapport, to provide dynamic and adaptive namespace management and support complex queries. The basic idea is to build files’ namespace by utilizing their semantic correlation and exploiting dynamic evolution of attributes to support namespace management. …


Trust In Social And Sensor Networks, Pramod Anantharam, Krishnaprasad Thirunarayan, Cory Andrew Henson, Amit P. Sheth Apr 2010

Trust In Social And Sensor Networks, Pramod Anantharam, Krishnaprasad Thirunarayan, Cory Andrew Henson, Amit P. Sheth

Kno.e.sis Publications

Trust can be defined as the perception of the trustor about the degree to which the trustee would satisfy an expectation about a transaction constituting risk. Trust plays a pivotal role when the risk in believing incorrect information is high. With Web 2.0 where user generated content and real time interactions dominate, the openness of data contribution may hinder the quality of information we can get.


Semantics-Empowered Text Exploration For Knowledge Discovery, Delroy H. Cameron, Pablo N. Mendes, Amit P. Sheth, Victor Chan Apr 2010

Semantics-Empowered Text Exploration For Knowledge Discovery, Delroy H. Cameron, Pablo N. Mendes, Amit P. Sheth, Victor Chan

Kno.e.sis Publications

The interaction paradigm offered by most contemporary Web Information Systems is a search-and-sift paradigm in which users manually seek information using hyperlinked documents. This paradigm is derived from a document-centric model that gives users minimal support for scanning through high volumes of text. We present a novel information exploration paradigm based on a data-centric view of corpora, along with a prototype implementation that demonstrates the value in content-driven navigation. We leverage semantic metadata to link data in documents by exploiting named relationships between entities. We also present utilities for gathering user generated navigation trails, critical for knowledge discovery. We discuss …


Dynamic Associative Relationships On The Linked Open Data Web, Pablo N. Mendes, Pavan Kapanipathi, Delroy H. Cameron, Amit P. Sheth Apr 2010

Dynamic Associative Relationships On The Linked Open Data Web, Pablo N. Mendes, Pavan Kapanipathi, Delroy H. Cameron, Amit P. Sheth

Kno.e.sis Publications

We provide a definition of context based on theme, time and location, and propose a mixed retrieval/extraction model for the dynamic suggestion of trending relationships to LOD resources.


What Goes Around Comes Around - Improving Linked Open Data Through On-Demand Model Creation, Christopher Thomas, Wenbo Wang, Pankaj Mehra, Delroy H. Cameron, Pablo N. Mendes, Amit P. Sheth Apr 2010

What Goes Around Comes Around - Improving Linked Open Data Through On-Demand Model Creation, Christopher Thomas, Wenbo Wang, Pankaj Mehra, Delroy H. Cameron, Pablo N. Mendes, Amit P. Sheth

Kno.e.sis Publications

Web 2.0 has changed the way we share and keep up with information. We communicate through social media platforms and make the information we exchange to a large extent publicly available. Linked Open Data (LOD) follows the same paradigm of sharing information but also makes it machine accessible. LOD provides an abundance of structured information albeit in a less formally rigorous form than would be desirable for Semantic Web applications. Nevertheless, most of the LOD assertions are community reviewed and we can rely on their accuracy to a large extent. In this work we want to follow the Web 2.0 …


A Pda Intervention To Sustain Smoking Cessation In Clients With Socioeconomic Vulnerability, Lynne Buchanan, Deepak Khazanchi Apr 2010

A Pda Intervention To Sustain Smoking Cessation In Clients With Socioeconomic Vulnerability, Lynne Buchanan, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications

This article describes a pilot study to explore use of a personal digital assistant (PDA) to sustain smoking cessation after discharge in clients with socioeconomic vulnerability. The major aim is to describe technology acceptance (perceived ease of use, usefulness, and attitude), portability, technical difficulty, satisfaction, and use time. The sample includes 31 medical surgical clients with average age of 47.35 (±13.3), average household income of $13,629 (±8,204), average number in the household of 2.67 (±2.22), and average education of 11th grade. The results demonstrate mean use time of 9.28 (±3.23) hr, or about 1 hr over 8 weeks. Technology acceptance …


Efficient Skyline Maintenance For Streaming Data With Partially-Ordered Domains, Yuan Fang, Chee-Yong Chan Apr 2010

Efficient Skyline Maintenance For Streaming Data With Partially-Ordered Domains, Yuan Fang, Chee-Yong Chan

Research Collection School Of Computing and Information Systems

We address the problem of skyline query processing for a count-based window of continuous streaming data that involves both totally- and partially-ordered attribute domains. In this problem, a fixed-size buffer of the N most recent tuples is dynamically maintained and the key challenge is how to efficiently maintain the skyline of the sliding window of N tuples as new tuples arrive and old tuples expire. We identify the limitations of the state-of-the-art approach STARS, and propose two new approaches, STARS+ and SkyGrid, to address its drawbacks. STARS+ is an enhancement of STARS with three new optimization techniques, while SkyGrid is …


Data Mining Based Predictive Models For Overall Health Indices, Ridhima Rajkumar, Kyong Jin Shim, Jaideep Srivastava Apr 2010

Data Mining Based Predictive Models For Overall Health Indices, Ridhima Rajkumar, Kyong Jin Shim, Jaideep Srivastava

Research Collection School Of Computing and Information Systems

In this study, we infer health care indices of individuals using their pharmacy medical and prescription claims. Specifically, we focus on the widely used Charlson Index. We use data mining techniques to formulate the problem of classifying Charlson Index (CI) and build predictive models to predict individual health index score. First, we present comparative analyses of several classification algorithms. Second, our study shows that certain ensemble algorithms lead to higher prediction accuracy in comparison to base algorithms. Third, we introduce cost-sensitive learning to the classification algorithms and show that the inclusion of cost-sensitive learning leads to improved prediction accuracy. The …


Algorithms For Constrained K-Nearest Neighbor Queries Over Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen Apr 2010

Algorithms For Constrained K-Nearest Neighbor Queries Over Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen

Research Collection School Of Computing and Information Systems

An important query for spatio-temporal databases is to find nearest trajectories of moving objects. Existing work on this topic focuses on the closest trajectories in the whole data space. In this paper, we introduce and solve constrained k-nearest neighbor (CkNN) queries and historical continuous CkNN (HCCkNN) queries on R-tree-like structures storing historical information about moving object trajectories. Given a trajectory set D, a query object (point or trajectory) q, a temporal extent T, and a constrained region CR, (i) a CkNN query over trajectories retrieves from D within T, the k (≥ 1) trajectories that lie closest to q and …


Adaptive Ensemble Classification In P2p Networks, Hock Hee Ang, Vivekanand Gopalkrishnan, Steven C. H. Hoi, Wee Keong Ng Apr 2010

Adaptive Ensemble Classification In P2p Networks, Hock Hee Ang, Vivekanand Gopalkrishnan, Steven C. H. Hoi, Wee Keong Ng

Research Collection School Of Computing and Information Systems

Classification in P2P networks has become an important research problem in data mining due to the popularity of P2P computing environments. This is still an open difficult research problem due to a variety of challenges, such as non-i.i.d. data distribution, skewed or disjoint class distribution, scalability, peer dynamism and asynchronism. In this paper, we present a novel P2P Adaptive Classification Ensemble (PACE) framework to perform classification in P2P networks. Unlike regular ensemble classification approaches, our new framework adapts to the test data distribution and dynamically adjusts the voting scheme by combining a subset of classifiers/peers according to the test data …


Managing Media Rich Geo-Spatial Annotations For A Map-Based Mobile Application Using Clustering, Khasfariyati Razikin, Dion Hoe-Lian Goh, Ee Peng Lim, Aixin Sun, Yin-Leng Theng, Thi Nhu Quynh Kim, Kalyani Chatterjea, Chew-Hung Chang Apr 2010

Managing Media Rich Geo-Spatial Annotations For A Map-Based Mobile Application Using Clustering, Khasfariyati Razikin, Dion Hoe-Lian Goh, Ee Peng Lim, Aixin Sun, Yin-Leng Theng, Thi Nhu Quynh Kim, Kalyani Chatterjea, Chew-Hung Chang

Research Collection School Of Computing and Information Systems

With the prevalence of mobile devices that are equipped with wireless Internet capabilities and Global Positioning System (GPS) functionality, the creation and access of user-generated content are extended to users on the go. Such content are tied to real world objects, in the form of geospatial annotations, and it is only natural that these annotations are visualized using a map-based approach. However, viewing maps that are filled with annotations could hinder the serendipitous discovery of data, especially on the small screens of mobile devices. This calls for a need to manage the annotations. In this paper, we introduce a mobile …


Optimal Matching Between Spatial Datasets Under Capacity Constraints, Hou U Leong, Kyriakos Mouratidis, Man Lung Yiu, Nikos Mamoulis Apr 2010

Optimal Matching Between Spatial Datasets Under Capacity Constraints, Hou U Leong, Kyriakos Mouratidis, Man Lung Yiu, Nikos Mamoulis

Research Collection School Of Computing and Information Systems

Consider a set of customers (e.g., WiFi receivers) and a set of service providers (e.g., wireless access points), where each provider has a capacity and the quality of service offered to its customers is anti-proportional to their distance. The capacity constrained assignment (CCA) is a matching between the two sets such that (i) each customer is assigned to at most one provider, (ii) every provider serves no more customers than its capacity, (iii) the maximum possible number of customers are served, and (iv) the sum of Euclidean distances within the assigned provider-customer pairs is minimized. Although max-flow algorithms are applicable …