Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 5881 - 5910 of 6720

Full-Text Articles in Physical Sciences and Mathematics

Effects Of Information And Machine Learning Algorithms On Word Sense Disambiguation With Small Datasets, Gondy Leroy, Thomas C. Rindflesch Aug 2005

Effects Of Information And Machine Learning Algorithms On Word Sense Disambiguation With Small Datasets, Gondy Leroy, Thomas C. Rindflesch

CGU Faculty Publications and Research

Current approaches to word sense disambiguation use (and often combine) various machine learning techniques. Most refer to characteristics of the ambiguity and its surrounding words and are based on thousands of examples. Unfortunately, developing large training sets is burdensome, and in response to this challenge, we investigate the use of symbolic knowledge for small datasets. A naïve Bayes classifier was trained for 15 words with 100 examples for each. Unified Medical Language System (UMLS) semantic types assigned to concepts found in the sentence and relationships between these semantic types form the knowledge base. The most frequent sense of a word …


Web Mining - The Ontology Approach, Ee Peng Lim, Aixin Sun Aug 2005

Web Mining - The Ontology Approach, Ee Peng Lim, Aixin Sun

Research Collection School Of Computing and Information Systems

The World Wide Web today provides users access to extremely large number of Web sites many of which contain information of education and commercial values. Due to the unstructured and semi-structured nature of Web pages and the design idiosyncrasy of Web sites, it is a challenging task to develop digital libraries for organizing and managing digital content from the Web. Web mining research, in its last 10 years, has on the other hand made significant progress in categorizing and extracting content from the Web. In this paper, we represent ontology as a set of concepts and their inter-relationships relevant to …


Medoid Queries In Large Spatial Databases, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou Aug 2005

Medoid Queries In Large Spatial Databases, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou

Research Collection School Of Computing and Information Systems

Assume that a franchise plans to open k branches in a city, so that the average distance from each residential block to the closest branch is minimized. This is an instance of the k-medoids problem, where residential blocks constitute the input dataset and the k branch locations correspond to the medoids. Since the problem is NP-hard, research has focused on approximate solutions. Despite an avalanche of methods for small and moderate size datasets, currently there exists no technique applicable to very large databases. In this paper, we provide efficient algorithms that utilize an existing data-partition index to achieve low CPU …


Wmxml: A System For Watermarking Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan, Dhruv Mangla Aug 2005

Wmxml: A System For Watermarking Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan, Dhruv Mangla

Research Collection School Of Computing and Information Systems

As increasing amount of data is published in the form of XML, copyright protection of XML data is becoming an important requirement for many applications. While digital watermarking is a widely used measure to protect digital data from copyright offences, the complex and flexible construction of XML data poses a number of challenges to digital watermarking, such as re-organization and alteration attacks. To overcome these challenges, the watermarking scheme has to be based on the usability of data and the underlying semantics like key attributes and functional dependencies. In this paper, we describe WmXML, a system for watermarking XML documents. …


Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis Aug 2005

Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

This paper proposes and solves a-autonomy and k-stops shortest path problems in large spatial databases. Given a source s and a destination d, an aautonomy query retrieves a sequence of data points connecting s and d, such that the distance between any two consecutive points in the path is not greater than a. A k-stops query retrieves a sequence that contains exactly k intermediate data points. In both cases our aim is to compute the shortest path subject to these constraints. Assuming that the dataset is indexed by a data-partitioning method, the proposed techniques initially compute a sub-optimal path by …


Geogdl: A Web-Based Approach To Geography Examination, Ee Peng Lim, Dion Hoe-Lian Goh, Yin-Leng Theng Aug 2005

Geogdl: A Web-Based Approach To Geography Examination, Ee Peng Lim, Dion Hoe-Lian Goh, Yin-Leng Theng

Research Collection School Of Computing and Information Systems

The traditional educational approach with students as passive recipients has been the subject of criticism. A constructivist learner-centered approach towards education has been argued to produce greater internalization and application of knowledge compared to the traditional teacher-centered, transmission-oriented approach. Nevertheless, contemporary instructional design models argue for the use and integration of both approaches especially in complex learning tasks. This paper describes GeogDL, a Web-based application developed above a digital library of geographical resources for Singapore students preparing to take a national examination in geography. GeogDL is unique in that it not only provides an environment for active learning, it also …


Peer-To-Peer Discovery Of Semantic Associations, Matthew Perry, Maciej Janik, Cartic Ramakrishnan, Conrad Ibanez, I. Budak Arpinar, Amit P. Sheth Jul 2005

Peer-To-Peer Discovery Of Semantic Associations, Matthew Perry, Maciej Janik, Cartic Ramakrishnan, Conrad Ibanez, I. Budak Arpinar, Amit P. Sheth

Kno.e.sis Publications

The Semantic Web vision promises an extension of the current Web in which all data is annotated with machine understandable metadata. The relationship-centric nature of this data has led to the definition of Semantic Associations, which are complex relationships between resources. Semantic Associations attempt to answer queries of the form “how are resource A and resource B related?” Knowing how two entities are related is a crucial question in knowledge discovery applications. Much the same way humans collaborate and interact to form new knowledge, discovery of Semantic Associations across repositories on a peer-to-peer network can allow peers to share their …


Suitability Of The Nist Shop Data Model As A Neutral File Format For Simulation, Gregory Brent Harward Jul 2005

Suitability Of The Nist Shop Data Model As A Neutral File Format For Simulation, Gregory Brent Harward

Theses and Dissertations

Due to the successful application in internet related fields, Extensible Markup Language (XML) and its related technologies are being explored as a revolutionary software file format technology used to provide increased interoperability in the discrete-event simulation (DES) arena. The National Institute of Standards and Technology (NIST) has developed an XML-based information model (XSD) called the Shop Data Model (SDM), which is used to describe the contents of a neutral file format (NFF) that is being promoted as a means to make manufacturing simulation technology more accessible to a larger group of potential customers. Using a two step process, this thesis …


A Semantic Template Based Designer For Semantic Web Processes, Ranjit Mulye, John A. Miller, Kunal Verma, Karthik Gomadam, Amit P. Sheth Jul 2005

A Semantic Template Based Designer For Semantic Web Processes, Ranjit Mulye, John A. Miller, Kunal Verma, Karthik Gomadam, Amit P. Sheth

Kno.e.sis Publications

The growing popularity of service oriented computing based on Web services standards is creating a need for paradigms to represent and design business processes. Significant work has been done in the representation aspects with regards to WSBPEL. However, design and modeling of business processes is still an open issue. In this paper, we present a novel designer for business processes, which allows for intuitive modeling of Web processes, as well as using a template based approach for semi-automatically integrating partners either at design time or at deployment time. This work has been done as part of the METEOR-S project, which …


On Embedding Machine-Processable Semantics Into Documents, Krishnaprasad Thirunarayan Jul 2005

On Embedding Machine-Processable Semantics Into Documents, Krishnaprasad Thirunarayan

Kno.e.sis Publications

Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an approach to amalgamate XML technology with programming languages for representational purposes that can enhance traceability, thereby facilitating semiautomatic extraction and update. Specifically, we propose a modular technique to embed machine-processable semantics into a text document with tabular data via annotations, resulting sometimes in ill-formed XML fragments, and evaluate this technique vis a vis document querying, manipulation, and integration. The ultimate aim is to be able to author and extract human-readable and machine-comprehensible parts of …


Faster Owl Using Split Programs, Denny Vrandecic, Pascal Hitzler Jul 2005

Faster Owl Using Split Programs, Denny Vrandecic, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Knowledge representation and reasoning on the Semantic Web is done by means of ontologies. While the quest for suitable ontology languages is still ongoing, OWL [5] has been established as a core standard. It comes in three flavours, as OWL Full, OWL DL and OWL Lite, where OWL Full contains OWL DL, which in turn contains OWL Lite. The latter two coincide semantically with certain description logics and can thus be considered fragments of first-order predicate logic.


Morphisms In Context, Markus Krotzsch, Guo-Qiang Zhang, Pascal Hitzler Jul 2005

Morphisms In Context, Markus Krotzsch, Guo-Qiang Zhang, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Morphisms constitute a general tool for modelling complex relationships between mathematical objects in a disciplined fashion. In Formal Concept Analysis (FCA), morphisms can be used for the study of structural properties of knowledge represented in formal contexts, with applications to data transformation and merging. In this paper we present a comprehensive treatment of some of the most important morphisms in FCA and their relationships, including dual bonds, scale measures, infomorphisms, and their respective relations to Galois connections. We summarize our results in a concept lattice that cumulates the relationships among the considered morphisms. The purpose of this work is to …


A Modular Approach To Document Indexing And Semantic Search, Dhanya Ravishankar, Krishnaprasad Thirunarayan, Trivikram Immaneni Jul 2005

A Modular Approach To Document Indexing And Semantic Search, Dhanya Ravishankar, Krishnaprasad Thirunarayan, Trivikram Immaneni

Kno.e.sis Publications

This paper develops a modular approach to improving effectiveness of searching documents for information by reusing and integrating mature software components such as Lucene APIs, WORDNET, LSA techniques, and domain-specific controlled vocabulary. To evaluate the practical benefits, the prototype was used to query MEDLINE database, and to locate domain-specific controlled vocabulary terms in Materials and Process Specifications. Its extensibility has been demonstrated by incorporating a spell-checker for the input query, and by structuring the retrieved output into hierarchical collections for quicker assimilation. It is also being used to experimentally explore the relationship between LSA and document clustering using 20-mini-newsgroups and …


Co-Clustering Of Time-Evolving News Story With Transcript And Keyframe, Xiao Wu, Chong-Wah Ngo, Qing Li Jul 2005

Co-Clustering Of Time-Evolving News Story With Transcript And Keyframe, Xiao Wu, Chong-Wah Ngo, Qing Li

Research Collection School Of Computing and Information Systems

This paper presents techniques in clustering the same-topic news stories according to event themes. We model the relationship of stories with textual and visual concepts under the representation of bipartite graph. The textual and visual concepts are extracted respectively from speech transcripts and keyframes. Co-clustering algorithm is employed to exploit the duality of stories and textual-visual concepts based on spectral graph partitioning. Experimental results on TRECVID-2004 corpus show that the co-clustering of news stories with textual-visual concepts is significantly better than the co-clustering with either textual or visual concept alone.


Social Network Discovery By Mining Spatio-Temporal Events, Hady Lauw, Ee Peng Lim, Hwee Hwa Pang, Teck-Tim Tan Jul 2005

Social Network Discovery By Mining Spatio-Temporal Events, Hady Lauw, Ee Peng Lim, Hwee Hwa Pang, Teck-Tim Tan

Research Collection School Of Computing and Information Systems

Knowing patterns of relationship in a social network is very useful for law enforcement agencies to investigate collaborations among criminals, for businesses to exploit relationships to sell products, or for individuals who wish to network with others. After all, it is not just what you know, but also whom you know, that matters. However, finding out who is related to whom on a large scale is a complex problem. Asking every single individual would be impractical, given the huge number of individuals and the changing dynamics of relationships. Recent advancement in technology has allowed more data about activities of individuals …


Hot Event Detection And Summarization By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo Jul 2005

Hot Event Detection And Summarization By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This paper proposes a new approach for hot event detection and summarization of news videos. The approach is mainly based on two graph algorithms: optimal matching (OM) and normalized cut (NC). Initially, OM is employed to measure the visual similarity between all pairs of events under the one-to-one mapping constraint among video shots. Then, news events are represented as a complete weighted graph and NC is carried out to globally and optimally partition the graph into event clusters. Finally, based on the cluster size and globality of events, hot events can be automatically detected and selected as the summaries of …


An Informatization Of Society Approach To E-Government: Analyzing Singapore’S E-Government Efforts, Calvin M. L. Chan, Yi Meng Lau Jul 2005

An Informatization Of Society Approach To E-Government: Analyzing Singapore’S E-Government Efforts, Calvin M. L. Chan, Yi Meng Lau

Research Collection School Of Computing and Information Systems

Despite the much publicized benefits of e-government, many countries are experiencing difficulty in yielding success in their e-government initiatives. Studies which adopt the national e-government initiatives as the unit of analysis remain largely rare. This paper aspires to provide an analysis of Singapore’s widely acclaimed success in the e-government effort at the national level to allow other countries to learn and gain from its experience. Using the ‘Conceptual Framework for the Informatization of Society’ in facilitating the data analysis, implications are drawn to offer insights for the considerations of e-government practitioners. Theoretical implications are also derived through positing that the …


Multibiometrics Based On Palmprint And Handgeometry, Xiao-Yong Wei, Dan Xu, Chong-Wah Ngo Jul 2005

Multibiometrics Based On Palmprint And Handgeometry, Xiao-Yong Wei, Dan Xu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This paper described our approach of multibiometrics in a single image. Firstly, a new method for capturing the key points of hand geometry is proposed. Then, we described our new method of palmprint feature extracting. By using projection transform and wavelet transform, this method considered both the global feature and local detail of a palmprint texture and proposed a new kind of palmprint feature. We also proposed a twice segmentation method for handgeometry feature extraction. In the processing of feature matching, we analyzed the weakness of the traditional Euclidian Square Norm method, and introduced an improved method. The experimental results …


Lifecycle Of Semantic Web Processes, Jorge Cardoso, Chistoph Bussler, Amit P. Sheth Jun 2005

Lifecycle Of Semantic Web Processes, Jorge Cardoso, Chistoph Bussler, Amit P. Sheth

Kno.e.sis Publications

This tutorial presents what can be achieved by symbiotic synthesis of two of the most important research and technology application areas: Web Services and the Semantic Web. It presents the more recent evolution of the Web Service platform towards rich Web Service and process model annotation, and explores some of the promises and challenges in applying semantics to each of the steps in the Semantic Web Process lifecycle.


Web Service Semantics - Wsdl-S, Rama Akkiraju, Joel Farrell, John A. Miller, Meenakshi Nagarajan, Amit P. Sheth, Kunal Verma Jun 2005

Web Service Semantics - Wsdl-S, Rama Akkiraju, Joel Farrell, John A. Miller, Meenakshi Nagarajan, Amit P. Sheth, Kunal Verma

Kno.e.sis Publications

Web services have primarily been designed for providing inter-operability between business applications. Current technologies assume a large amount of human interaction, for integrating two applications. This is primarily due to the fact that business process integration requires understanding of data and functions of the involved entities. Semantic Web technologies, powered by description logic based languages like OWL[1], aim to add greater meaning to Web content, by annotating the data with ontologies. Ontologies provide a mechanism of providing shared conceptualizations of domains. This allows agents to get an understanding of users’ Web content and greatly reduces human interaction for meaningful Web …


Ga-Facilitated Classifier Optimization With Varying Similarity Measures, Michael R. Peterson, Travis E. Doom, Michael L. Raymer Jun 2005

Ga-Facilitated Classifier Optimization With Varying Similarity Measures, Michael R. Peterson, Travis E. Doom, Michael L. Raymer

Kno.e.sis Publications

Genetic algorithms are powerful tools for k-nearest neighbors classification. Traditional knn classifiers employ Euclidian distance to assess neighbor similarity, though other measures may also be used. GAs can search for optimal linear weights of features to improve knn performance using both Euclidian distance and cosine similarity. GAs also optimize additive feature offsets in search of an optimal point of reference for assessing angular similarity using the cosine measure. This poster explores weight and offset optimization for knn with varying similarity measures, including Euclidian distance (weights only), cosine similarity, and Pearson correlation. The use of offset optimization …


Description Logic Programs: A Practical Choice For The Modelling Of Ontologies, Rudi Studer, York Sure, Pascal Hitzler Jun 2005

Description Logic Programs: A Practical Choice For The Modelling Of Ontologies, Rudi Studer, York Sure, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Knowledge representation using ontologies constitutes the heart of semantic technologies. Despite successful standardization efforts by the W3C, however, there are still numerous different ontology representation languages being used, and interoperability between them is in general not given. The problem is aggrevated by the fact that current standards lay foundations only and are well-known to be insufficient for the modelling of finer details. Thus, a plethora of extensions of the basic languages is being proposed, rendering the picture of ontology representation languages to be chaotic, to say the least. While semantic technologies start to become applicable and are being applied in …


Integrating Semantic Web Services For Mobile Access, Anupriya Ankolekar, Pascal Hitzler, Holger Lewen, Daniel Oberle, Rudi Studer Jun 2005

Integrating Semantic Web Services For Mobile Access, Anupriya Ankolekar, Pascal Hitzler, Holger Lewen, Daniel Oberle, Rudi Studer

Computer Science and Engineering Faculty Publications

We present our work in integrating Semantic Web services for access via mobile devices. We have developed a system, the WebServiceAccessComponent, that transforms a user request for a service on a mobile device, to a Web service request and then selects a matching service from the existing Web services of the Deutsche Telekom, which provide navigational and weather information. In this poster, we present the requirements and design of the WebServiceAccessComponent.


Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao Jun 2005

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Computer Science Faculty Publications

Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …


Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui Jun 2005

Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui

Research Collection School Of Computing and Information Systems

Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,...qn, an ANN query outputs the facility p belongs to P that minimizes the sum of distances |pqi| for 1 is less than or equal to i is less than or equal to n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p belongs to P that minimizes the maximum distance that …


Semantic Management Of Web Services Using The Core Ontology Of Services, Daniel Oberle, Steffen Lamparter, Andreas Eberhart, Stephan Grimm, Sudhir Agarwal, Rudi Studer, Pascal Hitzler Jun 2005

Semantic Management Of Web Services Using The Core Ontology Of Services, Daniel Oberle, Steffen Lamparter, Andreas Eberhart, Stephan Grimm, Sudhir Agarwal, Rudi Studer, Pascal Hitzler

Kno.e.sis Publications

Different Web Service standards like WSDL, WS-Security, WS-Policy etc., henceforth referred to as WS*, factorize Web Service management tasks into different aspects, such as input/output, workflow, or security. The advantages of WS* are multiple and have already achieved industrial impact. WS* descriptions are exchangeable and developers may use different implementations for the same Web Service description. The disadvantages of WS*, however, are also apparent: even though the different standards are complementary, they must overlap and one may produce models composed of different WS* descriptions, which are inconsistent with each other, but the reasons for the inconsistencies are not easily determined. …


Capturing Design Patterns For Performance Issues In Database-Driven Web Applications, Osama Mabroul Khaled Jun 2005

Capturing Design Patterns For Performance Issues In Database-Driven Web Applications, Osama Mabroul Khaled

Archived Theses and Dissertations

The Design patterns technology is a new research topic which aims at helping with communicating technical knowledge in a standard non-technical format. People coming from different technical backgrounds can share this knowledge and apply it in their own way. For example, pieces of designs could be the same for different applications but they get implemented using different programming languages. On the other hand, web applications are becoming more widely spread, especially e-commerce ones, which make light returns on investment and achieve good relations between the companies and the customers. To stabilize this relationship, a web application must have a good …


A Semi-Supervised Active Learning Framework For Image Retrieval, Steven Hoi, Michael R. Lyu Jun 2005

A Semi-Supervised Active Learning Framework For Image Retrieval, Steven Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Although recent studies have shown that unlabeled data are beneficial to boosting the image retrieval performance, very few approaches for image retrieval can learn with labeled and unlabeled data effectively. This paper proposes a novel semi-supervised active learning framework comprising a fusion of semi-supervised learning and support vector machines. We provide theoretical analysis of the active learning framework and present a simple yet effective active learning algorithm for image retrieval. Experiments are conducted on real-world color images to compare with traditional methods. The promising experimental results show that our proposed scheme significantly outperforms the previous approaches.


Dsi: A Fully Distributed Spatial Index For Wireless Data Broadcast, Wang-Chien Lee, Baihua Zheng Jun 2005

Dsi: A Fully Distributed Spatial Index For Wireless Data Broadcast, Wang-Chien Lee, Baihua Zheng

Research Collection School Of Computing and Information Systems

Recent announcement of the MSN Direct Service has demonstrated the feasibility and industrial interest in utilizing wireless broadcast for pervasive information services. To support location-based services in wireless data broadcast systems, a distributed spatial index (called DSI) is proposed in this paper. DSI is highly efficient because it has a linear yet fully distributed structure that facilitates multiple search paths to be naturally mixed together by sharing links. Moreover, DSI is very resilient in error-prone wireless communication environments. Search algorithms for two classical location-based queries, window queries and kNN queries, based on DSI are presented. Performance evaluation of DSI shows …


Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring, Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias Jun 2005

Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring, Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias

Research Collection School Of Computing and Information Systems

Given a set of objects P and a query point q, a k nearest neighbor (k-NN) query retrieves the k objects in P that lie closest to q. Even though the problem is well-studied for static datasets, the traditional methods do not extend to highly dynamic environments where multiple continuous queries require real-time results, and both objects and queries receive frequent location updates. In this paper we propose conceptual partitioning (CPM), a comprehensive technique for the efficient monitoring of continuous NN queries. CPM achieves low running time by handling location updates only from objects that fall in the vicinity of …