Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 4681 - 4710 of 6727

Full-Text Articles in Physical Sciences and Mathematics

Link Type Based Pre-Cluster Pair Model For Coreference Resolution, Yang Song, Houfeng Wang, Jing Jiang Jun 2011

Link Type Based Pre-Cluster Pair Model For Coreference Resolution, Yang Song, Houfeng Wang, Jing Jiang

Research Collection School Of Computing and Information Systems

This paper presents our participation in the CoNLL-2011 shared task, Modeling Unrestricted Coreference in OntoNotes. Coreference resolution, as a difficult and challenging problem in NLP, has attracted a lot of attention in the research community for a long time. Its objective is to determine whether two mentions in a piece of text refer to the same entity. In our system, we implement mention detection and coreference resolution seperately. For mention detection, a simple classification based method combined with several effective features is developed. For coreference resolution, we propose a link type based pre-cluster pair model. In this model, pre-clustering of …


Continuous Visible Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Xiaofa Guo Jun 2011

Continuous Visible Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Xiaofa Guo

Research Collection School Of Computing and Information Systems

In this paper, we identify and solve a new type of spatial queries, called continuous visible nearest neighbor (CVNN) search. Given a data set P, an obstacle set O, and a query line segment q in a two-dimensional space, a CVNN query returns a set of $${\langle p, R\rangle}$$ tuples such that $${p \in P}$$ is the nearest neighbor to every point r along the interval $${R \subseteq q}$$ as well as pis visible to r. Note that p may be NULL, meaning that all points in P are invisible to all points in R due to the obstruction of …


Developing Digital Field Guides For Plants: A Study From The Perspective Of Users, Emily Roseanne Schwarz Jun 2011

Developing Digital Field Guides For Plants: A Study From The Perspective Of Users, Emily Roseanne Schwarz

Master's Theses

A field guide is a tool to identify an object of natural history. Field guides
cover a wide range of topics from plants to fungi, birds to mammals, and shells to minerals. Traditionally, field guides are books, usually small enough to be carried outdoors . They enjoy wide popularity in modern life; almost every American home and library owns at least one field guide, and the same is also true for other areas of the world.


At this time, companies, non-profits, and universities are developing computer
technologies to replace printed field guides for identifying plants. This thesis
examines the state …


Publishing Survey Articles On Information Retrieval Topics, Douglas Oard, Fabrizio Sebastiani, Jonathan Furner, Gary Marchionini May 2011

Publishing Survey Articles On Information Retrieval Topics, Douglas Oard, Fabrizio Sebastiani, Jonathan Furner, Gary Marchionini

Jonathan Furner

Survey articles are an important way of sharing knowledge among interested researchers and contributing to the growth of a field. This brief note identifies several outlets for survey articles on information retrieval, and identifies some reasons to write articles of this type.


Active Caching For Recommender Systems, Muhammad Umar Qasim May 2011

Active Caching For Recommender Systems, Muhammad Umar Qasim

Dissertations

Web users are often overwhelmed by the amount of information available while carrying out browsing and searching tasks. Recommender systems substantially reduce the information overload by suggesting a list of similar documents that users might find interesting. However, generating these ranked lists requires an enormous amount of resources that often results in access latency. Caching frequently accessed data has been a useful technique for reducing stress on limited resources and improving response time. Traditional passive caching techniques, where the focus is on answering queries based on temporal locality or popularity, achieve a very limited performance gain. In this dissertation, we …


Getting To Know Social Media Analytics, Tin Seong Kam May 2011

Getting To Know Social Media Analytics, Tin Seong Kam

Research Collection School Of Computing and Information Systems

Over the last five years, the unprecedented development and use of social mediating technologies such as blog, wiki, Facebook, and Tweeter have engendered radically new ways of working, playing, and creating meaning, leaving an indelible mark on nearly every domain imaginable. Despite the growing ubiquity of social mediating technologies, their potential has hardly been tapped. Effectively using data collected from social mediating technologies by the business community is far from trivial. This is mainly due to the general lack of awareness on Social Network Analysis (SNA) techniques and technologies among the business analysts and practitioners. This presentation aims to provide …


Putting Artists On The Map: A Five Part Study Of Greater Cleveland Artists' Location Decisions - Part 5: Properties Analysis - Artist Housing Characteristics, Mark Salling, Gregory Soltis, Charles Post, Sharon Bliss, Ellen Cyran May 2011

Putting Artists On The Map: A Five Part Study Of Greater Cleveland Artists' Location Decisions - Part 5: Properties Analysis - Artist Housing Characteristics, Mark Salling, Gregory Soltis, Charles Post, Sharon Bliss, Ellen Cyran

All Maxine Goodman Levin School of Urban Affairs Publications

A series of reports detailing the residential and work space location preferences of Cuyahoga county's artists.


Rethinking Our Mobility: Supporting Our Patrons Where They Live, Alexandra Gomes, Elizabeth Palena Hall, Laura E. Abate May 2011

Rethinking Our Mobility: Supporting Our Patrons Where They Live, Alexandra Gomes, Elizabeth Palena Hall, Laura E. Abate

Himmelfarb Library Faculty Posters and Presentations

In anticipation of the release of a mobile VPN to access the George Washington University wireless network, the Himmelfarb Health Sciences Library began developing materials and services to provide support to patrons. This poster is an outline of the mobile services that were implemented to reach patrons where they live.


Trust Networks: Interpersonal, Social, And Sensor, Krishnaprasad Thirunarayan, Pramod Anantharam May 2011

Trust Networks: Interpersonal, Social, And Sensor, Krishnaprasad Thirunarayan, Pramod Anantharam

Kno.e.sis Publications

Trust relationships occur naturally in many diverse contexts such as ecommerce, interpersonal interactions, social networks, sensor web, etc. As agents providing content and services become increasingly removed from the agents that consume them, the issue of robust trust inference and update become critical. Unfortunately, there is neither a universal notion of trust that is applicable to all domains nor a clear explication of its semantics or computation in many situations. In this beginner's level tutorial, we motivate the trust problem, explain the relevant concepts, summarize research in modeling trust and gleaning trustworthiness, and discuss challenges confronting us in this process.


What's Happening In Semantic Web ... And What Fca Could Have To Do With It, Pascal Hitzler May 2011

What's Happening In Semantic Web ... And What Fca Could Have To Do With It, Pascal Hitzler

Computer Science and Engineering Faculty Publications

The Semantic Web is gaining momentum. Driven by over 10 years of focused project funding in the US and the EU, Semantic Web Technologies are now entering application areas in industry, academia, government, and the open Web.

The Semantic Web is based on the idea of describing the meaning - or semantics - of data on the Web using metadata - data that describes other data - in the form of ontologies, which are represented using logic-based knowledge representation languages. Central to the transfer of Semantic Web into practice is the Linked Open Data effort, which has already resulted in …


Double Updating Online Learning, Peilin Zhao, Steven C. H. Hoi, Rong Jin May 2011

Double Updating Online Learning, Peilin Zhao, Steven C. H. Hoi, Rong Jin

Research Collection School Of Computing and Information Systems

In most kernel based online learning algorithms, when an incoming instance is misclassified, it will be added into the pool of support vectors and assigned with a weight, which often remains unchanged during the rest of the learning process. This is clearly insufficient since when a new support vector is added, we generally expect the weights of the other existing support vectors to be updated in order to reflect the influence of the added support vector. In this paper, we propose a new online learning method, termed Double Updating Online Learning, or DUOL for short, that explicitly addresses this problem. …


Jerasure 2.0, Parth Deshmukh, Sean Maginnis, Josh Chandler May 2011

Jerasure 2.0, Parth Deshmukh, Sean Maginnis, Josh Chandler

Chancellor’s Honors Program Projects

No abstract provided.


Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard May 2011

Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard

Economics Faculty Publications

The vast majority of the literature related to the empirical estimation of retention models includes a discussion of the theoretical retention framework established by Bean, Braxton, Tinto, Pascarella, Terenzini and others (see Bean, 1980; Bean, 2000; Braxton, 2000; Braxton et al, 2004; Chapman and Pascarella, 1983; Pascarell and Ternzini, 1978; St. John and Cabrera, 2000; Tinto, 1975) This body of research provides a starting point for the consideration of which explanatory variables to include in any model specification, as well as identifying possible data sources. The literature separates itself into two major camps including research related to the hypothesis testing …


Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan May 2011

Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan

Masters Theses & Specialist Projects

The eXtensible Markup Language (XML) has become the standard format for data exchange on the Internet, providing interoperability between different business applications. Such wide use results in large volumes of heterogeneous XML data, i.e., XML documents conforming to different schemas. Although schemas are important in many business applications, they are often missing in XML documents. In this thesis, we present a suite of algorithms that are effective in extracting schema information from a large collection of XML documents. We propose using the cost of NFA simulation to compute the Minimum Length Description to rank the inferred schema. We also studied …


Continuous Nearest Neighbor Search In The Presence Of Obstacles, Yunjun Gao, Baihua Zheng, Gang Chen, Chun Chen, Qing Li May 2011

Continuous Nearest Neighbor Search In The Presence Of Obstacles, Yunjun Gao, Baihua Zheng, Gang Chen, Chun Chen, Qing Li

Research Collection School Of Computing and Information Systems

Despite the ubiquity of physical obstacles (e.g., buildings, hills, and blindages, etc.) in the real world, most of spatial queries ignore the obstacles. In this article, we study a novel form of continuous nearest-neighbor queries in the presence of obstacles, namely continuous obstructed nearest-neighbor (CONN) search, which considers the impact of obstacles on the distance between objects. Given a data setP, an obstacle set O, and a query line segment q, in a two-dimensional space, a CONN query retrieves the nearest neighbor p ∈ P of each point p′ on q according to the obstructed distance, the shortest path between …


Putting Artists On The Map: A Five Part Study Of Greater Cleveland Artists' Location Decisions - Part 4: Predictive Analysis - Regression Modeling, Mark Salling, Gregory Soltis, Charles Post, Sharon Bliss, Ellen Cyran Apr 2011

Putting Artists On The Map: A Five Part Study Of Greater Cleveland Artists' Location Decisions - Part 4: Predictive Analysis - Regression Modeling, Mark Salling, Gregory Soltis, Charles Post, Sharon Bliss, Ellen Cyran

All Maxine Goodman Levin School of Urban Affairs Publications

A series of reports detailing the residential and work space location preferences of Cuyahoga county's artists.


Putting Artists On The Map: A Five Part Study Of Greater Cleveland Artists' Location Decisions - Part 3: Attitudinal Analysis - Artist Housing And Space Survey, Mark Salling, Gregory Soltis, Charles Post, Sharon Bliss, Ellen Cyran Apr 2011

Putting Artists On The Map: A Five Part Study Of Greater Cleveland Artists' Location Decisions - Part 3: Attitudinal Analysis - Artist Housing And Space Survey, Mark Salling, Gregory Soltis, Charles Post, Sharon Bliss, Ellen Cyran

All Maxine Goodman Levin School of Urban Affairs Publications

A series of reports detailing the residential and work space location preferences of Cuyahoga county's artists.


Adaptation Of The Nevada Climate Change Data Portal Web Interface To Small-Screen Mobile Devices, Tsvetan Komarov Apr 2011

Adaptation Of The Nevada Climate Change Data Portal Web Interface To Small-Screen Mobile Devices, Tsvetan Komarov

Festival of Communities: UG Symposium (Posters)

Robust and convenient access to the Nevada Climate Change Data Portal is vital for the project’s success, because of the researchers’ need to gather and analyze large volumes of data with minimal effort. However, the current version of the data portal web interface is not optimized for small-screen mobile devices such as mobile phones, PDAs, iPads, NetBooks, and others. The proposed research will address this issue by exploring the current methods for creating a client-aware web interface adaptable to the variety of small-screen devices, designing and implementing the most appropriate solution, and finally, user testing of the implemented solution.


Object-Based Caching For Mpi-Io, Phillip M. Dickens Apr 2011

Object-Based Caching For Mpi-Io, Phillip M. Dickens

University of Maine Office of Research Administration: Grant Reports

As the size of the data sets manipulated by data-intensive scientific applications approaches the petabyte level and beyond, the need for scalable I/O techniques becomes increasingly important and difficult. Much of the research on this issue has been performed within the context of
MPI-IO: the de-facto standard parallel I/O interface for data-intensive applications. Its popularity stems from the fact that MPI-IO provides to applications a rich and flexile parallel I/O API coupled with highly efficient implementations of this API. This problem is being further addressed by the development of powerful parallel I/O subsystems, and state-of-the-art file systems that can efficiently …


Assessing Differences Between Physician's Realized And Anticipated Gains From Electronic Health Record Adoption, Lori T. Peterson, Eric W. Ford, John Eberhardt, T. R. Huerta Apr 2011

Assessing Differences Between Physician's Realized And Anticipated Gains From Electronic Health Record Adoption, Lori T. Peterson, Eric W. Ford, John Eberhardt, T. R. Huerta

Business Faculty Publications

Return on investment (ROI) concerns related to Electronic Health Records (EHRs) are a major barrier to the technology’s adoption. Physicians generally rely upon early adopters to vet new technologies prior to putting them into widespread use. Therefore, early adopters’ experiences with EHRs play a major role in determining future adoption patterns. The paper’s purposes are: (1) to map the EHR value streams that define the ROI calculation; and (2) to compare Current Users’ and Intended Adopters’ perceived value streams to identify similarities, differences and governing constructs. Primary data was collected by the Texas Medical Association, which surveyed 1,772 physicians on …


Efficient Topological Olap On Information Networks, Qiang Qu, Feida Zhu, Xifeng Yan, Jiawei Han, Philip Yu, Hongyan Li Apr 2011

Efficient Topological Olap On Information Networks, Qiang Qu, Feida Zhu, Xifeng Yan, Jiawei Han, Philip Yu, Hongyan Li

Research Collection School Of Computing and Information Systems

We propose a framework for efficient OLAP on information networks with a focus on the most interesting kind, the topological OLAP (called “T-OLAP”), which incurs topological changes in the underlying networks. T-OLAP operations generate new networks from the original ones by rolling up a subset of nodes chosen by certain constraint criteria. The key challenge is to efficiently compute measures for the newly generated networks and handle user queries with varied constraints. Two effective computational techniques, T-Distributiveness and T-Monotonicity are proposed to achieve efficient query processing and cube materialization. We also provide a T-OLAP query processing framework into which these …


Mkboost: A Framework Of Multiple Kernel Boosting, Hao Xia, Steven C. H. Hoi Apr 2011

Mkboost: A Framework Of Multiple Kernel Boosting, Hao Xia, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Multiple kernel learning (MKL) has been shown as a promising machine learning technique for data mining tasks by integrating with multiple diverse kernel functions. Traditional MKL methods often formulate the problem as an optimization task of learning both optimal combination of kernels and classifiers, and attempt to resolve the challenging optimization task by various techniques. Unlike the existing MKL methods, in this paper, we investigate a boosting framework of exploring multiple kernel learning for classification tasks. In particular, we present a novel framework of Multiple Kernel Boosting (MKBoost), which applies boosting techniques for learning kernel-based classifiers with multiple kernels. Based …


Weight-Based Boosting Model For Cross-Domain Relevance Ranking Adaptation, Peng Cai, Wei Gao, Kam-Fai Wong, Aoying Zhou Apr 2011

Weight-Based Boosting Model For Cross-Domain Relevance Ranking Adaptation, Peng Cai, Wei Gao, Kam-Fai Wong, Aoying Zhou

Research Collection School Of Computing and Information Systems

Adaptation techniques based on importance weighting were shown effective for RankSVM and RankNet, viz., each training instance is assigned a target weight denoting its importance to the target domain and incorporated into loss functions. In this work, we extend RankBoost using importance weighting framework for ranking adaptation. We find it non-trivial to incorporate the target weight into the boosting-based ranking algorithms because it plays a contradictory role against the innate weight of boosting, namely source weight that focuses on adjusting source-domain ranking accuracy. Our experiments show that among three variants, the additive weight-based RankBoost, which dynamically balances the two types …


Learning Feature Dependencies For Noise Correction In Biomedical Prediction, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang Apr 2011

Learning Feature Dependencies For Noise Correction In Biomedical Prediction, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

The presence of noise or errors in the stated feature values of biomedical data can lead to incorrect prediction. We introduce a Bayesian Network-based Noise Correction framework named BN-NC. After data preprocessing, a Bayesian Network (BN) is learned to capture the feature dependencies. Using the BN to predict each feature in turn, BN-NC estimates a feature's error rate as the deviation between its predicted and stated values in the training data, and allocates the appropriate uncertainty to its subsequent findings during prediction. BN-NC automatically generates a probabilistic rule to explain BN prediction on the class variable using the feature values …


Predicting Item Adoption Using Social Correlation, Freddy Chong-Tat Chua, Hady W. Lauw, Ee Peng Lim Apr 2011

Predicting Item Adoption Using Social Correlation, Freddy Chong-Tat Chua, Hady W. Lauw, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Users face a dazzling array of choices on the Web when it comes to choosing which product to buy, which video to watch, etc. The trend of social information processing means users increasingly rely not only on their own preferences, but also on friends when making various adoption decisions. In this paper, we investigate the effects of social correlation on users’ adoption of items. Given a user-user social graph and an item-user adoption graph, we seek to answer the following questions: 1) whether the items adopted by a user correlate to items adopted by her friends, and 2) how to …


Users Positions In Social Networks, Jasim Qazi Apr 2011

Users Positions In Social Networks, Jasim Qazi

Master's Projects

Social networks are a new phase in human interaction: using
technology to connect people online, the social nehvorks of today have
become a central part of the lives of millions of people. People use social
networks for sharing various infonrmation with their friends and family. This
information can take the forn of text, video, images, sound etc. and it is what
forms the collection of dats in social networks.

As social networks gain popularity and as more and more people start
using social networks, it has become more important now to understand the
inner structures of social networks and understand …


Recipe Suggestion Tool, Sakuntala Padmapriya Gangaraju Apr 2011

Recipe Suggestion Tool, Sakuntala Padmapriya Gangaraju

Master's Projects

ABSTRACT
There is currently a great need for a tool to search cooking recipes based on ingredients. Current search engines do not provide this feature. Most of the recipe search results in current websites are not efficiently clustered based on relevance or categories resulting in a user getting lost in the huge search results presented.
Clustering in information retrieval is used for higher efficiency and better presentation of information to the user. Clustering puts similar documents in the same cluster. If a document is relevant to a query, then the documents in the same cluster are also relevant.
The goal …


Smart Search: A Firefox Add-On To Compute A Web Traffic Ranking, Vijaya Pamidi Apr 2011

Smart Search: A Firefox Add-On To Compute A Web Traffic Ranking, Vijaya Pamidi

Master's Projects

Search engines results are typically ordered according to some notion of importance of a web page as well as relevance of the content of a web page to a query. Web page importance is usually calculated based on some graph theoretic properties of the web. Another common technique to measure page importance is to make use of the traffic that goes to a particular web page as measured by a browser toolbar. Currently, there are some traffic ranking tools available like www.alexa.com, www.ranking.com, www.compete.com that give such analytic as to the number of users who visit a web site. Alexa …


Medical Analysis Question And Answering Application For Internet Enabled Mobile Devices, Loc Nguyen Apr 2011

Medical Analysis Question And Answering Application For Internet Enabled Mobile Devices, Loc Nguyen

Master's Projects

Mobile devices such as smart phones, the iPhone, and the iPad have become more popular in recent years. With access to the Internet through cellular or WIFI networks, these mobile devices can make use of the great source of information available on the Internet. Unlike a desktop or laptop computer, an Internet enabled mobile device is designed to be carried around and available to the owner almost instantly at any moment of the day. Despite having such great advantage and potential, searching for information with a mobile device remains a difficult task. Mobile device users have to juggle between different …


Cloud Computing: Architectural And Policy Implications, Christopher S. Yoo Apr 2011

Cloud Computing: Architectural And Policy Implications, Christopher S. Yoo

All Faculty Scholarship

Cloud computing has emerged as perhaps the hottest development in information technology. Despite all of the attention that it has garnered, existing analyses focus almost exclusively on the issues that surround data privacy without exploring cloud computing’s architectural and policy implications. This article offers an initial exploratory analysis in that direction. It begins by introducing key cloud computing concepts, such as service-oriented architectures, thin clients, and virtualization, and discusses the leading delivery models and deployment strategies that are being pursued by cloud computing providers. It next analyzes the economics of cloud computing in terms of reducing costs, transforming capital expenditures …