Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Research Collection School Of Computing and Information Systems

Discipline
Keyword
Publication Year
File Type

Articles 5521 - 5550 of 6891

Full-Text Articles in Physical Sciences and Mathematics

Effective Music Tagging Through Advanced Statistical Modeling, Jialie Shen, Meng Wang, Shuicheng Yan, Hwee Hwa Pang, Xian-Sheng Hua Jul 2010

Effective Music Tagging Through Advanced Statistical Modeling, Jialie Shen, Meng Wang, Shuicheng Yan, Hwee Hwa Pang, Xian-Sheng Hua

Research Collection School Of Computing and Information Systems

Music information retrieval (MIR) holds great promise as a technology for managing large music archives. One of the key components of MIR that has been actively researched into is music tagging. While significant progress has been achieved, most of the existing systems still adopt a simple classification approach, and apply machine learning classifiers directly on low level acoustic features. Consequently, they suffer the shortcomings of (1) poor accuracy, (2) lack of comprehensive evaluation results and the associated analysis based on large scale datasets, and (3) incomplete content representation, arising from the lack of multimodal and temporal information integration. In this …


Generating Templates Of Entity Summaries With An Entity-Aspect Model And Pattern Mining, Peng Li, Jing Jiang, Yinglin Wang Jul 2010

Generating Templates Of Entity Summaries With An Entity-Aspect Model And Pattern Mining, Peng Li, Jing Jiang, Yinglin Wang

Research Collection School Of Computing and Information Systems

In this paper, we propose a novel approach to automatic generation of summary templates from given collections of summary articles. This kind of summary templates can be useful in various applications. We first develop an entity-aspect LDA model to simultaneously cluster both sentences and words into aspects. We then apply frequent subtree pattern mining on the dependency parse trees of the clustered and labeled sentences to discover sentence patterns that well represent the aspects. Key features of our method include automatic grouping of semantically related sentence patterns and automatic identification of template slots that need to be filled in. We …


On The Sampling Of Web Images For Learning Visual Concept Classifiers, Shiai Zhu, Gang Wang, Chong-Wah Ngo, Yu-Gang Jiang Jul 2010

On The Sampling Of Web Images For Learning Visual Concept Classifiers, Shiai Zhu, Gang Wang, Chong-Wah Ngo, Yu-Gang Jiang

Research Collection School Of Computing and Information Systems

Visual concept learning often requires a large set of training images. In practice, nevertheless, acquiring noise-free training labels with sufficient positive examples is always expensive. A plausible solution for training data collection is by sampling the largely available user-tagged images from social media websites. With the general belief that the probability of correct tagging is higher than that of incorrect tagging, such a solution often sounds feasible, though is not without challenges. First, user-tags can be subjective and, to certain extent, are ambiguous. For instance, an image tagged with “whales” may be simply a picture about ocean museum. Learning concept …


Hybrid Time-Frequency Domain Analysis For Inverter-Fed Induction Motor Fault Detection, T. W. Chua, W. W. Tan, Zhaoxia Wang, C. S. Chang Jul 2010

Hybrid Time-Frequency Domain Analysis For Inverter-Fed Induction Motor Fault Detection, T. W. Chua, W. W. Tan, Zhaoxia Wang, C. S. Chang

Research Collection School Of Computing and Information Systems

The detection of faults in an induction motor is important as a part of preventive maintenance. Stator current is one of the most popular signals used for utility-supplied induction motor fault detection as a current sensor can be installed nonintrusively. In variable speeds operation, the use of an inverter to drive the induction motor introduces noise into the stator current so stator current based fault detection techniques become less reliable. This paper presents a hybrid algorithm, which combines time and frequency domain analysis, for broken rotor bar and bearing fault detection. Cluster information obtained by using Independent Component Analysis (ICA) …


A Heuristic Algorithm For Trust-Oriented Service Provider Selection In Complex Social Networks, Guanfeng Liu, Yan Wang, Mehmet A. Orgun, Ee Peng Lim Jul 2010

A Heuristic Algorithm For Trust-Oriented Service Provider Selection In Complex Social Networks, Guanfeng Liu, Yan Wang, Mehmet A. Orgun, Ee Peng Lim

Research Collection School Of Computing and Information Systems

In a service-oriented online social network consisting of service providers and consumers, a service consumer can search trustworthy service providers via the social network. This requires the evaluation of the trustworthiness of a service provider along a certain social trust path from the service consumer to the service provider. However, there are usually many social trust paths between participants in social networks. Thus, a challenging problem is which social trust path is the optimal one that can yield the most trustworthy evaluation result. In this paper, we first present a novel complex social network structure and a new concept, Quality …


Effective Heuristic Methods For Finding Non-Optimal Solutions Of Interest In Constrained Optimization Models, Steven O. Kimbrough, Ann Kuo, Hoong Chuin Lau Jul 2010

Effective Heuristic Methods For Finding Non-Optimal Solutions Of Interest In Constrained Optimization Models, Steven O. Kimbrough, Ann Kuo, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

This paper introduces the SoI problem, that of finding nonoptimal solutions of interest for constrained optimization models. SoI problems subsume finding FoIs (feasible solutions of interest), and IoIs (infeasible solutions of interest). In all cases, the interest addressed is post-solution analysis in one form or another. Post-solution analysis of a constrained optimization model occurs after the model has been solved and a good or optimal solution for it has been found. At this point, sensitivity analysis and other questions of import for decision making (discussed in the paper) come into play and for this purpose the SoIs can be of …


Faceted Topic Retrieval Of News Video Using Joint Topic Modeling Of Visual Features And Speech Transcripts, Kong-Wah Wan, Ah-Hwee Tan, Joo-Hwee Lim, Liang-Tien Chia Jul 2010

Faceted Topic Retrieval Of News Video Using Joint Topic Modeling Of Visual Features And Speech Transcripts, Kong-Wah Wan, Ah-Hwee Tan, Joo-Hwee Lim, Liang-Tien Chia

Research Collection School Of Computing and Information Systems

Because of the inherent ambiguity in user queries, an important task of modern retrieval systems is faceted topic retrieval (FTR), which relates to the goal of returning diverse or novel information elucidating the wide range of topics or facets of the query need. We introduce a generative model for hypothesizing facets in the (news) video domain by combining the complementary information in the visual keyframes and the speech transcripts. We evaluate the efficacy of our multimodal model on the standard TRECVID-2005 video corpus annotated with facets. We find that: (1) the joint modeling of the visual and text (speech transcripts) …


Self-Organizing Neural Networks For Behavior Modeling In Games, Shu Feng, Ah-Hwee Tan Jul 2010

Self-Organizing Neural Networks For Behavior Modeling In Games, Shu Feng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

This paper proposes self-organizing neural networks for modeling behavior of non-player characters (NPC) in first person shooting games. Specifically, two classes of self-organizing neural models, namely Self-Generating Neural Networks (SGNN) and Fusion Architecture for Learning and Cognition (FALCON) are used to learn non-player characters' behavior rules according to recorded patterns. Behavior learning abilities of these two models are investigated by learning specific sample Bots in the Unreal Tournament game in a supervised manner. Our empirical experiments demonstrate that both SGNN and FALCON are able to recognize important behavior patterns and learn the necessary knowledge to operate in the Unreal environment. …


Towards Probabilistic Memetic Algorithm: An Initial Study On Capacitated Arc Routing Problem, Liang Feng, Yew-Soon Ong, Quang Huy Nguyen, Ah-Hwee Tan Jul 2010

Towards Probabilistic Memetic Algorithm: An Initial Study On Capacitated Arc Routing Problem, Liang Feng, Yew-Soon Ong, Quang Huy Nguyen, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Capacitated arc routing problem (CARP) has attracted much attention due to its generality to many real world problems. Memetic algorithm (MA), among other metaheuristic search methods, has been shown to achieve competitive performances in solving CARP ranging from small to medium size. In this paper we propose a formal probabilistic memetic algorithm for CARP that is equipped with an adaptation mechanism to control the degree of global exploration against local exploitation while the search progresses. Experimental study on benchmark instances of CARP showed that the proposed probabilistic scheme led to improved search performances when introduced into a recently proposed state-of-the-art …


Design And Implementation Of A Middleware For Easy Development And Provision Of Stream-Based Services, Seungwoo Kang, Youngki Lee, Sunghwan Ihm, Souneil Park, Su-Myeon Kim, Junehwa Song Jul 2010

Design And Implementation Of A Middleware For Easy Development And Provision Of Stream-Based Services, Seungwoo Kang, Youngki Lee, Sunghwan Ihm, Souneil Park, Su-Myeon Kim, Junehwa Song

Research Collection School Of Computing and Information Systems

This paper proposes MISSA, a novel middleware to facilitate the development and provision of stream-based services in emerging pervasive environments. The stream-based services utilize voluminous and continuously updated data streams as their input. The characteristics of data streams bring new requirements on the development and provision of the services. To satisfy the requirements, a unique service model and a runtime system are designed in MISSA. The key concept of our service model is to separate service logic from handling data streams. This significantly mitigates the burden on service developers by allowing them to only concentrate on the service logic. Job …


A Self-Organizing Approach To Episodic Memory Modeling, Wenwen Wang, Budhitama Subagdja, Ah-Hwee Tan Jul 2010

A Self-Organizing Approach To Episodic Memory Modeling, Wenwen Wang, Budhitama Subagdja, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

This paper presents a neural model that learns episodic traces in response to a continual stream of sensory input and feedback received from the environment. The proposed model, based on fusion Adaptive Resonance Theory (fusion ART) network, extracts key events and encodes spatiotemporal relations between events by creating cognitive nodes dynamically. The model further incorporates a novel memory search procedure, which performs parallel search of stored episodic traces continuously. Comparing with prior systems, the proposed episodic memory model presents a robust approach to encoding key events and episodes and recalling them using partial and erroneous cues. We present experimental studies, …


Extracting Common Emotions From Blogs Based On Fine-Grained Sentiment Clustering, Shi Feng, Daling Wang, Ge Yu, Wei Gao, Kam-Fai Wong Jul 2010

Extracting Common Emotions From Blogs Based On Fine-Grained Sentiment Clustering, Shi Feng, Daling Wang, Ge Yu, Wei Gao, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Recently, blogs have emerged as the major platform for people to express their feelings and sentiments in the age of Web 2.0. The common emotions, which reflect people’s collective and overall sentiments, are becoming the major concern for governments, business companies and individual users. Different from previous literatures on sentiment classification and summarization, the major issue of common emotion extraction is to find out people’s collective sentiments and their corresponding distributions on the Web. Most existing blog clustering methods take into account keywords, stories or timelines but neglect the embedded sentiments, which are considered very important features of blogs. In …


Learning To Rank Only Using Training Data From Related Domain, Wei Gao, Peng Cai, Kam-Fai Wong, Aoying Zhou Jul 2010

Learning To Rank Only Using Training Data From Related Domain, Wei Gao, Peng Cai, Kam-Fai Wong, Aoying Zhou

Research Collection School Of Computing and Information Systems

Like traditional supervised and semi-supervised algorithms, learning to rank for information retrieval requires document annotations provided by domain experts. It is costly to annotate training data for different search domains and tasks. We propose to exploit training data annotated for a related domain to learn to rank retrieved documents in the target domain, in which no labeled data is available. We present a simple yet effective approach based on instance-weighting scheme. Our method first estimates the importance of each related-domain document relative to the target domain. Then heuristics are studied to transform the importance of individual documents to the pairwise …


Efficient Mutual Nearest Neighbor Query Processing For Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen, Gang Chen Jun 2010

Efficient Mutual Nearest Neighbor Query Processing For Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen, Gang Chen

Research Collection School Of Computing and Information Systems

Given a set D of trajectories, a query object q, and a query time extent Γ, a mutual (i.e., symmetric) nearest neighbor (MNN) query over trajectories finds from D, the set of trajectories that are among the k1 nearest neighbors (NNs) of q within Γ, and meanwhile, have q as one of their k2 NNs. This type of queries is useful in many applications such as decision making, data mining, and pattern recognition, as it considers both the proximity of the trajectories to q and the proximity of q to the trajectories. In this paper, we first formalize MNN search …


A Social Transitivity-Based Data Dissemination Scheme For Opportunistic Networks, Jaesung Ku, Yangwoo Ko, Jisun An, Dongman Lee Jun 2010

A Social Transitivity-Based Data Dissemination Scheme For Opportunistic Networks, Jaesung Ku, Yangwoo Ko, Jisun An, Dongman Lee

Research Collection School Of Computing and Information Systems

A social-based routing protocol for opportunistic networks considers the direct delivery as forwarding metrics. By ignoring the indirect delivery through intermediate nodes, it misses chances to find paths that are better in terms of delivery ratio and time. To overcome this limitation, we propose to incorporate transitivity, which considers the indirect delivery through intermediate nodes, as one of the forwarding metrics. We also found that some message forwards do not improve the delivery performance. To reduce the number of these useless forwards, the proposed scheme forwards messages to an encountered node when the increase of total utility value is greater …


Using Hadoop And Cassandra For Taxi Data Analytics: A Feasibility Study, Alvin Jun Yong Koh, Xuan Khoa Nguyen, C. Jason Woodard Jun 2010

Using Hadoop And Cassandra For Taxi Data Analytics: A Feasibility Study, Alvin Jun Yong Koh, Xuan Khoa Nguyen, C. Jason Woodard

Research Collection School Of Computing and Information Systems

This paper reports on a preliminary study to assess the feasibility of using the Open Cirrus Cloud Computing Research testbed to provide offline and online analytical support for taxi fleet operations. In the study, we benchmarked the performance gains from distributing the offline analysis of GPS location traces over multiple virtual machines using the Apache Hadoop implementation of the MapReduce paradigm. We also explored the use of the Apache Cassandra distributed database system for online retrieval of vehicle trace data. While configuring the testbed infrastructure was straightforward, we encountered severe I/O bottlenecks in running the benchmarks due to the lack …


Semantic Context Modeling With Maximal Margin Conditional Random Fields For Automatic Image Annotation, Yu Xiang, Xiangdong Zhou, Zuotao Liu, Tat-Seng Chua, Chong-Wah Ngo Jun 2010

Semantic Context Modeling With Maximal Margin Conditional Random Fields For Automatic Image Annotation, Yu Xiang, Xiangdong Zhou, Zuotao Liu, Tat-Seng Chua, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Context modeling for Vision Recognition and Automatic Image Annotation (AIA) has attracted increasing attentions in recent years. For various contextual information and resources, semantic context has been exploited in AIA and brings promising results. However, previous works either casted the problem into structural classification or adopted multi-layer modeling, which suffer from the problems of scalability or model efficiency. In this paper, we propose a novel discriminative Conditional Random Field (CRF) model for semantic context modeling in AIA, which is built over semantic concepts and treats an image as a whole observation without segmentation. Our model captures the interactions between semantic …


Anytime Planning For Decentralized Pomdps Using Expectation Maximization, Akshat Kumar, Shlomo Zilberstein Jun 2010

Anytime Planning For Decentralized Pomdps Using Expectation Maximization, Akshat Kumar, Shlomo Zilberstein

Research Collection School Of Computing and Information Systems

Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While finite-horizon DECPOMDPs have enjoyed signifcant success, progress remains slow for the infinite-horizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infinite-horizon case, which recasts the optimization problem as inference in a mixture of DBNs. An attractive feature of this approach is the straightforward adoption of existing inference techniques in DBNs for solving DEC-POMDPs and supporting richer representations such as factored or continuous states and actions. We also derive the Expectation Maximization (EM) …


Satrap: Data And Network Heterogeneity Aware P2p Data-Mining, Hock Kee Ang, Vivekanand Gopalkrishnan, Anwitaman Datta, Wee Keong Ng, Steven C. H. Hoi Jun 2010

Satrap: Data And Network Heterogeneity Aware P2p Data-Mining, Hock Kee Ang, Vivekanand Gopalkrishnan, Anwitaman Datta, Wee Keong Ng, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Distributed classification aims to build an accurate classifier by learning from distributed data while reducing computation and communication cost A P2P network where numerous users come together to share resources like data content, bandwidth, storage space and CPU resources is an excellent platform for distributed classification However, two important aspects of the learning environment have often been overlooked by other works, viz., 1) location of the peers which results in variable communication cost and 2) heterogeneity of the peers' data which can help reduce redundant communication In this paper, we examine the properties of network and data heterogeneity and propose …


Otl: A Framework Of Online Transfer Learning, Peilin Zhao, Steven C. H. Hoi Jun 2010

Otl: A Framework Of Online Transfer Learning, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

In this paper, we investigate a new machine learning framework called Online Transfer Learning (OTL) that aims to transfer knowledge from some source domain to an online learning task on a target domain. We do not assume the target data follows the same class or generative distribution as the source data, and our key motivation is to improve a supervised online learning task in a target domain by exploiting the knowledge that had been learned from large amount of training data in source domains. OTL is in general challenging since data in both domains not only can be different in …


A Hybrid Method To Detect Deflation Fraud In Cost-Per-Action Online Advertising, Xuhua Ding Jun 2010

A Hybrid Method To Detect Deflation Fraud In Cost-Per-Action Online Advertising, Xuhua Ding

Research Collection School Of Computing and Information Systems

Web advertisers prefer the cost-per-action (CPA) advertisement model whereby an advertiser pays a web publisher according to the actual amount of transactions, rather than the volume of advertisement clicks. The main obstacle for a wide deployment of this model is the deflation fraud. Namely, a dishonest advertiser under-reports the transaction count in order to discharge less. In this paper, we present a mechanism to detect such a fraud using a hybrid of cryptography and probability tools. With the assistance from a small number of users, the publisher can detect deflation fraud with a success probability growing exponentially with the fraud …


Prediction Of Protein Subcellular Localization: A Machine Learning Approach, Kyong Jin Shim Jun 2010

Prediction Of Protein Subcellular Localization: A Machine Learning Approach, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

Subcellular localization is a key functional characteristic of proteins. Optimally combining available information is one of the key challenges in today's knowledge-based subcellular localization prediction approaches. This study explores machine learning approaches for the prediction of protein subcellular localization that use resources concerning Gene Ontology and secondary structures. Using the spectrum kernel for feature representation of amino acid sequences and secondary structures, we explore an SVM-based learning method that classifies six subcellular localization sites: endoplasmic reticulum, extracellular, Golgi, membrane, mitochondria, and nucleus.


Income, Endogenous Market Structure And Innovation, Mei Lin, Shaojin Li, Andrew B. Whinston Jun 2010

Income, Endogenous Market Structure And Innovation, Mei Lin, Shaojin Li, Andrew B. Whinston

Research Collection School Of Computing and Information Systems

We investigate the effect of income distribution on R&D in a dynamic framework. Our model captures both the infinite R&D race among heterogeneous innovators and a market where successful innovators generate revenues. The market structure of successful innovations is endogenous – firms produce vertically differentiated substitute goods and compete in price. Based on firms' equilibrium market revenues, we derive numerical solutions of the Markov perfect equilibrium innovation rate of the dynamic problem. A key insight in our results is that explicitly modeling price competition and the market structure plays an important role in evaluating the impact of rising income inequality …


Z-Sky: An Efficient Skyline Query Processing Framework Based On Z-Order, Ken C. K. Lee, Wang-Chien Lee, Baihua Zheng, Huajing Li, Yuan Tian Jun 2010

Z-Sky: An Efficient Skyline Query Processing Framework Based On Z-Order, Ken C. K. Lee, Wang-Chien Lee, Baihua Zheng, Huajing Li, Yuan Tian

Research Collection School Of Computing and Information Systems

Given a set of data points in a multidimensional space, a skyline query retrieves those data points that are not dominated by any other point in the same dataset. Observing that the properties of Z-order space filling curves (or Z-order curves) perfectly match with the dominance relationships among data points in a geometrical data space, we, in this paper, develop and present a novel and efficient processing framework to evaluate skyline queries and their variants, and to support skyline result updates based on Z-order curves. This framework consists of ZBtree, i.e., an index structure to organize a source dataset and …


On Trustworthiness Of Cpu Usage Metering And Accounting, Mei Liu, Xuhua Ding Jun 2010

On Trustworthiness Of Cpu Usage Metering And Accounting, Mei Liu, Xuhua Ding

Research Collection School Of Computing and Information Systems

In the envisaged utility computing paradigm, a user taps a service provider’s computing resources to accomplish her tasks, without deploying the needed hardware and software in her own IT infrastructure. To make the service profitable, the service provider charges the user based on the resources consumed. A commonly billed resource is CPU usage. A key factor to ensure the success of such a business model is the trustworthiness of the resource metering scheme. In this paper, we provide a systematic study on the trustworthiness of CPU usage metering. Our results show that the metering schemes in commodity operating systems should …


Revisiting Unpredictability-Based Rfid Privacy Models, Junzuo Lai, Robert Huijie Deng, Yingjiu Li Jun 2010

Revisiting Unpredictability-Based Rfid Privacy Models, Junzuo Lai, Robert Huijie Deng, Yingjiu Li

Research Collection School Of Computing and Information Systems

Recently, there have been several attempts in establishing formal RFID privacy models in the literature. These models mainly fall into two categories: one based on the notion of indistinguishability of two RFID tags, denoted as ind-privacy, and the other based on the unpredictability of the output of an RFID protocol, denoted as unp-privacy. Very recently, at CCS’09, Ma et al. proposed a modified unp-privacy model, referred to as unp -privacy. In this paper, we first revisit the existing RFID privacy models and point out their limitations. We then propose a new RFID privacy model, denoted as …


Visualizing And Exploring Evolving Information Networks In Wikipedia, Ee Peng Lim, Agus Trisnajaya Kwee, Nelman Lubis Ibrahim, Aixin Sun, Anwitaman Datta, Kuiyu Chang, Maureen Maureen Jun 2010

Visualizing And Exploring Evolving Information Networks In Wikipedia, Ee Peng Lim, Agus Trisnajaya Kwee, Nelman Lubis Ibrahim, Aixin Sun, Anwitaman Datta, Kuiyu Chang, Maureen Maureen

Research Collection School Of Computing and Information Systems

Information networks in Wikipedia evolve as users collaboratively edit articles that embed the networks. These information networks represent both the structure and content of community’s knowledge and the networks evolve as the knowledge gets updated. By observing the networks evolve and finding their evolving patterns, one can gain higher order knowledge about the networks and conduct longitudinal network analysis to detect events and summarize trends. In this paper, we present SSNetViz+, a visual analytic tool to support visualization and exploration of Wikipedia’s information networks. SSNetViz+ supports time-based network browsing, content browsing and search. Using a terrorism information network as an …


Do Wikipedians Follow Domain Experts? A Domain-Specific Study On Wikipedia Contribution, Yi Zhang, Aixin Sun, Anwitaman Datta, Kuiyu Chang, Ee Peng Lim Jun 2010

Do Wikipedians Follow Domain Experts? A Domain-Specific Study On Wikipedia Contribution, Yi Zhang, Aixin Sun, Anwitaman Datta, Kuiyu Chang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Wikipedia is one of the most successful online knowledge bases, attracting millions of visits daily. Not surprisingly, its huge success has in turn led to immense research interest for a better understanding of the collaborative knowledge building process. In this paper, we performed a (terrorism) domain-specific case study, comparing and contrasting the knowledge evolution in Wikipedia with a knowledge base created by domain experts. Specifically, we used the Terrorism Knowledge Base (TKB) developed by experts at MIPT. We identified 409 Wikipedia articles matching TKB records, and went ahead to study them from three aspects: creation, revision, and link evolution. We …


Stevent: Spatio-Temporal Event Model For Social Network Discovery, Hady W. Lauw, Ee Peng Lim, Hwee Hwa Pang, Teck-Tim Tan Jun 2010

Stevent: Spatio-Temporal Event Model For Social Network Discovery, Hady W. Lauw, Ee Peng Lim, Hwee Hwa Pang, Teck-Tim Tan

Research Collection School Of Computing and Information Systems

Spatio-temporal data concerning the movement of individuals over space and time contains latent information on the associations among these individuals. Sources of spatio-temporal data include usage logs of mobile and Internet technologies. This article defines a spatio-temporal event by the co-occurrences among individuals that indicate potential associations among them. Each spatio-temporal event is assigned a weight based on the precision and uniqueness of the event. By aggregating the weights of events relating two individuals, we can determine the strength of association between them. We conduct extensive experimentation to investigate both the efficacy of the proposed model as well as the …


Efficient Processing Of Exact Top-K Queries Over Disk-Resident Sorted Lists, Hwee Hwa Pang, Xuhua Ding, Baihua Zheng Jun 2010

Efficient Processing Of Exact Top-K Queries Over Disk-Resident Sorted Lists, Hwee Hwa Pang, Xuhua Ding, Baihua Zheng

Research Collection School Of Computing and Information Systems

The top-k query is employed in a wide range of applications to generate a ranked list of data that have the highest aggregate scores over certain attributes. As the pool of attributes for selection by individual queries may be large, the data are indexed with per-attribute sorted lists, and a threshold algorithm (TA) is applied on the lists involved in each query. The TA executes in two phases--find a cut-off threshold for the top-k result scores, then evaluate all the records that could score above the threshold. In this paper, we focus on exact top-k queries that involve monotonic linear …