Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 4591 - 4620 of 6727

Full-Text Articles in Physical Sciences and Mathematics

Towards Better Quality Specification Miners, David Lo, Siau-Cheng Khoo Nov 2011

Towards Better Quality Specification Miners, David Lo, Siau-Cheng Khoo

David LO

Softwares are often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications – temporal API specification – is often specified in the form of automaton (i.e., FSA/PFSA). There have been many work on mining software temporal specification using dynamic analysis techniques; i.e., analysis of software program traces. Unfortunately, the issues of scalability, robustness and accuracy of these techniques have not been comprehensively addressed. In this paper, we describe a framework that enables assessments of the performance of a specification miner in generating temporal specification of software …


Mining Interesting Link Formation Rules In Social Networks, Cane Wing-Ki Leung, Ee Peng Lim, David Lo, Jianshu Weng Nov 2011

Mining Interesting Link Formation Rules In Social Networks, Cane Wing-Ki Leung, Ee Peng Lim, David Lo, Jianshu Weng

David LO

Link structures are important patterns one looks out for when modeling and analyzing social networks. In this paper, we propose the task of mining interesting Link Formation rules (LF-rules) containing link structures known as Link Formation patterns (LF-patterns). LF-patterns capture various dyadic and/or triadic structures among groups of nodes, while LF-rules capture the formation of a new link from a focal node to another node as a postcondition of existing connections between the two nodes. We devise a novel LF-rule mining algorithm, known as LFR-Miner, based on frequent subgraph mining for our task. In addition to using a support-confidence framework …


Mining Closed Discriminative Dyadic Sequential Patterns, David Lo, Hong Cheng, - Lucia Nov 2011

Mining Closed Discriminative Dyadic Sequential Patterns, David Lo, Hong Cheng, - Lucia

David LO

A lot of data are in sequential formats. In this study, we are interested in sequential data that goes in pairs. There are many interesting datasets in this format coming from various domains including parallel textual corpora, duplicate bug reports, and other pairs of related sequences of events. Our goal is to mine a set of closed discriminative dyadic sequential patterns from a database of sequence pairs each belonging to one of the two classes +ve and -ve. These dyadic sequential patterns characterize the discriminating facets contrasting the two classes. They are potentially good features to be used for the …


Mining Patterns And Rules For Software Specification Discovery, David Lo, Siau-Cheng Khoo Nov 2011

Mining Patterns And Rules For Software Specification Discovery, David Lo, Siau-Cheng Khoo

David LO

Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potentially cause bugs and compatibility issues. In this paper, we describe novel data mining techniques to mine or reverse engineer these specifications from the pool of software engineering data. A large amount of software data is available for analysis. One form of software data is program …


Mining Specifications In Diversified Formats From Execution Traces, David Lo Nov 2011

Mining Specifications In Diversified Formats From Execution Traces, David Lo

David LO

Software evolves; this phenomenon causes increase in maintenance efforts, problem in comprehending the ever-changing code base and difficulty in verifying software correctness. As software changes, often the documented specification is not updated. Outdated specification adds challenge to the understanding of the code base during maintenance tasks. Also, software changes might induce bugs, anomalies and even security threats. To address the above issues, we propose an array of specification mining techniques to mine software specifications in diversified formats from program execution traces. Case studies on various systems show that the extracted specifications shed light on the behaviors of systems under analysis. …


Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu Nov 2011

Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu

David LO

To improve software productivity and quality, software engineers are increasingly applying data mining algorithms to various software engineering tasks. However, mining SE data poses several challenges. The authors present various algorithms to effectively mine sequences, graphs, and text from such data.


Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, Jiawei Han Nov 2011

Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, Jiawei Han

David LO

No abstract provided.


Terapixel Imaging Of Cosmological Simulations, Yu Feng, Rupert Croft, Tiziana Di Matteo, Nishikanta Khandai, Randy Sargent, Illah Nourbakhsh, Paul Dille, Chris Bartley, Volker Springel, Anirban Jana, Jeffrey Gardner Nov 2011

Terapixel Imaging Of Cosmological Simulations, Yu Feng, Rupert Croft, Tiziana Di Matteo, Nishikanta Khandai, Randy Sargent, Illah Nourbakhsh, Paul Dille, Chris Bartley, Volker Springel, Anirban Jana, Jeffrey Gardner

Randy Sargent

The increasing size of cosmological simulations has led to the need for new visualization techniques. We focus on smoothed particle hydrodynamic (SPH) simulations run with the GADGET code and describe methods for visually accessing the entire simulation at full resolution. The simulation snapshots are rastered and processed on supercomputers into images that are ready to be accessed through a Web interface (GigaPan). This allows any scientist with a Web browser to interactively explore simulation data sets in both spatial and temporal dimensions and data sets which in their native format can be hundreds of terabytes in size or more. We …


On-Line Banking Systems: Are They Sustainable?, Satish Mahadevan Srinivasan, Sachin Pawaskar, Abhishek Tripathi, Lotfollah Najjar Nov 2011

On-Line Banking Systems: Are They Sustainable?, Satish Mahadevan Srinivasan, Sachin Pawaskar, Abhishek Tripathi, Lotfollah Najjar

Information Systems and Quantitative Analysis Faculty Proceedings & Presentations

Although the trend for on-line banking has increased in recent years, the customers have not shown enthusiastic participation in the past and in present. Since the sustainability of a bank supporting on-line-banking service depends on what capacity it can attract new customers, retain already existing customers and how well can it extend its services to the current and future customer base. This investigation is focused on examining if there is any significant difference among the factors namely the transactional security, information design, navigational design, visual design, web site trust, web site satisfaction and e-loyalty over sustainability of on-line banking for …


A Visual Analytics System For Metropolitan Transportation, Siyuan Liu, Ce Liu, Qiong Luo, Lionel M. Ni, Huamin Qu Nov 2011

A Visual Analytics System For Metropolitan Transportation, Siyuan Liu, Ce Liu, Qiong Luo, Lionel M. Ni, Huamin Qu

LARC Research Publications

With the increasing availability of metropolitan transportation data, such as those from vehicle GPSs (Global Positioning systems) and road-side sensors, it becomes viable for authorities, operators, as well as individuals to analyze the data for a better understanding of the transportation system and possibly improved utilization and planning of the system. We report our experience in building the VAST (Visual Analytics for Smart Transportation) system. Our key observation is that metropolitan transportation data are inherently visual as they are spatiotemporal around road networks. Therefore, we visualize traffic data together with digital maps and support analytical queries through this interactive visual …


Coping With Distance: An Empirical Study Of Communication On The Jazz Platform, Renuka Sindhgatta, Bikram Sengupta, Subhajit Datta Nov 2011

Coping With Distance: An Empirical Study Of Communication On The Jazz Platform, Renuka Sindhgatta, Bikram Sengupta, Subhajit Datta

Research Collection School Of Computing and Information Systems

Global software development - which is characterized by teams separated by physical distance and/or time-zone differences - has traditionally posed significant communication challenges. Often these have caused delays in completing tasks, or created misalignment across sites leading to re-work. In recent years, however, a new breed of development environments with rich collaboration features have emerged to facilitate cross-site work in distributed projects. In this paper we revisit the question "does distance matter?" in the context of IBM Jazz Platform -- a state-of-the-art collaborative development environment. We study the ecosystem of a large distributed team of around 300 members across 35 …


Enabling Gpu Acceleration With Messaging Middleware, Randall E. Duran, Li Zhang, Tom Hayhurst Nov 2011

Enabling Gpu Acceleration With Messaging Middleware, Randall E. Duran, Li Zhang, Tom Hayhurst

Research Collection School Of Computing and Information Systems

Graphics processing units (GPUs) offer great potential for accelerating processing for a wide range of scientific and business applications. However, complexities associated with using GPU technology have limited its use in applications. This paper reviews earlier approaches improving GPU accessibility, and explores how integration with middleware messaging technologies can further improve the accessibility and usability of GPU-enabled platforms. The results of a proof-of-concept integration between an open-source messaging middleware platform and a general-purpose GPU platform using the CUDA framework are presented. Additional applications of this technique are identified and discussed as potential areas for further research.


Learning Human Emotion Patterns For Modeling Virtual Humans, Shu Feng, Ah-Hwee Tan Nov 2011

Learning Human Emotion Patterns For Modeling Virtual Humans, Shu Feng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Emotion modeling is a crucial part in modeling virtual humans. Although various emotion models have been proposed, most of them focus on designing specific appraisal rules. As there is no unified framework for emotional appraisal, the appraisal variables have to be defined beforehand and evaluated in a subjective way. In this paper, we propose an emotion model based on machine learning methods by taking the following position: an emotion model should mirror actual human emotion in the real world and connect tightly with human inner states, such as drives, motivations and personalities. Specifically, a self-organizing neural model called Emotional Appraisal …


Consistent Community Identification In Complex Networks, Haewoon Kwak, Young-Ho Eom, Yoonchan Choi, Hawoong Jeong Nov 2011

Consistent Community Identification In Complex Networks, Haewoon Kwak, Young-Ho Eom, Yoonchan Choi, Hawoong Jeong

Research Collection School Of Computing and Information Systems

We have found that known community identification algorithms produce inconsistent communities when the node ordering changes at input. We use the pairwise membership probability and consistency to quantify the level of consistency across multiple runs of an algorithm. Based on these two metrics, we address the consistency problem without compromising the modularity. The key insight of the algorithm is to use pairwise membership probabilities as link weights. It offers a new tool in the study of community structures and their evolutions.


Are There Contagion Effects In Information Technology And Business Process Outsourcing?, Arti Mann, Robert J. Kauffman, Kunsoo Han, Barrie R. Nault Nov 2011

Are There Contagion Effects In Information Technology And Business Process Outsourcing?, Arti Mann, Robert J. Kauffman, Kunsoo Han, Barrie R. Nault

Research Collection School Of Computing and Information Systems

We model the diffusion of IT outsourcing using announcements about IT outsourcing deals. We estimate a lognormal diffusion curve to test whether IT outsourcing follows a pure diffusion process or there are contagion effects involved. The methodology permits us to study the consequences of outsourcing events, especially mega-deals with IT contract amounts that exceed US$1 billion. Mega-deals act, we theorize, as precipitating events that create a strong basis for contagion effects and are likely to affect decision-making by other firms in an industry. Then, we evaluate the role of different communication channels in the diffusion process of IT outsourcing by …


A Brain-Inspired Model Of Hierarchical Planner, Budhitama Subagdja, Ah-Hwee Tan Nov 2011

A Brain-Inspired Model Of Hierarchical Planner, Budhitama Subagdja, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Hierarchical planning is an approach of planning by composing and executing hierarchically arranged plans to solve some problems. Most symbolic-based hierarchical planners have been devised to allow the knowledge to be described expressively. However, a great challenge is to automatically seek and acquire new plans on the fly. This paper presents a novel neural-based model of hierarchical planning that can seek and acquired new plans on-line if the necessary knowledge are lacking. Inspired by findings in neuropsychology, plans can be inherently learnt, retrieved, and manipulated simultaneously rather than discretely processed like in most symbolic approaches. Using a multi-channel adaptive resonance …


Software Process Evaluation: A Machine Learning Approach, Ning Chen, Steven C. H. Hoi, Xiaokui Xiao Nov 2011

Software Process Evaluation: A Machine Learning Approach, Ning Chen, Steven C. H. Hoi, Xiaokui Xiao

Research Collection School Of Computing and Information Systems

Software process evaluation is essential to improve software development and the quality of software products in an organization. Conventional approaches based on manual qualitative evaluations (e.g., artifacts inspection) are deficient in the sense that (i) they are time-consuming, (ii) they suffer from the authority constraints, and (iii) they are often subjective. To overcome these limitations, this paper presents a novel semi-automated approach to software process evaluation using machine learning techniques. In particular, we formulate the problem as a sequence classification task, which is solved by applying machine learning algorithms. Based on the framework, we define a new quantitative indicator to …


Finding Relevant Answers In Software Forums, Swapna Gottopati, David Lo, Jing Jiang Nov 2011

Finding Relevant Answers In Software Forums, Swapna Gottopati, David Lo, Jing Jiang

Research Collection School Of Computing and Information Systems

Online software forums provide a huge amount of valuable content. Developers and users often ask questions and receive answers from such forums. The availability of a vast amount of thread discussions in forums provides ample opportunities for knowledge acquisition and summarization. For a given search query, current search engines use traditional information retrieval approach to extract webpages containing relevant keywords. However, in software forums, often there are many threads containing similar keywords where each thread could contain a lot of posts as many as 1,000 or more. Manually finding relevant answers from these long threads is a painstaking task to …


Unsupervised Multiple Kernel Learning, Jinfeng Zhuang, Jialei Wang, Steven C. H. Hoi, Xiangyang Lan Nov 2011

Unsupervised Multiple Kernel Learning, Jinfeng Zhuang, Jialei Wang, Steven C. H. Hoi, Xiangyang Lan

Research Collection School Of Computing and Information Systems

Traditional multiple kernel learning (MKL) algorithms are essentially supervised learning in the sense that the kernel learning task requires the class labels of training data. However, class labels may not always be available prior to the kernel learning task in some real world scenarios, e.g., an early preprocessing step of a classification task or an unsupervised learning task such as dimension reduction. In this paper, we investigate a problem of Unsupervised Multiple Kernel Learning (UMKL), which does not require class labels of training data as needed in a conventional multiple kernel learning task. Since a kernel essentially defines pairwise similarity …


The Knowledge-Driven Exploration Of Integrated Biomedical Knowledge Sources Facilitates The Generation Of New Hypotheses, Vinh Nguyen, Olivier Bodenreider, Todd Minning, Amit P. Sheth Oct 2011

The Knowledge-Driven Exploration Of Integrated Biomedical Knowledge Sources Facilitates The Generation Of New Hypotheses, Vinh Nguyen, Olivier Bodenreider, Todd Minning, Amit P. Sheth

Kno.e.sis Publications

Knowledge gained from the scientific literature can complement newly obtained experimental data in helping researchers understand the pathological processes underlying diseases. However, unless the scientific literature and experimental data are semantically integrated, it is generally difficult for scientists to exploit the two sources effectively. We argue that, in addition to the semantic integration of heterogeneous knowledge sources, the usability of the integrated resource by scientists is dependent upon the availability of knowledge visualization and exploration tools. Moreover, the integration techniques must be scalable and the exploration interfaces must be easy to use by bench scientists. The end goal of such …


Demonstration: Real-Time Semantic Analysis Of Sensor Streams, Harshal Patni, Cory Andrew Henson, Michael Cooney, Amit P. Sheth, Krishnaprasad Thirunarayan Oct 2011

Demonstration: Real-Time Semantic Analysis Of Sensor Streams, Harshal Patni, Cory Andrew Henson, Michael Cooney, Amit P. Sheth, Krishnaprasad Thirunarayan

Kno.e.sis Publications

The emergence of dynamic information sources – including sensor networks – has led to large streams of real-time data on the Web. Research studies suggest, these dynamic networks have created more data in the last three years than in the entire history of civilization, and this trend will only increase in the coming years [1]. With this coming data explosion, real-time analytics software must either adapt or die [2]. This paper focuses on the task of integrating and analyzing multiple heterogeneous streams of sensor data with the goal of creating meaningful abstractions, or features. These features are then temporally aggregated …


Demonstration: Secure - Semantics Empowered Rescue Environment, Pratikkumar Desai, Cory Andrew Henson, Pramod Anantharam, Amit P. Sheth Oct 2011

Demonstration: Secure - Semantics Empowered Rescue Environment, Pratikkumar Desai, Cory Andrew Henson, Pramod Anantharam, Amit P. Sheth

Kno.e.sis Publications

This paper demonstrates a Semantic Web enabled system for collecting and processing sensor data within a rescue environment. The real-time system collects heterogeneous raw sensor data from rescue robots through a wireless sensor network. The raw sensor data is converted to RDF using the Semantic Sensor Network (SSN) ontology and further processed to generate abstractions used for event detection in emergency scenarios.


Identifying Social Influence In Networks Using Randomized Experiments, Sinan Aral, Dylan Walker Oct 2011

Identifying Social Influence In Networks Using Randomized Experiments, Sinan Aral, Dylan Walker

Business Faculty Articles and Research

The recent availability of massive amounts of networked data generated by email, instant messaging, mobile phone communications, micro blogs, and online social networks is enabling studies of population-level human interaction on scales orders of magnitude greater than what was previously possible.1'2 One important goal of applying statistical inference techniques to large networked datasets is to understand how behavioral contagions spread in human social networks. More precisely, understanding how people influence or are influenced by their peers can help us understand the ebb and flow of market trends, product adoption and diffusion, the spread of health behaviors such as smoking and …


Sempush: Privacy-Aware And Scalable Broadcasting For Semantic Microblogging, Pavan Kapanipathi, Julia Anaya, Alexandre Passant Oct 2011

Sempush: Privacy-Aware And Scalable Broadcasting For Semantic Microblogging, Pavan Kapanipathi, Julia Anaya, Alexandre Passant

Kno.e.sis Publications

Users of traditional microblogging platforms such as Twitter face drawbacks in terms of (1) Privacy of status updates as a followee - reaching undesired people (2) Information overload as a follower - receiving uninteresting microposts from followees. In this paper we demonstrate distributed and user-controlled dissemination of microposts using SMOB (semantic microblogging framework) and Semantic Hub (privacy-aware implementation of PuSH3 protocol) . The approach leverages users' Social Graph to dynamically create group of followers who are eligible to receive micropost. The restrictions to create the groups are provided by the followee based on the hastags in the micropost. Both SMOB …


Semantic Annotation And Search For Resources In The Next Generation Web With Sa-Rest, Ajith H. Ranabahu, Amit P. Sheth, Maryam Panahiazar, Sanjaya Wijeratne Oct 2011

Semantic Annotation And Search For Resources In The Next Generation Web With Sa-Rest, Ajith H. Ranabahu, Amit P. Sheth, Maryam Panahiazar, Sanjaya Wijeratne

Kno.e.sis Publications

SA-REST, the W3C member submission, can be used for supporting a wide variety of Plain Old Semantic HTML (POSH) annotation capabilities on any type of Web resource. Kino framework and tools provide support of capabilities to realize SA-RESTs promised value. These tools include (a) a browser-plugin to support annotation of a Web resource (including services) with respect to an ontology, domain model or vocabulary, (b) an annotation aware indexing engine and (c) faceted search and selection of the Web resources. At one end of the spectrum, we present KinoE (aka Kino for Enterprise) which uses NCBO formal ontologies and …


A Domain Specific Language For Enterprise Grade Cloud-Mobile Hybrid Applications, Ajith H. Ranabahu, E. Michael Maximilien, Amit P. Sheth, Krishnaprasad Thirunarayan Oct 2011

A Domain Specific Language For Enterprise Grade Cloud-Mobile Hybrid Applications, Ajith H. Ranabahu, E. Michael Maximilien, Amit P. Sheth, Krishnaprasad Thirunarayan

Kno.e.sis Publications

Cloud computing has changed the technology landscape by offering flexible and economical computing resources to the masses. However, vendor lock-in makes the migration of applications and data across clouds an expensive proposition. The lock-in is especially serious when considering the new technology trend of combining cloud with mobile devices.

In this paper, we present a domain specific language (DSL) that is purposely created for generating hybrid applications spanning across mobile devices as well as computing clouds. We propose a model-driven development process that makes use of a DSL to provide sufficient programming abstractions over both cloud and mobile features. We …


Personalized Filtering Of The Twitter Stream, Pavan Kapanipathi, Fabrizio Orlandi, Amit P. Sheth, Alexandre Passant Oct 2011

Personalized Filtering Of The Twitter Stream, Pavan Kapanipathi, Fabrizio Orlandi, Amit P. Sheth, Alexandre Passant

Kno.e.sis Publications

With the rapid growth in users on social networks, there is a corresponding increase in user-generated content, in turn resulting in information overload. On Twitter, for example, users tend to receive uninterested information due to their non-overlapping interests from the people whom they follow. In this paper we present a Semantic Web approach to filter public tweets matching interests from personalized user profiles. Our approach includes automatic generation of multi-domain and personalized user profiles, filtering Twitter stream based on the generated profiles and delivering them in real-time. Given that users interests and personalization needs change with time, we also discuss …


Efficient Evaluation Of Continuous Text Seach Queries, Kyriakos Mouratidis, Hwee Hwa Pang Oct 2011

Efficient Evaluation Of Continuous Text Seach Queries, Kyriakos Mouratidis, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

Consider a text filtering server that monitors a stream of incoming documents for a set of users, who register their interests in the form of continuous text search queries. The task of the server is to constantly maintain for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring.In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is to …


A Survey Of Information Diffusion Models And Relevant Problems, Minh Duc Luu, Tuan Anh Hoang, Ee-Peng Lim Oct 2011

A Survey Of Information Diffusion Models And Relevant Problems, Minh Duc Luu, Tuan Anh Hoang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

There has been tremendous interest in diffusion of innovations or information in a social system. Nowadays, social networks (offline as well as online) are considered as important medium for diffusion and large amount of research has been conducted to understand the dynamics of diffusion in social networks. In this work, we review some of the models proposed for diffusion in social networks. We also highlight the major features of these models by dividing the surveyed models into two categories: non-network and network diffusion models. The former refers to user communities without any knowledge about the user relationship network and the …


Active Multiple Kernel Learning For Interactive 3d Object Retrieval Systems, Steven C. H. Hoi, Rong Jin Oct 2011

Active Multiple Kernel Learning For Interactive 3d Object Retrieval Systems, Steven C. H. Hoi, Rong Jin

Research Collection School Of Computing and Information Systems

An effective relevance feedback solution plays a key role in interactive intelligent 3D object retrieval systems. In this work, we investigate the relevance feedback problem for interactive intelligent 3D object retrieval, with the focus on studying effective machine learning algorithms for improving the user's interaction in the retrieval task. One of the key challenges is to learn appropriate kernel similarity measure between 3D objects through the relevance feedback interaction with users. We address this challenge by presenting a novel framework of Active multiple kernel learning (AMKL), which exploits multiple kernel learning techniques for relevance feedback in interactive 3D object retrieval. …