Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 3511 - 3540 of 6722

Full-Text Articles in Physical Sciences and Mathematics

Exploring Discriminative Features For Anomaly Detection In Public Spaces, Shriguru Nayak, Archan Misra, Kasthuri Jeyarajah, Philips Kokoh Prasetyo, Ee-Peng Lim Apr 2015

Exploring Discriminative Features For Anomaly Detection In Public Spaces, Shriguru Nayak, Archan Misra, Kasthuri Jeyarajah, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Context data, collected either from mobile devices or from user-generated social media content, can help identify abnormal behavioural patterns in public spaces (e.g., shopping malls, college campuses or downtown city areas). Spatiotemporal analysis of such data streams provides a compelling new approach towards automatically creating real-time urban situational awareness, especially about events that are unanticipated or that evolve very rapidly. In this work, we use real-life datasets collected via SMU's LiveLabs testbed or via SMU's Palanteer software, to explore various discriminative features (both spatial and temporal - e.g., occupancy volumes, rate of change in topic{specific tweets or probabilistic distribution of …


Best Upgrade Plans For Single And Multiple Source-Destination Pairs, Yimin Lin, Kyriakos Mouratidis Apr 2015

Best Upgrade Plans For Single And Multiple Source-Destination Pairs, Yimin Lin, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

In this paper, we study Resource Constrained Best Upgrade Plan (BUP) computation in road network databases. Consider a transportation network (weighted graph) G where a subset of the edges are upgradable, i.e., for each such edge there is a cost, which if spent, the weight of the edge can be reduced to a specific new value. In the single-pair version of BUP, the input includes a source and a destination in G, and a budget B (resource constraint). The goal is to identify which upgradable edges should be upgraded so that the shortest path distance between source and …


Measuring User Influence, Susceptibility And Cynicalness In Sentiment Diffusion, Roy Ka-Wei Lee, Ee Peng Lim Apr 2015

Measuring User Influence, Susceptibility And Cynicalness In Sentiment Diffusion, Roy Ka-Wei Lee, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Diffusion in social networks is an important research topic lately due to massive amount of information shared on social media and Web. As information diffuses, users express sentiments which can affect the sentiments of others. In this paper, we analyze how users reinforce or modify sentiment of one another based on a set of inter-dependent latent user factors as they are engaged in diffusion of event information. We introduce these sentiment-based latent user factors, namely influence, susceptibility and cynicalness. We also propose the ISC model to relate the three factors together and develop an iterative computation approach to …


Theory Identity: A Machine-Learning Approach, Kai Larsen, Dirk Hovorka, Jevin West, James Birt, James Pfaff, Trevor Chambers, Zebula Sampedro, Nick Zager, Bruce Vanstone Mar 2015

Theory Identity: A Machine-Learning Approach, Kai Larsen, Dirk Hovorka, Jevin West, James Birt, James Pfaff, Trevor Chambers, Zebula Sampedro, Nick Zager, Bruce Vanstone

Bruce Vanstone

Theory identity is a fundamental problem for researchers seeking to determine theory quality, create theory ontologies and taxonomies, or perform focused theory-specific reviews and meta-analyses. We demonstrate a novel machine-learning approach to theory identification based on citation data and article features. The multi-disciplinary ecosystem of articles which cite a theory's originating paper is created and refined into the network of papers predicted to contribute to, and thus identify, a specific theory. We provide a 'proof-of-concept' for a highly-cited theory. Implications for crossdisciplinary theory integration and the identification of theories for a rapidly expanding scientific literature are discussed.


Temporal Mining For Distributed Systems, Yexi Jiang Mar 2015

Temporal Mining For Distributed Systems, Yexi Jiang

FIU Electronic Theses and Dissertations

Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent.

With the …


Unknown Threat Detection With Honeypot Ensemble Analsyis Using Big Datasecurity Architecture, Michael Eugene Sanders Mar 2015

Unknown Threat Detection With Honeypot Ensemble Analsyis Using Big Datasecurity Architecture, Michael Eugene Sanders

Theses and Dissertations

The amount of data that is being generated continues to rapidly grow in size and complexity. Frameworks such as Apache Hadoop and Apache Spark are evolving at a rapid rate as organizations are building data driven applications to gain competitive advantages. Data analytics frameworks decomposes our problems to build applications that are more than just inference and can help make predictions as well as prescriptions to problems in real time instead of batch processes.

Information Security is becoming more important to organizations as the Internet and cloud technologies become more integrated with their internal processes. The number of attacks and …


Using Software Defined Networking To Solve Missed Firewall Architecture In Legacy Networks, Jared Dean Vogel Mar 2015

Using Software Defined Networking To Solve Missed Firewall Architecture In Legacy Networks, Jared Dean Vogel

Theses and Dissertations

This study is concerned with migrating traditional networks and their inherent firewall architecture to Software Defined Networking (SDN) architecture to provide an initial attempt at preventing application downtime due to hidden firewall domain rules. In legacy organization environments the networking engineers, firewall teams, and application analysts are often silo groups, but Software Defined Networking (SDN) can blur the lines between these group silos.

This thesis first outlines the interworking of SDN, traditional firewall architecture and how it interacts with SDN, an experiment of implementation, and the resulting conclusions.

Testing with SDN shows we are approaching new environments where the edges …


Sensitivity Analysis For The Winning Algorithm In Knowledge Discovery And Data Mining ( Kdd ) Cup Competition 2014, Fakhri Ghassan Abbas Mar 2015

Sensitivity Analysis For The Winning Algorithm In Knowledge Discovery And Data Mining ( Kdd ) Cup Competition 2014, Fakhri Ghassan Abbas

Theses and Dissertations

This thesis applies multi-way sensitivity analysis for the winning algorithm in the Knowledge Discovery in Data Mining (KDD) cup competition 2014 -`Predicting Excitement at Donors.org'. Because of the highly advanced nature of this competition, analyzing the winning solution under a variety of different conditions provides insight about each of the models the winning team has used in the competition. The study follows Cross Industry Standard Process (CRISP) for data mining to study the steps taken to prepare, model and evaluate the model. The thesis focuses on a gradient boosting model. After careful examination of the models created by the researchers …


The Symbiotic Relationship Between Information Retrieval And Informetrics, Dietmar Wolfram Mar 2015

The Symbiotic Relationship Between Information Retrieval And Informetrics, Dietmar Wolfram

School of Information Studies Faculty Articles

Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments …


Wikipedia And Medicine: Quantifying Readership, Editors, And The Significance Of Natural Language, James M. Heilman, Andrew G. West Mar 2015

Wikipedia And Medicine: Quantifying Readership, Editors, And The Significance Of Natural Language, James M. Heilman, Andrew G. West

Andrew G. West

BACKGROUND: Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of healthcare information by both professionals and the lay public.

OBJECTIVE: This document quantifies: (1) The amount of medical content on Wikipedia, (2) the citations supporting Wikipedia’s medical content, (3) the readership of medical content, and (4) the quantity/characteristics of Wikipedia’s medical contributors

METHODS: Using a well-defined categorization infrastructure we identify medically pertinent English Wikipedia articles and links to their foreign language equivalents (Objective 1). With these, Wikipedia’s API can be queried to produce metadata …


Leading Undergraduate Students To Big Data Generation, Jianjun Yang, Ju Shen Mar 2015

Leading Undergraduate Students To Big Data Generation, Jianjun Yang, Ju Shen

Computer Science Faculty Publications

People are facing a flood of data today. Data are being collected at unprecedented scale in many areas, such as networking, image processing, virtualization, scientific computation, and algorithms. The huge data nowadays are called Big Data. Big data is an all encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. In this article, the authors present a unique way which uses network simulator and tools of image processing to train students abilities to learn, analyze, manipulate, and apply Big Data. Thus they develop students hands-on …


Kinesic Patterning In Deceptive And Truthful Interactions, Judee K. Burgoon, Ryan M. Schuetzler, David W. Wilson Mar 2015

Kinesic Patterning In Deceptive And Truthful Interactions, Judee K. Burgoon, Ryan M. Schuetzler, David W. Wilson

Information Systems and Quantitative Analysis Faculty Publications

A persistent question in the deception literature has been the extent to which nonverbal behaviors can reliably distinguish between truth and deception. It has been argued that deception instigates cognitive load and arousal that are betrayed through visible nonverbal indicators. Yet, empirical evidence has often failed to find statistically significant or strong relationships. Given that interpersonal message production is characterized by a high degree of simultaneous and serial patterning among multiple behaviors, it may be that patterns of behaviors are more diagnostic of veracity. Or it may be that the theorized linkage between internal states of arousal, cognitive taxation, and …


Assessing The Emphasis On Information Security In The Systems Analysis And Design Course, William David Salisbury, Thomas W. Ferratt, Donald E. Wynn Mar 2015

Assessing The Emphasis On Information Security In The Systems Analysis And Design Course, William David Salisbury, Thomas W. Ferratt, Donald E. Wynn

MIS/OM/DS Faculty Publications

Due to several recent highly publicized information breaches, information security has gained a higher profile. Hence, it is reasonable to expect that information security would receive an equally significant emphasis in the education of future systems professionals. A variety of security standards that various entities (e.g., NIST, COSO, ISACA-COBIT, ISO) have put forth emphasize the importance of information security from the very beginning of the system development lifecycle (SDLC) to avoid significant redesign in later phases. To determine the emphasis on security in typical systems analysis and design (SA&D) courses, we examine (1) to what extent security is emphasized in …


Reconstruction Privacy: Enabling Statistical Learning, Ke Wang, Chao Han, Ada Waichee Fu, Raymond C. Wong, Philip S. Yu Mar 2015

Reconstruction Privacy: Enabling Statistical Learning, Ke Wang, Chao Han, Ada Waichee Fu, Raymond C. Wong, Philip S. Yu

Research Collection School Of Computing and Information Systems

Non-independent reasoning (NIR) allows the information about one record in the data to be learnt from the information of other records in the data. Most posterior/prior based privacy criteria consider NIR as a privacy violation and require to smooth the distribution of published data to avoid sensitive NIR. The drawback of this approach is that it limits the utility of learning statistical relationships. The differential privacy criterion considers NIR as a non-privacy violation, therefore, enables learning statistical relationships, but at the cost of potential disclosures through NIR. A question is whether it is possible to (1) allow learning statistical relationships, …


On Efficient K-Optimal-Location-Selection Query Processing In Metric Spaces, Yunjun Gao, Shuyao Qi, Lu Chen, Baihua Zheng, Xinhan Li Mar 2015

On Efficient K-Optimal-Location-Selection Query Processing In Metric Spaces, Yunjun Gao, Shuyao Qi, Lu Chen, Baihua Zheng, Xinhan Li

Research Collection School Of Computing and Information Systems

This paper studies the problem of k-optimal-location-selection (kOLS) retrieval in metric spaces. Given a set DA of customers, a set DB of locations, a constrained region R , and a critical distance dc, a metric kOLS (MkOLS) query retrieves k locations in DB that are outside R but have the maximal optimality scores. Here, the optimality score of a location l∈DB located outside R is defined as the number of the customers in DA that are inside R and meanwhile have their distances to l bounded by …


Joint Search By Social And Spatial Proximity, Kyriakos Mouratidis, Jing Li, Yu Tang, Nikos Mamoulis Mar 2015

Joint Search By Social And Spatial Proximity, Kyriakos Mouratidis, Jing Li, Yu Tang, Nikos Mamoulis

Research Collection School Of Computing and Information Systems

The diffusion of social networks introduces new challenges and opportunities for advanced services, especially so with their ongoing addition of location-based features. We show how applications like company and friend recommendation could significantly benefit from incorporating social and spatial proximity, and study a query type that captures these two-fold semantics. We develop highly scalable algorithms for its processing, and enhance them with elaborate optimizations. Finally, we use real social network data to empirically verify the efficiency and efficacy of our solutions.


Beyond Support And Confidence: Exploring Interestingness Measures For Rule-Based Specification Mining, Bui Tien Duy Le, David Lo Mar 2015

Beyond Support And Confidence: Exploring Interestingness Measures For Rule-Based Specification Mining, Bui Tien Duy Le, David Lo

Research Collection School Of Computing and Information Systems

Numerous rule-based specification mining approaches have been proposed in the literature. Many of these approaches analyze a set of execution traces to discover interesting usage rules, e.g., whenever lock() is invoked, eventually unlock() is invoked. These techniques often generate and enumerate a set of candidate rules and compute some interestingness scores. Rules whose interestingness scores are above a certain threshold would then be output. In past studies, two measures, namely support and confidence, which are well-known measures, are often used to compute these scores. However, aside from these two, many other interestingness measures have been proposed. It is thus unclear …


Nirmal: Automatic Identification Of Software Relevant Tweets Leveraging Language Model, Abishek Sharma, Yuan Tian, David Lo Mar 2015

Nirmal: Automatic Identification Of Software Relevant Tweets Leveraging Language Model, Abishek Sharma, Yuan Tian, David Lo

Research Collection School Of Computing and Information Systems

Twitter is one of the most widely used social media platforms today. It enables users to share and view short 140-character messages called 'tweets'. About 284 million active users generate close to 500 million tweets per day. Such rapid generation of user generated content in large magnitudes results in the problem of information overload. Users who are interested in information related to a particular domain have limited means to filter out irrelevant tweets and tend to get lost in the huge amount of data they encounter. A recent study by Singer et al. found that software developers use Twitter to …


Prediction Of Venues In Foursquare Using Flipped Topic Models, Wen Haw Chong, Bing Tian Dai, Ee Peng Lim Mar 2015

Prediction Of Venues In Foursquare Using Flipped Topic Models, Wen Haw Chong, Bing Tian Dai, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Foursquare is a highly popular location-based social platform, where users indicate their presence at venues via check-ins and/or provide venue-related tips. On Foursquare, we explore Latent Dirichlet Allocation (LDA) topic models for venue prediction: predict venues that a user is likely to visit, given his history of other visited venues. However we depart from prior works which regard the users as documents and their visited venues as terms. Instead we ‘flip’ LDA models such that we regard venues as documents that attract users, which are now the terms. Flipping is simple and requires no changes to the LDA mechanism. Yet …


A Web-Based Temperature Monitoring System For The College Of Arts And Letters, Rigoberto Solorio Mar 2015

A Web-Based Temperature Monitoring System For The College Of Arts And Letters, Rigoberto Solorio

Electronic Theses, Projects, and Dissertations

In general, server rooms have restricted access requiring that staff possess access codes, keys, etc. Normally, only administrators are provided access to protect the physical hardware and the data stored in the servers. Servers also have firewalls to restrict outsiders from accessing them via the Internet. Servers also cost a lot of money. For this reason, server rooms also need to be protected against overheating. This will prolong the lifecycle of the units and can prevent data loss from hardware failure.

The California State University San Bernardino (CSUSB), Specifically the College of Arts and Letters server room has faced power …


A Study Of Out-Of-Turn Interaction In Menu-Based, Ivr, Voicemail Systems, Saverio Perugini, Taylor J. Anderson, William F. Moroney Feb 2015

A Study Of Out-Of-Turn Interaction In Menu-Based, Ivr, Voicemail Systems, Saverio Perugini, Taylor J. Anderson, William F. Moroney

William F. Moroney

We present the first user study of out-of-turn interaction in menu-based, interactive voice-response systems. Out-ofturn interaction is a technique which empowers the user (unable to respond to the current prompt) to take the conversational initiative by supplying information that is currently unsolicited, but expected later in the dialog. The technique permits the user to circumvent any flows of navigation hardwired into the design and navigate the menus in a manner which reflects their model of the task. We conducted a laboratory experiment to measure the effect of the use of outof- turn interaction on user performance and preference in a …


Collaborative Research: Hecura: A New Semantic-Aware Metadata Organization For Improved File-System Performance And Functionality In High-End Computing, Yifeng Zhu Feb 2015

Collaborative Research: Hecura: A New Semantic-Aware Metadata Organization For Improved File-System Performance And Functionality In High-End Computing, Yifeng Zhu

University of Maine Office of Research Administration: Grant Reports

Existing data storage systems based on the hierarchical directory-tree organization do not meet the scalability and functionality requirements for exponentially growing datasets and increasingly complex metadata queries in large-scale Exabyte-level file systems with billions of files. This project focuses on a new decentralized semantic-aware metadata organization that exploits semantics of file metadata to improve system scalability, reduce query latency for complex data queries, and enhance file system functionality.

The research has four major components:

1) exploit metadata semantic-correlation to organize metadata in a scalable way,

2) exploit the semantic and scalable nature of the new metadata organization to significantly speed …


Smart Data - How You And I Will Exploit Big Data For Personalized Digital Health And Many Other Activities, Amit P. Sheth Feb 2015

Smart Data - How You And I Will Exploit Big Data For Personalized Digital Health And Many Other Activities, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Hole Detection And Shape-Free Representation And Double Landmarks Based Geographic Routing In Wireless Sensor Networks, Jianjun Yang, Zongming Fei, Ju Shen Feb 2015

Hole Detection And Shape-Free Representation And Double Landmarks Based Geographic Routing In Wireless Sensor Networks, Jianjun Yang, Zongming Fei, Ju Shen

Computer Science Faculty Publications

In wireless sensor networks, an important issue of geographic routing is “local minimum” problem, which is caused by a “hole” that blocks the greedy forwarding process. Existing geographic routing algorithms use perimeter routing strategies to find a long detour path when such a situation occurs. To avoid the long detour path, recent research focuses on detecting the hole in advance, then the nodes located on the boundary of the hole advertise the hole information to the nodes near the hole. Hence the long detour path can be avoided in future routing. We propose a heuristic hole detecting algorithm which identifies …


On Using Synthetic Social Media Stimuli In An Emergency Preparedness Functional Exercise, Andrew Hampton, Shreyansh Bhatt, Gary Alan Smith, Jeremy S. Brunn, Hemant Purohit, Valerie L. Shalin, John M. Flach, Amit P. Sheth Feb 2015

On Using Synthetic Social Media Stimuli In An Emergency Preparedness Functional Exercise, Andrew Hampton, Shreyansh Bhatt, Gary Alan Smith, Jeremy S. Brunn, Hemant Purohit, Valerie L. Shalin, John M. Flach, Amit P. Sheth

Kno.e.sis Publications

This paper details the creation and use of a massive (over 32,000 messages) artificially constructed 'Twitter' microblog stream for a regional emergency preparedness functional exercise. By combining microblog conversion, manual production, and a control set, we created a web based information stream providing valid, misleading, and irrelevant information to public information officers (PIOs) representing hospitals, fire departments, the local Red Cross, and city and county government officials. PIOs searched, monitored, and (through conventional channels) verified potentially actionable information that could then be redistributed through a personalized screen name. Our case study of a key PIO reveals several capabilities that social …


A Simulation-Based Approach To Solve A Specific Type Of Chance Constrained Optimization, Lijian Chan Feb 2015

A Simulation-Based Approach To Solve A Specific Type Of Chance Constrained Optimization, Lijian Chan

MIS/OM/DS Faculty Publications

We solve the chance constrained optimization with convex feasible set through approximating the chance constraint by another convex smooth function. The approximation is based on the numerical properties of the Bernstein polynomial that is capable of effectively controlling the approximation error for both function value and gradient. Thus, we adopt a first-order algorithm to reach a satisfactory solution which is expected to be optimal. When the explicit expression of joint distribution is not available, we then use Monte Carlo approach to numerically evaluate the chance constraint to obtain an optimal solution by probability. Numerical results for known problem instances are …


Empyreal Radiance: An Application Of Sonification In The Field Of Astrophysics, Ryan Loth Feb 2015

Empyreal Radiance: An Application Of Sonification In The Field Of Astrophysics, Ryan Loth

Undergraduate Distinction Papers

Broadly, this paper discusses the application of sonification and its potential for increasing knowledge. The paper is broken up into three sections: the theory of sonification, sonification for artistic purposes, and lastly an extensive look at one process of sonification dealing with solar winds in space. Concerning the theory of sonification, the paper will divulge into the process of sonification and ask questions about the limitations of it as well. The second section discusses how sonification is a way to build the curiosity of not just scientists, but also the general public. The final section addresses my composition Empyreal Radiance …


Simapp: A Framework For Detecting Similar Mobile Applications By Online Kernel Learning, Ning Chen, Steven C. H. Hoi, Shaohua Li, Xiaokui Xiao Feb 2015

Simapp: A Framework For Detecting Similar Mobile Applications By Online Kernel Learning, Ning Chen, Steven C. H. Hoi, Shaohua Li, Xiaokui Xiao

Research Collection School Of Computing and Information Systems

With the popularity of smart phones and mobile devices, the number of mobile applications (a.k.a. "apps") has been growing rapidly. Detecting semantically similar apps from a large pool of apps is a basic and important problem, as it is beneficial for various applications, such as app recommendation, app search, etc. However, there is no systematic and comprehensive work so far that focuses on addressing this problem. In order to fill this gap, in this paper, we explore multi-modal heterogeneous data in app markets (e.g., description text, images, user reviews, etc.), and present "SimApp" -- a novel framework for detecting similar …


Use Of A High-Value Social Audience Index For Target Audience Identification On Twitter, Siaw Ling Lo, David Cornforth, Raymond. Chiong Feb 2015

Use Of A High-Value Social Audience Index For Target Audience Identification On Twitter, Siaw Ling Lo, David Cornforth, Raymond. Chiong

Research Collection School Of Computing and Information Systems

With the large and growing user base of social media, it is not an easy feat to identify potential customers for business. This is mainly due to the challenge of extracting commercially viable contents from the vast amount of free-form conversations. In this paper, we analyse the Twitter content of an account owner and its list of followers through various text mining methods and segment the list of followers via an index. We have termed this index as the High-Value Social Audience (HVSA) index. This HVSA index enables a company or organisation to devise their marketing and engagement plan according …


Privacycanary: Privacy-Aware Recommenders With Adaptive Input Obfuscation, Thivya Kandappu, Arik Friedman, Roksan Borelli, Vijay Sivaraman Feb 2015

Privacycanary: Privacy-Aware Recommenders With Adaptive Input Obfuscation, Thivya Kandappu, Arik Friedman, Roksan Borelli, Vijay Sivaraman

Research Collection School Of Computing and Information Systems

Recommender systems are widely used by online retailers to promote products and content that are most likely to be of interest to a specific customer. In such systems, users often implicitly or explicitly rate products they have consumed, and some form of collaborative filtering is used to find other users with similar tastes to whom the products can be recommended. While users can benefit from more targeted and relevant recommendations, they are also exposed to greater risks of privacy loss, which can lead to undesirable financial and social consequences. The use of obfuscation techniques to preserve the privacy of user …