Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 2731 - 2760 of 6720

Full-Text Articles in Physical Sciences and Mathematics

Dark Hazard: Large-Scale Discovery Of Unknown Hidden Sensitive Operations In Android Apps, Xiaorui Pan, Xueqiang Wang, Yue Duan, Xiaofeng Wang, Heng Yin Mar 2017

Dark Hazard: Large-Scale Discovery Of Unknown Hidden Sensitive Operations In Android Apps, Xiaorui Pan, Xueqiang Wang, Yue Duan, Xiaofeng Wang, Heng Yin

Research Collection School Of Computing and Information Systems

Hidden sensitive operations (HSO) such as stealing privacy user data upon receiving an SMS message are increasingly utilized by mobile malware and other potentially-harmful apps (PHAs) to evade detection. Identification of such behaviors is hard, due to the challenge in triggering them during an app’s runtime. Current static approaches rely on the trigger conditions or hidden behaviors known beforehand and therefore cannot capture previously unknown HSO activities. Also these techniques tend to be computationally intensive and therefore less suitable for analyzing a large number of apps. As a result, our understanding of real-world HSO today is still limited, not to …


Social Tag Relevance Learning Via Ranking-Oriented Neighbor Voting, Chaoran Cui, Jialie Shen, Jun Ma, Tao Lian Mar 2017

Social Tag Relevance Learning Via Ranking-Oriented Neighbor Voting, Chaoran Cui, Jialie Shen, Jun Ma, Tao Lian

Research Collection School Of Computing and Information Systems

High quality tags play a critical role in applications involving online multimedia search, such as social image annotation, sharing and browsing. However, user-generated tags in real world are often imprecise and incomplete to describe the image contents, which severely degrades the performance of current search systems. To improve the descriptive powers of social tags, a fundamental issue is tag relevance learning, which concerns how to interpret the relevance of a tag with respect to the contents of an image effectively. In this paper, we investigate the problem from a new perspective of learning to rank, and develop a novel approach …


Scalable Image Retrieval By Sparse Product Quantization, Qingqun Ning, Jianke Zhu, Zhiyuan Zhong, Steven C. H. Hoi, Chun Chen Mar 2017

Scalable Image Retrieval By Sparse Product Quantization, Qingqun Ning, Jianke Zhu, Zhiyuan Zhong, Steven C. H. Hoi, Chun Chen

Research Collection School Of Computing and Information Systems

Fast approximate nearest neighbor (ANN) search technique for high-dimensional feature indexing and retrieval is the crux of large-scale image retrieval. A recent promising technique is product quantization, which attempts to index high-dimensional image features by decomposing the feature space into a Cartesian product of low-dimensional subspaces and quantizing each of them separately. Despite the promising results reported, their quantization approach follows the typical hard assignment of traditional quantization methods, which may result in large quantization errors, and thus, inferior search performance. Unlike the existing approaches, in this paper, we propose a novel approach called sparse product quantization (SPQ) to encoding …


Infographics: A Practical Guide For Librarians, Darren Sweeper Feb 2017

Infographics: A Practical Guide For Librarians, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Intelligent Web Crawler For Semantic Search Engine, Shujia Zhang Feb 2017

Intelligent Web Crawler For Semantic Search Engine, Shujia Zhang

Master's Projects

A Semantic Search Engine (SSE) is a program that produces semantic-oriented concepts from the Internet. A web crawler is the front end of our SSE; its primary goal is to supply important and necessary information to the data analysis component of SSE. The main function of the analysis component is to produce the concepts (moderately frequent finite sequences of keywords) from the input; it uses some variants of TF-IDF as a primary tool to remove stop words. However, it is a very expensive way to filter out stop words using the idea of TF-IDF. The goal of this project is …


Modeling Adoption Dynamics In Social Networks, Minh Duc Luu Feb 2017

Modeling Adoption Dynamics In Social Networks, Minh Duc Luu

Dissertations and Theses Collection

This dissertation studies the modeling of user-item adoption dynamics where an item can be an innovation, a piece of contagious information or a product. By “adoption dynamics” we refer to the process of users making decision choices to adopt items based on a variety of user and item factors. In the context of social networks, “adoption dynamics” is closely related to “item diffusion”. When a user in a social network adopts an item, she may influence her network neighbors to adopt the item. Those neighbors of her who adopt the item then continue to trigger more adoptions. As this progress …


Streaming Classification With Emerging New Class By Class Matrix Sketching, Xin Mu, Feida Zhu, Juan Du, Ee-Peng Lim, Zhi-Hua Zhou Feb 2017

Streaming Classification With Emerging New Class By Class Matrix Sketching, Xin Mu, Feida Zhu, Juan Du, Ee-Peng Lim, Zhi-Hua Zhou

Research Collection School Of Computing and Information Systems

Streaming classification with emerging new class is an important problem of great research challenge and practical value. In many real applications, the task often needs to handle large matrices issues such as textual data in the bag-of-words model and large-scale image analysis. However, the methodologies and approaches adopted by the existing solutions, most of which involve massive distance calculation, have so far fallen short of successfully addressing a real-time requested task. In this paper, the proposed method dynamically maintains two low-dimensional matrix sketches to 1) detect emerging new classes; 2) classify known classes; and 3) update the model in the …


Collaboration Trumps Homophily In Urban Mobile Crowd-Sourcing, Thivya Kandappu, Archan Misra, Randy Tandriansyah Daratan Feb 2017

Collaboration Trumps Homophily In Urban Mobile Crowd-Sourcing, Thivya Kandappu, Archan Misra, Randy Tandriansyah Daratan

Research Collection School Of Computing and Information Systems

This paper establishes the power of dynamic collaborative task completion among workers for urban mobile crowdsourcing. Collaboration is defined via the notion of peer referrals, whereby a worker who has accepted a location-specific task, but is unlikely to visit that location, offloads the task to a willing friend. Such a collaborative framework might be particularly useful for task bundles, especially for bundles that have higher geographic dispersion. The challenge, however, comes from the high similarity observed in the spatiotemporal pattern of task completion among friends. Using extensive real-world crowd-sourcing studies conducted over 7 weeks and 1000+ workers on a campus-based …


Harnessing Twitter To Support Serendipitous Learning Of Developers, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, David Lo, Aiko Yamashita Feb 2017

Harnessing Twitter To Support Serendipitous Learning Of Developers, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, David Lo, Aiko Yamashita

Research Collection School Of Computing and Information Systems

Developers often rely on various online resources, such as blogs, to keep themselves up-to-date with the fast pace at which software technologies are evolving. Singer et al. found that developers tend to use channels such as Twitter to keep themselves updated and support learning, often in an undirected or serendipitous way, coming across things that they may not apply presently, but which should be helpful in supporting their developer activities in future. However, identifying relevant and useful articles among the millions of pieces of information shared on Twitter is a non-trivial task. In this work to support serendipitous discovery of …


Crowdsensing And Analyzing Micro-Event Tweets For Public Transportation Insights, Thoong Hoang, Pei Hua (Xu Peihua) Cher, Philips Kokoh Prasetyo, Ee-Peng Lim Feb 2017

Crowdsensing And Analyzing Micro-Event Tweets For Public Transportation Insights, Thoong Hoang, Pei Hua (Xu Peihua) Cher, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Efficient and commuter friendly public transportation system is a critical part of a thriving and sustainable city. As cities experience fast growing resident population, their public transportation systems will have to cope with more demands for improvements. In this paper, we propose a crowdsensing and analysis framework to gather and analyze realtime commuter feedback from Twitter. We perform a series of text mining tasks identifying those feedback comments capturing bus related micro-events; extracting relevant entities; and, predicting event and sentiment labels. We conduct a series of experiments involving more than 14K labeled tweets. The experiments show that incorporating domain knowledge …


Why And How Developers Fork What From Whom In Github, Jing Jiang, David Lo, Jiahuan He, Xin Xia, Pavneet Singh Kochhar, Li Zhang Feb 2017

Why And How Developers Fork What From Whom In Github, Jing Jiang, David Lo, Jiahuan He, Xin Xia, Pavneet Singh Kochhar, Li Zhang

Research Collection School Of Computing and Information Systems

Forking is the creation of a new software repository by copying another repository. Though forking is controversial in traditional open source software (OSS) community, it is encouraged and is a built-in feature in GitHub. Developers freely fork repositories, use codes as their own and make changes. A deep understanding of repository forking can provide important insights for OSS community and GitHub. In this paper, we explore why and how developers fork what from whom in GitHub. We collect a dataset containing 236,344 developers and 1,841,324 forks. We make surveys, and analyze programming languages and owners of forked repositories. Our main …


Active Video Summarization: Customized Summaries Via On-Line Interaction With The User, Ana Garcia Del Molino, Xavier Boix, Joo-Hwee Lim, Ah-Hwee Tan Feb 2017

Active Video Summarization: Customized Summaries Via On-Line Interaction With The User, Ana Garcia Del Molino, Xavier Boix, Joo-Hwee Lim, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

To facilitate the browsing of long videos, automatic video summarization provides an excerpt that represents its content. In the case of egocentric and consumer videos, due to their personal nature, adapting the summary to specific user’s preferences is desirable. Current approaches to customizable video summarization obtain the user’s preferences prior to the summarization process. As a result, the user needs to manually modify the summary to further meet the preferences. In this paper, we introduce Active Video Summarization (AVS), an interactive approach to gather the user’s preferences while creating the summary. AVS asks questions about the summary to update it …


Using An Online Tutorial To Teach Rea Data Modeling In Accounting Information Systems Courses, Poh Sun Seow, Pan, Gary Feb 2017

Using An Online Tutorial To Teach Rea Data Modeling In Accounting Information Systems Courses, Poh Sun Seow, Pan, Gary

Research Collection School Of Accountancy

Online learning has been gaining widespread adoption due to its successin enhancing student-learning outcomes and improving student t academicperformance. This paper describes an online tutorial to teach resource-event-agent(REA) data modeling in an undergraduate accounting information systems course.The REA online tutorial reflects a self-study application designed to helpstudents improve their understanding of the REA data model. As such, thetutorial acts as a supplement to lectures by reinforcing the concepts andincorporating practices to assess student understanding. Instructors can accessthe REA online tutorial at http://smu.asg/rea. An independent survey by the University’sCentre for Teaching Excellence found a significant increase in students’perceived knowledge of REA …


Detecting Similar Repositories On Github, Yun Zhang, David Lo, Pavneet Singh Kochhar, Xin Xia, Quanlai Li, Jianling Sun Feb 2017

Detecting Similar Repositories On Github, Yun Zhang, David Lo, Pavneet Singh Kochhar, Xin Xia, Quanlai Li, Jianling Sun

Research Collection School Of Computing and Information Systems

GitHub contains millions of repositories among which many are similar with one another (i.e., having similar source codes or implementing similar functionalities). Finding similar repositories on GitHub can be helpful for software engineers as it can help them reuse source code, build prototypes, identify alternative implementations, explore related projects, find projects to contribute to, and discover code theft and plagiarism. Previous studies have proposed techniques to detect similar applications by analyzing API usage patterns and software tags. However, these prior studies either only make use of a limited source of information or use information not available for projects on GitHub. …


Attribute-Based Secure Messaging In The Public Cloud, Zhi Yuan Poh, Hui Cui, Robert H. Deng, Yingjiu Li Feb 2017

Attribute-Based Secure Messaging In The Public Cloud, Zhi Yuan Poh, Hui Cui, Robert H. Deng, Yingjiu Li

Research Collection School Of Computing and Information Systems

Messaging systems operating within the public cloud are gaining popularity. To protect message confidentiality from the public cloud including the public messaging servers, we propose to encrypt messages in messaging systems using Attribute-Based Encryption (ABE). ABE is an one-to-many public key encryption system in which data are encrypted with access policies and only users with attributes that satisfy the access policies can decrypt the ciphertexts, and hence is considered as a promising solution for realizing expressive and fine-grained access control of encrypted data in public servers. Our proposed system, called Attribute-Based Secure Messaging System with Outsourced Decryption (ABSM-OD), has three …


Soal: Second-Order Online Active Learning, Shuji Hao, Peilin Zhao, Jing Lu, Steven C. H. Hoi, Chunyan Miao, Chi Zhang Feb 2017

Soal: Second-Order Online Active Learning, Shuji Hao, Peilin Zhao, Jing Lu, Steven C. H. Hoi, Chunyan Miao, Chi Zhang

Research Collection School Of Computing and Information Systems

This paper investigates the problem of online active learning for training classification models from sequentially arriving data. This is more challenging than conventional online learning tasks since the learner not only needs to figure out how to effectively update the classifier but also needs to decide when is the best time to query the label of an incoming instance given limited label budget. The existing online active learning approaches are often based on first-order online learning methods which generally fall short in slow convergence rate and suboptimal exploitation of available information when querying the labeled data. To overcome the limitations, …


Recurrent Neural Networks With Auxiliary Labels For Cross-Domain Opinion Target Extraction, Ying Ding, Jianfei Yu, Jing Jiang Feb 2017

Recurrent Neural Networks With Auxiliary Labels For Cross-Domain Opinion Target Extraction, Ying Ding, Jianfei Yu, Jing Jiang

Research Collection School Of Computing and Information Systems

Opinion target extraction is a fundamental task in opinion mining. In recent years, neural network based supervised learning methods have achieved competitive performance on this task. However, as with any supervised learning method, neural network based methods for this task cannot work well when the training data comes from a different domain than the test data. On the other hand, some rule-based unsupervised methods have shown to be robust when applied to different domains. In this work, we use rule-based unsupervised methods to create auxiliary labels and use neural network models to learn a hidden representation that works well for …


Unsupervised Visual Hashing With Semantic Assistant For Content-Based Image Retrieval, Lei Zhu, Jialie Shen, Liang Xie, Zhiyong Cheng Feb 2017

Unsupervised Visual Hashing With Semantic Assistant For Content-Based Image Retrieval, Lei Zhu, Jialie Shen, Liang Xie, Zhiyong Cheng

Research Collection School Of Computing and Information Systems

As an emerging technology to support scalable content-based image retrieval (CBIR), hashing has recently received great attention and became a very active research domain. In this study, we propose a novel unsupervised visual hashing approach called semantic-assisted visual hashing (SAVH). Distinguished from semi-supervised and supervised visual hashing, its core idea is to effectively extract the rich semantics latently embedded in auxiliary texts of images to boost the effectiveness of visual hashing without any explicit semantic labels. To achieve the target, a unified unsupervised framework is developed to learn hash codes by simultaneously preserving visual similarities of images, integrating the semantic …


Maximizing The Probability Of Arriving On Time: A Practical Q-Learning Method, Zhiguang Cao, Hongliang Guo, Jie Zhang, Frans Oliehoek, Ulrich Fastenrath Feb 2017

Maximizing The Probability Of Arriving On Time: A Practical Q-Learning Method, Zhiguang Cao, Hongliang Guo, Jie Zhang, Frans Oliehoek, Ulrich Fastenrath

Research Collection School Of Computing and Information Systems

The stochastic shortest path problem is of crucial importance for the development of sustainable transportation systems. Existing methods based on the probability tail model seek for the path that maximizes the probability of arriving at the destination before a deadline. However, they suffer from low accuracy and/or high computational cost. We design a novel Q-learning method where the converged Q-values have the practical meaning as the actual probabilities of arriving on time so as to improve accuracy. By further adopting dynamic neural networks to learn the value function, our method can scale well to large road networks with arbitrary deadlines. …


Discovering Burst Patterns Of Burst Topic In Twitter, Guozhong Dong, Wu Yang, Feida Zhu, Wei Wang Feb 2017

Discovering Burst Patterns Of Burst Topic In Twitter, Guozhong Dong, Wu Yang, Feida Zhu, Wei Wang

Research Collection School Of Computing and Information Systems

Twitter has become one of largest social networks for users to broadcast burst topics. There have been many studies on how to detect burst topics. However, mining burst patterns in burst topics has not been solved by the existing works. In this paper, we investigate the problem of mining burst patterns of burst topic in Twitter. A burst topic user graph model is proposed, which can represent the topology structure of burst topic propagation across a large number of Twitter users. Based on the model, hierarchical clustering is applied to cluster burst topics and reveal burst patterns from the macro …


Assessing Differences Between Physician's Realized And Anticipated Gains From Electronic Health Record Adoption, Lori T. Peterson, Eric W. Ford, John Eberhardt, T. R. Huerta Jan 2017

Assessing Differences Between Physician's Realized And Anticipated Gains From Electronic Health Record Adoption, Lori T. Peterson, Eric W. Ford, John Eberhardt, T. R. Huerta

Lori Peterson

Return on investment (ROI) concerns related to Electronic Health Records (EHRs) are a major barrier to the technology’s adoption. Physicians generally rely upon early adopters to vet new technologies prior to putting them into widespread use. Therefore, early adopters’ experiences with EHRs play a major role in determining future adoption patterns. The paper’s purposes are: (1) to map the EHR value streams that define the ROI calculation; and (2) to compare Current Users’ and Intended Adopters’ perceived value streams to identify similarities, differences and governing constructs. Primary data was collected by the Texas Medical Association, which surveyed 1,772 physicians on …


Let’S Try Something New: Service Learning In Boise State's Computer Science Department, Daniel Kondratyuk Jan 2017

Let’S Try Something New: Service Learning In Boise State's Computer Science Department, Daniel Kondratyuk

International Journal of Undergraduate Community Engagement

In this article I explain how a group of Computer Science students at Boise State University participated in a new service learning project. I provide a few testimonials on the students’ experiences and describe the rewarding aspects of service learning in the greater Computer Science community.


Knowledge Activation For Patient Centered Care: Bridging The Health Information Technology Divide, Sajda Qureshi, Cherie Notebloom Jan 2017

Knowledge Activation For Patient Centered Care: Bridging The Health Information Technology Divide, Sajda Qureshi, Cherie Notebloom

Information Systems and Quantitative Analysis Faculty Proceedings & Presentations

The provision of healthcare is a collaborative process. It follows evidence based treatments which are becoming increasingly data driven and focusing on the best clinical outcomes. Patient centered care requires participation of patients in the decision making of the best treatment options. Healthcare provision requires both evidence based and patient centered care. In practice, these two perspectives conflict with each other due to the use of an information technology designed primarily for billing purposes. Using the knowledge activation framework developed by Qureshi and Keen [25], we analyze data from two hospitals in the Midwest that aim to achieve quality of …


Understanding The Role Of Information Technology In The Development Of Micro-Enterprises: Concepts To Study In Making A Better World, Sajda Qureshi, Jason Jie Xiong Jan 2017

Understanding The Role Of Information Technology In The Development Of Micro-Enterprises: Concepts To Study In Making A Better World, Sajda Qureshi, Jason Jie Xiong

Information Systems and Quantitative Analysis Faculty Proceedings & Presentations

The concept of Development has alluded scholars and practitioners when information technology becomes prevalent. The majority of research in the Information Technology for Development (ICT4D) field is considered to be practice intended to make the world better with Information and Communications technologies (ICTs). In addition, a majority of wellintentioned ICT4D projects tend to fail, often due to unrealistic expectation set by development agencies responding to their political objectives. At the same time, Information Systems (IS) research is ripe with well-studied concepts that do little to make a better world. This paper investigates ICT interventions in three case studies of micro-enterprises …


An Annotated Corpus With Nanomedicine And Pharmacokinetic Parameters, Nastassja Lewinski, Ivan Jimenez, Bridget Mcinnes Jan 2017

An Annotated Corpus With Nanomedicine And Pharmacokinetic Parameters, Nastassja Lewinski, Ivan Jimenez, Bridget Mcinnes

Chemical and Life Science Engineering Publications

A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the …


Health And Safety Monitoring System, Cailin Simpson Jan 2017

Health And Safety Monitoring System, Cailin Simpson

Summer Community of Scholars Posters (RCEU and HCR Combined Programs)

No abstract provided.


Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes Jan 2017

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes

Computer Science Publications

The UMLS::Association CUICollector module identifies UMLS Concept Unique Identifier bigrams and their frequencies in a biomedical text corpus. CUICollector was re-implemented in Hadoop MapReduce to improve algorithm speed, flexibility, and scalability. Evaluation of the Hadoop implementation compared to the serial module produced equivalent results and achieved a 28x speedup on a single-node Hadoop system.


Data Mining By Grid Computing In The Search For Extrasolar Planets, Oisin Creaner [Thesis] Jan 2017

Data Mining By Grid Computing In The Search For Extrasolar Planets, Oisin Creaner [Thesis]

Doctoral

A system is presented here to provide improved precision in ensemble differential photometry. This is achieved by using the power of grid computing to analyse astronomical catalogues. This produces new catalogues of optimised pointings for each star, which maximise the number and quality of reference stars available. Astronomical phenomena such as exoplanet transits and small-scale structure within quasars may be observed by means of millimagnitude photometric variability on the timescale of minutes to hours. Because of atmospheric distortion, ground-based observations of these phenomena require the use of differential photometry whereby the target is compared with one or more reference stars. …


Essays On Sofwtare Development Projects: Impact Of Social And Technological Factors On Project Performance And Co-Diffusion Of Software Sourcing Arrangements, Niharika Dayyala Jan 2017

Essays On Sofwtare Development Projects: Impact Of Social And Technological Factors On Project Performance And Co-Diffusion Of Software Sourcing Arrangements, Niharika Dayyala

Open Access Theses & Dissertations

Software development is a complicated process which is accomplished by the combined effort of the key elements such as people, processes, and technology. Software development firms are on a constant quest to identify the best practices for managing the key elements to improve their software development project outcomes and to deliver successful software projects. This Dissertation aims to study key elements that can influence the software development project performance through three distinctive studies. The first essay of this Dissertation investigates the impact of people factors on software project outcomes. Specifically, it examines the impact of team characteristics on the software …


Exploring Strategies For Implementing Data Governance Practices, Ashley Cave Jan 2017

Exploring Strategies For Implementing Data Governance Practices, Ashley Cave

Walden Dissertations and Doctoral Studies

Data governance reaches across the field of information technology and is increasingly important for big data efforts, regulatory compliance, and ensuring data integrity. The purpose of this qualitative case study was to explore strategies for implementing data governance practices. This study was guided by institutional theory as the conceptual framework. The study's population consisted of informatics specialists from a small hospital, which is also a research institution in the Washington, DC, metropolitan area. This study's data collection included semi structured, in-depth individual interviews (n = 10), focus groups (n = 3), and the analysis of organizational documents (n = 19). …