Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1561 - 1590 of 6720

Full-Text Articles in Physical Sciences and Mathematics

Visual Commonsense Representation Learning Via Causal Inference, Tan Wang, Jianqiang Huang, Hanwang Zhang, Qianru Sun Jun 2020

Visual Commonsense Representation Learning Via Causal Inference, Tan Wang, Jianqiang Huang, Hanwang Zhang, Qianru Sun

Research Collection School Of Computing and Information Systems

We present a novel unsupervised feature representation learning method, Visual Commonsense Region-based Convolutional Neural Network (VC R-CNN), to serve as an improved visual region encoder for high-level tasks such as captioning and VQA. Given a set of detected object regions in an image (e.g., using Faster R-CNN), like any other unsupervised feature learning methods (e.g., word2vec), the proxy training objective of VC R-CNN is to predict the con-textual objects of a region. However, they are fundamentally different: the prediction of VC R-CNN is by using causal intervention: P(Y|do(X)), while others are by using the conventional likelihood: P(Y|X). We extensively apply …


Ntire 2020 Challenge On Video Quality Mapping: Methods And Results, D. Fuoli, Zhiwu Huang, M. Danelljan, R. Timofte, H. Wang, L. Jin, D. Su, J. Liu, J. Lee, M. Kudelski, L. Bala, D. Hryboy, M. Mozejko, M. Li, S. Li, B. Pang, C. Lu, Li C., He D., Li F. Jun 2020

Ntire 2020 Challenge On Video Quality Mapping: Methods And Results, D. Fuoli, Zhiwu Huang, M. Danelljan, R. Timofte, H. Wang, L. Jin, D. Su, J. Liu, J. Lee, M. Kudelski, L. Bala, D. Hryboy, M. Mozejko, M. Li, S. Li, B. Pang, C. Lu, Li C., He D., Li F.

Research Collection School Of Computing and Information Systems

This paper reviews the NTIRE 2020 challenge on video quality mapping (VQM), which addresses the issues of quality mapping from source video domain to target video domain. The challenge includes both a supervised track (track 1) and a weakly-supervised track (track 2) for two benchmark datasets. In particular, track 1 offers a new Internet video benchmark, requiring algorithms to learn the map from more compressed videos to less compressed videos in a supervised training manner. In track 2, algorithms are required to learn the quality mapping from one device to another when their quality varies substantially and weaklyaligned video pairs …


Gpu-Accelerated Subgraph Enumeration On Partitioned Graphs, Wentian Guo, Yuchen Li, Mo Sha, Bingsheng He, Xiaokui Xiao, Kian-Lee Tan Jun 2020

Gpu-Accelerated Subgraph Enumeration On Partitioned Graphs, Wentian Guo, Yuchen Li, Mo Sha, Bingsheng He, Xiaokui Xiao, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Subgraph enumeration is important for many applications such as network motif discovery and community detection. Recent works utilize graphics processing units (GPUs) to parallelize subgraph enumeration, but they can only handle graphs that fit into the GPU memory. In this paper, we propose a new approach for GPU-accelerated subgraph enumeration that can efficiently scale to large graphs beyond the GPU memory. Our approach divides the graph into partitions, each of which fits into the GPU memory. The GPU processes one partition at a time and searches the matched subgraphs of a given pattern (i.e., instances) within the partition as in …


Mining User-Generated Content Of Mobile Patient Portal: Dimensions Of User Experience, Mohammad Al-Ramahi, Cherie Noteboom Jun 2020

Mining User-Generated Content Of Mobile Patient Portal: Dimensions Of User Experience, Mohammad Al-Ramahi, Cherie Noteboom

Research & Publications

Patient portals are positioned as a central component of patient engagement through the potential to change the physician-patient relationship and enable chronic disease self-management. The incorporation of patient portals provides the promise to deliver excellent quality, at optimized costs, while improving the health of the population. This study extends the existing literature by extracting dimensions related to the Mobile Patient Portal Use. We use a topic modeling approach to systematically analyze users’ feedback from the actual use of a common mobile patient portal, Epic’s MyChart. Comparing results of Latent Dirichlet Allocation analysis with those of human analysis validated the extracted …


Improved Chinese Language Processing For An Open Source Search Engine, Xianghong Sun May 2020

Improved Chinese Language Processing For An Open Source Search Engine, Xianghong Sun

Master's Projects

Natural Language Processing (NLP) is the process of computers analyzing on human languages. There are also many areas in NLP. Some of the areas include speech recognition, natural language understanding, and natural language generation.

Information retrieval and natural language processing for Asians languages has its own unique set of challenges not present for Indo-European languages. Some of these are text segmentation, named entity recognition in unsegmented text, and part of speech tagging. In this report, we describe our implementation of and experiments with improving the Chinese language processing sub-component of an open source search engine, Yioop. In particular, we rewrote …


An Inventory Of Existing Neuroprivacy Controls, Dustin Steinhagen, Houssain Kettani May 2020

An Inventory Of Existing Neuroprivacy Controls, Dustin Steinhagen, Houssain Kettani

Research & Publications

Brain-Computer Interfaces (BCIs) facilitate communication between brains and computers. As these devices become increasingly popular outside of the medical context, research interest in brain privacy risks and countermeasures has bloomed. Several neuroprivacy threats have been identified in the literature, including brain malware, personal data being contained in collected brainwaves and the inadequacy of legal regimes with regards to neural data protection. Dozens of controls have been proposed or implemented for protecting neuroprivacy, although it has not been immediately apparent what the landscape of neuroprivacy controls consists of. This paper inventories the implemented and proposed neuroprivacy risk mitigation techniques from open …


Benchmarking Mongodb Multi-Document Transactions In A Sharded Cluster, Tushar Panpaliya May 2020

Benchmarking Mongodb Multi-Document Transactions In A Sharded Cluster, Tushar Panpaliya

Master's Projects

Relational databases like Oracle, MySQL, and Microsoft SQL Server offer trans- action processing as an integral part of their design. These databases have been a primary choice among developers for business-critical workloads that need the highest form of consistency. On the other hand, the distributed nature of NoSQL databases makes them suitable for scenarios needing scalability, faster data access, and flexible schema design. Recent developments in the NoSQL database community show that NoSQL databases have started to incorporate transactions in their drivers to let users work on business-critical scenarios without compromising the power of distributed NoSQL features [1].

MongoDB is …


Server Score, Zachary Buresh May 2020

Server Score, Zachary Buresh

Student Academic Conference

This presentation is in regards to the Android mobile application that I developed in the Kotlin programming language named "Server Score". The app helps waiters/waitresses calculate, track, and predict performance related statistics on the job.


Predictive Modeling Of Asynchronous Event Sequence Data, Jin Shang May 2020

Predictive Modeling Of Asynchronous Event Sequence Data, Jin Shang

LSU Doctoral Dissertations

Large volumes of temporal event data, such as online check-ins and electronic records of hospital admissions, are becoming increasingly available in a wide variety of applications including healthcare analytics, smart cities, and social network analysis. Those temporal events are often asynchronous, interdependent, and exhibiting self-exciting properties. For example, in the patient's diagnosis events, the elevated risk exists for a patient that has been recently at risk. Machine learning that leverages event sequence data can improve the prediction accuracy of future events and provide valuable services. For example, in e-commerce and network traffic diagnosis, the analysis of user activities can be …


Building A Library Search Infrastructure With Elasticsearch, Kim Pham, Fernando Reyes, Jeff Rynhart May 2020

Building A Library Search Infrastructure With Elasticsearch, Kim Pham, Fernando Reyes, Jeff Rynhart

University Libraries: Faculty Scholarship

This article discusses our implementation of an Elastic cluster to address our search, search administration and indexing needs, how it integrates in our technology infrastructure, and finally takes a close look at the way that we built a reusable, dynamic search engine that powers our digital repository search. We cover the lessons learned with our early implementations and how to address them to lay the groundwork for a scalable, networked search environment that can also be applied to alternative search engines such as Solr.


Improved User News Feed Customization For An Open Source Search Engine, Timothy Chow May 2020

Improved User News Feed Customization For An Open Source Search Engine, Timothy Chow

Master's Projects

Yioop is an open source search engine project hosted on the site of the same name.It offers several features outside of searching, with one such feature being a news feed. The current news feed system aggregates articles from a curated list of news sites determined by the owner. However in its current state, the feed list is limited in size, constrained by the hardware that the aggregator is run on. The goal of my project was to overcome this limit by improving the current storage method used. The solution was derived by making use of IndexArchiveBundles and IndexShards, both of …


Developing A Mongodb Monitoring System Using Nosql Databases For Monitored Data Management, Anjitha Karattu Thodi May 2020

Developing A Mongodb Monitoring System Using Nosql Databases For Monitored Data Management, Anjitha Karattu Thodi

Master's Projects

MongoDB is a NoSQL database, specifically used to efficiently store and access a large quantity of unstructured data over a distributed cluster of nodes. As the number of nodes in the cluster increases, it becomes difficult to manually monitor different components of the database. This poses an interesting problem of monitoring the MongoDB database to view the state of the system at any point. Although a few proprietary monitoring tools exist to monitor MongoDB clusters, they are not freely available for use in academia. Therefore, the focus of this project is to create a monitoring system that is completely built …


The Use Of Digital Millenium Copyright Act To Stifle Speech Through Non-Copyright Related Takedowns, Miller Freeman May 2020

The Use Of Digital Millenium Copyright Act To Stifle Speech Through Non-Copyright Related Takedowns, Miller Freeman

Seattle Journal of Technology, Environmental, & Innovation Law

In 1998, Congress passed the Digital Millennium Copyright Act. This law provided new methods of protecting copyright in online media. These protections shift the normal judicial process that would stop the publication of infringing materials to private actors: the online platforms. As a result, online platforms receive notices of infringement and issue takedowns of allegedly copyrighted works without the judicial process which normally considers the purpose of the original notice of infringement. In at least one case, discussed in detail below, this has resulted in a notice and takedown against an individual for reasons not related to the purpose of …


Secure And Efficient Models For Retrieving Data From Encrypted Databases In Cloud, Sultan Ahmed A Almakdi May 2020

Secure And Efficient Models For Retrieving Data From Encrypted Databases In Cloud, Sultan Ahmed A Almakdi

Graduate Theses and Dissertations

Recently, database users have begun to use cloud database services to outsource their databases. The reason for this is the high computation speed and the huge storage capacity that cloud owners provide at low prices. However, despite the attractiveness of the cloud computing environment to database users, privacy issues remain a cause for concern for database owners since data access is out of their control. Encryption is the only way of assuaging users’ fears surrounding data privacy, but executing Structured Query Language (SQL) queries over encrypted data is a challenging task, especially if the data are encrypted by a randomized …


Dynamic Fraud Detection Via Sequential Modeling, Panpan Zheng May 2020

Dynamic Fraud Detection Via Sequential Modeling, Panpan Zheng

Graduate Theses and Dissertations

The impacts of information revolution are omnipresent from life to work. The web services have signicantly changed our living styles in daily life, such as Facebook for communication and Wikipedia for knowledge acquirement. Besides, varieties of information systems, such as data management system and management information system, make us work more eciently. However, it is usually a double-edged sword. With the popularity of web services, relevant security issues are arising, such as fake news on Facebook and vandalism on Wikipedia, which denitely impose severe security threats to OSNs and their legitimate participants. Likewise, oce automation incurs another challenging security issue, …


Data Breach Consequences And Responses: A Multi-Method Investigation Of Stakeholders, Hamid Reza Nikkhah May 2020

Data Breach Consequences And Responses: A Multi-Method Investigation Of Stakeholders, Hamid Reza Nikkhah

Graduate Theses and Dissertations

The role of information in today’s economy is essential as organizations that can effectively store and leverage information about their stakeholders can gain an advantage in their markets. The extensive digitization of business information can make organizations vulnerable to data breaches. A data breach is the unauthorized access to sensitive, protected, or confidential data resulting in the compromise of information security. Data breaches affect not only the breached organization but also various related stakeholders. After a data breach, stakeholders of the breached organizations show negative behaviors, which causes the breached organizations to face financial and non-financial costs. As such, the …


The Ensemble Mesh-Term Query Expansion Models Using Multiple Lda Topic Models And Ann Classifiers In Health Information Retrieval, Sukjin You May 2020

The Ensemble Mesh-Term Query Expansion Models Using Multiple Lda Topic Models And Ann Classifiers In Health Information Retrieval, Sukjin You

Theses and Dissertations

Information retrieval in the health field has several challenges. Health information terminology is difficult for consumers (laypeople) to understand. Formulating a query with professional terms is not easy for consumers because health-related terms are more familiar to health professionals. If health terms related to a query are automatically added, it would help consumers to find relevant information. The proposed query expansion (QE) models show how to expand a query using MeSH (Medical Subject Headings) terms. The documents were represented by MeSH terms (i.e. Bag-of-MeSH), which were included in the full-text articles. And then the MeSH terms were used to generate …


Applications Of Digital Remote Sensing To Quantify Glacier Change In Glacier And Mount Rainier National Parks, Brianna Clark May 2020

Applications Of Digital Remote Sensing To Quantify Glacier Change In Glacier And Mount Rainier National Parks, Brianna Clark

Electronic Theses and Dissertations

Digital remote sensing and geographic information systems were employed in performing area and volume calculations on glacial landscapes. Characteristics of glaciers from two geographic regions, the Intermountain Region (between the Rocky Mountain and Cascade Ranges) and the Pacific Northwest, were estimated for the years 1985, 2000, and 2015. Glacier National Park was studied for the Intermountain Region whereas Mount Rainier National Park was representative of the glaciers in the Pacific Northwest. Within the thirty year period of the study, the glaciers in Glacier National Park decreased in area by 27.5 percent while those on Mount Rainier only decreased by 5.7 …


Hierarchical Reinforcement Learning With Integrated Discovery Of Salient Subgoals, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan May 2020

Hierarchical Reinforcement Learning With Integrated Discovery Of Salient Subgoals, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Hierarchical Reinforcement Learning (HRL) is a promising approach to solve more complex tasks which may be challenging for the traditional reinforcement learning. HRL achieves this by decomposing a task into shorter-horizon subgoals which are simpler to achieve. Autonomous discovery of such subgoals is an important part of HRL. Recently, end-to-end HRL methods have been used to reduce the overhead from offline subgoal discovery by seeking the useful subgoals while simultaneously learning optimal policies in a hierarchy. However, these methods may still suffer from slow learning when the search space used by a high level policy to find the subgoals is …


Platform Pricing With Strategic Buyers: The Impact Of Future Production Cost, Mei Lin, Xiajun Amy Pan, Quan Zheng May 2020

Platform Pricing With Strategic Buyers: The Impact Of Future Production Cost, Mei Lin, Xiajun Amy Pan, Quan Zheng

Research Collection School Of Computing and Information Systems

Two-sided platforms are often coupled with exclusive hardware products that connect two sides of users, the consumers of the hardware product (i.e., buyers) and the application developers (i.e., sellers). The hardware product in the platform business model introduces three important issues that are not yet well understood in the literature of platform pricing: potentially downward-trending production cost, product quality improvements, and consumers' strategic behaviors. Using analytical modeling, our study explicitly factors in these issues in analyzing a monopoly platform owner's two-sided pricing problem. The platform sequentially introduces and prices quality-improving hardware products, for which the costliness of quality may decrease. …


Chaff From The Wheat: Characterizing And Determining Valid Bug Reports, Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan May 2020

Chaff From The Wheat: Characterizing And Determining Valid Bug Reports, Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan

Research Collection School Of Computing and Information Systems

Developers use bug reports to triage and fix bugs. When triaging a bug report, developers must decide whether the bug report is valid (i.e., a real bug). A large amount of bug reports are submitted every day, with many of them end up being invalid reports. Manually determining valid bug report is a difficult and tedious task. Thus, an approach that can automatically analyze the validity of a bug report and determine whether a report is valid can help developers prioritize their triaging tasks and avoid wasting time and effort on invalid bug reports. In this study, motivated by the …


Cornac: A Comparative Framework For Multimodal Recommender Systems, Aghiles Salah, Quoc Tuan Truong, Hady W. Lauw May 2020

Cornac: A Comparative Framework For Multimodal Recommender Systems, Aghiles Salah, Quoc Tuan Truong, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Cornac is an open-source Python framework for multimodal recommender systems. In addition to core utilities for accessing, building, evaluating, and comparing recommender models, Cornac is distinctive in putting emphasis on recommendation models that leverage auxiliary information in the form of a social network, item textual descriptions, product images, etc. Such multimodal auxiliary data supplement user-item interactions (e.g., ratings, clicks), which tend to be sparse in practice. To facilitate broad adoption and community contribution, Cornac is publicly available at https://github.com/PreferredAI/cornac, and it can be installed via Anaconda or the Python Package Index (pip). Not only is it well-covered by unit tests …


A 2020 Perspective On "Client Risk Informedness In Brokered Cloud Services: An Experimental Pricing Study", Di Shang, Robert J. Kauffman May 2020

A 2020 Perspective On "Client Risk Informedness In Brokered Cloud Services: An Experimental Pricing Study", Di Shang, Robert J. Kauffman

Research Collection School Of Computing and Information Systems

Cloud computing and the cloud services market have advanced in the past ten years. Cloud services now include most information technology (IT) services from fundamental computing services to more cutting- edge artificial intelligence (AI) services. Accordingly, opportunities have emerged for research on the design of new market features to improve the cloud services market to benefit providers and users. Based on our observation of the recent development of cloud services, in this short research commentary, we share our agenda for future studies of this important sector of IT services.


Robust Graph Learning From Noisy Data, Zhao Kang, Haiqi Pan, Steven C. H. Hoi, Zenglin Xu May 2020

Robust Graph Learning From Noisy Data, Zhao Kang, Haiqi Pan, Steven C. H. Hoi, Zenglin Xu

Research Collection School Of Computing and Information Systems

Learning graphs from data automatically have shown encouraging performance on clustering and semisupervised learning tasks. However, real data are often corrupted, which may cause the learned graph to be inexact or unreliable. In this paper, we propose a novel robust graph learning scheme to learn reliable graphs from the real-world noisy data by adaptively removing noise and errors in the raw data. We show that our proposed model can also be viewed as a robust version of manifold regularized robust principle component analysis (RPCA), where the quality of the graph plays a critical role. The proposed model is able to …


Jplink: On Linking Jobs To Vocational Interest Types, Amila Silva, Pei Chi Lo, Ee-Peng Lim May 2020

Jplink: On Linking Jobs To Vocational Interest Types, Amila Silva, Pei Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Linking job seekers with relevant jobs requires matching based on not only skills, but also personality types. Although the Holland Code also known as RIASEC has frequently been used to group people by their suitability for six different categories of occupations, the RIASEC category labels of individual jobs are often not found in job posts. This is attributed to significant manual efforts required for assigning job posts with RIASEC labels. To cope with assigning massive number of jobs with RIASEC labels, we propose JPLink, a machine learning approach using the text content in job titles and job descriptions. JPLink exploits …


Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim May 2020

Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim

McKelvey School of Engineering Theses & Dissertations

Electronic Health Records (EHR) are widely adopted and used throughout healthcare systems and are able to collect and store longitudinal information data that can be used to describe patient phenotypes. From the underlying data structures used in the EHR, discrete data can be extracted and analyzed to improve patient care and outcomes via tasks such as risk stratification and prospective disease management. Temporality in EHR is innately present given the nature of these data, however, and traditional classification models are limited in this context by the cross-sectional nature of training and prediction processes. Finding temporal patterns in EHR is especially …


Vzwam Web-Based Lookup, Ruben Claudio May 2020

Vzwam Web-Based Lookup, Ruben Claudio

Masters Theses & Doctoral Dissertations

This web-based lookup will allow V employees to find territory sales rep much faster. It will simplify the process and eliminate manual processes.

At the moment, a combination of multiple manual processes is needed to find territory sales reps. The company’s CRM does not allow to find rep sales quickly. When an in-house sales representative is talking to a prospect, this sales rep has to go through a few series of steps to find an outside or territory sales rep --which is usually needed to schedule in-person meetings, that results in delays while doing transactions with the prospects. Besides, because …


The Future Of Work Now: Cyber Threat Attribution At Fireeye, Thomas H. Davenport, Steven M. Miller May 2020

The Future Of Work Now: Cyber Threat Attribution At Fireeye, Thomas H. Davenport, Steven M. Miller

Research Collection School Of Computing and Information Systems

One of the most frequently-used phrases at business events these days is “the future of work.” It’s increasingly clear that artificial intelligence and other new technologies will bring substantial changes in work tasks and business processes. But while these changes are predicted for the future, they’re already present in many organizations for many different jobs. The job and incumbent described below is an example of this phenomenon. It’s a clear example of an existing job that’s been transformed by AI and related tools.


Retrofitting Embeddings For Unsupervised User Identity Linkage, Tao Zhou, Ee-Peng Lim, Roy Ka-Wei Lee, Feida Zhu, Jiuxin Cao May 2020

Retrofitting Embeddings For Unsupervised User Identity Linkage, Tao Zhou, Ee-Peng Lim, Roy Ka-Wei Lee, Feida Zhu, Jiuxin Cao

Research Collection School Of Computing and Information Systems

User Identity Linkage (UIL) is the problem of matching user identities across multiple online social networks (OSNs) which belong to the same person. The solutions to UIL problem facilitate cross-platform research on OSN users and enable many useful applications such as user profiling and recommendation. As the UIL labeled data are often lacking and costly to obtain, learning user embeddings for matching user identities using an unsupervised approach is therefore highly desired. In this paper, we propose a novel unsupervised UIL framework for enhancing existing user embedding-based UIL methods. Our proposed framework incorporates two key ideas, user-discriminative features and retrofitting …


Privacy-Preserving Protocol For Atomic Swap Between Blockchains, Kiran Gurung May 2020

Privacy-Preserving Protocol For Atomic Swap Between Blockchains, Kiran Gurung

Boise State University Theses and Dissertations

Atomic swap facilitates fair exchange of cryptocurrencies without the need for a trusted authority. It is regarded as one of the prominent technologies for the cryptocurrency ecosystem, helping to realize the idea of a decentralized blockchain introduced by Bitcoin. However, due to the heterogeneity of the cryptocurrency systems, developing efficient and privacy-preserving atomic swap protocols has proven challenging. In this thesis, we propose a generic framework for atomic swap, called PolySwap, that enables fair ex-change of assets between two heterogeneous sets of blockchains. Our construction 1) does not require a trusted third party, 2) preserves the anonymity of the swap …