Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Singapore Management University (2961)
- Wright State University (632)
- Walden University (447)
- Selected Works (287)
- New Jersey Institute of Technology (137)
-
- University of Nebraska at Omaha (119)
- California State University, San Bernardino (96)
- Old Dominion University (95)
- San Jose State University (85)
- University of Dayton (82)
- The University of Maine (67)
- City University of New York (CUNY) (65)
- University of Nebraska - Lincoln (54)
- Air Force Institute of Technology (53)
- SelectedWorks (53)
- Technological University Dublin (51)
- University of South Florida (50)
- Kennesaw State University (46)
- Nova Southeastern University (43)
- Claremont Colleges (42)
- University of Wisconsin Milwaukee (42)
- University of Arkansas, Fayetteville (41)
- Western Kentucky University (41)
- Dakota State University (39)
- Institute of Business Administration (38)
- California Polytechnic State University, San Luis Obispo (36)
- Western University (35)
- Ateneo de Manila University (34)
- Governors State University (34)
- Purdue University (34)
- Keyword
-
- Machine learning (101)
- Information technology (93)
- Data mining (89)
- Social media (78)
- Twitter (64)
-
- Machine Learning (57)
- Cybersecurity (54)
- Semantic Web (54)
- Deep learning (52)
- Artificial intelligence (49)
- Online learning (49)
- Information Technology (47)
- Classification (46)
- Cloud computing (45)
- Information retrieval (45)
- Privacy (45)
- Big data (44)
- Database (43)
- Ontology (43)
- Computer science (42)
- Information security (41)
- Algorithms (40)
- Security (40)
- Databases (39)
- Information systems (39)
- Management (37)
- Clustering (36)
- Data Mining (36)
- Northern Ohio Data and Information Service (NODIS) (36)
- Technology (35)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (2867)
- Kno.e.sis Publications (541)
- Walden Dissertations and Doctoral Studies (447)
- Theses and Dissertations (116)
- Dissertations (107)
-
- Computer Science Faculty Publications (91)
- Computer Science and Engineering Faculty Publications (91)
- Theses Digitization Project (84)
- Master's Projects (68)
- Information Systems and Quantitative Analysis Faculty Proceedings & Presentations (64)
- Electronic Theses and Dissertations (55)
- Dissertations and Theses Collection (Open Access) (50)
- Theses (46)
- USF Tampa Graduate Theses and Dissertations (46)
- CCE Theses and Dissertations (42)
- Information Systems and Quantitative Analysis Faculty Publications (41)
- Kyriakos MOURATIDIS (40)
- CGU Faculty Publications and Research (37)
- International Conference on Information and Communication Technologies (36)
- Open Educational Resources (34)
- Department of Information Systems & Computer Science Faculty Publications (33)
- All Capstone Projects (32)
- Graduate Theses and Dissertations (32)
- Masters Theses & Doctoral Dissertations (32)
- Articles (29)
- Conference papers (28)
- David LO (28)
- Journal of Spatial Information Science (28)
- All Maxine Goodman Levin School of Urban Affairs Publications (27)
- Saverio Perugini (25)
- Publication Type
Articles 2491 - 2520 of 6720
Full-Text Articles in Physical Sciences and Mathematics
Design And Implementation Of A Stand-Alone Tool For Metabolic Simulations, Milad Ghiasi Rad
Design And Implementation Of A Stand-Alone Tool For Metabolic Simulations, Milad Ghiasi Rad
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
In this thesis, we present the design and implementation of a stand-alone tool for metabolic simulations. This system is able to integrate custom-built SBML models along with external user’s input information and produces the estimation of any reactants participating in the chain of the reactions in the provided model, e.g., ATP, Glucose, Insulin, for the given duration using numerical analysis and simulations. This tool offers the food intake arguments in the calculations to consider the personalized metabolic characteristics in the simulations. The tool has also been generalized to take into consideration of temporal genomic information and be flexible for simulation …
On Modeling Sense Relatedness In Multi-Prototype Word Embedding, Yixin Cao, Juanzi Li, Jiaxin Shi, Zhiyuan Liu, Chengjiang Li
On Modeling Sense Relatedness In Multi-Prototype Word Embedding, Yixin Cao, Juanzi Li, Jiaxin Shi, Zhiyuan Liu, Chengjiang Li
Research Collection School Of Computing and Information Systems
To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel approach to capture word sense relatedness in multi-prototype word embedding model. Particularly, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. Based on the idea of …
Leveraging Auxiliary Tasks For Document-Level Cross-Domain Sentiment Classification, Jianfei Yu, Jing Jiang
Leveraging Auxiliary Tasks For Document-Level Cross-Domain Sentiment Classification, Jianfei Yu, Jing Jiang
Research Collection School Of Computing and Information Systems
In this paper, we study domain adaptationwith a state-of-the-art hierarchicalneural network for document-level sentimentclassification. We first design a newauxiliary task based on sentiment scoresof domain-independent words. We thenpropose two neural network architecturesto respectively induce document embeddingsand sentence embeddings that workwell for different domains. When thesedocument and sentence embeddings areused for sentiment classification, we findthat with both pseudo and external sentimentlexicons, our proposed methods canperform similarly to or better than severalhighly competitive domain adaptationmethods on a benchmark dataset of productreviews.
Using Teaching Cases For Achieving Bloom’S High-Order Cognitive Levels: An Application In Technically-Oriented Information Systems Course, Kar Way Tan
Research Collection School Of Computing and Information Systems
Case-teaching has been an attractive pedagogy method for bringing in real-world examples into the classroom. However, it is challenging to introduce cases to address high-order cognitive skills such as analyzing and creating new IT solutions in technically-oriented computing course. In this research, we present our experience in introducing three types of case studies -- Story-Telling case, Design-and-Problem-Solving case, and Create-Design-Implement case to a course in an undergraduate Information Systems programme. For each case study, we plan and map the learning objectives to address various cognitive levels in the revised Bloom’s Taxonomy. Using surveys conducted over two academic years, we show …
Disease Gene Classification With Metagraph Representations, Sezin Kircali Ata, Yuan Fang, Min Wu, Xiao-Li Li, Xiaokui Xiao
Disease Gene Classification With Metagraph Representations, Sezin Kircali Ata, Yuan Fang, Min Wu, Xiao-Li Li, Xiaokui Xiao
Research Collection School Of Computing and Information Systems
Protein-protein interaction (PPI) networks play an important role in studying the functional roles of proteins, including their association with diseases. However, protein interaction networks are not sufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To complement and enrich PPI networks, we propose to exploit biological properties of individual proteins. More specifically, we integrate keywords describing protein properties into the PPI network, and construct a novel PPI-Keywords (PPIK) network consisting of both proteins and keywords as two different types of nodes. As disease proteins tend to have a similar topological characteristics …
Inferring Social Media Users’ Demographics From Profile Pictures: A Face++ Analysis On Twitter Users, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard J. Jansen
Inferring Social Media Users’ Demographics From Profile Pictures: A Face++ Analysis On Twitter Users, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard J. Jansen
Research Collection School Of Computing and Information Systems
In this research, we evaluate the applicability of using facial recognition of social media account profile pictures to infer the demographic attributes of gender, race, and age of the account owners leveraging a commercial and well-known image service, specifically Face++. Our goal is to determine the feasibility of this approach for actual system implementation. Using a dataset of approximately 10,000 Twitter profile pictures, we use Face++ to classify this set of images for gender, race, and age. We determine that about 30% of these profile pictures contain identifiable images of people using the current state-of-the-art automated means. We then employ …
Using Data Analytics For Discovering Library Resource Insights: Case From Singapore Management University, Ning Lu, Rui Song, Dina Li Gwek Heng, Swapna Gottipati, Aaron Tay
Using Data Analytics For Discovering Library Resource Insights: Case From Singapore Management University, Ning Lu, Rui Song, Dina Li Gwek Heng, Swapna Gottipati, Aaron Tay
Research Collection School Of Computing and Information Systems
Library resources are critical in supporting teaching, research and learning processes. Several universities have employed online platforms and infrastructure for enabling the online services to students, faculty and staff. To provide efficient services by understanding and predicting user needs libraries are looking into the area of data analytics. Library analytics in Singapore Management University is the project committed to provide an interface for data-intensive project collaboration, while supporting one of the library’s key pillars on its commitment to collaborate on initiatives with SMU Communities and external groups. In this paper, we study the transaction logs for user behavior analysis that …
Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen
Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen
Student Theses
The topic of machine ethics is growing in recognition and energy, but bias in machine learning algorithms outpaces it to date. Bias is a complicated term with good and bad connotations in the field of algorithmic prediction making. Especially in circumstances with legal and ethical consequences, we must study the results of these machines to ensure fairness. This paper attempts to address ethics at the algorithmic level of autonomous machines. There is no one solution to solving machine bias, it depends on the context of the given system and the most reasonable way to avoid biased decisions while maintaining the …
Utilizing Consumer Health Posts For Pharmacovigilance: Identifying Underlying Factors Associated With Patients’ Attitudes Towards Antidepressants, Maryam Zolnoori
Utilizing Consumer Health Posts For Pharmacovigilance: Identifying Underlying Factors Associated With Patients’ Attitudes Towards Antidepressants, Maryam Zolnoori
Theses and Dissertations
Non-adherence to antidepressants is a major obstacle to antidepressants therapeutic benefits, resulting in increased risk of relapse, emergency visits, and significant burden on individuals and the healthcare system. Several studies showed that non-adherence is weakly associated with personal and clinical variables, but strongly associated with patients’ beliefs and attitudes towards medications. The traditional methods for identifying the key dimensions of patients’ attitudes towards antidepressants are associated with some methodological limitations, such as concern about confidentiality of personal information. In this study, attempts have been made to address the limitations by utilizing patients’ self report experiences in online healthcare forums to …
A Novel Density Peak Clustering Algorithm Based On Squared Residual Error, Milan Parmar, Di Wang, Ah-Hwee Tan, Chunyan Miao, Jianhua Jiang, You Zhou
A Novel Density Peak Clustering Algorithm Based On Squared Residual Error, Milan Parmar, Di Wang, Ah-Hwee Tan, Chunyan Miao, Jianhua Jiang, You Zhou
Research Collection School Of Computing and Information Systems
The density peak clustering (DPC) algorithm is designed to quickly identify intricate-shaped clusters with high dimensionality by finding high-density peaks in a non-iterative manner and using only one threshold parameter. However, DPC has certain limitations in processing low-density data points because it only takes the global data density distribution into account. As such, DPC may confine in forming low-density data clusters, or in other words, DPC may fail in detecting anomalies and borderline points. In this paper, we analyze the limitations of DPC and propose a novel density peak clustering algorithm to better handle low-density clustering tasks. Specifically, our algorithm …
Secure Server-Aided Top-K Monitoring, Yujue Wang, Hwee Hwa Pang, Yanjiang Yang, Xuhua Ding
Secure Server-Aided Top-K Monitoring, Yujue Wang, Hwee Hwa Pang, Yanjiang Yang, Xuhua Ding
Research Collection School Of Computing and Information Systems
In a data streaming model, a data owner releases records or documents to a set of users with matching interests, in such a way that the match in interest can be calculated from the correlation between each pair of document and user query. For scalability and availability reasons, this calculation is delegated to third-party servers, which gives rise to the need to protect the integrity and privacy of the documents and user queries. In this paper, we propose a server-aided data stream monitoring scheme (DSM) to address the aforementioned integrity and privacy challenges, so that the users are able to …
Btci: A New Framework For Identifying Congestion Cascades Using Bus Trajectory Data, Meng-Fen Chiang, Ee Peng Lim, Wang-Chien Lee, Agus Trisnajaya Kwee
Btci: A New Framework For Identifying Congestion Cascades Using Bus Trajectory Data, Meng-Fen Chiang, Ee Peng Lim, Wang-Chien Lee, Agus Trisnajaya Kwee
Research Collection School Of Computing and Information Systems
The knowledge of traffic health status is essential to the general public and urban traffic management. To identify congestion cascades, an important phenomenon of traffic health, we propose a Bus Trajectory based Congestion Identification (BTCI) framework that explores the anomalous traffic health status and structure properties of congestion cascades using bus trajectory data. BTCI consists of two main steps, congested segment extraction and congestion cascades identification. The former constructs path speed models from historical vehicle transitions and design a non-parametric Kernel Density Estimation (KDE) function to derive a measure of congestion score. The latter aggregates congested segments (i.e., those with …
Analyzing The E-Learning Video Environment Requirements Of Generation Z Students Using Echo360 Platform, Swapna Gottipati, Venky Shankararaman
Analyzing The E-Learning Video Environment Requirements Of Generation Z Students Using Echo360 Platform, Swapna Gottipati, Venky Shankararaman
Research Collection School Of Computing and Information Systems
As with any other generational cohort,Generation Z students have their own unique characteristics that influencetheir approach to learning process. They are the future workforce and severalefforts are undertaken by Government and education institutes to consider thecharacteristics of Gen-Z in developing the curriculum and teaching environmentsuitable for these students. E-learning plays a key role in students learningprocess and has been widely adopted by many education institutions. Inparticular, videos play a major role in the learning process of Gen-Zstudents. The purpose of this paper isto focus the on requirements of Gen-Z students and to provide suggestions forhow to create a e-learning video …
D-Watch: Embracing “Bad” Multipaths For Device-Free Localization With Cots Rfid Devices, Ju Wang, Jie Xiong, Hongbo Jiang, Xiaojiang Chen, Dingyi Fang
D-Watch: Embracing “Bad” Multipaths For Device-Free Localization With Cots Rfid Devices, Ju Wang, Jie Xiong, Hongbo Jiang, Xiaojiang Chen, Dingyi Fang
Research Collection School Of Computing and Information Systems
Device-free localization, which does not require any device attached to the target, is playing a critical role in many applications, such as intrusion detection, elderly monitoring and so on. This paper introduces D-Watch, a device-free system built on the top of low cost commodity-off-the-shelf RFID hardware. Unlike previous works which consider multipaths detrimental, D-Watch leverages the ''bad'' multipaths to provide a decimeter-level localization accuracy without offline training. D-Watch harnesses the angle-of-arrival information from the RFID tags' backscatter signals. The key intuition is that whenever a target blocks a signal's propagation path, the signal power experiences a drop which can be …
Leveraging The Trade-Off Between Accuracy And Interpretability In A Hybrid Intelligent System, Di Wang, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Geok See Ng, You Zhou
Leveraging The Trade-Off Between Accuracy And Interpretability In A Hybrid Intelligent System, Di Wang, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Geok See Ng, You Zhou
Research Collection School Of Computing and Information Systems
Neural Fuzzy Inference System (NFIS) is a widely adopted paradigm to develop a data-driven learning system. This hybrid system has been widely adopted due to its accurate reasoning procedure and comprehensible inference rules. Although most NFISs primarily focus on accuracy, we have observed an ever increasing demand on improving the interpretability of NFISs and other types of machine learning systems. In this paper, we illustrate how we leverage the trade-off between accuracy and interpretability in an NFIS called Genetic Algorithm and Rough Set Incorporated Neural Fuzzy Inference System (GARSINFIS). In a nutshell, GARSINFIS self-organizes its network structure with a small …
Who Are Your Users? Comparing Media Professionals' Preconception Of Users To Data-Driven Personas, Lene Nielsen, Soon-Gyu Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen
Who Are Your Users? Comparing Media Professionals' Preconception Of Users To Data-Driven Personas, Lene Nielsen, Soon-Gyu Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen
Research Collection School Of Computing and Information Systems
One of the reasons for using personas is to align user understandings across project teams and sites. As part of a larger persona study, at Al Jazeera English (AJE), we conducted 16 qualitative interviews with media producers, the end users of persona descriptions. We asked the participants about their understanding of a typical AJE media consumer, and the variety of answers shows that the understandings are not aligned and are built on a mix of own experiences, own self, assumptions, and data given by the company. The answers are sometimes aligned with the data-driven personas and sometimes not. The end …
The Graph Database: Jack Of All Trades Or Just Not Sql?, George F. Hurlburt, Maria R. Lee, George K. Thiruvathukal
The Graph Database: Jack Of All Trades Or Just Not Sql?, George F. Hurlburt, Maria R. Lee, George K. Thiruvathukal
Computer Science: Faculty Publications and Other Works
This special issue of IT Professional focuses on the graph database. The graph database, a relatively new phenomenon, is well suited to the burgeoning information era in which we are increasingly becoming immersed. Here, the guest editors briefly explain how a graph database works, its relation to the relational database management system (RDBMS), and its quantitative and qualitative pros and cons, including how graph databases can be harnessed in a hybrid environment. They also survey the excellent articles submitted for this special issue.
Nbpmf: Novel Network-Based Inference Methods For Peptide Mass Fingerprinting, Zhewei Liang
Nbpmf: Novel Network-Based Inference Methods For Peptide Mass Fingerprinting, Zhewei Liang
Electronic Thesis and Dissertation Repository
Proteins are large, complex molecules that perform a vast array of functions in every living cell. A proteome is a set of proteins produced in an organism, and proteomics is the large-scale study of proteomes. Several high-throughput technologies have been developed in proteomics, where the most commonly applied are mass spectrometry (MS) based approaches. MS is an analytical technique for determining the composition of a sample. Recently it has become a primary tool for protein identification, quantification, and post translational modification (PTM) characterization in proteomics research. There are usually two different ways to identify proteins: top-down and bottom-up. Top-down approaches …
Multi-Step Tokenization Of Automated Clearing House Payment Transactions, Privin Alexander
Multi-Step Tokenization Of Automated Clearing House Payment Transactions, Privin Alexander
USF Tampa Graduate Theses and Dissertations
Since its beginnings in 1974, the Automated Clearing House (ACH) network has grown into one of the largest, safest, and most efficient payment systems in the world. An ACH transaction is an electronic funds transfer between bank accounts using a batch processing system.
Currently, the ACH Network moves almost $43 trillion and 25 billion electronic financial transactions each year. With the increasing movement toward an electronic, interconnected and mobile infrastructure, it is critical that electronic payments work safely and efficiently for all users. ACH transactions carry sensitive data, such as a consumer's name, account number, tax identification number, account holder …
A Study On The Practical Use Of Operations Research And Vessels Big Data In Benefit Of Efficient Ports Utilization In Panama, Gabriel Fuentes Lezcano
A Study On The Practical Use Of Operations Research And Vessels Big Data In Benefit Of Efficient Ports Utilization In Panama, Gabriel Fuentes Lezcano
World Maritime University Dissertations
No abstract provided.
Constructing A Clinical Research Data Management System, Michael C. Quintero
Constructing A Clinical Research Data Management System, Michael C. Quintero
USF Tampa Graduate Theses and Dissertations
Clinical study data is usually collected without knowing what kind of data is going to be collected in advance. In addition, all of the possible data points that can apply to a patient in any given clinical study is almost always a superset of the data points that are actually recorded for a given patient. As a result of this, clinical data resembles a set of sparse data with an evolving data schema. To help researchers at the Moffitt Cancer Center better manage clinical data, a tool was developed called GURU that uses the Entity Attribute Value model to handle …
Uncovering User-Triggered Privacy Leaks In Mobile Applications And Their Utility In Privacy Protection, Joo Keng Joseph Chan
Uncovering User-Triggered Privacy Leaks In Mobile Applications And Their Utility In Privacy Protection, Joo Keng Joseph Chan
Dissertations and Theses Collection
Mobile applications are increasingly popular, and help mobile users in many aspects of their lifestyle. Applications have access to a wealth of information about the user through powerful developer APIs. It is known that most applications, even popular and highly regarded ones, utilize and leak privacy data to the network. It is also common for applications to over-access privacy data that does not fit the functionality profile of the application. Although there are available privacy detection tools, they might not provide sufficient context to help users better understand the privacy behaviours of their applications. In this dissertation, I present the …
Indexable Bayesian Personalized Ranking For Efficient Top-K Recommendation, Dung D. Le, Hady W. Lauw
Indexable Bayesian Personalized Ranking For Efficient Top-K Recommendation, Dung D. Le, Hady W. Lauw
Research Collection School Of Computing and Information Systems
Top-k recommendation seeks to deliver a personalized recommendation list of k items to a user. The dual objectives are (1) accuracy in identifying the items a user is likely to prefer, and (2) efficiency in constructing the recommendation list in real time. One direction towards retrieval efficiency is to formulate retrieval as approximate k nearest neighbor (kNN) search aided by indexing schemes, such as locality-sensitive hashing, spatial trees, and inverted index. These schemes, applied on the output representations of recommendation algorithms, speed up the retrieval process by automatically discarding a large number of potentially irrelevant items when given a user …
Color-Sketch Simulator: A Guide For Color-Based Visual Known-Item Search, Jakub Lokoč, Anh Nguyen Phuong, Marta Vomlelová, Chong-Wah Ngo
Color-Sketch Simulator: A Guide For Color-Based Visual Known-Item Search, Jakub Lokoč, Anh Nguyen Phuong, Marta Vomlelová, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
In order to evaluate the effectiveness of a color-sketch retrieval system for a given multimedia database, tedious evaluations involving real users are required as users are in the center of query sketch formulation. However, without any prior knowledge about the bottlenecks of the underlying sketch-based retrieval model, the evaluations may focus on wrong settings and thus miss the desired effect. Furthermore, users have usually no clues or recommendations to draw color-sketches effectively. In this paper, we aim at a preliminary analysis to identify potential bottlenecks of a flexible color-sketch retrieval model. We present a formal framework based on position-color feature …
An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le
An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le
Dissertations and Theses Collection (Open Access)
This thesis proposes a general solution framework that integrates methods in machine learning in creative ways to solve a diverse set of problems arising in urban environments. It particularly focuses on modeling spatiotemporal data for the purpose of predicting urban phenomena. Concretely, the framework is applied to solve three specific real-world problems: human mobility prediction, trac speed prediction and incident prediction. For human mobility prediction, I use visitor trajectories collected a large theme park in Singapore as a simplified microcosm of an urban area. A trajectory is an ordered sequence of attraction visits and corresponding timestamps produced by a visitor. …
Scalable Online Kernel Learning, Jing Lu
Scalable Online Kernel Learning, Jing Lu
Dissertations and Theses Collection (Open Access)
One critical deficiency of traditional online kernel learning methods is their increasing and unbounded number of support vectors (SV’s), making them inefficient and non-scalable for large-scale applications. Recent studies on budget online learning have attempted to overcome this shortcoming by bounding the number of SV’s. Despite being extensively studied, budget algorithms usually suffer from several drawbacks.
First of all, although existing algorithms attempt to bound the number of SV’s at each iteration, most of them fail to bound the number of SV’s for the final averaged classifier, which is commonly used for online-to-batch conversion. To solve this problem, we propose …
Selective Value Coupling Learning For Detecting Outliers In High-Dimensional Categorical Data, Guansong Pang, Hongzuo Xu, Cao Longbing, Wentao Zhao
Selective Value Coupling Learning For Detecting Outliers In High-Dimensional Categorical Data, Guansong Pang, Hongzuo Xu, Cao Longbing, Wentao Zhao
Research Collection School Of Computing and Information Systems
This paper introduces a novel framework, namely SelectVC and its instance POP, for learning selective value couplings (i.e., interactions between the full value set and a set of outlying values) to identify outliers in high-dimensional categorical data. Existing outlier detection methods work on a full data space or feature subspaces that are identified independently from subsequent outlier scoring. As a result, they are significantly challenged by overwhelming irrelevant features in high-dimensional data due to the noise brought by the irrelevant features and its huge search space. In contrast, SelectVC works on a clean and condensed data space spanned by selective …
Unsupervised Topic Hypergraph Hashing For Efficient Mobile Image Retrieval, Lei Zhu, Jialie Shen, Liang Xie, Zhiyong Cheng
Unsupervised Topic Hypergraph Hashing For Efficient Mobile Image Retrieval, Lei Zhu, Jialie Shen, Liang Xie, Zhiyong Cheng
Research Collection School Of Computing and Information Systems
Hashing compresses high-dimensional features into compact binary codes. It is one of the promising techniques to support efficient mobile image retrieval, due to its low data transmission cost and fast retrieval response. However, most of existing hashing strategies simply rely on low-level features. Thus, they may generate hashing codes with limited discriminative capability. Moreover, many of them fail to exploit complex and high-order semantic correlations that inherently exist among images. Motivated by these observations, we propose a novel unsupervised hashing scheme, called topic hypergraph hashing (THH), to address the limitations. THH effectively mitigates the semantic shortage of hashing codes by …
Predicting Indoor Crowd Density Using Column-Structured Deep Neural Network, Akihito Sudo, Teck Hou (Deng Dehao) Teng, Hoong Chuin Lau, Yoshihide Sekimoto
Predicting Indoor Crowd Density Using Column-Structured Deep Neural Network, Akihito Sudo, Teck Hou (Deng Dehao) Teng, Hoong Chuin Lau, Yoshihide Sekimoto
Research Collection School Of Computing and Information Systems
This work proposes a deep neural network approach known as the column-structured deep neural network (COL-DNN-R) for predicting crowd density in an indoor environment using historical Wi-Fi traces of individual visitors. With a structure designed to minimize feature engineering, COL-DNN accepts raw features such as crowd density, opening and closing hours and peak visitor counts for extracting features. The extracted features are used by a regression model R for predicting the crowd densities. Standard regression models such as MLP, RF and SVM can be used as R. Experiments are performed to investigate the effect of feature representation and model structure …
Sourcevote: Fusing Multi-Valued Data Via Inter-Source Agreements, Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H.H. Ngu
Sourcevote: Fusing Multi-Valued Data Via Inter-Source Agreements, Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H.H. Ngu
Research Collection School Of Computing and Information Systems
Data fusion is a fundamental research problem of identifyingtrue values of data items of interest from conflicting multi-sourceddata. Although considerable research efforts have been conducted on thistopic, existing approaches generally assume every data item has exactlyone true value, which fails to reflect the real world where data items withmultiple true values widely exist. In this paper, we propose a novel approach,SourceVote, to estimate value veracity for multi-valued data items.SourceVote models the endorsement relations among sources by quantifyingtheir two-sided inter-source agreements. In particular, two graphs areconstructed to model inter-source relations. Then two aspects of sourcereliability are derived from these graphs and …