Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Singapore Management University

Discipline
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 2821 - 2850 of 7471

Full-Text Articles in Physical Sciences and Mathematics

Socially-Enriched Multimedia Data Co-Clustering, Ah-Hwee Tan May 2019

Socially-Enriched Multimedia Data Co-Clustering, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Heterogeneous data co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multimodal features. This chapter explains how to use the Generalized Heterogeneous Fusion Adaptive Resonance Theory (GHF-ART) generalized heterogeneous fusion adaptive resonance theory for clustering large-scale web multimedia documents. Specifically, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling …


Bilateral Liability-Based Contracts In Information Security Outsourcing, Kai-Lung Hui, Ping Fan Ke, Yuxi Yao, Wei Thoo Yue May 2019

Bilateral Liability-Based Contracts In Information Security Outsourcing, Kai-Lung Hui, Ping Fan Ke, Yuxi Yao, Wei Thoo Yue

Research Collection School Of Computing and Information Systems

We study the efficiency of bilateral liability-based contracts in managed security services (MSSs). We model MSS as a collaborative service with the protection quality shaped by the contribution of both the service provider and the client. We adopt the negligence concept from the legal profession to design two novel contracts: threshold-based liability contract and variable liability contract. We find that they can achieve the first best outcome when postbreach effort verification is feasible. More importantly, they are more efficient than a multilateral contract when the MSS provider assumes limited liability. Our results show that bilateral liability-based contracts can work in …


Graph Based Optimization For Multiagent Cooperation, Arambam James Singh, Akshat Kumar May 2019

Graph Based Optimization For Multiagent Cooperation, Arambam James Singh, Akshat Kumar

Research Collection School Of Computing and Information Systems

We address the problem of solving math programs defined over a graph where nodes represent agents and edges represent interaction among agents. The objective and constraint functions of this program model the task agent team must perform and the domain constraints. In this multiagent setting, no single agent observes the complete objective and all the constraints of the program. Thus, we develop a distributed message-passing approach to solve this optimization problem. We focus on the class of graph structured linear and quadratic programs (LPs/QPs) which can model important multiagent coordination frameworks such as distributed constraint optimization (DCOP). For DCOPs, our …


Pptds: A Privacy-Preserving Truth Discovery Scheme In Crowd Sensing Systems, Chuan Zhang, Liehuang Zhu, Chang Xu, Kashif Sharif, Ximeng Liu May 2019

Pptds: A Privacy-Preserving Truth Discovery Scheme In Crowd Sensing Systems, Chuan Zhang, Liehuang Zhu, Chang Xu, Kashif Sharif, Ximeng Liu

Research Collection School Of Computing and Information Systems

Benefiting from the fast development of human-carried mobile devices, crowd sensing has become an emerging paradigm to sense and collect data. However, reliability of sensory data provided by participating users is still a major concern. To address this reliability challenge, truth discovery is an effective technology to improve data accuracy, and has garnered significant attention. Nevertheless, many of state of art works in truth discovery, either failed to address the protection of participants' privacy or incurred tremendous overhead on the user side. In this paper, we first propose a privacy-preserving truth discovery scheme, named PPTDS-I, which is implemented on two …


Robust Factorization Machine: A Doubly Capped Norms Minimization, Chenghao Liu, Teng Zhang, Jundong Li, Jianwen Yin, Peilin Zhao, Jianling Sun, Steven C. H. Hoi May 2019

Robust Factorization Machine: A Doubly Capped Norms Minimization, Chenghao Liu, Teng Zhang, Jundong Li, Jianwen Yin, Peilin Zhao, Jianling Sun, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Factorization Machine (FM) is a general supervised learning framework for many AI applications due to its powerful capability of feature engineering. Despite being extensively studied, existing FM methods have several limitations in common. First of all, most existing FM methods often adopt the squared loss in the modeling process, which can be very sensitive when the data for learning contains noises and outliers. Second, some recent FM variants often explore the low-rank structure of the feature interactions matrix by relaxing the low-rank minimization problem as a trace norm minimization, which cannot always achieve a tight approximation to the original one. …


Practitioners' Views On Good Software Testing Practices, Pavneet S. Kochhar, Xin Xia, David Lo May 2019

Practitioners' Views On Good Software Testing Practices, Pavneet S. Kochhar, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

Software testing is an integral part of software development process. Unfortunately, for many projects, bugs are prevalent despite testing effort, and testing continues to cost significant amount of time and resources. This brings forward the issue of test case quality and prompts us to investigate what make good test cases. To answer this important question, we interview 21 and survey 261 practitioners, who come from many small to large companies and open source projects distributed in 27 countries, to create and validate 29 hypotheses that describe characteristics of good test cases and testing practices. These characteristics span multiple dimensions including …


How Practitioners Perceive Coding Proficiency, Xin Xia, Zhiyuan Wan, Pavneet S. Kochhar, David Lo May 2019

How Practitioners Perceive Coding Proficiency, Xin Xia, Zhiyuan Wan, Pavneet S. Kochhar, David Lo

Research Collection School Of Computing and Information Systems

Coding proficiency is essential to software practitioners. Unfortunately, our understanding on coding proficiency often translates to vague stereotypes, e.g., “able to write good code”. The lack of specificity hinders employers from measuring a software engineer’s coding proficiency, and software engineers from improving their coding proficiency skills. This raises an important question: what skills matter to improve one’s coding proficiency. To answer this question, we perform an empirical study by surveying 340 software practitioners from 33 countries across 5 continents. We first identify 38 coding proficiency skills grouped into nine categories by interviewing 15 developers from three companies. We then ask …


Deepjit: An End-To-End Deep Learning Framework For Just-In-Time Defect Prediction, Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, Naoyasu Ubayashi May 2019

Deepjit: An End-To-End Deep Learning Framework For Just-In-Time Defect Prediction, Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, Naoyasu Ubayashi

Research Collection School Of Computing and Information Systems

Software quality assurance efforts often focus on identifying defective code. To find likely defective code early, change-level defect prediction – aka. Just-In-Time (JIT) defect prediction – has been proposed. JIT defect prediction models identify likely defective changes and they are trained using machine learning techniques with the assumption that historical changes are similar to future ones. Most existing JIT defect prediction approaches make use of manually engineered features. Unlike those approaches, in this paper, we propose an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects. Experiments …


Towards Personalized Data-Driven Bundle Design With Qos Constraint, Mustafa Misir, Hoong Chuin Lau May 2019

Towards Personalized Data-Driven Bundle Design With Qos Constraint, Mustafa Misir, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

In this paper, we study the bundle design problem for offering personalized bundles of services using historical consumer redemption data. The problem studied here is for an operator managing multiple service providers, each responsible for an attraction, in a leisure park. Given the specific structure of interactions between service providers, consumers and the operator, a bundle of services is beneficial for the operator when the bundle is underutilized by service consumers. Such revenue structure is commonly seen in the cable television and leisure industries, creating strong incentives for the operator to design bundles containing lots of not-so-popular services. However, as …


The Challenges Of Creating Engaging Content: Results From A Focus Group Study Of A Popular News Media Organization, Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen May 2019

The Challenges Of Creating Engaging Content: Results From A Focus Group Study Of A Popular News Media Organization, Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

The process of content creation for distribution via social media platforms is not a trivial one for social media editors as the goal of creating both serious and engaging content is challenging, with no clear or differing guidelines or rules across and between platforms. For creators of serious content, such as news organizations, advertisers, or educational institutions, engagement has a deeper meaning beyond likes, shares, etc. that is aimed at the audience actually processing the underlying content associated with a social media post. In this research, we report findings from a group study that aimed to understand the process and …


Peerlens: Peer-Inspired Interactive Learning Path Planning In Online Question Pool, Meng Xia, Mingfei Sun, Huan Wei, Qing Chen, Yong Wang, Lei Shi, Huamin Qu, Xiaojuan Ma May 2019

Peerlens: Peer-Inspired Interactive Learning Path Planning In Online Question Pool, Meng Xia, Mingfei Sun, Huan Wei, Qing Chen, Yong Wang, Lei Shi, Huamin Qu, Xiaojuan Ma

Research Collection School Of Computing and Information Systems

Online question pools like LeetCode provide hands-on exercises of skills and knowledge. However, due to the large volume of questions and the intent of hiding the tested knowledge behind them, many users find it hard to decide where to start or how to proceed based on their goals and performance. To overcome these limitations, we present PeerLens, an interactive visual analysis system that enables peer-inspired learning path planning. PeerLens can recommend a customized, adaptable sequence of practice questions to individual learners, based on the exercise history of other users in a similar learning scenario. We propose a new way to …


Learning Two-Layer Neural Networks With Symmetric Inputs, Rong Ge, Rohith Kuditipudi, Zhize Li, Xiang Wang May 2019

Learning Two-Layer Neural Networks With Symmetric Inputs, Rong Ge, Rohith Kuditipudi, Zhize Li, Xiang Wang

Research Collection School Of Computing and Information Systems

We give a new algorithm for learning a two-layer neural network under a very general class of input distributions. Assuming there is a ground-truth two-layer network $y = A \sigma(Wx) + \xi$, where A, W are weight matrices, $\xi$ represents noise, and the number of neurons in the hidden layer is no larger than the input or output, our algorithm is guaranteed to recover the parameters A, W of the ground-truth network. The only requirement on the input x is that it is symmetric, which still allows highly complicated and structured input. Our algorithm is based on the method-of-moments framework …


Predicting Good Configurations For Github And Stack Overflow Topic Models, Christoph Treude, Markus Wagner May 2019

Predicting Good Configurations For Github And Stack Overflow Topic Models, Christoph Treude, Markus Wagner

Research Collection School Of Computing and Information Systems

Software repositories contain large amounts of textual data, ranging from source code comments and issue descriptions to questions, answers, and comments on Stack Overflow. To make sense of this textual data, topic modelling is frequently used as a text-mining tool for the discovery of hidden semantic structures in text bodies. Latent Dirichlet allocation (LDA) is a commonly used topic model that aims to explain the structure of a corpus by grouping texts. LDA requires multiple parameters to work well, and there are only rough and sometimes conflicting guidelines available on how these parameters should be set. In this paper, we …


Sotorrent: Studying The Origin, Evolution, And Usage Of Stack Overflow Code Snippets, Sebastian Baltes, Christoph Treude, Stephan Diehl May 2019

Sotorrent: Studying The Origin, Evolution, And Usage Of Stack Overflow Code Snippets, Sebastian Baltes, Christoph Treude, Stephan Diehl

Research Collection School Of Computing and Information Systems

Stack Overflow (SO) is the most popular questionand-answer website for software developers, providing a large amount of copyable code snippets. Like other software artifacts, code on SO evolves over time, for example when bugs are fixed or APIs are updated to the most recent version. To be able to analyze how code and the surrounding text on SO evolves, we built SOTorrent, an open dataset based on the official SO data dump. SOTorrent provides access to the version history of SO content at the level of whole posts and individual text and code blocks. It connects code snippets from SO …


Designated-Server Identity-Based Authenticated Encryption With Keyword Search For Encrypted Emails, Hongbo Li, Qiong Huang, Jian Shen, Guomin Yang, Willy Susilo May 2019

Designated-Server Identity-Based Authenticated Encryption With Keyword Search For Encrypted Emails, Hongbo Li, Qiong Huang, Jian Shen, Guomin Yang, Willy Susilo

Research Collection School Of Computing and Information Systems

In encrypted email system, how to search over encrypted cloud emails without decryption is an important and practical problem. Public key encryption with keyword search (PEKS) is an efficient solution to it. However, PEKS suffers from the complex key management problem in the public key infrastructure. Its variant in the identity-based setting addresses the drawback, however, almost all the schemes does not resist against offline keyword guessing attacks (KGA) by inside adversaries. In this work we introduce the notion of designated-server identity-based authenticated encryption with keyword search (dIBAEKS), in which the email sender authenticates the message while encrypting so that …


Cure: Flexible Categorical Data Representation By Hierarchical Coupling Learning, Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, Hang Gao May 2019

Cure: Flexible Categorical Data Representation By Hierarchical Coupling Learning, Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, Hang Gao

Research Collection School Of Computing and Information Systems

The representation of categorical data with hierarchical value coupling relationships (i.e., various value-to-value cluster interactions) is very critical yet challenging for capturing complex data characteristics in learning tasks. This paper proposes a novel and flexible coupled unsupervised categorical data representation (CURE) framework, which not only captures the hierarchical couplings but is also flexible enough to be instantiated for contrastive learning tasks. CURE first learns the value clusters of different granularities based on multiple value coupling functions and then learns the value representation from the couplings between the obtained value clusters. With two complementary value coupling functions, CURE is instantiated into …


Distilling Managerial Insights And Lessons From Ai Projects At Singapore's Changi Airport (Part 2), Steve Lee, Steven M. Miller May 2019

Distilling Managerial Insights And Lessons From Ai Projects At Singapore's Changi Airport (Part 2), Steve Lee, Steven M. Miller

Asian Management Insights

Since 2017, Changi Airport group (CAG) has initiated a host of pilot projects that use connective and intelligent technologies to enable its move towards digital transformation and SMART Airport Vision. This has resulted in a first wave of deployment of AI and Machine Learning-enabled applications across various functions that can better sense, analyse, predict, and interact with people.


Pinchlist: Leveraging Pinch Gestures For Hierarchical List Navigation On Smartphones, Teng Han, Jie Liu, Khalad Hasan, Mingming Fan, Junhyeok Kim, Jiannan Li, Xiangmin Fan, Feng Tian, Edward Lank, Pourang Irani May 2019

Pinchlist: Leveraging Pinch Gestures For Hierarchical List Navigation On Smartphones, Teng Han, Jie Liu, Khalad Hasan, Mingming Fan, Junhyeok Kim, Jiannan Li, Xiangmin Fan, Feng Tian, Edward Lank, Pourang Irani

Research Collection School Of Computing and Information Systems

Intensive exploration and navigation of hierarchical lists on smartphones can be tedious and time-consuming as it often requires users to frequently switch between multiple views. To overcome this limitation, we present PinchList, a novel interaction design that leverages pinch gestures to support seamless exploration of multi-level list items in hierarchical views. With PinchList, sub-lists are accessed with a pinch-out gesture whereas a pinch-in gesture navigates back to the previous level. Additionally, pinch and flick gestures are used to navigate lists consisting of more than two levels. We conduct a user study to refine the design parameters of PinchList such as …


Unifying Knowledge Graph Learning And Recommendation: Towards A Better Understanding Of User Preferences, Yixin Cao, Xiang Wang, Xiangnan He, Zikun Hu, Tat-Seng Chua May 2019

Unifying Knowledge Graph Learning And Recommendation: Towards A Better Understanding Of User Preferences, Yixin Cao, Xiang Wang, Xiangnan He, Zikun Hu, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Incorporating knowledge graph (KG) into recommender system is promising in improving the recommendation accuracy and explainability. However, existing methods largely assume that a KG is complete and simply transfer the ”knowledge” in KG at the shallow level of entity raw data or embeddings. This may lead to suboptimal performance, since a practical KG can hardly be complete, and it is common that a KG has missing facts, relations, and entities. Thus, we argue that it is crucial to consider the incomplete nature of KG when incorporating it into recommender system. In this paper, we jointly learn the model of recommendation …


A Blockchain-Based Location Privacy-Preserving Crowdsensing System, Mengmeng Yang, Tianqing Zhu, Kaitai Liang, Wanlei Zhou, Robert H. Deng May 2019

A Blockchain-Based Location Privacy-Preserving Crowdsensing System, Mengmeng Yang, Tianqing Zhu, Kaitai Liang, Wanlei Zhou, Robert H. Deng

Research Collection School Of Computing and Information Systems

With the support of portable electronic devices and crowdsensing, a new class of mobile applications based on the Internet of Things (IoT) application is emerging. Crowdsensing enables workers with mobile devices to travel to specified locations and collect data, then send it back to the requester for rewards. However, the majority of the existing crowdsensing systems are based on centralized servers, which are prone to a high chance of attack, intrusion, and manipulation. Further, during the process of transmitting information to and from the service server, the worker's location is usually exposed. This raises the potential risk of a privacy …


The Challenge Of Collaborative Iot-Based Inferencing In Adversarial Settings, Archan Misra, Dulanga Kaveesha Weerakoon Weerakoon Mudiyanselage, Kasthuri Jayarajah May 2019

The Challenge Of Collaborative Iot-Based Inferencing In Adversarial Settings, Archan Misra, Dulanga Kaveesha Weerakoon Weerakoon Mudiyanselage, Kasthuri Jayarajah

Research Collection School Of Computing and Information Systems

In many practical environments, resource-constrained IoT nodes are deployed with varying degrees of redundancy/overlap--i.e., their data streams possess significant spatiotemporal correlation. We posit that collaborative inferencing, whereby individual nodes adjust their inferencing pipelines to incorporate such correlated observations from other nodes, can improve both inferencing accuracy and performance metrics (such as latency and energy overheads). However, such collaborative models are vulnerable to adversarial behavior by one or more nodes, and thus require mechanisms that identify and inoculate against such malicious behavior. We use a dataset of 8 outdoor cameras to (a) demonstrate that such collaborative inferencing can improve people counting …


How To Derive Causal Insights For Digital Commerce In China? A Research Commentary On Computational Social Science Methods, David C.W. Phang, Kanliang Wang, Qiu-Hong Wang, Robert John Kauffman, Maurizio Naldi May 2019

How To Derive Causal Insights For Digital Commerce In China? A Research Commentary On Computational Social Science Methods, David C.W. Phang, Kanliang Wang, Qiu-Hong Wang, Robert John Kauffman, Maurizio Naldi

Research Collection School Of Computing and Information Systems

The transformation of empirical research due to the arrival of big data analytics and data science, as well as the new availability of methods that emphasize causal inference, are moving forward at full speed. In this Research Commentary, we examine the extent to which this has the potential to influence how e-commerce research is conducted. China offers the ultimate in data-at-scale settings, and the construction of real-world natural experiments. Chinese e-commerce includes some of the largest firms involved in e-commerce, mobile commerce, social media and social networks. This article was written to encourage young faculty and doctoral students to engage …


Emerging App Issue Identification From User Feedback: Experience On Wechat, Cuiyun Gao, Wujie Zheng, Yuetang Deng, David Lo, Jichuan Zeng, Michael R. Lyu, Irwin King May 2019

Emerging App Issue Identification From User Feedback: Experience On Wechat, Cuiyun Gao, Wujie Zheng, Yuetang Deng, David Lo, Jichuan Zeng, Michael R. Lyu, Irwin King

Research Collection School Of Computing and Information Systems

It is vital for popular mobile apps with large numbers of users to release updates with rich features while keeping stable user experience. Timely and accurately locating emerging app issues can greatly help developers to maintain and update apps. User feedback (i.e., user reviews) is a crucial channel between app developers and users, delivering a stream of information about bugs and features that concern users. Methods to identify emerging issues based on user feedback have been proposed in the literature, however, their applicability in industry has not been explored. We apply the recent method IDEA to WeChat, a popular messenger …


On Reliability Of Patch Correctness Assessment, Xuan-Bach D. Le, Lingfeng Bao, David Lo, Xin Xia, Shanping Li, Corina S. Pasareanu May 2019

On Reliability Of Patch Correctness Assessment, Xuan-Bach D. Le, Lingfeng Bao, David Lo, Xin Xia, Shanping Li, Corina S. Pasareanu

Research Collection School Of Computing and Information Systems

Current state-of-the-art automatic software repair (ASR) techniques rely heavily on incomplete specifications, or test suites, to generate repairs. This, however, may cause ASR tools to generate repairs that are incorrect and hard to generalize. To assess patch correctness, researchers have been following two methods separately: (1) Automated annotation, wherein patches are automatically labeled by an independent test suite (ITS) – a patch passing the ITS is regarded as correct or generalizable, and incorrect otherwise, (2) Author annotation, wherein authors of ASR techniques manually annotate the correctness labels of patches generated by their and competing tools. While automated annotation cannot ascertain …


Patchnet: A Tool For Deep Patch Classification, Thong Hoang, Julia Lawall, Richard J. Oentaryo, Yuan Tian, David Lo May 2019

Patchnet: A Tool For Deep Patch Classification, Thong Hoang, Julia Lawall, Richard J. Oentaryo, Yuan Tian, David Lo

Research Collection School Of Computing and Information Systems

This work proposes PatchNet, an automated tool based on hierarchical deep learning for classifying patches by extracting features from commit messages and code changes. PatchNet contains a deep hierarchical structure that mirrors the hierarchical and sequential structure of a code change, differentiating it from the existing deep learning models on source code. PatchNet provides several options allowing users to select parameters for the training process. The tool has been validated in the context of automatic identification of stable-relevant patches in the Linux kernel and is potentially applicable to automate other software engineering tasks that can be formulated as patch classification …


Detect Rumors On Twitter By Promoting Information Campaigns With Generative Adversarial Learning, Jing Ma, Wei Gao, Kam-Fai Wong May 2019

Detect Rumors On Twitter By Promoting Information Campaigns With Generative Adversarial Learning, Jing Ma, Wei Gao, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Rumors can cause devastating consequences to individual and/or society. Analysis shows that widespread of rumors typically results from deliberately promoted information campaigns which aim to shape collective opinions on the concerned news events. In this paper, we attempt to fight such chaos with itself to make automatic rumor detection more robust and effective. Our idea is inspired by adversarial learning method originated from Generative Adversarial Networks (GAN). We propose a GAN-style approach, where a generator is designed to produce uncertain or conflicting voices, complicating the original conversational threads in order to pressurize the discriminator to learn stronger rumor indicative representations …


Beyond Autonomy: The Self And Life Of Social Agents, Budhitama Subagdja, Ah-Hwee Tan May 2019

Beyond Autonomy: The Self And Life Of Social Agents, Budhitama Subagdja, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Agents have gained popularity nowadays as virtual assistants and companions of their human users supporting daily activities in many aspects of personal life. Designed to be sociable, an agent engages its user(s) to communicate and even develop friendships. Rather than just as a lifeless toy, it is supposed to be perceived as an individual with its own personality, experiences, and social life. In this paper, we seek to highlight self-hood as another dimension that characterizes an agent. Besides levels of autonomy and reasoning, an agent can be defined based on its capacity to process and reflect on its own self …


Ai Gets Real At Singapore's Changi Airport (Part 1), Steve Lee, Steven M. Miller May 2019

Ai Gets Real At Singapore's Changi Airport (Part 1), Steve Lee, Steven M. Miller

Asian Management Insights

Ranked as the best airport for seven consecutive years, Singapore’s Changi Airport is lauded the world over for the efficient, safe, pleasurable and seamless service it offers the millions of passengers that pass through its facilities annually. Much of Changi Airport’s success can be attributed to the organisation’s customer-oriented business focus and deeply embedded culture of service excellence, combined with a host of advanced technologies operating invisibly in the background. The framework for this technology enablement is Changi Airport Group’s (CAG’s) SMART Airport Vision—an enterprise-wide approach to connective technologies that leverages sensors, data fusion, data analytics, and artificial intelligence (AI), …


Adaptive Resonance Theory (Art) For Social Media Analytics, Lei Meng, Ah-Hwee Tan, Donald C. Ii Wunsch May 2019

Adaptive Resonance Theory (Art) For Social Media Analytics, Lei Meng, Ah-Hwee Tan, Donald C. Ii Wunsch

Research Collection School Of Computing and Information Systems

The last decade has witnessed how social media in the era of Web 2.0 reshapes the way people communicate, interact, and entertain in daily life and incubates the prosperity of various user-centric platforms, such as social networking, question answering, massive open online courses (MOOC), and e-commerce platforms. The available rich user-generated multimedia data on the web has evolved traditional ways of understanding multimedia research and has led to numerous emerging topics on human-centric analytics and services, such as user profiling, social network mining, crowd behavior analysis, and personalized recommendation. Clustering, as an important tool for mining information groups and in-group …


Clustering And Its Extensions In The Social Media Domain, Lei Meng, Ah-Hwee Tan, Donald C. Wunsch May 2019

Clustering And Its Extensions In The Social Media Domain, Lei Meng, Ah-Hwee Tan, Donald C. Wunsch

Research Collection School Of Computing and Information Systems

This chapter summarizes existing clustering and related approaches for the identified challenges as described in Sect. 1.2 and presents the key branches of social media mining applications where clustering holds a potential. Specifically, several important types of clustering algorithms are first illustrated, including clustering, semi-supervised clustering, heterogeneous data co-clustering, and online clustering. Subsequently, Sect. 2.5 presents a review on existing techniques that help decide the value of the predefined number of clusters (required by most clustering algorithms) automatically and highlights the clustering algorithms that do not require such a parameter. It better illustrates the challenge of input parameter sensitivity of …