Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 301 - 330 of 6717

Full-Text Articles in Physical Sciences and Mathematics

Semantically Constitutive Entities In Knowledge Graphs, Chong Cher Chia, Maksim Tkachenko, Hady Wirawan Lauw Aug 2023

Semantically Constitutive Entities In Knowledge Graphs, Chong Cher Chia, Maksim Tkachenko, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Knowledge graphs are repositories of facts about a world. In this work, we seek to distill the set of entities or nodes in a knowledge graph into a specified number of constitutive nodes, whose embeddings would be retained. Intuitively, the remaining accessory nodes could have their original embeddings “forgotten”, and yet reconstitutable from those of the retained constitutive nodes. The constitutive nodes thus represent the semantically constitutive entities, which retain the core semantics of the knowledge graph. We propose a formulation as well as algorithmic solutions to minimize the reconstitution errors. The derived constitutive nodes are validated empirically both in …


An Adaptive Large Neighborhood Search For Heterogeneous Vehicle Routing Problem With Time Windows, Minh Pham Kien Nguyen, Aldy Gunawan, Vincent F. Yu, Mustafa Misir Aug 2023

An Adaptive Large Neighborhood Search For Heterogeneous Vehicle Routing Problem With Time Windows, Minh Pham Kien Nguyen, Aldy Gunawan, Vincent F. Yu, Mustafa Misir

Research Collection School Of Computing and Information Systems

The heterogeneous vehicle routing problem with time windows (HVRPTW) employs various vehicles with different capacities to serve upcoming pickup and delivery orders. We introduce a HVRPTW variant for reflecting the practical needs of crowd-shipping by considering the mass-rapid-transit stations, as the additional terminal points. A mixed integer linear programming model is formulated. An Adaptive Large Neighborhood Search based meta-heuristic is also developed by utilizing a basic probabilistic selection strategy, i.e. roulette wheel, and Simulated Annealing. The proposed approach is empirically evaluated on a new set of benchmark instances. The computational results revealed that ALNS shows its clear advantage on the …


Sparsity Brings Vulnerabilities: Exploring New Metrics In Backdoor Attacks, Jianwen Tian, Kefan Qiu, Debin Gao, Zhi Wang, Xiaohui Kuang, Gang Zhao Aug 2023

Sparsity Brings Vulnerabilities: Exploring New Metrics In Backdoor Attacks, Jianwen Tian, Kefan Qiu, Debin Gao, Zhi Wang, Xiaohui Kuang, Gang Zhao

Research Collection School Of Computing and Information Systems

Nowadays, using AI-based detectors to keep pace with the fast iterating of malware has attracted a great attention. However, most AI-based malware detectors use features with vast sparse subspaces to characterize applications, which brings significant vulnerabilities to the model. To exploit this sparsityrelated vulnerability, we propose a clean-label backdoor attack consisting of a dissimilarity metric-based candidate selection and a variation ratio-based trigger construction. The proposed backdoor is verified on different datasets, including a Windows PE dataset, an Android dataset with numerical and boolean feature values, and a PDF dataset. The experimental results show that the attack can slash the accuracy …


Balancing Utility And Fairness In Submodular Maximization, Yanhao Wang, Yuchen Li, Francesco Bonchi, Ying Wang Aug 2023

Balancing Utility And Fairness In Submodular Maximization, Yanhao Wang, Yuchen Li, Francesco Bonchi, Ying Wang

Research Collection School Of Computing and Information Systems

Submodular function maximization is a fundamental combinatorial optimization problem with plenty of applications – including data summarization, influence maximization, and recommendation. In many of these problems, the goal is to find a solution that maximizes the average utility over all users, for each of whom the utility is defined by a monotone submodular function. However, when the population of users is composed of several demographic groups, another critical problem is whether the utility is fairly distributed across different groups. Although the utility and fairness objectives are both desirable, they might contradict each other, and, to the best of our knowledge, …


Evolve Path Tracer: Early Detection Of Malicious Addresses In Cryptocurrency, Ling Cheng, Feida Zhu, Yong Wang, Ruicheng Liang, Huiwen Liu Aug 2023

Evolve Path Tracer: Early Detection Of Malicious Addresses In Cryptocurrency, Ling Cheng, Feida Zhu, Yong Wang, Ruicheng Liang, Huiwen Liu

Research Collection School Of Computing and Information Systems

With the boom of cryptocurrency and its concomitant financial risk concerns, detecting fraudulent behaviors and associated malicious addresses has been drawing significant research effort. Most existing studies, however, rely on the full history features or full-fledged address transaction networks, both of which are unavailable in the problem of early malicious address detection and therefore failing them for the task. To detect fraudulent behaviors of malicious addresses in the early stage, we present Evolve Path Tracer, which consists of Evolve Path Encoder LSTM, Evolve Path Graph GCN, and Hierarchical Survival Predictor. Specifically, in addition to the general address features, we propose …


Three Essays On Artificial Intelligence In Business And Healthcare, Zongxi Liu Aug 2023

Three Essays On Artificial Intelligence In Business And Healthcare, Zongxi Liu

Theses and Dissertations

The big data era has provided researchers with challenges and opportunities for data-centric research. On the one hand, recent developments in AI technology have allowed advanced techniques to process text/image/audio/video and graph-structured data, providing new opportunities to employ big data for explanatory and predictive analytics in information systems research. On the other hand, the field requires a new level of artificial intelligence–transparent, robust, and ethical AI–to facilitate reliable business decision-making. My three dissertation essays apply, develop, and enhance state-of-the-art AI methods, leveraging various data sources as well as domain knowledge synthesis, to deal with issues in business and healthcare fields. …


Deep Weakly-Supervised Anomaly Detection, Guansong Pang, Chunhua Shen, Huidong Jin, Anton Van Den Hengel Aug 2023

Deep Weakly-Supervised Anomaly Detection, Guansong Pang, Chunhua Shen, Huidong Jin, Anton Van Den Hengel

Research Collection School Of Computing and Information Systems

Recent semi-supervised anomaly detection methods that are trained using small labeled anomaly examples and large unlabeled data (mostly normal data) have shown largely improved performance over unsupervised methods. However, these methods often focus on fitting abnormalities illustrated by the given anomaly examples only (i.e., seen anomalies), and consequently they fail to generalize to those that are not, i.e., new types/classes of anomaly unseen during training. To detect both seen and unseen anomalies, we introduce a novel deep weakly-supervised approach, namely Pairwise Relation prediction Network (PReNet), that learns pairwise relation features and anomaly scores by predicting the relation of any two …


Single-View View Synthesis With Self-Rectified Pseudo-Stereo, Yang Zhou, Hanjie Wu, Wenxi Liu, Zheng Xiong, Jing Qin, Shengfeng He Aug 2023

Single-View View Synthesis With Self-Rectified Pseudo-Stereo, Yang Zhou, Hanjie Wu, Wenxi Liu, Zheng Xiong, Jing Qin, Shengfeng He

Research Collection School Of Computing and Information Systems

Synthesizing novel views from a single view image is a highly ill-posed problem. We discover an effective solution to reduce the learning ambiguity by expanding the single-view view synthesis problem to a multi-view setting. Specifically, we leverage the reliable and explicit stereo prior to generate a pseudo-stereo viewpoint, which serves as an auxiliary input to construct the 3D space. In this way, the challenging novel view synthesis process is decoupled into two simpler problems of stereo synthesis and 3D reconstruction. In order to synthesize a structurally correct and detail-preserved stereo image, we propose a self-rectified stereo synthesis to amend erroneous …


Survey On Sentiment Analysis: Evolution Of Research Methods And Topics, Jingfeng Cui, Zhaoxia Wang, Seng-Beng Ho, Erik Cambria Aug 2023

Survey On Sentiment Analysis: Evolution Of Research Methods And Topics, Jingfeng Cui, Zhaoxia Wang, Seng-Beng Ho, Erik Cambria

Research Collection School Of Computing and Information Systems

Sentiment analysis, one of the research hotspots in the natural language processing field, has attracted the attention of researchers, and research papers on the field are increasingly published. Many literature reviews on sentiment analysis involving techniques, methods, and applications have been produced using different survey methodologies and tools, but there has not been a survey dedicated to the evolution of research methods and topics of sentiment analysis. There have also been few survey works leveraging keyword co-occurrence on sentiment analysis. Therefore, this study presents a survey of sentiment analysis focusing on the evolution of research methods and topics. It incorporates …


The 4th International Workshop On Talent And Management Computing (Tmc'2023): Editorial, Hengshu Zhu, Hui Xiong, Yong Ge, Ee-Peng Lim Aug 2023

The 4th International Workshop On Talent And Management Computing (Tmc'2023): Editorial, Hengshu Zhu, Hui Xiong, Yong Ge, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

In today's competitive and fast-evolving business environment, it is a critical time for organizations to rethink how to deal with the talent and management related tasks in a quantitative manner. Indeed, thanks to the era of big data, the availability of large-scale talent data provides unparalleled opportunities for business leaders to understand the rules of talent and management, which in turn deliver intelligence for effective decision making and management for their organizations. In the past few years, talent and management computing have increasingly attracted attentions from KDD communities, and a number of research/applied data science efforts have been devoted. To …


Knowledge Representation For Conceptual, Motivational, And Affective Processes In Natural Language Communication, Seng Beng Ho, Zhaoxia Wang, Boon-Kiat Quek, Erik Cambria Aug 2023

Knowledge Representation For Conceptual, Motivational, And Affective Processes In Natural Language Communication, Seng Beng Ho, Zhaoxia Wang, Boon-Kiat Quek, Erik Cambria

Research Collection School Of Computing and Information Systems

Natural language communication is an intricate and complex process. The speaker usually begins with an intention and motivation of what is to be communicated, and what outcomes are expected from the communication, while taking into consideration the listener’s mental model to concoct an appropriate sentence. Likewise, the listener has to interpret the speaker’s message, and respond accordingly, also with the speaker’s mental model in mind. Doing this successfully entails the appropriate representation of the conceptual, motivational, and affective processes that underlie language generation and understanding. Whereas big-data approaches in language processing (such as chatbots and machine translation) have performed well, …


A Survey On Proactive Dialogue Systems: Problems, Methods, And Prospects, Yang Deng, Wenqiang Lei, Wai Lam, Tat-Seng Chua Aug 2023

A Survey On Proactive Dialogue Systems: Problems, Methods, And Prospects, Yang Deng, Wenqiang Lei, Wai Lam, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Proactive dialogue systems, related to a wide range of real-world conversational applications, equip the conversational agent with the capability of leading the conversation direction towards achieving pre-defined targets or fulfilling certain goals from the system side. It is empowered by advanced techniques to progress to more complicated tasks that require strategical and motivational interactions. In this survey, we provide a comprehensive overview of the prominent problems and advanced designs for conversational agent's proactivity in different types of dialogues. Furthermore, we discuss challenges that meet the real-world application needs but require a greater research focus in the future. We hope that …


Corporate Trade War Uncertainty And Patent Bubble, Xu Yang, Nan Hu, Peng Liang Aug 2023

Corporate Trade War Uncertainty And Patent Bubble, Xu Yang, Nan Hu, Peng Liang

Research Collection School Of Computing and Information Systems

This paper draws upon resource dependence theory and investigates how trade policy uncertainty affects firm strategic innovation management in China. Adopting a machine learning approach called Word2Vec from computational linguistics, we construct and validate a measure of firm-level managers’ perceived trade war uncertainty (TWU). We find that TWU has a positive effect on the number of total patent applications, but this positive effect is totally driven by low-quality patents instead of high-quality patents. Moreover, we document that firms have stronger incentives for such strategic innovation behavior when the underlying firms are more financially constrained, and/or when the management is more …


Characterization And Estimation Of Musculoskeletal Pain Using Machine Learning, Boluwatife Faremi Jul 2023

Characterization And Estimation Of Musculoskeletal Pain Using Machine Learning, Boluwatife Faremi

Master's Theses

Traditional scales utilized for recording pain are known to be highly subjective and biased due to inaccuracies in recollecting actual pain intensities. As a result, machine learning (ML) models that are trained using these scores as ground truth are reported to have low performance for objective pain classification because of the huge disparity between what was felt in moments of pain and the scores recorded afterward.

In the present study, two devices were designed for gathering real-time, continuous in-session subjective pain scores and the recording of the autonomic nervous system (ANS) altered endodermal (EDA) activity. 24 participants were recruited to …


Interposition Based Container Optimization For Data Intensive Applications, Rohan Tikmany Jul 2023

Interposition Based Container Optimization For Data Intensive Applications, Rohan Tikmany

College of Computing and Digital Media Dissertations

Reproducibility of applications is paramount in several scenarios such as collaborative work and software testing. Containers provide an easy way of addressing reproducibility by packaging the application's software and data dependencies into one executable unit, which can be executed multiple times in different environments. With the increased use of containers in industry as well as academia, current research has examined the provisioning and storage cost of containers and has shown that container deployments often include unnecessary software packages. Current methods to optimize the container size prune unnecessary data at the granularity of files and thus make binary decisions. We show …


How Technology May Be Used For Future Disease Predictions, Rich P. Manprisio Jul 2023

How Technology May Be Used For Future Disease Predictions, Rich P. Manprisio

Journal of Applied Disciplines

Exasperated by the ongoing global pandemic, the healthcare system is grappling with the formidable challenges posed by proper and effective disease treatments. Nevertheless, amidst these growing difficulties, the healthcare field has witnessed significant technological advancements, offering promising avenues for disease prediction. Notably, a positive correlation exists between the utilization of technologies and their potential to serve as valuable tools for disease prediction. As our reliance on technological sophistication continues progressing, current research highlights numerous viable options to augment the healthcare sector. This review explores the current state of utilizing technologies and their potential to enhance healthcare, shedding light on their …


Product Question Answering In E-Commerce: A Survey, Yang Deng, Wenxuan Zhang, Qian Yu, Wai Lam Jul 2023

Product Question Answering In E-Commerce: A Survey, Yang Deng, Wenxuan Zhang, Qian Yu, Wai Lam

Research Collection School Of Computing and Information Systems

Product question answering (PQA), aiming to automatically provide instant responses to customer’s questions in E-Commerce platforms, has drawn increasing attention in recent years. Compared with typical QA problems, PQA exhibits unique challenges such as the subjectivity and reliability of user-generated contents in E-commerce platforms. Therefore, various problem settings and novel methods have been proposed to capture these special characteristics. In this paper, we aim to systematically review existing research efforts on PQA. Specifically, we categorize PQA studies into four problem settings in terms of the form of provided answers. We analyze the pros and cons, as well as present existing …


Beyond Anthropomorphism: Unraveling The True Priorities Of Chatbot Usage In Smes, Tamas Makany, Sungjong Roh, Kotaro Hara, Jie Min Hua, Felicia Si Ying Goh, Wilson Yang Jie Teh Jul 2023

Beyond Anthropomorphism: Unraveling The True Priorities Of Chatbot Usage In Smes, Tamas Makany, Sungjong Roh, Kotaro Hara, Jie Min Hua, Felicia Si Ying Goh, Wilson Yang Jie Teh

Research Collection Lee Kong Chian School Of Business

This study examined business communication practices with chatbots among various Small and Medium Enterprise (SME) stakeholders in Singapore, including business owners/employees, customers, and developers. Through qualitative interviews and chatbot transcript analysis, we investigated two research questions: (1) How do the expectations of SME stakeholders compare to the conversational design of SME chatbots? and (2) What are the business reasons for SMEs to add human-like features to their chatbots? Our findings revealed that functionality is more crucial than anthropomorphic characteristics, such as personality and name. Stakeholders preferred chatbots that explicitly identified themselves as machines to set appropriate expectations. Customers prioritized efficiency, …


Analyzing Taxi Drivers’ Decision-Making And Recommending Strategies For Enhanced Performance: A Data-Driven Approach, Mengyu Ji Jul 2023

Analyzing Taxi Drivers’ Decision-Making And Recommending Strategies For Enhanced Performance: A Data-Driven Approach, Mengyu Ji

Dissertations and Theses Collection (Open Access)

This thesis focuses on analyzing the decision-making process of taxi drivers and providing data-driven strategies to enhance their performance. By examin- ing comprehensive historical data encompassing passenger demand patterns, drivers’ spatial dynamics, and fare structures, valuable insights are gained into drivers’ choices regarding optimal routes, timing, and areas with high demand. Integrating real-time information sources, such as GPS data and passenger updates, allows drivers to adapt their strategies dynamically to changing traffic conditions and emerging demand patterns. Predictive analytics models, includ- ing ARIMA, XGBoost, and Linear Regression, are utilized to forecast demand flow at key locations, enabling proactive decision-making and …


Mitigating Adversarial Attacks On Data-Driven Invariant Checkers For Cyber-Physical Systems, Rajib Ranjan Maiti, Cheah Huei Yoong, Venkata Reddy Palleti, Arlindo Silva, Christopher M. Poskitt Jul 2023

Mitigating Adversarial Attacks On Data-Driven Invariant Checkers For Cyber-Physical Systems, Rajib Ranjan Maiti, Cheah Huei Yoong, Venkata Reddy Palleti, Arlindo Silva, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

The use of invariants in developing security mechanisms has become an attractive research area because of their potential to both prevent attacks and detect attacks in Cyber-Physical Systems (CPS). In general, an invariant is a property that is expressed using design parameters along with Boolean operators and which always holds in normal operation of a system, in particular, a CPS. Invariants can be derived by analysing operational data of various design parameters in a running CPS, or by analysing the system's requirements/design documents, with both of the approaches demonstrating significant potential to detect and prevent cyber-attacks on a CPS. While …


A Data-Driven Approach For Scheduling Bus Services Subject To Demand Constraints, Brahmanage Janaka Chathuranga Thilakarathna, Thivya Kandappu, Baihua Zheng Jul 2023

A Data-Driven Approach For Scheduling Bus Services Subject To Demand Constraints, Brahmanage Janaka Chathuranga Thilakarathna, Thivya Kandappu, Baihua Zheng

Research Collection School Of Computing and Information Systems

Passenger satisfaction is extremely important for the success of a public transportation system. Many studies have shown that passenger satisfaction strongly depends on the time they have to wait at the bus stop (waiting time) to get on a bus. To be specific, user satisfaction drops faster as the waiting time increases. Therefore, service providers want to provide a bus to the waiting passengers within a threshold to keep them satisfied. It is a two-pronged problem: (a) to satisfy more passengers the transport planner may increase the frequency of the buses, and (b) in turn, the increased frequency may impact …


Conference Report On 2022 Ieee Symposium Series On Computational Intelligence (Ieee Ssci 2022), Ah-Hwee Tan, Dipti Srinivasan, Chunyan Miao Jul 2023

Conference Report On 2022 Ieee Symposium Series On Computational Intelligence (Ieee Ssci 2022), Ah-Hwee Tan, Dipti Srinivasan, Chunyan Miao

Research Collection School Of Computing and Information Systems

On behalf of the organizing committee, we are delighted to deliver this conference report for the 2022 IEEE Symposium Series on Computational Intelligence (SSCI 2022), which was held in Singapore from 4th to 7th December 2022. IEEE SSCI is an established flagship annual international series of symposia on computational intelligence (CI) sponsored by the IEEE Computational Intelligence Society (CIS) to promote and stimulate discussions on the latest theory, algorithms, applications, and emerging topics on computational intelligence. After two years of virtual conferences due to the global pandemic, IEEE SSCI returned as an in-person meeting with online elements in 2022.


Augmenting Low-Resource Text Classification With Graph-Grounded Pre-Training And Prompting, Zhihao Wen, Yuan Fang Jul 2023

Augmenting Low-Resource Text Classification With Graph-Grounded Pre-Training And Prompting, Zhihao Wen, Yuan Fang

Research Collection School Of Computing and Information Systems

ext classification is a fundamental problem in information retrieval with many real-world applications, such as predicting the topics of online articles and the categories of e-commerce product descriptions. However, low-resource text classification, with few or no labeled samples, poses a serious concern for supervised learning. Meanwhile, many text data are inherently grounded on a network structure, such as a hyperlink/citation network for online articles, and a user-item purchase network for e-commerce products. These graph structures capture rich semantic relationships, which can potentially augment low-resource text classification. In this paper, we propose a novel model called Graph-Grounded Pre-training and Prompting (G2P2) …


Do-Good: Towards Distribution Shift Evaluation For Pre-Trained Visual Document Understanding Models, Jiabang He, Yi Hu, Lei Wang, Xing Xu, Ning Liu, Hui Liu Jul 2023

Do-Good: Towards Distribution Shift Evaluation For Pre-Trained Visual Document Understanding Models, Jiabang He, Yi Hu, Lei Wang, Xing Xu, Ning Liu, Hui Liu

Research Collection School Of Computing and Information Systems

Numerous pre-training techniques for visual document understanding (VDU) have recently shown substantial improvements in performance across a wide range of document tasks. However, these pre-trained VDU models cannot guarantee continued success when the distribution of test data differs from the distribution of training data. In this paper, to investigate how robust existing pre-trained VDU models are to various distribution shifts, we first develop an out-of-distribution (OOD) benchmark termed Do-GOOD for the fine-Grained analysis on Document image-related tasks specifically. The Do-GOOD benchmark defines the underlying mechanisms that result in different distribution shifts and contains 9 OOD datasets covering 3 VDU related …


Multi-Target Backdoor Attacks For Code Pre-Trained Models, Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu Jul 2023

Multi-Target Backdoor Attacks For Code Pre-Trained Models, Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu

Research Collection School Of Computing and Information Systems

Backdoor attacks for neural code models have gained considerable attention due to the advancement of code intelligence. However, most existing works insert triggers into task-specific data for code-related downstream tasks, thereby limiting the scope of attacks. Moreover, the majority of attacks for pre-trained models are designed for understanding tasks. In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. Our backdoored model is pre-trained with two learning strategies (i.e., Poisoned Seq2Seq learning and token representation learning) to support the multi-target attack of downstream code understanding and generation tasks. During the deployment phase, the implanted backdoors in the victim …


Few-Shot Event Detection: An Empirical Study And A Unified View, Yubo Ma, Zehao Wang, Yixin Cao, Aixin Sun Jul 2023

Few-Shot Event Detection: An Empirical Study And A Unified View, Yubo Ma, Zehao Wang, Yixin Cao, Aixin Sun

Research Collection School Of Computing and Information Systems

Few-shot event detection (ED) has been widely studied, while this brings noticeable discrepancies, e.g., various motivations, tasks, and experimental settings, that hinder the understanding of models for future progress. This paper presents a thorough empirical study, a unified view of ED models, and a better unified baseline. For fair evaluation, we compare 12 representative methods on three datasets, which are roughly grouped into prompt-based and prototype-based models for detailed analysis. Experiments consistently demonstrate that prompt-based methods, including ChatGPT, still significantly trail prototype-based methods in terms of overall performance. To investigate their superior performance, we break down their design elements along …


Take A Break In The Middle: Investigating Subgoals Towards Hierarhical Script Generation, Xinze Li, Yixin Cao, Muhao Chen, Aixin Sun Jul 2023

Take A Break In The Middle: Investigating Subgoals Towards Hierarhical Script Generation, Xinze Li, Yixin Cao, Muhao Chen, Aixin Sun

Research Collection School Of Computing and Information Systems

Goal-oriented Script Generation is a new task of generating a list of steps that can fulfill the given goal. In this paper, we propose to extend the task from the perspective of cognitive theory. Instead of a simple flat structure, the steps are typically organized hierarchically — Human often decompose a complex task into subgoals, where each subgoal can be further decomposed into steps. To establish the benchmark, we contribute a new dataset, propose several baseline methods, and set up evaluation metrics. Both automatic and human evaluation verify the high-quality of dataset, as well as the effectiveness of incorporating subgoals …


Large-Scale Correlation Analysis Of Automated Metrics For Topic Models, Jia Peng Lim, Hady Wirawan Lauw Jul 2023

Large-Scale Correlation Analysis Of Automated Metrics For Topic Models, Jia Peng Lim, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Automated coherence metrics constitute an important and popular way to evaluate topic models. Previous works present a mixed picture of their presumed correlation with human judgement. In this paper, we conduct a large-scale correlation analysis of coherence metrics. We propose a novel sampling approach to mine topics for the purpose of metric evaluation, and conduct the analysis via three large corpora showing that certain automated coherence metrics are correlated. Moreover, we extend the analysis to measure topical differences between corpora. Lastly, we examine the reliability of human judgement by conducting an extensive user study, which is designed as an amalgamation …


Reducing Spatial Labeling Redundancy For Active Semi-Supervised Crowd Counting, Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Dan Xu, Jing Qin, Shengfeng He Jul 2023

Reducing Spatial Labeling Redundancy For Active Semi-Supervised Crowd Counting, Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Dan Xu, Jing Qin, Shengfeng He

Research Collection School Of Computing and Information Systems

Labeling is onerous for crowd counting as it should annotate each individual in crowd images. Recently, several methods have been proposed for semi-supervised crowd counting to reduce the labeling efforts. Given a limited labeling budget, they typically select a few crowd images and densely label all individuals in each of them. Despite the promising results, we argue the None-or-All labeling strategy is suboptimal as the densely labeled individuals in each crowd image usually appear similar while the massive unlabeled crowd images may contain entirely diverse individuals. To this end, we propose to break the labeling chain of previous methods and …


Multi-Target Backdoor Attacks For Code Pre-Trained Models, Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu Jul 2023

Multi-Target Backdoor Attacks For Code Pre-Trained Models, Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu

Research Collection School Of Computing and Information Systems

Backdoor attacks for neural code models have gained considerable attention due to the advancement of code intelligence. However, most existing works insert triggers into task-specific data for code-related downstream tasks, thereby limiting the scope of attacks. Moreover, the majority of attacks for pre-trained models are designed for understanding tasks. In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. Our backdoored model is pre-trained with two learning strategies (i.e., Poisoned Seq2Seq learning and token representation learning) to support the multi-target attack of downstream code understanding and generation tasks. During the deployment phase, the implanted backdoors in the victim …