Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Singapore Management University

Discipline
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 391 - 420 of 7446

Full-Text Articles in Physical Sciences and Mathematics

Large Language Model Is Not A Good Few-Shot Information Extractor, But A Good Reranker For Hard Samples!, Yubo Ma, Yixin Cao, Yongchin Hong, Aixin Sun Dec 2023

Large Language Model Is Not A Good Few-Shot Information Extractor, But A Good Reranker For Hard Samples!, Yubo Ma, Yixin Cao, Yongchin Hong, Aixin Sun

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have made remarkable strides in various tasks. However, whether they are competitive few-shot solvers for information extraction (IE) tasks and surpass fine-tuned small Pre-trained Language Models (SLMs) remains an open problem. This paper aims to provide a thorough answer to this problem, and moreover, to explore an approach towards effective and economical IE systems that combine the strengths of LLMs and SLMs. Through extensive experiments on nine datasets across four IE tasks, we show that LLMs are not effective few-shot information extractors in general, given their unsatisfactory performance in most settings and the high latency and …


Neural Multi-Objective Combinatorial Optimization With Diversity Enhancement, Jinbiao Chen, Zizhen Zhang, Zhiguang Cao, Yaoxin Wu, Yining Ma, Te Ye, Jiahai Wang Dec 2023

Neural Multi-Objective Combinatorial Optimization With Diversity Enhancement, Jinbiao Chen, Zizhen Zhang, Zhiguang Cao, Yaoxin Wu, Yining Ma, Te Ye, Jiahai Wang

Research Collection School Of Computing and Information Systems

Most of existing neural methods for multi-objective combinatorial optimization (MOCO) problems solely rely on decomposition, which often leads to repetitive solutions for the respective subproblems, thus a limited Pareto set. Beyond decomposition, we propose a novel neural heuristic with diversity enhancement (NHDE) to produce more Pareto solutions from two perspectives. On the one hand, to hinder duplicated solutions for different subproblems, we propose an indicator-enhanced deep reinforcement learning method to guide the model, and design a heterogeneous graph attention mechanism to capture the relations between the instance graph and the Pareto front graph. On the other hand, to excavate more …


Designing An Overseas Experiential Course In Data Science, Hua Leong Fwa, Graham Ng Dec 2023

Designing An Overseas Experiential Course In Data Science, Hua Leong Fwa, Graham Ng

Research Collection School Of Computing and Information Systems

Unprecedented demand for data science professionals in the industry has led to many educational institutions launching new data science courses. It is however imperative that students of data science programmes learn through execution of real-world, authentic projects on top of acquiring foundational knowledge on the basics of data science. In the process of working on authentic, real-world projects, students not only create new knowledge but also learn to solve open, sophisticated, and ill-structured problems in an inter-disciplinary fashion. In this paper, we detailed our approach to design a data science curriculum premised on learners solving authentic data science problems sourced …


Learning To Search Feasible And Infeasible Regions Of Routing Problems With Flexible Neural K-Opt, Yining Ma, Zhiguang Cao, Yew Meng Chee Dec 2023

Learning To Search Feasible And Infeasible Regions Of Routing Problems With Flexible Neural K-Opt, Yining Ma, Zhiguang Cao, Yew Meng Chee

Research Collection School Of Computing and Information Systems

In this paper, we present Neural k-Opt (NeuOpt), a novel learning-to-search (L2S) solver for routing problems. It learns to perform flexible k-opt exchanges based on a tailored action factorization method and a customized recurrent dual-stream decoder. As a pioneering work to circumvent the pure feasibility masking scheme and enable the autonomous exploration of both feasible and infeasible regions, we then propose the Guided Infeasible Region Exploration (GIRE) scheme, which supplements the NeuOpt policy network with feasibility-related features and leverages reward shaping to steer reinforcement learning more effectively. Besides, we further equip NeuOpt with dynamic data augmentations during inference for more …


Deepaco: Neural-Enhanced Ant Systems For Combinatorial Optimization, Haoran Ye, Jiarui Wang, Zhiguang Cao, Helan Liang, Yong Li Dec 2023

Deepaco: Neural-Enhanced Ant Systems For Combinatorial Optimization, Haoran Ye, Jiarui Wang, Zhiguang Cao, Helan Liang, Yong Li

Research Collection School Of Computing and Information Systems

Ant Colony Optimization (ACO) is a meta-heuristic algorithm that has been successfully applied to various Combinatorial Optimization Problems (COPs). Traditionally, customizing ACO for a specific problem requires the expert design of knowledge-driven heuristics. In this paper, we propose DeepACO, a generic framework leveraging deep reinforcement learning to automate heuristic designs. DeepACO serves to strengthen the heuristic measures of existing ACO algorithms and dispense with laborious manual design in future ACO applications. As a neural-enhanced meta-heuristic, DeepACO consistently outperforms its ACO counterparts on eight COPs using a single neural model and a single set of hyperparameters. As a Neural Combinatorial Optimization …


Efficient Meta Neural Heuristic For Multi-Objective Combinatorial Optimization, Jinbiao Chen, Zizhen Zhang, Te Ye, Zhiguang Cao, Siyuan Chen, Jiahai Wang Dec 2023

Efficient Meta Neural Heuristic For Multi-Objective Combinatorial Optimization, Jinbiao Chen, Zizhen Zhang, Te Ye, Zhiguang Cao, Siyuan Chen, Jiahai Wang

Research Collection School Of Computing and Information Systems

Recently, neural heuristics based on deep reinforcement learning have exhibited promise in solving multi-objective combinatorial optimization problems (MOCOPs). However, they are still struggling to achieve high learning efficiency and solution quality. To tackle this issue, we propose an efficient meta neural heuristic (EMNH), in which a meta model is first trained and then fine-tuned with a few steps to solve corresponding single-objective subproblems. Specifically, for the training process, a (partial) architecture-shared multi-task model is leveraged to achieve parallel learning for the meta model, so as to speed up the training; meanwhile, a scaled symmetric sampling method with respect to the …


Metabox: A Benchmark Platform For Meta-Black-Box Optimization With Reinforcement Learning, Zeyuan Ma, Hongshu Guo, Jiacheng Chen, Zhenrui Li, Guojun Peng, Yue-Jiao Gong, Yining Ma, Zhiguang Cao Dec 2023

Metabox: A Benchmark Platform For Meta-Black-Box Optimization With Reinforcement Learning, Zeyuan Ma, Hongshu Guo, Jiacheng Chen, Zhenrui Li, Guojun Peng, Yue-Jiao Gong, Yining Ma, Zhiguang Cao

Research Collection School Of Computing and Information Systems

Recently, Meta-Black-Box Optimization with Reinforcement Learning (MetaBBO-RL) has showcased the power of leveraging RL at the meta-level to mitigate manual fine-tuning of lower-level black-box optimizers. However, this field is hindered by the lack of a unified benchmark. To fill this gap, we introduce MetaBox, the first benchmark platform expressly tailored for developing and evaluating MetaBBO-RL methods. MetaBox offers a flexible algorithmic template that allows users to effortlessly implement their unique designs within the platform. Moreover, it provides a broad spectrum of over 300 problem instances, collected from synthetic to realistic scenarios, and an extensive library of 19 baseline methods, including …


Truncated Affinity Maximization: One-Class Homophily Modeling For Graph Anomaly Detection, Hezhe Qiao, Guansong Pang Dec 2023

Truncated Affinity Maximization: One-Class Homophily Modeling For Graph Anomaly Detection, Hezhe Qiao, Guansong Pang

Research Collection School Of Computing and Information Systems

We reveal a one-class homophily phenomenon, which is one prevalent property we find empirically in real-world graph anomaly detection (GAD) datasets, i.e., normal nodes tend to have strong connection/affinity with each other, while the homophily in abnormal nodes is significantly weaker than normal nodes. However, this anomaly-discriminative property is ignored by existing GAD methods that are typically built using a conventional anomaly detection objective, such as data reconstruction. In this work, we explore this property to introduce a novel unsupervised anomaly scoring measure for GAD – local node affinity – that assigns a larger anomaly score to nodes that are …


A Poisson-Based Distribution Learning Framework For Short-Term Prediction Of Food Delivery Demand Ranges, Jian Liang, Jintao Ke, Hai Wang, Hongbo Ye, Jinjun Tang Dec 2023

A Poisson-Based Distribution Learning Framework For Short-Term Prediction Of Food Delivery Demand Ranges, Jian Liang, Jintao Ke, Hai Wang, Hongbo Ye, Jinjun Tang

Research Collection School Of Computing and Information Systems

The COVID-19 pandemic has caused a dramatic change in the demand composition of restaurants and, at the same time, catalyzed on-demand food delivery (OFD) services—such as DoorDash, Grubhub, and Uber Eats—to a large extent. With massive amounts of data on customers, drivers, and merchants, OFD platforms can achieve higher efficiency with better strategic and operational decisions; these include dynamic pricing, order bundling and dispatching, and driver relocation. Some of these decisions, and especially proactive decisions in real time, rely on accurate and reliable short-term predictions of demand ranges or distributions. In this paper, we develop a Poisson-based distribution prediction (PDP) …


Spatial-Temporal Episodic Memory Modeling For Adls: Encoding, Retrieval, And Prediction, Xinjing Song, Di Wang, Chai Quek, Ah-Hwee Tan, Yanjiang Wang Dec 2023

Spatial-Temporal Episodic Memory Modeling For Adls: Encoding, Retrieval, And Prediction, Xinjing Song, Di Wang, Chai Quek, Ah-Hwee Tan, Yanjiang Wang

Research Collection School Of Computing and Information Systems

Activities of daily living (ADLs) relate to people’s daily self-care activities, which reflect their living habits and lifestyle. A prior study presented a neural network model called STADLART for ADL routine learning. In this paper, we propose a cognitive model named Spatial-Temporal Episodic Memory for ADL (STEM-ADL), which extends STADLART to encode event sequences in the form of distributed episodic memory patterns. Specifically, STEM-ADL encodes each ADL and its associated contextual information as an event pattern and encodes all events in a day as an episode pattern. By explicitly encoding the temporal characteristics of events as activity gradient patterns, STEM-ADL …


Exploring Students' Adoption Of Chatgpt As A Mentor For Undergraduate Computing Projects: Pls-Sem Analysis, Gottipati Swapna, Kyong Jin Shim, Shankararaman, Venky Dec 2023

Exploring Students' Adoption Of Chatgpt As A Mentor For Undergraduate Computing Projects: Pls-Sem Analysis, Gottipati Swapna, Kyong Jin Shim, Shankararaman, Venky

Research Collection School Of Computing and Information Systems

As computing projects increasingly become a core component of undergraduate courses, effective mentorship is crucial for supporting students' learning and development. Our study examines the adoption of ChatGPT as a mentor for undergraduate computing projects. It explores the impact of ChatGPT mentorship, specifically, skills development, and mentor responsiveness, i.e., ChatGPT's responsiveness to students' needs and requests. We utilize PLS-SEM to investigate the interrelationships between different factors and develop a model that captures their contribution to the effectiveness of ChatGPT as a mentor. The findings suggest that mentor responsiveness and technical/design support are key factors for the adoption of AI tools …


Offline Rl With Discrete Proxy Representations For Generalizability In Pomdps, Pengjie Gu, Xinyu Cai, Dong Xing, Xinrun Wang, Mengchen Zhao, Bo An Dec 2023

Offline Rl With Discrete Proxy Representations For Generalizability In Pomdps, Pengjie Gu, Xinyu Cai, Dong Xing, Xinrun Wang, Mengchen Zhao, Bo An

Research Collection School Of Computing and Information Systems

Offline Reinforcement Learning (RL) has demonstrated promising results in various applications by learning policies from previously collected datasets, reducing the need for online exploration and interactions. However, real-world scenarios usually involve partial observability, which brings crucial challenges of the deployment of offline RL methods: i) the policy trained on data with full observability is not robust against the masked observations during execution, and ii) the information of which parts of observations are masked is usually unknown during training. In order to address these challenges, we present Offline RL with DiscrEte pRoxy representations (ORDER), a probabilistic framework which leverages novel state …


Cue-Cot: Chain-Of-Thought Prompting For Responding To In-Depth Dialogue Questions With Llms, Hongru Wang, Rui Wang, Fei Mi, Yang Deng, Zezhong Wang, Bin Liang, Ruifeng Xu, Kam-Fai Wong Dec 2023

Cue-Cot: Chain-Of-Thought Prompting For Responding To In-Depth Dialogue Questions With Llms, Hongru Wang, Rui Wang, Fei Mi, Yang Deng, Zezhong Wang, Bin Liang, Ruifeng Xu, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs), such as ChatGPT, greatly empower dialogue systems with strong language understanding and generation capabilities. However, most of the previous works prompt the LLMs to directly generate a response based on the dialogue context, overlooking the underlying linguistic cues about the user status exhibited in the context. Such in-depth dialogue scenarios are challenging for existing LLMs to figure out the user’s hidden needs and respond satisfactorily through a single-step inference. To this end, we propose a novel linguistic cue-based chain-of-thoughts (Cue-CoT), which enhances the LLMs inference with an intermediate reasoning step to find cues exhibited in the …


Flowpg: Action-Constrained Policy Gradient With Normalizing Flows, Brahmanage Janaka Chathuranga Thilakarathna, Jiajing Ling, Akshat Kumar Dec 2023

Flowpg: Action-Constrained Policy Gradient With Normalizing Flows, Brahmanage Janaka Chathuranga Thilakarathna, Jiajing Ling, Akshat Kumar

Research Collection School Of Computing and Information Systems

Action-constrained reinforcement learning (ACRL) is a popular approach for solving safety-critical and resource-allocation related decision making problems. A major challenge in ACRL is to ensure agent taking a valid action satisfying constraints in each RL step. Commonly used approach of using a projection layer on top of the policy network requires solving an optimization program which can result in longer training time, slow convergence, and zero gradient problem. To address this, first we use a normalizing flow model to learn an invertible, differentiable mapping between the feasible action space and the support of a simple distribution on a latent variable, …


Generative Modelling Of Stochastic Actions With Arbitrary Constraints In Reinforcement Learning, Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham Dec 2023

Generative Modelling Of Stochastic Actions With Arbitrary Constraints In Reinforcement Learning, Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security resources and emergency response units, etc. A challenge in this setting is that the underlying action space is categorical (discrete and unordered) and large, for which existing RL methods do not perform well. Moreover, these problems require validity of the realized action (allocation); this validity constraint is often difficult to express compactly in a closed mathematical form. The allocation nature of the problem also prefers stochastic optimal policies, …


From Asset Flow To Status, Action And Intention Discovery: Early Malice Detection In Cryptocurrency, Ling Cheng, Feida Zhu, Yong Wang, Ruicheng Liang, Huiwen Liu Dec 2023

From Asset Flow To Status, Action And Intention Discovery: Early Malice Detection In Cryptocurrency, Ling Cheng, Feida Zhu, Yong Wang, Ruicheng Liang, Huiwen Liu

Research Collection School Of Computing and Information Systems

Cryptocurrency has been subject to illicit activities probably more often than traditional financial assets due to the pseudo-anonymous nature of its transacting entities. An ideal detection model is expected to achieve all three critical properties of early detection, good interpretability, and versatility for various illicit activities. However, existing solutions cannot meet all these requirements, as most of them heavily rely on deep learning without interpretability and are only available for retrospective analysis of a specific illicit type. To tackle all these challenges, we propose Intention Monitor for early malice detection in Bitcoin, where the on-chain record data for a certain …


Mitigating Membership Inference Attacks Via Weighted Smoothing, Minghan Tan, Xiaofei Xie, Jun Sun, Tianhao Wang Dec 2023

Mitigating Membership Inference Attacks Via Weighted Smoothing, Minghan Tan, Xiaofei Xie, Jun Sun, Tianhao Wang

Research Collection School Of Computing and Information Systems

Recent advancements in deep learning have spotlighted a crucial privacy vulnerability to membership inference attack (MIA), where adversaries can determine if specific data was present in a training set, thus potentially revealing sensitive information. In this paper, we introduce a technique, weighted smoothing (WS), to mitigate MIA risks. Our approach is anchored on the observation that training samples differ in their vulnerability to MIA, primarily based on their distance to clusters of similar samples. The intuition is clusters will make model predictions more confident and increase MIA risks. Thus WS strategically introduces noise to training samples, depending on whether they …


Transformer-Based Multi-Task Learning For Crisis Actionability Extraction, Yuhao Zhang, Siaw Ling Lo, Phyo Yi Win Myint Dec 2023

Transformer-Based Multi-Task Learning For Crisis Actionability Extraction, Yuhao Zhang, Siaw Ling Lo, Phyo Yi Win Myint

Research Collection School Of Computing and Information Systems

Social media has become a valuable information source for crisis informatics. While various methods were proposed to extract relevant information during a crisis, their adoption by field practitioners remains low. In recent fieldwork, actionable information was identified as the primary information need for crisis responders and a key component in bridging the significant gap in existing crisis management tools. In this paper, we proposed a Crisis Actionability Extraction System for filtering, classification, phrase extraction, severity estimation, localization, and aggregation of actionable information altogether. We examined the effectiveness of transformer-based LSTM-CRF architecture in Twitter-related sequence tagging tasks and simultaneously extracted actionable …


C³: Code Clone-Based Identification Of Duplicated Components, Yanming Yang, Ying Zou, Xing Hu, David Lo, Chao Ni, John C. Grundy, Xin: Xia Dec 2023

C³: Code Clone-Based Identification Of Duplicated Components, Yanming Yang, Ying Zou, Xing Hu, David Lo, Chao Ni, John C. Grundy, Xin: Xia

Research Collection School Of Computing and Information Systems

Reinventing the wheel is a detrimental programming practice in software development that frequently results in the introduction of duplicated components. This practice not only leads to increased maintenance and labor costs but also poses a higher risk of propagating bugs throughout the system. Despite numerous issues introduced by duplicated components in software, the identification of component-level clones remains a significant challenge that existing studies struggle to effectively tackle. Specifically, existing methods face two primary limitations that are challenging to overcome: 1) Measuring the similarity between different components presents a challenge due to the significant size differences among them; 2) Identifying …


Prompting And Evaluating Large Language Models For Proactive Dialogues: Clarification, Target-Guided, And Non-Collaboration, Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua Dec 2023

Prompting And Evaluating Large Language Models For Proactive Dialogues: Clarification, Target-Guided, And Non-Collaboration, Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Conversational systems based on Large Language Models (LLMs), such as ChatGPT, show exceptional proficiency in context understanding and response generation. However, they still possess limitations, such as failing to ask clarifying questions to ambiguous queries or refuse users’ unreasonable requests, both of which are considered as key aspects of a conversational agent’s proactivity. This raises the question of whether LLM-based conversational systems are equipped to handle proactive dialogue problems. In this work, we conduct a comprehensive analysis of LLM-based conversational systems, specifically focusing on three key aspects of proactive dialogues: clarification, target-guided, and non-collaborative dialogues. To trigger the proactivity of …


A Black-Box Attack On Code Models Via Representation Nearest Neighbor Search, Jie Zhang, Wei Ma, Qiang Hu, Shangqing Liu, Xiaofei Xie, Yves Le Traon, Yang Liu Dec 2023

A Black-Box Attack On Code Models Via Representation Nearest Neighbor Search, Jie Zhang, Wei Ma, Qiang Hu, Shangqing Liu, Xiaofei Xie, Yves Le Traon, Yang Liu

Research Collection School Of Computing and Information Systems

Existing methods for generating adversarial code examples face several challenges: limted availability of substitute variables, high verification costs for these substitutes, and the creation of adversarial samples with noticeable perturbations. To address these concerns, our proposed approach, RNNS, uses a search seed based on historical attacks to find potential adversarial substitutes. Rather than directly using the discrete substitutes, they are mapped to a continuous vector space using a pre-trained variable name encoder. Based on the vector representation, RNNS predicts and selects better substitutes for attacks. We evaluated the performance of RNNS across six coding tasks encompassing three programming languages: Java, …


A Big Data Approach To Augmenting The Huff Model With Road Network And Mobility Data For Store Footfall Prediction, Ming Hui Tan, Kar Way Tan, Hoong Chuin Lau Dec 2023

A Big Data Approach To Augmenting The Huff Model With Road Network And Mobility Data For Store Footfall Prediction, Ming Hui Tan, Kar Way Tan, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Conventional methodologies for new retail store catchment area and footfall estimation rely on ground surveys which are costly and time-consuming. This study augments existing research in footfall estimation through the innovative integration of mobility data and road network to create population-weighted centroids and delineate residential neighbourhoods via a community detection algorithm. Our findings are then used to enhance Huff Model which is commonly used in site selection and footfall estimation. Our approach demonstrated the vast potential residing within big data where we harness the power of mobility data and road network information, offering a cost-effective and scalable alternative. It obviates …


Class Participation: Using Technology To Enhance Efficiency And Fairness, Benjamin Gan, Eng Lieh Ouh Dec 2023

Class Participation: Using Technology To Enhance Efficiency And Fairness, Benjamin Gan, Eng Lieh Ouh

Research Collection School Of Computing and Information Systems

Class participation can be considered as contribution to discussion, attendance, presentations, unsolicited responses, questions, comments, etc. What counts may vary across individual teachers. The more students participate, the less memorization they do, and the more they engage in higher levels of thinking, including interpretation, analysis, and synthesis. However, only a handful of students in many classrooms participate regularly, a phenomenon dubbed as "consolidation of responsibility". This study provides a literature review of inclass participation, as well as pedagogies and technologies that enhance participation. Pedagogies such as active learning, group learning, project-based learning and flipped classroom. Technologies to automate attendance taking, …


Prompting And Evaluating Large Language Models For Proactive Dialogues: Clarification, Target-Guided, And Non-Collaboration, Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua Dec 2023

Prompting And Evaluating Large Language Models For Proactive Dialogues: Clarification, Target-Guided, And Non-Collaboration, Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Conversational systems based on Large Language Models (LLMs), such as ChatGPT, show exceptional proficiency in context understanding and response generation. However, they still possess limitations, such as failing to ask clarifying questions to ambiguous queries or refuse users' unreasonable requests, both of which are considered as key aspects of a conversational agent's proactivity. This raises the question of whether LLM-based conversational systems are equipped to handle proactive dialogue problems. In this work, we conduct a comprehensive analysis of LLM-based conversational systems, specifically focusing on three key aspects of proactive dialogues: clarification, target-guided, and non-collaborative dialogues. To trigger the proactivity of …


Ethical Considerations For Artificial Intelligence In Educational Assessments, Lim Ming Soon Tristan, Gottipati Swapna, Michelle L. F. Cheong Dec 2023

Ethical Considerations For Artificial Intelligence In Educational Assessments, Lim Ming Soon Tristan, Gottipati Swapna, Michelle L. F. Cheong

Research Collection School Of Computing and Information Systems

In the vital context of education, the application of artificial intelligence (AI) to assessments necessitates a nuanced examination of the boundaries between ethically permissible and impermissible practices. In this chapter, the authors applied a systematic literature mapping methodology to scour extant research, so as to holistically structure the landscape into explicit topical research clusters. Through topic modelling and network analyses, research mapped key ethical principles to different assessment phases in a triadic ontological framework. The chapter looks to provide researchers and practitioners the insights into the ethical challenges that exist across an end-to-end assessment pipeline.


Interpreting Codebert For Semantic Code Clone Detection, Shamsa Abid, Xuemeng Cai, Lingxiao Jiang Dec 2023

Interpreting Codebert For Semantic Code Clone Detection, Shamsa Abid, Xuemeng Cai, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Accurate detection of semantic code clones has many applications in software engineering but is challenging because of lexical, syntactic, or structural dissimilarities in code. CodeBERT, a popular deep neural network based pre-trained code model, can detect code clones with a high accuracy. However, its performance on unseen data is reported to be lower. A challenge is to interpret CodeBERT's clone detection behavior and isolate the causes of mispredictions. In this paper, we evaluate CodeBERT and interpret its clone detection behavior on the SemanticCloneBench dataset focusing on Java and Python clone pairs. We introduce the use of a black-box model interpretation …


Software Composition Analysis For Vulnerability Detection: An Empirical Study On Java Projects, Lida Zhao, Sen Chen, Zhengzi Xu, Lyuye Zhang, Jiahui Wu, Jun Sun, Yang Liu Dec 2023

Software Composition Analysis For Vulnerability Detection: An Empirical Study On Java Projects, Lida Zhao, Sen Chen, Zhengzi Xu, Lyuye Zhang, Jiahui Wu, Jun Sun, Yang Liu

Research Collection School Of Computing and Information Systems

Software composition analysis (SCA) tools are proposed to detect potential vulnerabilities introduced by open-source software (OSS) imported as third-party libraries (TPL). With the increasing complexity of software functionality, SCA tools may encounter various scenarios during the dependency resolution process, such as diverse formats of artifacts, diverse dependency imports, and diverse dependency specifications. However, there still lacks a comprehensive evaluation of SCA tools for Java that takes into account the above scenarios. This could lead to a confined interpretation of comparisons, improper use of tools, and hinder further improvements of the tools. To fill this gap, we proposed an Evaluation Model …


Deeparc: Modularizing Neural Networks For The Model Maintenance, Xiaoning Ren, Yun Lin, Yinxing Xue, Ruofan Liu, Jun Sun, Zhiyong Feng, Jinsong Dong Dec 2023

Deeparc: Modularizing Neural Networks For The Model Maintenance, Xiaoning Ren, Yun Lin, Yinxing Xue, Ruofan Liu, Jun Sun, Zhiyong Feng, Jinsong Dong

Research Collection School Of Computing and Information Systems

Neural networks are an emerging data-driven programming paradigm widely used in many areas. Unlike traditional software systems consisting of decomposable modules, a neural network is usually delivered as a monolithic package, raising challenges for some maintenance tasks such as model restructure and re-adaption. In this work, we propose DeepArc, a novel modularization method for neural networks, to reduce the cost of model maintenance tasks. Specifically, DeepArc decomposes a neural network into several consecutive modules, each of which encapsulates consecutive layers with similar semantics. The network modularization facilitates practical tasks such as refactoring the model to preserve existing features (e.g., model …


Customer Cybersecurity And Supplier Cost Management Strategy, Xu Yang, Peng Liang, Nan Hu, Fujing Xue Dec 2023

Customer Cybersecurity And Supplier Cost Management Strategy, Xu Yang, Peng Liang, Nan Hu, Fujing Xue

Research Collection School Of Computing and Information Systems

In this paper, we explore the spillover effect of customer firms’ data breaches on their upstream supplier firms’ cost management strategies, proxied by cost stickiness. Our primary analyses suggest that data breaches suffered by customer firms are associated with a decrease in cost stickiness among supplier firms. Furthermore, the reductions in supplier cost stickiness are stronger if suppliers are managed by CEOs from national cultural groups with high uncertainty avoidance, low long-term orientations, and/or low individualism. In sum, the findings contribute to both Information Systems (IS) and Operations Management (OM) disciplines in terms of data breach, cost management strategy, and …


Novus Ex Machina: Realise Your Organisation’S Creative Potential With Ai, Adam Tatarynowicz, Utz Claassen Nov 2023

Novus Ex Machina: Realise Your Organisation’S Creative Potential With Ai, Adam Tatarynowicz, Utz Claassen

Asian Management Insights

Innovation managers must learn how to harness AI’s transformative potential.