Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 991 - 1020 of 6720

Full-Text Articles in Physical Sciences and Mathematics

Is Multi-Hop Reasoning Really Explainable? Towards Benchmarking Reasoning Interpretability, Xin Lv, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Yichi Zhang, Zelin Dai Nov 2021

Is Multi-Hop Reasoning Really Explainable? Towards Benchmarking Reasoning Interpretability, Xin Lv, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Yichi Zhang, Zelin Dai

Research Collection School Of Computing and Information Systems

Multi-hop reasoning has been widely studied in recent years to obtain more interpretable link prediction. However, we find in experiments that many paths given by these models are actually unreasonable, while little work has been done on interpretability evaluation for them. In this paper, we propose a unified framework to quantitatively evaluate the interpretability of multi-hop reasoning models so as to advance their development. In specific, we define three metrics, including path recall, local interpretability, and global interpretability for evaluation, and design an approximate strategy to calculate these metrics using the interpretability scores of rules. We manually annotate all possible …


Topic Modeling For Multi-Aspect Listwise Comparison, Delvin Ce Zhang, Hady W. Lauw Nov 2021

Topic Modeling For Multi-Aspect Listwise Comparison, Delvin Ce Zhang, Hady W. Lauw

Research Collection School Of Computing and Information Systems

As a well-established probabilistic method, topic models seek to uncover latent semantics from plain text. In addition to having textual content, we observe that documents are usually compared in listwise rankings based on their content. For instance, world-wide countries are compared in an international ranking in terms of electricity production based on their national reports. Such document comparisons constitute additional information that reveal documents' relative similarities. Incorporating them into topic modeling could yield comparative topics that help to differentiate and rank documents. Furthermore, based on different comparison criteria, the observed document comparisons usually cover multiple aspects, each expressing a distinct …


An Economic Analysis Of Rebates Conditional On Positive Reviews, Jianqing Chen, Zhiling Guo, Jian Huang Nov 2021

An Economic Analysis Of Rebates Conditional On Positive Reviews, Jianqing Chen, Zhiling Guo, Jian Huang

Research Collection School Of Computing and Information Systems

Strategic sellers on some online selling platforms have recently been using a conditional-rebate strategy to manipulate product reviews under which only purchasing consumers who post positive reviews online are eligible to redeem the rebate. A key concern for the conditional rebate is that it can easily induce fake reviews, which might be harmful to consumers and society. We develop a microbehavioral model capturing consumers’ review-sharing benefit, review-posting cost, and moral cost of lying to examine the seller’s optimal pricing and rebate decisions. We derive three equilibria: the no-rebate, organic-review equilibrium; the low-rebate, boosted-authentic-review equilibrium; and the high-rebate, partially-fake-review equilibrium. We …


Factual Consistency Evaluation For Text Summarization Via Counterfactual Estimation, Yuexiang Xie, Fei Sun, Yang Deng, Yaliang Li, Bolin Ding Nov 2021

Factual Consistency Evaluation For Text Summarization Via Counterfactual Estimation, Yuexiang Xie, Fei Sun, Yang Deng, Yaliang Li, Bolin Ding

Research Collection School Of Computing and Information Systems

Despite significant progress has been achieved in text summarization, factual inconsistency in generated summaries still severely limits its practical applications. Among the key factors to ensure factual consistency, a reliable automatic evaluation metric is the first and the most crucial one. However, existing metrics either neglect the intrinsic cause of the factual inconsistency or rely on auxiliary tasks, leading to an unsatisfied correlation with human judgments or increasing the inconvenience of usage in practice. In light of these challenges, we propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation, which formulates the causal relationship …


Exploiting Reasoning Chains For Multi-Hop Science Question Answering, Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai, Wai Lam Nov 2021

Exploiting Reasoning Chains For Multi-Hop Science Question Answering, Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai, Wai Lam

Research Collection School Of Computing and Information Systems

We propose a novel Chain Guided Retrieverreader (CGR) framework to model the reasoning chain for multi-hop Science Question Answering. Our framework is capable of performing explainable reasoning without the need of any corpus-specific annotations, such as the ground-truth reasoning chain, or humanannotated entity mentions. Specifically, we first generate reasoning chains from a semantic graph constructed by Abstract Meaning Representation of retrieved evidence facts. A Chain-aware loss, concerning both local and global chain information, is also designed to enable the generated chains to serve as distant supervision signals for training the retriever, where reinforcement learning is also adopted to maximize the …


Information Extraction And Classification On Journal Papers, Lei Yu Nov 2021

Information Extraction And Classification On Journal Papers, Lei Yu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The importance of journals for diffusing the results of scientific research has increased considerably. In the digital era, Portable Document Format (PDF) became the established format of electronic journal articles. This structured form, combined with a regular and wide dissemination, spread scientific advancements easily and quickly. However, the rapidly increasing numbers of published scientific articles requires more time and effort on systematic literature reviews, searches and screens. The comprehension and extraction of useful information from the digital documents is also a challenging task, due to the complex structure of PDF.

To help a soil science team from the United States …


Transforming Businesses With E-Commerce Intelligence, Yuanto Kusnadi, Gary Pan Nov 2021

Transforming Businesses With E-Commerce Intelligence, Yuanto Kusnadi, Gary Pan

Research Collection School Of Accountancy

2020 had been an extraordinary year as the Covid-19 pandemic struck almost all countries in the world and created an extraordinary impact on businesses worldwide. Singapore and many other Southeast Asian countries were not spared and had to implement lockdowns swiftly. To cope with physical store closures and the increased volume of online transactions, most businesses tried to revamp their business models and set up online stores to capitalise on the rise of the e-commerce wave. With the growing trend of online transactions, it has become imperative for companies operating in the Fast Moving Consumer Goods (FMCG) industry to track …


Aspect-Based Sentiment Analysis In Question Answering Forums, Wenxuan Zhang, Yang Deng, Xin Li, Lidong Bing, Wai Lam Nov 2021

Aspect-Based Sentiment Analysis In Question Answering Forums, Wenxuan Zhang, Yang Deng, Xin Li, Lidong Bing, Wai Lam

Research Collection School Of Computing and Information Systems

Aspect-based sentiment analysis (ABSA) typically focuses on extracting aspects and predicting their sentiments on individual sentences such as customer reviews. Recently, another kind of opinion sharing platform, namely question answering (QA) forum, has received increasing popularity, which accumulates a large number of user opinions towards various aspects. This motivates us to investigate the task of ABSA on QA forums (ABSA-QA), aiming to jointly detect the discussed aspects and their sentiment polarities for a given QA pair. Unlike review sentences, a QA pair is composed of two parallel sentences, which requires interaction modeling to align the aspect mentioned in the question …


Aspect Sentiment Quad Prediction As Paraphrase Generation, Wenxuan Zhang, Yang Deng, Xin Li, Yifei Yuan, Lidong Bing, Wai Lam Nov 2021

Aspect Sentiment Quad Prediction As Paraphrase Generation, Wenxuan Zhang, Yang Deng, Xin Li, Yifei Yuan, Lidong Bing, Wai Lam

Research Collection School Of Computing and Information Systems

Aspect-based sentiment analysis (ABSA) has been extensively studied in recent years, which typically involves four fundamental sentiment elements, including the aspect category, aspect term, opinion term, and sentiment polarity. Existing studies usually consider the detection of partial sentiment elements, instead of predicting the four elements in one shot. In this work, we introduce the Aspect Sentiment Quad Prediction (ASQP) task, aiming to jointly detect all sentiment elements in quads for a given opinionated sentence, which can reveal a more comprehensive and complete aspect-level sentiment structure. We further propose a novel Paraphrase modeling paradigm to cast the ASQP task to a …


Stock Market Trend Forecasting Based On Multiple Textual Features: A Deep Learning Method, Zhenda Hu, Zhaoxia Wang, Seng-Beng Ho, Ah-Hwee Tan Nov 2021

Stock Market Trend Forecasting Based On Multiple Textual Features: A Deep Learning Method, Zhenda Hu, Zhaoxia Wang, Seng-Beng Ho, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Stock market trend forecasting is a valuable and challenging research task for both industry and academia. In order to explore the influence of stock news information on the stock market trend, a textual embedding construction method is proposed to encode multiple textual features, including topic features, sentiment features, and semantic features extracted from stock news textual content. In addition, a deep learning method is designed by using financial data and multiple textual features obtained from multiple news textual embeddings for short-term stock market trend prediction. For evaluation, extensive experiments on real stock market data are conducted. The experimental results illustrate …


Contrastive Pre-Training Of Gnns On Heterogeneous Graphs, Xunqiang Jiang, Yuanfu Lu, Yuan Fang, Chuan Shi Nov 2021

Contrastive Pre-Training Of Gnns On Heterogeneous Graphs, Xunqiang Jiang, Yuanfu Lu, Yuan Fang, Chuan Shi

Research Collection School Of Computing and Information Systems

While graph neural networks (GNNs) emerge as the state-of-the-art representation learning methods on graphs, they often require a large amount of labeled data to achieve satisfactory performance, which is often expensive or unavailable. To relieve the label scarcity issue, some pre-training strategies have been devised for GNNs, to learn transferable knowledge from the universal structural properties of the graph. However, existing pre-training strategies are only designed for homogeneous graphs, in which each node and edge belongs to the same type. In contrast, a heterogeneous graph embodies rich semantics, as multiple types of nodes interact with each other via different kinds …


Finding A Needle In A Haystack: Automatic Mining Of Silent Vulnerability Fixes, Jiayuan Zhou, Michael Pacheco, Zhiyuan Wan, Xin Xia, David Lo, Yuan Wang, Ahmed E. Hassan Nov 2021

Finding A Needle In A Haystack: Automatic Mining Of Silent Vulnerability Fixes, Jiayuan Zhou, Michael Pacheco, Zhiyuan Wan, Xin Xia, David Lo, Yuan Wang, Ahmed E. Hassan

Research Collection School Of Computing and Information Systems

Following the coordinated vulnerability disclosure model, a vulnerability in open source software (OSS) is suggested to be fixed “silently”, without disclosing the fix until the vulnerability is disclosed. Yet, it is crucial for OSS users to be aware of vulnerability fixes as early as possible, as once a vulnerability fix is pushed to the source code repository, a malicious party could probe for the corresponding vulnerability to exploit it. In practice, OSS users often rely on the vulnerability disclosure information from security advisories (e.g., National Vulnerability Database) to sense vulnerability fixes. However, the time between the availability of a vulnerability …


Learning To Teach And Learn For Semi-Supervised Few-Shot Image Classification, Xinzhe Li, Jianqiang Huang, Yaoyao Liu, Qin Zhou, Shibao Zheng, Bernt Schiele, Qianru Sun Nov 2021

Learning To Teach And Learn For Semi-Supervised Few-Shot Image Classification, Xinzhe Li, Jianqiang Huang, Yaoyao Liu, Qin Zhou, Shibao Zheng, Bernt Schiele, Qianru Sun

Research Collection School Of Computing and Information Systems

This paper presents a novel semi-supervised few-shot image classification method named Learning to Teach and Learn (LTTL) to effectively leverage unlabeled samples in small-data regimes. Our method is based on self-training, which assigns pseudo labels to unlabeled data. However, the conventional pseudo-labeling operation heavily relies on the initial model trained by using a handful of labeled data and may produce many noisy labeled samples. We propose to solve the problem with three steps: firstly, cherry-picking searches valuable samples from pseudo-labeled data by using a soft weighting network; and then, cross-teaching allows the classifiers to teach mutually for rejecting more noisy …


Transfer-Learned Pruned Deep Convolutional Neural Networks For Efficient Plant Classification In Resource-Constrained Environments, Martinson Ofori Nov 2021

Transfer-Learned Pruned Deep Convolutional Neural Networks For Efficient Plant Classification In Resource-Constrained Environments, Martinson Ofori

Masters Theses & Doctoral Dissertations

Traditional means of on-farm weed control mostly rely on manual labor. This process is time-consuming, costly, and contributes to major yield losses. Further, the conventional application of chemical weed control can be economically and environmentally inefficient. Site-specific weed management (SSWM) counteracts this by reducing the amount of chemical application with localized spraying of weed species. To solve this using computer vision, precision agriculture researchers have used remote sensing weed maps, but this has been largely ineffective for early season weed control due to problems such as solar reflectance and cloud cover in satellite imagery. With the current advances in artificial …


On Aggregating Salaries Of Occupations From Job Post And Review Data, Chih-Chieh Hung, Ee-Peng Lim Nov 2021

On Aggregating Salaries Of Occupations From Job Post And Review Data, Chih-Chieh Hung, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

The popularity of job websites has significantly changed the way people learn about different occupations. Among the insights offered by these websites are the statistics of occupation salaries which are useful information for job seekers, career coaches, graduating students, and labor related government agencies. Such statistics include the distribution of job salaries of each occupation, such as average or quantiles. However, significant variability in salary (and review salary) can be found among jobs of the same occupation as we gather job post and review data from job websites. Such variability shows the existence of biases, including salary competitiveness in job …


K-Sums Clustering: A Stochastic Optimization Approach, Zhao Wan-Lei, Shi Ying Lan, Run-Qing Chen, Chong-Wah Ngo Nov 2021

K-Sums Clustering: A Stochastic Optimization Approach, Zhao Wan-Lei, Shi Ying Lan, Run-Qing Chen, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

In this paper, we revisit the decades-old clustering method k-means. The egg-chicken loop in traditional k-means has been replaced by a pure stochastic optimization procedure. The optimization is undertaken from the perspective of each individual sample. Different from existing incremental k-means, an individual sample is tentatively joined into a new cluster to evaluate its distance to the corresponding new centroid, in which the contribution from this sample is accounted. The sample is moved to this new cluster concretely only after we find the reallocation makes the sample closer to the new centroid than it is to the current one. Compared …


Automating User Notice Generation For Smart Contract Functions, Xing Hu, Zhipeng Gao, Xin Xia, David Lo, Xiaohu Yang Nov 2021

Automating User Notice Generation For Smart Contract Functions, Xing Hu, Zhipeng Gao, Xin Xia, David Lo, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Smart contracts have obtained much attention and are crucial for automatic financial and business transactions. For end-users who have never seen the source code, they can read the user notice shown in end-user client to understand what a transaction does of a smart contract function. However, due to time constraints or lack of motivation, user notice is often missing during the development of smart contracts. For endusers who lack the information of the user notices, there is no easy way for them to check the code semantics of the smart contracts. Thus, in this paper, we propose a new approach …


Investigating The Effects Of Dimension-Specific Sentiments On Product Sales: The Perspective Of Sentiment Preferences, Cuiqing Jiang, Jianfei Wang, Qian Tang, Xiaozhong Lyu Nov 2021

Investigating The Effects Of Dimension-Specific Sentiments On Product Sales: The Perspective Of Sentiment Preferences, Cuiqing Jiang, Jianfei Wang, Qian Tang, Xiaozhong Lyu

Research Collection School Of Computing and Information Systems

While literature has reached a consensus on the awareness effect of online word-of-mouth (eWOM), this paper studies its persuasive effect, specifically, the dimension-specific sentiment effects on product sales. We allow the sentiment information in eWOM along different product dimensions to have different persuasive effects on consumers’ purchase decisions. This occurs because of consumers’ sentiment preference, which is defined as the relative importance consumers place on various dimension-specific sentiments. We use an aspect-level sentiment analysis to derive the dimension-specific sentiments and PVAR (panel vector auto-regression) models to estimate their effects on product sales using a movie panel dataset. The findings show …


Representation Learning On Multi-Layered Heterogeneous Network, Delvin Ce Zhang, Hady W. Lauw Nov 2021

Representation Learning On Multi-Layered Heterogeneous Network, Delvin Ce Zhang, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Network data can often be represented in a multi-layered structure with rich semantics. One example is e-commerce data, containing user-user social network layer and item-item context layer, with cross-layer user-item interactions. Given the dual characters of homogeneity within each layer and heterogeneity across layers, we seek to learn node representations from such a multi-layered heterogeneous network while jointly preserving structural information and network semantics. In contrast, previous works on network embedding mainly focus on single-layered or homogeneous networks with one type of nodes and links. In this paper we propose intra- and cross-layer proximity concepts. Intra-layer proximity simulates propagation along …


Building Legal Datasets, Jerrold Soh Nov 2021

Building Legal Datasets, Jerrold Soh

Research Collection Yong Pung How School Of Law

Data-centric AI calls for better, not just bigger, datasets. As data protection laws with extra-territorial reach proliferate worldwide, ensuring datasets are legal is an increasingly crucial yet overlooked component of “better”. To help dataset builders become more willing and able to navigate this complex legal space, this paper reviews key legal obligations surrounding ML datasets, examines the practical impact of data laws on ML pipelines, and offers a framework for building legal datasets.


Expediting The Accuracy-Improving Process Of Svms For Class Imbalance Learning, Bin Cao, Yuqi Liu, Chenyu Hou, Jing Fan, Baihua Zheng, Jianwei Jin Nov 2021

Expediting The Accuracy-Improving Process Of Svms For Class Imbalance Learning, Bin Cao, Yuqi Liu, Chenyu Hou, Jing Fan, Baihua Zheng, Jianwei Jin

Research Collection School Of Computing and Information Systems

To improve the classification performance of support vector machines (SVMs) on imbalanced datasets, cost-sensitive learning methods have been proposed, e.g., DEC (Different Error Costs) and FSVM-CIL (Fuzzy SVM for Class Imbalance Learning). They relocate the hyperplane by adjusting the costs associated with misclassifying samples. However, the error costs are determined either empirically or by performing an exhaustive search in the parameter space. Both strategies can not guarantee effectiveness and efficiency simultaneously. In this paper, we propose ATEC, a solution that can efficiently find a preferable hyperplane by automatically tuning the error cost for between-class samples. ATEC distinguishes itself from all …


On A Multistage Discrete Stochastic Optimization Problem With Stochastic Constraints And Nested Sampling, Thuy Anh Ta, Tien Mai, Fabian Bastin, Pierre L'Ecuyer Nov 2021

On A Multistage Discrete Stochastic Optimization Problem With Stochastic Constraints And Nested Sampling, Thuy Anh Ta, Tien Mai, Fabian Bastin, Pierre L'Ecuyer

Research Collection School Of Computing and Information Systems

We consider a multistage stochastic discrete program in which constraints on any stage might involve expectations that cannot be computed easily and are approximated by simulation. We study a sample average approximation (SAA) approach that uses nested sampling, in which at each stage, a number of scenarios are examined and a number of simulation replications are performed for each scenario to estimate the next-stage constraints. This approach provides an approximate solution to the multistage problem. To establish the consistency of the SAA approach, we first consider a two-stage problem and show that in the second-stage problem, given a scenario, the …


Can We Make It Better? Assessing And Improving Quality Of Github Repositories, Gede Artha Azriadi Prana Nov 2021

Can We Make It Better? Assessing And Improving Quality Of Github Repositories, Gede Artha Azriadi Prana

Dissertations and Theses Collection (Open Access)

The code hosting platform GitHub has gained immense popularity worldwide in recent years, with over 200 million repositories hosted as of June 2021. Due to its popularity, it has great potential to facilitate widespread improvements across many software projects. Naturally, GitHub has attracted much research attention, and the source code in the various repositories it hosts also provide opportunity to apply techniques and tools developed by software engineering researchers over the years. However, much of existing body of research applicable to GitHub focuses on code quality of the software projects and ways to improve them. Fewer work focus on potential …


Informing Complexity: The Business Case For Managing Digital Twins Of Complex Process Facilities As A Valuable Asset, William Randell Mcnair Oct 2021

Informing Complexity: The Business Case For Managing Digital Twins Of Complex Process Facilities As A Valuable Asset, William Randell Mcnair

USF Tampa Graduate Theses and Dissertations

The Digital Twins of complex facilities, specifically 3D models created during their design, is a potentially valuable information asset. This three- article dissertation explores the business case for firms in the petrochemical process industry to manage throughout the facility lifecycle. A maturity model is provided to illustrate the stages of digital twin evolution and serves as a tool to help communicate each of the five levels of digital twin maturity achievable in various use cases. An industry analysis reviews existing literature and proposes a model to assess informing or insight value of digital twins from three perspectives. Next, an empirical …


Residential Curbside Recycle Context Analysis, Ntchanang Mpafe Oct 2021

Residential Curbside Recycle Context Analysis, Ntchanang Mpafe

USF Tampa Graduate Theses and Dissertations

Curbside recycling as a preferred mode of residential and municipal sustainability goals seems to have an overwhelming acceptance and adoption in the US. About 69.8 million out of 97.3 million (72%) single-family households in the United States have access to curbside recycling services (State of Curbside Recycling Report, 2020). Collectively, the programs divert about nine million tons of recyclables from landfill disposal each year (Cottom, 2019).

For a design that started in the 1980s in the US, its rapid universal adoption seems to have precluded a concerted effort in examining the coproduced nature (Households: service receptors and Municipalities: service providers) …


Detection Of Dental Apical Lesions Using Cnns On Periapical Radiograph, Chun-Wei Li, Szu-Yin Lin, He-Sheng Chou, Tsung-Yi Chen, Yu-An Chen, Sheng-Yu Liu, Yu-Lin Liu, Chiung-An Chen, Yen-Cheng Huang, Shih-Lun Chen, Yi-Cheng Mao, Patricia Angela R. Abu, Wei-Yuan Chiang, Wen-Shen Lo Oct 2021

Detection Of Dental Apical Lesions Using Cnns On Periapical Radiograph, Chun-Wei Li, Szu-Yin Lin, He-Sheng Chou, Tsung-Yi Chen, Yu-An Chen, Sheng-Yu Liu, Yu-Lin Liu, Chiung-An Chen, Yen-Cheng Huang, Shih-Lun Chen, Yi-Cheng Mao, Patricia Angela R. Abu, Wei-Yuan Chiang, Wen-Shen Lo

Department of Information Systems & Computer Science Faculty Publications

Apical lesions, the general term for chronic infectious diseases, are very common dental diseases in modern life, and are caused by various factors. The current prevailing endodontic treatment makes use of X-ray photography taken from patients where the lesion area is marked manually, which is therefore time consuming. Additionally, for some images the significant details might not be recognizable due to the different shooting angles or doses. To make the diagnosis process shorter and efficient, repetitive tasks should be performed automatically to allow the dentists to focus more on the technical and medical diagnosis, such as treatment, tooth cleaning, or …


Appendix: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana Oct 2021

Appendix: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana

Open Educational Resources

No abstract provided.


(2021 Revision) Chapter 4: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana Oct 2021

(2021 Revision) Chapter 4: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana

Open Educational Resources

No abstract provided.


(2021 Revision) Chapter 5: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana Oct 2021

(2021 Revision) Chapter 5: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana

Open Educational Resources

No abstract provided.


(2021 Revision) Chapter 1: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana Oct 2021

(2021 Revision) Chapter 1: Essential Aspects Of Physical Design And Implementation Of Relational Databases, Tatiana Malyuta, Ashwin Satyanarayana

Open Educational Resources

No abstract provided.