Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 91 - 120 of 6717

Full-Text Articles in Physical Sciences and Mathematics

Deep Learning In Indus Valley Script Digitization, Deva Munikanta Reddy Atturu May 2024

Deep Learning In Indus Valley Script Digitization, Deva Munikanta Reddy Atturu

Theses and Dissertations

This research introduces ASR-net(Ancient Script Recognition), a groundbreaking system that automatically digitizes ancient Indus seals by converting them into coded text, similar to Optical Character Recognition for modern languages. ASR-net, with an 95% success rate in identifying individual symbols, aims to address the crucial need for automated techniques in deciphering the enigmatic Indus script. Initially Yolov3 is utilized to create the bounding boxes around each graphemes present in the Indus Valley Seal. In addition to that we created M-net(Mahadevan) model to encode the graphemes. Beyond digitization, the paper proposes a new research challenge called the Motif Identification Problem (MIP) related …


Quantum Machine Learning For Credit Scoring, Nikolaos Schetakis, Davit Aghamalyan, Micheael Boguslavsky, Agnieszka Rees, Marc Rakotomalala, Paul Robert Griffin May 2024

Quantum Machine Learning For Credit Scoring, Nikolaos Schetakis, Davit Aghamalyan, Micheael Boguslavsky, Agnieszka Rees, Marc Rakotomalala, Paul Robert Griffin

Research Collection School Of Computing and Information Systems

This study investigates the integration of quantum circuits with classical neural networks for enhancing credit scoring for small- and medium-sized enterprises (SMEs). We introduce a hybrid quantum–classical model, focusing on the synergy between quantum and classical rather than comparing the performance of separate quantum and classical models. Our model incorporates a quantum layer into a traditional neural network, achieving notable reductions in training time. We apply this innovative framework to a binary classification task with a proprietary real-world classical credit default dataset for SMEs in Singapore. The results indicate that our hybrid model achieves efficient training, requiring significantly fewer epochs …


Knowledge Enhanced Multi-Intent Transformer Network For Recommendation, Ding Zou, Wei Wei, Feida Zhu, Chuanyu Xu, Tao Zhang, Chengfu Huo May 2024

Knowledge Enhanced Multi-Intent Transformer Network For Recommendation, Ding Zou, Wei Wei, Feida Zhu, Chuanyu Xu, Tao Zhang, Chengfu Huo

Research Collection School Of Computing and Information Systems

Incorporating Knowledge Graphs (KGs) into Recommendation has attracted growing attention in industry, due to the great potential of KG in providing abundant supplementary information and interpretability for the underlying models. However, simply integrating KG into recommendation usually brings in negative feedback in industry, mainly due to the ignorance of the following two factors: i) users' multiple intents, which involve diverse nodes in KG. For example, in e-commerce scenarios, users may exhibit preferences for specific styles, brands, or colors. ii) knowledge noise, which is a prevalent issue in Knowledge Enhanced Recommendation (KGR) and even more severe in industry scenarios. The irrelevant …


From Tweets To Token Sales: Assessing Ico Success Through Social Media Sentiments, Donghao Huang, S. Samuel, Quoc Toan Huynh, Zhaoxia Wang May 2024

From Tweets To Token Sales: Assessing Ico Success Through Social Media Sentiments, Donghao Huang, S. Samuel, Quoc Toan Huynh, Zhaoxia Wang

Research Collection School Of Computing and Information Systems

With the advent of social network technology, the influence of collective opinions has significantly impacted business, marketing, and fundraising. Particularly in the blockchain space, Initial Coin Offerings (ICOs) gain substantial exposure across various online platforms. Yet, the intricate relationships among these elements remain largely unexplored. This study aims to investigate the relationships between social media sentiment, engagement metrics, and ICO success. We hypothesize a positive correlation between favorable sentiment in ICO-related tweets and overall project success. Additionally, we recognize social media engagement indicators (mentions, retweets, likes, follower counts) as critical factors affecting ICO performance. Employing machine learning techniques, we conduct …


Intriguing Properties Of Data Attribution On Diffusion Models, Xiaosen Zheng, Tianyu Pang, Chao Du, Jing Jiang, Xiaosen Zheng May 2024

Intriguing Properties Of Data Attribution On Diffusion Models, Xiaosen Zheng, Tianyu Pang, Chao Du, Jing Jiang, Xiaosen Zheng

Research Collection School Of Computing and Information Systems

Data attribution seeks to trace model outputs back to training data. With the recent development of diffusion models, data attribution has become a desired module to properly assign valuations for high-quality or copyrighted training samples, ensuring that data contributors are fairly compensated or credited. Several theoretically motivated methods have been proposed to implement data attribution, in an effort to improve the trade-off between computational scalability and effectiveness. In this work, we conduct extensive experiments and ablation studies on attributing diffusion models, specifically focusing on DDPMs trained on CIFAR-10 and CelebA, as well as a Stable Diffusion model LoRA-finetuned on ArtBench. …


Develop An Interactive Python Dashboard For Analyzing Ezproxy Logs, Andy Huff, Matthew Roth, Weiling Liu Apr 2024

Develop An Interactive Python Dashboard For Analyzing Ezproxy Logs, Andy Huff, Matthew Roth, Weiling Liu

Faculty and Staff Scholarship

This paper describes the development of an interactive dashboard in Python with EZproxy log data. Hopefully, this dashboard will help improve the evidence-based decision-making process in electronic resources management and explore the impact of library use.


A Design Science Approach To Investigating Decentralized Identity Technology, Janelle Krupicka Apr 2024

A Design Science Approach To Investigating Decentralized Identity Technology, Janelle Krupicka

Cybersecurity Undergraduate Research Showcase

The internet needs secure forms of identity authentication to function properly, but identity authentication is not a core part of the internet’s architecture. Instead, approaches to identity verification vary, often using centralized stores of identity information that are targets of cyber attacks. Decentralized identity is a secure way to manage identity online that puts users’ identities in their own hands and that has the potential to become a core part of cybersecurity. However, decentralized identity technology is new and continually evolving, which makes implementing this technology in an organizational setting challenging. This paper suggests that, in the future, decentralized identity …


Binder, Tyler A. Peaster, Lindsey M. Davenport, Madelyn Little, Alex Bales Apr 2024

Binder, Tyler A. Peaster, Lindsey M. Davenport, Madelyn Little, Alex Bales

ATU Research Symposium

Binder is a mobile application that aims to introduce readers to a book recommendation service that appeals to devoted and casual readers. The main goal of Binder is to enrich book selection and reading experience. This project was created in response to deficiencies in the mobile space for book suggestions, library management, and reading personalization. The tools we used to create the project include Visual Studio, .Net Maui Framework, C#, XAML, CSS, MongoDB, NoSQL, Git, GitHub, and Figma. The project’s selection of books were sourced from the Google Books repository. Binder aims to provide an intuitive interface that allows users …


Kalamazoo Nature Center Mobile Application, Jacob Tebben Apr 2024

Kalamazoo Nature Center Mobile Application, Jacob Tebben

Honors Theses

This project aimed to address the challenge of enhancing visitor engagement and information dissemination at the Kalamazoo Nature Center (KNC) through the development of an integrated mobile and desktop application system. This initiative arose due to the limitations posed by traditional mobile applications which often become outdated and need to be updated by a dedicated software team. This project was designed for any user of the KNC desktop app to be able to update content on the mobile app, without the need of a dedicated software team.

The mobile application was designed for visitor use, enabling them to access up-to-date …


Immersive Japanese Language Learning Web Application Using Spaced Repetition, Active Recall, And An Artificial Intelligent Conversational Chat Agent Both In Voice And In Text, Marc Butler Apr 2024

Immersive Japanese Language Learning Web Application Using Spaced Repetition, Active Recall, And An Artificial Intelligent Conversational Chat Agent Both In Voice And In Text, Marc Butler

MS in Computer Science Project Reports

In the last two decades various human language learning applications, spaced repetition software, online dictionaries, and artificial intelligent chat agents have been developed. However, there is no solution to cohesively combine these technologies into a comprehensive language learning application including skills such as speaking, typing, listening, and reading. Our contribution is to provide an immersive language learning web application to the end user which combines spaced repetition, a study technique used to review information at systematic intervals, and active recall, the process of purposely retrieving information from memory during a review session, with an artificial intelligent conversational chat agent both …


What Students Have To Say On Data Privacy For Educational Technology, Stephanie Choi Apr 2024

What Students Have To Say On Data Privacy For Educational Technology, Stephanie Choi

Cybersecurity Undergraduate Research Showcase

The literature on data privacy in terms of educational technology is a growing area of study. The perspective of educators has been captured extensively. However, the literature on students’ perspectives is missing, which is what we explore in this paper. We use a pragmatic qualitative approach with an experiential lens to capture students’ attitudes towards data privacy in terms of educational technology. We identified preliminary, common themes that appeared in the survey responses. The paper concludes by calling for more research on how students perceive data privacy in terms of educational technology.


Artificial Intelligence Could Probably Write This Essay Better Than Me, Claire Martino Apr 2024

Artificial Intelligence Could Probably Write This Essay Better Than Me, Claire Martino

Augustana Center for the Study of Ethics Essay Contest

No abstract provided.


Unearthing The Past: A Comprehensive Study Of Natural And Anthropogenic Changes At An Archaeological Site Through Hydrogeologic Connectivity Utilizing Gis, Mehlich Ii Phosphorus Extractant, And Ph, Dana L. F. Herren Apr 2024

Unearthing The Past: A Comprehensive Study Of Natural And Anthropogenic Changes At An Archaeological Site Through Hydrogeologic Connectivity Utilizing Gis, Mehlich Ii Phosphorus Extractant, And Ph, Dana L. F. Herren

Theses

This thesis aims to thoroughly analyze the Mehlich II Phosphorus Extractant and pH levels at the Bains Gap Village Site in Anniston, AL., while examining the impact of various environmental factors and human activities on them. Phosphorus is often used in archaeology as an indicator of human activity. Soil core samples were collected to analyze anomalies in phosphorus levels.

To establish any relationships, phosphorus and pH levels from soil cores were correlated with findings from past excavation units and features. The potential effects of hydrogeologic connectivity on soil phosphorus and pH levels were investigated. Geospatial technologies were used to manage …


Flgan: Gan-Based Unbiased Federated Learning Under Non-Iid Settings, Zhuoran Ma, Yang Liu, Yinbin Miao, Guowen Xu, Ximeng Liu, Jianfeng Ma, Robert H. Deng Apr 2024

Flgan: Gan-Based Unbiased Federated Learning Under Non-Iid Settings, Zhuoran Ma, Yang Liu, Yinbin Miao, Guowen Xu, Ximeng Liu, Jianfeng Ma, Robert H. Deng

Research Collection School Of Computing and Information Systems

Federated Learning (FL) suffers from low convergence and significant accuracy loss due to local biases caused by non-Independent and Identically Distributed (non-IID) data. To enhance the non-IID FL performance, a straightforward idea is to leverage the Generative Adversarial Network (GAN) to mitigate local biases using synthesized samples. Unfortunately, existing GAN-based solutions have inherent limitations, which do not support non-IID data and even compromise user privacy. To tackle the above issues, we propose a GAN-based unbiased FL scheme, called FlGan, to mitigate local biases using synthesized samples generated by GAN while preserving user-level privacy in the FL setting. Specifically, FlGan first …


Impact Of Government Outsourcing Contracts On High-Tech Vendors: An Empirical Study, Yi Dong, Nan Hu, Yonghua Ji, Chenkai Ni, Jing Xie Apr 2024

Impact Of Government Outsourcing Contracts On High-Tech Vendors: An Empirical Study, Yi Dong, Nan Hu, Yonghua Ji, Chenkai Ni, Jing Xie

Research Collection School Of Computing and Information Systems

Outsourcing is an important strategic decision of high-tech firms. However, while the research has extensively studied the implications of outsourcing to high-tech clients, its impact on high-tech vendors remains underexplored. This study empirically estimates the impact of government outsourcing contracts on high-tech vendors. Employing the earnings-return analyses framework, we find that, for high-tech vendors engaged in government outsourcing contracts, the stock market places a higher value on each unit of unexpected earnings compared to other firms. Additionally, this impact becomes stronger for contracts with longer terms, for contracts outsourced by the U.S. government or by countries with better political and …


Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Guo Qi, Junming Cao, Xiaofei Xie, Shangqing Liu, Xiaohong Li, Bihuan Chen, Xin Peng Apr 2024

Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Guo Qi, Junming Cao, Xiaofei Xie, Shangqing Liu, Xiaohong Li, Bihuan Chen, Xin Peng

Research Collection School Of Computing and Information Systems

Code review is an essential activity for ensuring the quality and maintainability of software projects. However, it is a time-consuming and often error-prone task that can significantly impact the development process. Recently, ChatGPT, a cutting-edge language model, has demonstrated impressive performance in various natural language processing tasks, suggesting its potential to automate code review processes. However, it is still unclear how well ChatGPT performs in code review tasks. To fill this gap, in this paper, we conduct the first empirical study to understand the capabilities of ChatGPT in code review tasks, specifically focusing on automated code refinement based on given …


Context-Aware Representation: Jointly Learning Item Features And Selection From Triplets, Rodrigo Alves, Antoine Ledent Apr 2024

Context-Aware Representation: Jointly Learning Item Features And Selection From Triplets, Rodrigo Alves, Antoine Ledent

Research Collection School Of Computing and Information Systems

In areas of machine learning such as cognitive modeling or recommendation, user feedback is usually context-dependent. For instance, a website might provide a user with a set of recommendations and observe which (if any) of the links were clicked by the user. Similarly, there is growing interest in the so-called “odd-one-out” learning setting, where human participants are provided with a basket of items and asked which is the most dissimilar to the others. In both of those cases, the presence of all the items in the basket can influence the final decision. In this article, we consider a classification task …


Test Optimization In Dnn Testing: A Survey, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon Apr 2024

Test Optimization In Dnn Testing: A Survey, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon

Research Collection School Of Computing and Information Systems

This article presents a comprehensive survey on test optimization in deep neural network (DNN) testing. Here, test optimization refers to testing with low data labeling effort. We analyzed 90 papers, including 43 from the software engineering (SE) community, 32 from the machine learning (ML) community, and 15 from other communities. Our study: (i) unifies the problems as well as terminologies associated with low-labeling cost testing, (ii) compares the distinct focal points of SE and ML communities, and (iii) reveals the pitfalls in existing literature. Furthermore, we highlight the research opportunities in this domain.


Elevating Academic Administration: A Comprehensive Faculty Dashboard For Tracking Student Evaluations And Research, Musa M. Azeem Apr 2024

Elevating Academic Administration: A Comprehensive Faculty Dashboard For Tracking Student Evaluations And Research, Musa M. Azeem

Senior Theses

The USC Faculty Dashboard is a web application designed to revolutionize how department heads, professors, and instructors monitor progress and make decisions, providing a centralized hub for efficient data storage and analysis. Currently, there’s a gap in tools tailored for department heads to concisely manage the performance of their department, which our platform aims to fill. The USC Faculty Dashboard offers easy access to upload and view student evaluation and research information, empowering department heads to evaluate the performance of faculty members and seamlessly track their research grants, publications, and expenditures. Furthermore, professors and instructors gain personalized performance analysis tools, …


Home Is Where The Work Is: How Biases In Managers’ Resource Allocation Decisions Affect Task Performance In Remote Work Environments, Richard D. Mautz Iii Mar 2024

Home Is Where The Work Is: How Biases In Managers’ Resource Allocation Decisions Affect Task Performance In Remote Work Environments, Richard D. Mautz Iii

USF Tampa Graduate Theses and Dissertations

As the use of remote and hybrid work arrangements continues to grow, it is important to understand how these arrangements can yield performance. In this paper, I conduct two studies to examine how the remote work environment affects managers’ task assignment decisions across different task types and how those decisions affect workers’ task performance. First, I survey managers, in both a cross-section of industries and specifically in accounting, to study the effect of remote work on their task assignment decisions. Consistent with prior literature and economic theory, I predict and find that managers are more inclined to assign generative tasks …


The Effect Of Internet Firms’ Data Analytics Capability On Their Innovation Speed And Innovation Quality: A Dynamic Capability Perspective, Yeyu Hua Mar 2024

The Effect Of Internet Firms’ Data Analytics Capability On Their Innovation Speed And Innovation Quality: A Dynamic Capability Perspective, Yeyu Hua

Dissertations and Theses Collection (Open Access)

With the advent of big data era, data plays a pivotal role in sustainingfirms’ competitive advantages. Although a few studies have shown that data analytics capability contributes to firms’ innovative performance, these studies either focus on general innovative performance or specific types of innovation, such as incremental innovation, radical innovation, and supply chaininnovation. In this thesis, I enrich this stream of literature by conducting twostudies to further examine the relationship between data analytics capabilityand innovation speed as well as innovation quality. This thesis consists of twostudies. Study 1 is a survey study, in which I investigate the relationshipbetween data analytics …


Simulated Annealing With Reinforcement Learning For The Set Team Orienteering Problem With Time Windows, Vincent F. Yu, Nabila Y. Salsabila, Shih-W Lin, Aldy Gunawan Mar 2024

Simulated Annealing With Reinforcement Learning For The Set Team Orienteering Problem With Time Windows, Vincent F. Yu, Nabila Y. Salsabila, Shih-W Lin, Aldy Gunawan

Research Collection School Of Computing and Information Systems

This research investigates the Set Team Orienteering Problem with Time Windows (STOPTW), a new variant of the well-known Team Orienteering Problem with Time Windows and Set Orienteering Problem. In the STOPTW, customers are grouped into clusters. Each cluster is associated with a profit attainable when a customer in the cluster is visited within the customer's time window. A Mixed Integer Linear Programming model is formulated for STOPTW to maximizing total profit while adhering to time window constraints. Since STOPTW is an NP-hard problem, a Simulated Annealing with Reinforcement Learning (SARL) algorithm is developed. The proposed SARL incorporates the core concepts …


Non-Monotonic Generation Of Knowledge Paths For Context Understanding, Pei-Chi Lo, Ee-Peng Lim Mar 2024

Non-Monotonic Generation Of Knowledge Paths For Context Understanding, Pei-Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Knowledge graphs can be used to enhance text search and access by augmenting textual content with relevant background knowledge. While many large knowledge graphs are available, using them to make semantic connections between entities mentioned in the textual content remains to be a difficult task. In this work, we therefore introduce contextual path generation (CPG) which refers to the task of generating knowledge paths, contextual path, to explain the semantic connections between entities mentioned in textual documents with given knowledge graph. To perform CPG task well, one has to address its three challenges, namely path relevance, incomplete knowledge graph, and …


Screening Through A Broad Pool: Towards Better Diversity For Lexically Constrained Text Generation, Changsen Yuan, Heyan Huang, Yixin Cao, Qianwen Cao Mar 2024

Screening Through A Broad Pool: Towards Better Diversity For Lexically Constrained Text Generation, Changsen Yuan, Heyan Huang, Yixin Cao, Qianwen Cao

Research Collection School Of Computing and Information Systems

Lexically constrained text generation (CTG) is to generate text that contains given constrained keywords. However, the text diversity of existing models is still unsatisfactory. In this paper, we propose a lightweight dynamic refinement strategy that aims at increasing the randomness of inference to improve generation richness and diversity while maintaining a high level of fluidity and integrity. Our basic idea is to enlarge the number and length of candidate sentences in each iteration, and choose the best for subsequent refinement. On the one hand, different from previous works, which carefully insert one token between two words per action, we insert …


Hypergraphs With Attention On Reviews For Explainable Recommendation, Theis E. Jendal, Trung Hoang Le, Hady Wirawan Lauw, Matteo Lissandrini, Peter Dolog, Katja Hose Mar 2024

Hypergraphs With Attention On Reviews For Explainable Recommendation, Theis E. Jendal, Trung Hoang Le, Hady Wirawan Lauw, Matteo Lissandrini, Peter Dolog, Katja Hose

Research Collection School Of Computing and Information Systems

Given a recommender system based on reviews, the challenges are how to effectively represent the review data and how to explain the produced recommendations. We propose a novel review-specific Hypergraph (HG) model, and further introduce a model-agnostic explainability module. The HG model captures high-order connections between users, items, aspects, and opinions while maintaining information about the review. The explainability module can use the HG model to explain a prediction generated by any model. We propose a path-restricted review-selection method biased by the user preference for item reviews and propose a novel explanation method based on a review graph. Experiments on …


Meta-Interpretive Learning With Reuse, Rong Wang, Jun Sun, Cong Tian, Zhenhua Duan Mar 2024

Meta-Interpretive Learning With Reuse, Rong Wang, Jun Sun, Cong Tian, Zhenhua Duan

Research Collection School Of Computing and Information Systems

Inductive Logic Programming (ILP) is a research field at the intersection between machine learning and logic programming, focusing on developing a formal framework for inductively learning relational descriptions in the form of logic programs from examples and background knowledge. As an emerging method of ILP, Meta-Interpretive Learning (MIL) leverages the specialization of a set of higher-order metarules to learn logic programs. In MIL, the input includes a set of examples, background knowledge, and a set of metarules, while the output is a logic program. MIL executes a depth-first traversal search, where its program search space expands polynomially with the number …


Temporal Implicit Multimodal Networks For Investment And Risk Management, Meng Kiat Gary Ang, Ee-Peng Lim Mar 2024

Temporal Implicit Multimodal Networks For Investment And Risk Management, Meng Kiat Gary Ang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Many deep learning works on financial time-series forecasting focus on predicting future prices/returns of individual assets with numerical price-related information for trading, and hence propose models designed for univariate, single-task, and/or unimodal settings. Forecasting for investment and risk management involves multiple tasks in multivariate settings: forecasts of expected returns and risks of assets in portfolios, and correlations between these assets. As different sources/types of time-series influence future returns, risks, and correlations of assets in different ways, it is also important to capture time-series from different modalities. Hence, this article addresses financial time-series forecasting for investment and risk management in a …


Pa2blo: Low-Power, Personalized Audio Badge, Hemanth Sabbella, Dulaj Sanjaya Weerakoon, Manoj Gulati, Archan Misra Mar 2024

Pa2blo: Low-Power, Personalized Audio Badge, Hemanth Sabbella, Dulaj Sanjaya Weerakoon, Manoj Gulati, Archan Misra

Research Collection School Of Computing and Information Systems

We present the hardware design and software pipeline for an ultra-low power device, in the form factor of a wearable badge, that supports energy efficient sensing, processing and wireless transfer of human voice commands and interactions. The proposed system, called PA2BLO, is envisioned to support both: (a) real-time, scalable, authorized voice based interaction and control of devices and appliances, and (b) longitudinal, low-power logging of natural voice interactions. PA2BLO in-troduces two key novel capabilities. First, it includes a low power, low-complexity voice authentication module that is able to reliably authenticate an authorized user only using low sampling rate (500 Hz) …


Xfuzz: Machine Learning Guided Cross-Contract Fuzzing, Yinxing Xue, Jiaming Ye, Wei Zhang, Jun Sun, Lei Ma, Haijun Wang, Jianjun Zhao Mar 2024

Xfuzz: Machine Learning Guided Cross-Contract Fuzzing, Yinxing Xue, Jiaming Ye, Wei Zhang, Jun Sun, Lei Ma, Haijun Wang, Jianjun Zhao

Research Collection School Of Computing and Information Systems

Smart contract transactions are increasingly interleaved by cross-contract calls. While many tools have been developed to identify a common set of vulnerabilities, the cross-contract vulnerability is overlooked by existing tools. Cross-contract vulnerabilities are exploitable bugs that manifest in the presence of more than two interacting contracts. Existing methods are however limited to analyze a maximum of two contracts at the same time. Detecting cross-contract vulnerabilities is highly non-trivial. With multiple interacting contracts, the search space is much larger than that of a single contract. To address this problem, we present xFuzz , a machine learning guided smart contract fuzzing framework. …


Community Similarity Based On User Profile Joins, Konstantinos Theocharidis, Hady Wirawan Lauw Mar 2024

Community Similarity Based On User Profile Joins, Konstantinos Theocharidis, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Similarity joins on multidimensional data are crucial operators for recommendation purposes. The classic ��-join problem finds all pairs of points within �� distance to each other among two ��-dimensional datasets. In this paper, we consider a novel and alternative version of ��-join named community similarity based on user profile joins (CSJ). The aim of CSJ problem is, given two communities having a set of ��-dimensional users, to find how similar are the communities by matching every single pair of users (a user can be matched with at most one other user) having an absolute difference of at most �� per …