Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Singapore Management University (2961)
- Wright State University (632)
- Walden University (447)
- Selected Works (287)
- New Jersey Institute of Technology (137)
-
- University of Nebraska at Omaha (119)
- California State University, San Bernardino (96)
- Old Dominion University (95)
- San Jose State University (85)
- University of Dayton (82)
- The University of Maine (67)
- City University of New York (CUNY) (65)
- University of Nebraska - Lincoln (54)
- Air Force Institute of Technology (53)
- SelectedWorks (53)
- Technological University Dublin (51)
- University of South Florida (50)
- Kennesaw State University (46)
- Nova Southeastern University (43)
- Claremont Colleges (42)
- University of Wisconsin Milwaukee (42)
- University of Arkansas, Fayetteville (41)
- Western Kentucky University (41)
- Dakota State University (39)
- Institute of Business Administration (38)
- California Polytechnic State University, San Luis Obispo (36)
- Western University (35)
- Ateneo de Manila University (34)
- Governors State University (34)
- Purdue University (34)
- Keyword
-
- Machine learning (101)
- Information technology (93)
- Data mining (89)
- Social media (78)
- Twitter (64)
-
- Machine Learning (57)
- Cybersecurity (54)
- Semantic Web (54)
- Deep learning (52)
- Artificial intelligence (49)
- Online learning (49)
- Information Technology (47)
- Classification (46)
- Cloud computing (45)
- Information retrieval (45)
- Privacy (45)
- Big data (44)
- Database (43)
- Ontology (43)
- Computer science (42)
- Information security (41)
- Algorithms (40)
- Security (40)
- Databases (39)
- Information systems (39)
- Management (37)
- Clustering (36)
- Data Mining (36)
- Northern Ohio Data and Information Service (NODIS) (36)
- Technology (35)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (2867)
- Kno.e.sis Publications (541)
- Walden Dissertations and Doctoral Studies (447)
- Theses and Dissertations (116)
- Dissertations (107)
-
- Computer Science Faculty Publications (91)
- Computer Science and Engineering Faculty Publications (91)
- Theses Digitization Project (84)
- Master's Projects (68)
- Information Systems and Quantitative Analysis Faculty Proceedings & Presentations (64)
- Electronic Theses and Dissertations (55)
- Dissertations and Theses Collection (Open Access) (50)
- Theses (46)
- USF Tampa Graduate Theses and Dissertations (46)
- CCE Theses and Dissertations (42)
- Information Systems and Quantitative Analysis Faculty Publications (41)
- Kyriakos MOURATIDIS (40)
- CGU Faculty Publications and Research (37)
- International Conference on Information and Communication Technologies (36)
- Open Educational Resources (34)
- Department of Information Systems & Computer Science Faculty Publications (33)
- All Capstone Projects (32)
- Graduate Theses and Dissertations (32)
- Masters Theses & Doctoral Dissertations (32)
- Articles (29)
- Conference papers (28)
- David LO (28)
- Journal of Spatial Information Science (28)
- All Maxine Goodman Levin School of Urban Affairs Publications (27)
- Saverio Perugini (25)
- Publication Type
Articles 751 - 780 of 6720
Full-Text Articles in Physical Sciences and Mathematics
Unified Route Planning For Shared Mobility: An Insertion-Based Framework, Yongxin Tong, Yuxiang Zeng, Zimu Zhou, Lei Chen, Ke. Xu
Unified Route Planning For Shared Mobility: An Insertion-Based Framework, Yongxin Tong, Yuxiang Zeng, Zimu Zhou, Lei Chen, Ke. Xu
Research Collection School Of Computing and Information Systems
There has been a dramatic growth of shared mobility applications such as ride-sharing, food delivery, and crowdsourced parcel delivery. Shared mobility refers to transportation services that are shared among users, where a central issue is route planning. Given a set of workers and requests, route planning finds for each worker a route, i.e., a sequence of locations to pick up and drop off passengers/parcels that arrive from time to time, with different optimization objectives. Previous studies lack practicability due to their conflicted objectives and inefficiency in inserting a new request into a route, a basic operation called insertion. In addition, …
Guided Attention Multimodal Multitask Financial Forecasting With Inter-Company Relationships And Global And Local News, Meng Kiat Gary Ang, Ee-Peng Lim
Guided Attention Multimodal Multitask Financial Forecasting With Inter-Company Relationships And Global And Local News, Meng Kiat Gary Ang, Ee-Peng Lim
Research Collection School Of Computing and Information Systems
Most works on financial forecasting use information directly associated with individual companies (e.g., stock prices, news on the company) to predict stock returns for trading. We refer to such company-specific information as local information. Stock returns may also be influenced by global information (e.g., news on the economy in general), and inter-company relationships. Capturing such diverse information is challenging due to the low signal-to-noise ratios, different time-scales, sparsity and distributions of global and local information from different modalities. In this paper, we propose a model that captures both global and local multimodal information for investment and risk management-related forecasting tasks. …
Uipdroid: Unrooted Dynamic Monitor Of Android App Uis For Fine-Grained Permission Control, Mulin Duan, Lingxiao Jiang, Lwin Khin Shar, Debin Gao
Uipdroid: Unrooted Dynamic Monitor Of Android App Uis For Fine-Grained Permission Control, Mulin Duan, Lingxiao Jiang, Lwin Khin Shar, Debin Gao
Research Collection School Of Computing and Information Systems
Proper permission controls in Android systems are important for protecting users' private data when running applications installed on the devices. Currently Android systems require apps to obtain authorization from users at the first time when they try to access users' sensitive data, but every permission is only managed at the application level, allowing apps to (mis)use permissions granted by users at the beginning for different purposes subsequently without informing users. Based on privacy-by-design principles, this paper develops a new permission manager, named UIPDroid, that (1) enforces the users' basic right-to-know through user interfaces whenever an app uses permissions, and (2) …
Do Pre-Trained Models Benefit Knowledge Graph Completion? A Reliable Evaluation And A Reasonable Approach, Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie Zhou
Do Pre-Trained Models Benefit Knowledge Graph Completion? A Reliable Evaluation And A Reasonable Approach, Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie Zhou
Research Collection School Of Computing and Information Systems
In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models. However, these models are still quite behind the SOTA KGC models in terms of performance. In this work, we find two main reasons for the weak performance: (1) Inaccurate evaluation setting. The evaluation setting under the closed-world assumption (CWA) may underestimate the PLM-based KGC models since they introduce more external knowledge; (2) Inappropriate utilization of PLMs. Most PLM-based KGC models simply splice the labels of entities and relations as inputs, leading to …
Translate-Train Embracing Translationese Artifacts, Sicheng Yu, Qianru Sun, Hao Zhang, Jing Jiang
Translate-Train Embracing Translationese Artifacts, Sicheng Yu, Qianru Sun, Hao Zhang, Jing Jiang
Research Collection School Of Computing and Information Systems
Translate-train is a general training approach to multilingual tasks. The key idea is to use the translator of the target language to generate training data to mitigate the gap between the source and target languages. However, its performance is often hampered by the artifacts in the translated texts (translationese). We discover that such artifacts have common patterns in different languages and can be modeled by deep learning, and subsequently propose an approach to conduct translate-train using Translationese Embracing the effect of Artifacts (TEA). TEA learns to mitigate such effect on the training data of a source language (whose original and …
Optimal In‐Place Suffix Sorting, Zhize Li, Jian Li, Hongwei Huo
Optimal In‐Place Suffix Sorting, Zhize Li, Jian Li, Hongwei Huo
Research Collection School Of Computing and Information Systems
The suffix array is a fundamental data structure for many applications that involve string searching and data compression. Designing time/space-efficient suffix array construction algorithms has attracted significant attention and considerable advances have been made for the past 20 years. We obtain the \emph{first} in-place suffix array construction algorithms that are optimal both in time and space for (read-only) integer alphabets. Concretely, we make the following contributions: 1. For integer alphabets, we obtain the first suffix sorting algorithm which takes linear time and uses only $O(1)$ workspace (the workspace is the total space needed beyond the input string and the output …
Two Project On Information Systems Capabilities And Organizational Performance, Giridhar Reddy Bojja
Two Project On Information Systems Capabilities And Organizational Performance, Giridhar Reddy Bojja
Masters Theses & Doctoral Dissertations
Information systems (IS), as a multi-disciplinary research area, emphasizes the complementary relationship between people, organizations, and technology and has evolved dramatically over the years. IS and the underlying Information Technology (IT) application and research play a crucial role in transforming the business world and research within the management domain. Consistent with this evolution and transformation, I develop a two-project dissertation on Information systems capabilities and organizational outcomes.
Project 1 examines the role of hospital operational effectiveness on the link between information systems capabilities and hospital performance. This project examines the cross-lagged effects on a sample of 217 hospitals measured over …
Mmekg: Multi-Modal Event Knowledge Graph Towards Universal Representation Across Modalities, Yubo Ma, Zehao Wang, Mukai Li, Yixin Cao, Meiqi Chen, Xinze Li, Wenqi Sun, Kunquan Deng, Kun Wang, Aixin Sun, Jing Shao
Mmekg: Multi-Modal Event Knowledge Graph Towards Universal Representation Across Modalities, Yubo Ma, Zehao Wang, Mukai Li, Yixin Cao, Meiqi Chen, Xinze Li, Wenqi Sun, Kunquan Deng, Kun Wang, Aixin Sun, Jing Shao
Research Collection School Of Computing and Information Systems
Events are fundamental building blocks of realworld happenings. In this paper, we present a large-scale, multi-modal event knowledge graph named MMEKG. MMEKG unifies different modalities of knowledge via events, which complement and disambiguate each other. Specifically, MMEKG incorporates (i) over 990 thousand concept events with 644 relation types to cover most types of happenings, and (ii) over 863 million instance events connected through 934 million relations, which provide rich contextual information in texts and/or images. To collect billion-scale instance events and relations among them, we additionally develop an efficient yet effective pipeline for textual/visual knowledge extraction system. We also develop …
Optimized Damage Assessment And Recovery Through Data Categorization In Critical Infrastructure System., Shruthi Ramakrishnan
Optimized Damage Assessment And Recovery Through Data Categorization In Critical Infrastructure System., Shruthi Ramakrishnan
Graduate Theses and Dissertations
Critical infrastructures (CI) play a vital role in majority of the fields and sectors worldwide. It contributes a lot towards the economy of nations and towards the wellbeing of the society. They are highly coupled, interconnected and their interdependencies make them more complex systems. Thus, when a damage occurs in a CI system, its complex interdependencies make it get subjected to cascading effects which propagates faster from one infrastructure to another resulting in wide service degradations which in turn causes economic and societal effects. The propagation of cascading effects of disruptive events could be handled efficiently if the assessment and …
Supervised Representation Learning For Improving Prediction Performance In Medical Decision Support Applications, Phawis Thammasorn
Supervised Representation Learning For Improving Prediction Performance In Medical Decision Support Applications, Phawis Thammasorn
Graduate Theses and Dissertations
Machine learning approaches for prediction play an integral role in modern-day decision supports system. An integral part of the process is extracting interest variables or features to describe the input data. Then, the variables are utilized for training machine-learning algorithms to map from the variables to the target output. After the training, the model is validated with either validation or testing data before making predictions with a new dataset. Despite the straightforward workflow, the process relies heavily on good feature representation of data. Engineering suitable representation eases the subsequent actions and copes with many practical issues that potentially prevent the …
A Novel Data Lineage Model For Critical Infrastructure And A Solution To A Special Case Of The Temporal Graph Reachability Problem, Ian Moncur
Graduate Theses and Dissertations
Rapid and accurate damage assessment is crucial to minimize downtime in critical infrastructure. Dependency on modern technology requires fast and consistent techniques to prevent damage from spreading while also minimizing the impact of damage on system users. One technique to assist in assessment is data lineage, which involves tracing a history of dependencies for data items. The goal of this thesis is to present one novel model and an algorithm that uses data lineage with the goal of being fast and accurate. In function this model operates as a directed graph, with the vertices being data items and edges representing …
Deep Depression Prediction On Longitudinal Data Via Joint Anomaly Ranking And Classification, Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton Van Den Hengel
Deep Depression Prediction On Longitudinal Data Via Joint Anomaly Ranking And Classification, Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton Van Den Hengel
Research Collection School Of Computing and Information Systems
A wide variety of methods have been developed for identifying depression, but they focus primarily on measuring the degree to which individuals are suffering from depression currently. In this work we explore the possibility of predicting future depression using machine learning applied to longitudinal socio-demographic data. In doing so we show that data such as housing status, and the details of the family environment, can provide cues for predicting future psychiatric disorders. To this end, we introduce a novel deep multi-task recurrent neural network to learn time-dependent depression cues. The depression prediction task is jointly optimized with two auxiliary anomaly …
Learning Transferable Perturbations For Image Captioning, Hanjie Wu, Yongtuo Liu, Hongmin Cai, Shengfeng He
Learning Transferable Perturbations For Image Captioning, Hanjie Wu, Yongtuo Liu, Hongmin Cai, Shengfeng He
Research Collection School Of Computing and Information Systems
Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate adversarial examples faster and stronger, we propose to learn the perturbations by a generative model that is governed by three novel loss functions. Image feature distortion loss is designed to maximize the encoded image feature distance between original images and the corresponding adversarial examples at the image domain, and local-global mismatching loss is introduced to separate the mapping encoding representation …
Storm The Capitol: Linking Offline Political Speech And Online Twitter Extra-Representational Participation On Qanon And The January 6 Insurrection, Claire Seungeun Lee, Juan Merizalde, John D. Colautti, Jisun An, Haewoon Kwak
Storm The Capitol: Linking Offline Political Speech And Online Twitter Extra-Representational Participation On Qanon And The January 6 Insurrection, Claire Seungeun Lee, Juan Merizalde, John D. Colautti, Jisun An, Haewoon Kwak
Research Collection School Of Computing and Information Systems
The transfer of power stemming from the 2020 presidential election occurred during an unprecedented period in United States history. Uncertainty from the COVID-19 pandemic, ongoing societal tensions, and a fragile economy increased societal polarization, exacerbated by the outgoing president's offline rhetoric. As a result, online groups such as QAnon engaged in extra political participation beyond the traditional platforms. This research explores the link between offline political speech and online extra-representational participation by examining Twitter within the context of the January 6 insurrection. Using a mixed-methods approach of quantitative and qualitative thematic analyses, the study combines offline speech information with Twitter …
Unified And Incremental Simrank: Index-Free Approximation With Scheduled Principle (Extended Abstract), Fanwei Zhu, Yuan Fang, Kai Zhang, Kevin Chen-Chuan Chang, Hongtai Cao, Zhen Jiang, Minghui Wu
Unified And Incremental Simrank: Index-Free Approximation With Scheduled Principle (Extended Abstract), Fanwei Zhu, Yuan Fang, Kai Zhang, Kevin Chen-Chuan Chang, Hongtai Cao, Zhen Jiang, Minghui Wu
Research Collection School Of Computing and Information Systems
SimRank is a popular link-based similarity measure on graphs. It enables a variety of applications with different modes of querying. In this paper, we propose UISim, a unified and incremental framework for all SimRank modes based on a scheduled approximation principle. UISim processes queries with incremental and prioritized exploration of the entire computation space, and thus allows flexible tradeoff of time and accuracy. On the other hand, it creates and shares common “building blocks” for online computation without relying on indexes, and thus is efficient to handle both static and dynamic graphs. Our experiments on various real-world graphs show that …
Detecting False Alarms From Automatic Static Analysis Tools: How Far Are We?, Hong Jin Kang, Khai Loong Aw, David Lo
Detecting False Alarms From Automatic Static Analysis Tools: How Far Are We?, Hong Jin Kang, Khai Loong Aw, David Lo
Research Collection School Of Computing and Information Systems
Automatic static analysis tools (ASATs), such as Findbugs, have a high false alarm rate. The large number of false alarms produced poses a barrier to adoption. Researchers have proposed the use of machine learning to prune false alarms and present only actionable warnings to developers. The state-of-the-art study has identified a set of “Golden Features” based on metrics computed over the characteristics and history of the file, code, and warning. Recent studies show that machine learning using these features is extremely effective and that they achieve almost perfect performance. We perform a detailed analysis to better understand the strong performance …
An Exploratory Study On Code Attention In Bert, Rishab Sharma, Fuxiang Chen, Fatemeh H. Fard, David Lo
An Exploratory Study On Code Attention In Bert, Rishab Sharma, Fuxiang Chen, Fatemeh H. Fard, David Lo
Research Collection School Of Computing and Information Systems
Many recent models in software engineering introduced deep neural models based on the Transformer architecture or use transformerbased Pre-trained Language Models (PLM) trained on code. Although these models achieve the state of the arts results in many downstream tasks such as code summarization and bug detection, they are based on Transformer and PLM, which are mainly studied in the Natural Language Processing (NLP) field. The current studies rely on the reasoning and practices from NLP for these models in code, despite the differences between natural languages and programming languages. There is also limited literature on explaining how code is modeled. …
Learning Semantically Rich Network-Based Multi-Modal Mobile User Interface Embeddings, Meng Kiat Gary Ang, Ee-Peng Lim
Learning Semantically Rich Network-Based Multi-Modal Mobile User Interface Embeddings, Meng Kiat Gary Ang, Ee-Peng Lim
Research Collection School Of Computing and Information Systems
Semantically rich information from multiple modalities - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. Moreover, each UI design is composed of inter-linked UI entities which support different functions of an application, e.g., a UI screen comprising a UI taskbar, a menu and multiple button elements. Existing UI representation learning methods unfortunately are not designed to capture multi-modal and linkage structure between UI entities. To support effective search and recommendation applications over mobile UIs, we need UI representations that integrate latent semantics present in both multi-modal information and linkages between …
Prompt For Extraction? Paie: Prompting Argument Interaction For Event Argument Extraction, Yubo Ma, Zehao Wang, Yixin Cao, Mukai Li, Meiqi Chen, Kun Wang, Jing Shao
Prompt For Extraction? Paie: Prompting Argument Interaction For Event Argument Extraction, Yubo Ma, Zehao Wang, Yixin Cao, Mukai Li, Meiqi Chen, Kun Wang, Jing Shao
Research Collection School Of Computing and Information Systems
In this paper, we propose an effective yet efficient model PAIE for both sentence-level and document-level Event Argument Extraction (EAE), which also generalizes well when there is a lack of training data. On the one hand, PAIE utilizes prompt tuning for extractive objectives to take the best advantages of Pre-trained Language Models (PLMs). It introduces two span selectors based on the prompt to select start/end tokens among input texts for each role. On the other hand, it captures argument interactions via multi-role prompts and conducts joint optimization with optimal span assignments via a bipartite matching loss. Also, with a flexible …
Neighbor-Anchoring Adversarial Graph Neural Networks (Extended Abstract), Zemin Liu, Yuan Fang, Yong Liu, Vincent W. Zheng
Neighbor-Anchoring Adversarial Graph Neural Networks (Extended Abstract), Zemin Liu, Yuan Fang, Yong Liu, Vincent W. Zheng
Research Collection School Of Computing and Information Systems
While graph neural networks (GNNs) exhibit strong discriminative power, they often fall short of learning the underlying node distribution for increased robustness. To deal with this, inspired by generative adversarial networks (GANs), we investigate the problem of adversarial learning on graph neural networks, and propose a novel framework named NAGNN (i.e., Neighbor-anchoring Adversarial Graph Neural Networks) for graph representation learning, which trains not only a discriminator but also a generator that compete with each other. In particular, we propose a novel neighbor-anchoring strategy, where the generator produces samples with explicit features and neighborhood structures anchored on a reference real node, …
Static Inference Meets Deep Learning: A Hybrid Type Inference Approach For Python, Yun Peng, Cuiyun Gao, Zongjie Li, Bowei Gao, David Lo, Qirun Zhang, Michael R. Lyu
Static Inference Meets Deep Learning: A Hybrid Type Inference Approach For Python, Yun Peng, Cuiyun Gao, Zongjie Li, Bowei Gao, David Lo, Qirun Zhang, Michael R. Lyu
Research Collection School Of Computing and Information Systems
Type inference for dynamic programming languages such as Python is an important yet challenging task. Static type inference techniques can precisely infer variables with enough static constraints but are unable to handle variables with dynamic features. Deep learning (DL) based approaches are feature-agnostic, but they cannot guarantee the correctness of the predicted types. Their performance significantly depends on the quality of the training data (i.e., DL models perform poorly on some common types that rarely appear in the training dataset). It is interesting to note that the static and DL-based approaches offer complementary benefits. Unfortunately, to our knowledge, precise type …
Automated Identification Of Libraries From Vulnerability Data: Can We Do Better?, Stefanus A. Haryono, Hong Jin Kang, Abhishek Sharma, Asankhaya Sharma, Andrew E. Santosa, Ming Yi Ang, David Lo
Automated Identification Of Libraries From Vulnerability Data: Can We Do Better?, Stefanus A. Haryono, Hong Jin Kang, Abhishek Sharma, Asankhaya Sharma, Andrew E. Santosa, Ming Yi Ang, David Lo
Research Collection School Of Computing and Information Systems
Software engineers depend heavily on software libraries and have to update their dependencies once vulnerabilities are found in them. Software Composition Analysis (SCA) helps developers identify vulnerable libraries used by an application. A key challenge is the identification of libraries related to a given reported vulnerability in the National Vulnerability Database (NVD), which may not explicitly indicate the affected libraries. Recently, researchers have tried to address the problem of identifying the libraries from an NVD report by treating it as an extreme multi-label learning (XML) problem, characterized by its large number of possible labels and severe data sparsity. As input, …
Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Wu Aoyu, Huan Wei, Huamin Qu
Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Wu Aoyu, Huan Wei, Huamin Qu
Research Collection School Of Computing and Information Systems
With the wide usage of data visualizations, a huge number of Scalable Vector Graphic (SVG)-based visualizations have been created and shared online. Accordingly, there has been an increasing interest in exploring how to retrieve perceptually similar visualizations from a large corpus, since it can beneft various downstream applications such as visualization recommendation. Existing methods mainly focus on the visual appearance of visualizations by regarding them as bitmap images. However, the structural information intrinsically existing in SVG-based visualizations is ignored. Such structural information can delineate the spatial and hierarchical relationship among visual elements, and characterize visualizations thoroughly from a new perspective. …
Causality-Based Neural Network Repair, Bing Sun, Jun Sun, Long H. Pham, Jie Shi
Causality-Based Neural Network Repair, Bing Sun, Jun Sun, Long H. Pham, Jie Shi
Research Collection School Of Computing and Information Systems
Neural networks have had discernible achievements in a wide range of applications. The wide-spread adoption also raises the concern of their dependability and reliability. Similar to traditional decision-making programs, neural networks can have defects that need to be repaired. The defects may cause unsafe behaviors, raise security concerns or unjust societal impacts. In this work, we address the problem of repairing a neural network for desirable properties such as fairness and the absence of backdoor. The goal is to construct a neural network that satisfies the property by (minimally) adjusting the given neural network's parameters (i.e., weights). Specifically, we propose …
Data Pricing In Machine Learning Pipelines, Zicun Cong, Xuan Luo, Jian Pei, Feida Zhu, Yong Zhang
Data Pricing In Machine Learning Pipelines, Zicun Cong, Xuan Luo, Jian Pei, Feida Zhu, Yong Zhang
Research Collection School Of Computing and Information Systems
Machine learning is disruptive. At the same time, machine learning can only succeed by collaboration among many parties in multiple steps naturally as pipelines in an eco-system, such as collecting data for possible machine learning applications, collaboratively training models by multiple parties and delivering machine learning services to end users. Data are critical and penetrating in the whole machine learning pipelines. As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating those many parties. In this article, we survey …
Adaptive Task Planning For Large-Scale Robotized Warehouses, Dingyuan Shi, Yongxin Tong, Zimu Zhou, Ke Xu, Wenzhe Tan, Hongbo Li
Adaptive Task Planning For Large-Scale Robotized Warehouses, Dingyuan Shi, Yongxin Tong, Zimu Zhou, Ke Xu, Wenzhe Tan, Hongbo Li
Research Collection School Of Computing and Information Systems
Robotized warehouses are deployed to automatically distribute millions of items brought by the massive logistic orders from e-commerce. A key to automated item distribution is to plan paths for robots, also known as task planning, where each task is to deliver racks with items to pickers for processing and then return the rack back. Prior solutions are unfit for large-scale robotized warehouses due to the inflexibility to time-varying item arrivals and the low efficiency for high throughput. In this paper, we propose a new task planning problem called TPRW, which aims to minimize the end-to-end makespan that incorporates the entire …
Exploring And Adapting Chinese Gpt To Pinyin Input Method, Minghuan Tan, Yong Dai, Duyu Tang, Zhangyin Feng, Guoping Huang, Jing Jiang, Jiwei Li, Shuming Shi
Exploring And Adapting Chinese Gpt To Pinyin Input Method, Minghuan Tan, Yong Dai, Duyu Tang, Zhangyin Feng, Guoping Huang, Jing Jiang, Jiwei Li, Shuming Shi
Research Collection School Of Computing and Information Systems
While GPT has become the de-facto method for text generation tasks, its application to pinyin input method remains unexplored. In this work, we make the first exploration to leverage Chinese GPT for pinyin input method. We find that a frozen GPT achieves state-of-the-art performance on perfect pinyin. However, the performance drops dramatically when the input includes abbreviated pinyin. A reason is that an abbreviated pinyin can be mapped to many perfect pinyin, which links to even larger number of Chinese characters. We mitigate this issue with two strategies, including enriching the context with pinyin and optimizing the training process to …
Cancel Culture: Who Or What Will Be Next?, Christine Trumper
Cancel Culture: Who Or What Will Be Next?, Christine Trumper
Honors Projects in Data Science
This paper utilizes Data Science and Applied Statistic techniques, to perform an analytical dive into Cancel Culture as it is referenced and used on Twitter. The research focuses on analyzing how Cancel Culture has affected the sentiment of Twitter, specifically how it impacts prominent topics in the media that have occurred between February 2021 to September 2021. The development of a topic and sentiment analysis will be based on 1,302,844 Tweets collected using Twitter’s API. Cancel Culture became popularized on social media in the past few years and there is little concrete information regarding its process and the demographics it …
Performance Comparison Of The Filesystem And Embedded Key-Value Databases, Jesse Hines, Nicholas Cunningham
Performance Comparison Of The Filesystem And Embedded Key-Value Databases, Jesse Hines, Nicholas Cunningham
Campus Research Day
A common scenario when developing local applications is storing many records and then retrieving them by ID. A developer can simply save the records as files or use an embedded database. Large numbers of files can slow down filesystems, but developers may want to avoid a dependency on an embedded database if it offers little benefit for their use case. We will compare the performance for the insert, update, get and delete operations and the space efficiency of storing records as files vs. using key-value embedded databases including RocksDB, LevelDB, Berkley DB, and SQLite.
Database Query Execution Through Virtual Reality, Logan Bateman, Marc Butler
Database Query Execution Through Virtual Reality, Logan Bateman, Marc Butler
Campus Research Day
Building database queries often requires technical knowledge of a query language. However, company employees, such as executives, managers, and others (outside of software research and development, generally) may not have the pre-required knowledge to accurately construct and execute database queries. This paper proposes an approach to constructing database queries using virtual reality. This approach utilizes natural hand or controller gestures which map to various components of building and visualizing database queries.