UMBC Information Systems Department

Permanent URI for this collectionhttp://hdl.handle.net/11603/51

Browse

Recent Submissions

Now showing 1 - 20 of 1068
  • Item
    Assessing Annotation Accuracy in Ice Sheets Using Quantitative Metrics
    (IEEE, 2024-09-05) Tama, Bayu Adhi; Janeja, Vandana; Purushotham, Sanjay
    The increasing threat of sea level rise due to climate change necessitates a deeper understanding of ice sheet structures. This study addresses the need for accurate ice sheet data interpretation by introducing a suite of quantitative metrics designed to validate ice sheet annotation techniques. Focusing on both manual and automated methods, including ARESELP and its modified version, MARESELP, we assess their accuracy against expert annotations. Our methodology incorporates several computer vision metrics, traditionally under-utilized in glaciological research, to evaluate the continuity and connectivity of ice layer annotations. The results demonstrate that while manual annotations provide invaluable expert insights, automated methods, particularly MARESELP, improve layer continuity and alignment with expert labels.
  • Item
    Creating Geospatial Trajectories from Human Trafficking Text Corpora
    (2024-05-09) Karabatis, Saydeh N.; Janeja, Vandana
    Human trafficking is a crime that affects the lives of millions of people across the globe. Traffickers exploit the victims through forced labor, involuntary sex, or organ harvesting. Migrant smuggling could also be seen as a form of human trafficking when the migrant fails to pay the smuggler and is forced into coerced activities. Several news agencies and anti-trafficking organizations have reported trafficking survivor stories that include the names of locations visited along the trafficking route. Identifying such routes can provide knowledge that is essential to preventing such heinous crimes. In this paper we propose a Narrative to Trajectory (N2T) information extraction system that analyzes reported narratives, extracts relevant information through the use of Natural Language Processing (NLP) techniques, and applies geospatial augmentation in order to automatically plot trajectories of human trafficking routes. We evaluate N2T on human trafficking text corpora and demonstrate that our approach of utilizing data preprocessing and augmenting database techniques with NLP libraries outperforms existing geolocation detection methods.
  • Item
    Harnessing Feature Clustering For Enhanced Anomaly Detection With Variational Autoencoder And Dynamic Threshold
    (IEEE, 2024-09-05) Ale, Tolulope; Janeja, Vandana; Schlegel, Nicole-Jeanne
    We introduce an anomaly detection method for multivariate time series data with the aim of identifying critical periods and features influencing extreme climate events like snowmelt in the Arctic. This method leverages Variational Autoencoder (VAE) integrated with dynamic thresholding and correlationbased feature clustering. This framework enhances the VAE’s ability to identify localized dependencies and learn the temporal relationships in climate data, thereby improving the detection of anomalies as demonstrated by its higher F1-score on benchmark datasets. The study’s main contributions include the development of a robust anomaly detection method, improving feature representation within VAEs through clustering, and creating a dynamic threshold algorithm for localized anomaly detection. This method offers explainability of climate anomalies across different regions.
  • Item
    Deep Learning for Antarctic Sea Ice Anomaly Detection and Prediction: A Two-Module Framework
    (ACM, 2024-11-06) Devnath, Maloy Kumar; Chakraborty, Sudip; Janeja, Vandana
    The Antarctic sea ice cover plays a crucial role in regulating global climate and sea level rise. The recent retreat of the Antarctic Sea Ice Extent and the accelerated melting of ice sheets (which causes sea level rise) raise concerns about the impact of climate change. Understanding the spatial patterns of anomalous melting events in sea ice is crucial for improving climate models and predicting future sea level rise, as sea ice serves as a protective barrier for ice sheets. This paper proposes a two-module framework based on Deep Learning that utilizes satellite imagery to identify and predict non-anomalous and anomalous melting regions in Antarctic sea ice. The first module focuses on identifying non-anomalous and anomalous melting regions in the current day by analyzing the difference between consecutive satellite images over time. The second module then leverages the current day's information and predicts the next day's non-anomalous and anomalous melting regions. This approach aims to improve our ability to monitor and predict critical changes in the Antarctic sea ice cover.
  • Item
    Organizing for More Just and Inclusive Futures: A Community Discussion
    (ACM, 2024-11-13) Fernandes, Kim; Alharbi, Rahaf; Sum, Cella; Kameswaran, Vaishnav; Spektor, Franchesca; Thuppilikkat, Ashique Ali; Petterson, Adrian; Marathe, Megh; Hamidi, Foad; Chandra, Priyank
    This Special Interest Group brings together researchers and practitioners to examine the critical questions, innovative methods and emerging possibilities that arise from an orientation toward disability justice within CSCW research particularly and HCI research more broadly. We will focus on how digital technologies influence the ways disabled people organize and advocate for their rights, and how disabled people influence and configure technologies as well. By attending to the intersections of technology, disability justice, and social movements, we aim to explore how HCI and CSCW research can support the organizing efforts of disabled communities. This SIG emphasizes the ways in which disabled people and communities have been organizing and are continuing to organize in response to various forms of oppression. The SIG will provide a platform for scholars and activists to engage in conversations around technologies, disability justice, and social movements. By centering disability justice as a framework, we hope to foster a deeper understanding of how HCI and CSCW research can support and amplify the efforts of disabled communities. Participants will share their insights, collaborate on research ideas, and contribute to a collective vision of a more inclusive and justice-oriented HCI and CSCW. Through these discussions, we aim to generate actionable strategies for future research and practice in supporting organizing efforts.
  • Item
    Toward Transdisciplinary Approaches to Audio Deepfake Discernment
    (2024-11-08) Janeja, Vandana; Mallinson, Christine
    This perspective calls for scholars across disciplines to address the challenge of audio deepfake detection and discernment through an interdisciplinary lens across Artificial Intelligence methods and linguistics. With an avalanche of tools for the generation of realistic-sounding fake speech on one side, the detection of deepfakes is lagging on the other. Particularly hindering audio deepfake detection is the fact that current AI models lack a full understanding of the inherent variability of language and the complexities and uniqueness of human speech. We see the promising potential in recent transdisciplinary work that incorporates linguistic knowledge into AI approaches to provide pathways for expert-in-the-loop and to move beyond expert agnostic AI-based methods for more robust and comprehensive deepfake detection.
  • Item
    Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment
    (2024-10-23) Ghosh, Indrajeet; Chugh, Garvit; Faridee, Abu Zaher Md; Roy, Nirmalya
    Recent advancements in deep learning-based wearable human action recognition (wHAR) have improved the capture and classification of complex motions, but adoption remains limited due to the lack of expert annotations and domain discrepancies from user variations. Limited annotations hinder the model's ability to generalize to out-of-distribution samples. While data augmentation can improve generalizability, unsupervised augmentation techniques must be applied carefully to avoid introducing noise. Unsupervised domain adaptation (UDA) addresses domain discrepancies by aligning conditional distributions with labeled target samples, but vanilla pseudo-labeling can lead to error propagation. To address these challenges, we propose μDAR, a novel joint optimization architecture comprised of three functions: (i) consistency regularizer between augmented samples to improve model classification generalizability, (ii) temporal ensemble for robust pseudo-label generation and (iii) conditional distribution alignment to improve domain generalizability. The temporal ensemble works by aggregating predictions from past epochs to smooth out noisy pseudo-label predictions, which are then used in the conditional distribution alignment module to minimize kernel-based class-wise conditional maximum mean discrepancy (kCMMD) between the source and target feature space to learn a domain invariant embedding. The consistency-regularized augmentations ensure that multiple augmentations of the same sample share the same labels; this results in (a) strong generalization with limited source domain samples and (b) consistent pseudo-label generation in target samples. The novel integration of these three modules in μDAR results in a range of ≈4-12% average macro-F1 score improvement over six state-of-the-art UDA methods in four benchmark wHAR datasets
  • Item
    SERN: Simulation-Enhanced Realistic Navigation for Multi-Agent Robotic Systems in Contested Environments
    (2024-10-22) Hossain, Jumman; Dey, Emon; Chugh, Snehalraj; Ahmed, Masud; Anwar,Mohammad Saeid; Faridee, Abu Zaher Md; Hoppes, Jason; Trout, Theron; Basak, Anjon; Chowdhury, Rafidh; Mistry, Rishabh; Kim, Hyun; Freeman, Jade; Suri, Niranjan; Raglin, Adrienne; Busart, Carl; Gregory, Timothy; Ravi, Anuradha; Roy, Nirmalya
    The increasing deployment of autonomous systems in complex environments necessitates efficient communication and task completion among multiple agents. This paper presents SERN (Simulation-Enhanced Realistic Navigation), a novel framework integrating virtual and physical environments for real-time collaborative decision-making in multi-robot systems. SERN addresses key challenges in asset deployment and coordination through a bi-directional communication framework using the AuroraXR ROS Bridge. Our approach advances the SOTA through accurate real-world representation in virtual environments using Unity high-fidelity simulator; synchronization of physical and virtual robot movements; efficient ROS data distribution between remote locations; and integration of SOTA semantic segmentation for enhanced environmental perception. Our evaluations show a 15% to 24% improvement in latency and up to a 15% increase in processing efficiency compared to traditional ROS setups. Real-world and virtual simulation experiments with multiple robots demonstrate synchronization accuracy, achieving less than 5 cm positional error and under 2-degree rotational error. These results highlight SERN's potential to enhance situational awareness and multi-agent coordination in diverse, contested environments.
  • Item
    TS-ACL: A Time Series Analytic Continual Learning Framework for Privacy-Preserving and Class-Incremental Pattern Recognition
    (2024-10-21) Fan, Kejia; Li, Jiaxu; Lai, Songning; Lv, Linpu; Liu, Anfeng; Tang, Jianheng; Song, Houbing; Zhuang, Huiping
    Class-incremental Learning (CIL) in Time Series Classification (TSC) aims to incrementally train models using the streaming time series data that arrives continuously. The main problem in this scenario is catastrophic forgetting, i.e., training models with new samples inevitably leads to the forgetting of previously learned knowledge. Among existing methods, the replay-based methods achieve satisfactory performance but compromise privacy, while exemplar-free methods protect privacy but suffer from low accuracy. However, more critically, owing to their reliance on gradient-based update techniques, these existing methods fundamentally cannot solve the catastrophic forgetting problem. In TSC scenarios with continuously arriving data and temporally shifting distributions, these methods become even less practical. In this paper, we propose a Time Series Analytic Continual Learning framework, called TS-ACL. Inspired by analytical learning, TS-ACL transforms neural network updates into gradient-free linear regression problems, thereby fundamentally mitigating catastrophic forgetting. Specifically, employing a pre-trained and frozen feature extraction encoder, TS-ACL only needs to update its analytic classifier recursively in a lightweight manner that is highly suitable for real-time applications and large-scale data processing. Additionally, we theoretically demonstrate that the model obtained recursively through the TS-ACL is exactly equivalent to a model trained on the complete dataset in a centralized manner, thereby establishing the property of absolute knowledge memory. Extensive experiments validate the superior performance of our TS-ACL.
  • Item
    Tutorial on Causal Inference with Spatiotemporal Data
    (ACM, 2024-11-04) Ali, Sahara; Wang, Jianwu
    Spatiotemporal data, which captures how variables evolve across space and time, is ubiquitous in fields such as environmental science, epidemiology, and urban planning. However, identifying causal relationships in these datasets is challenging due to the presence of spatial dependencies, temporal autocorrelation, and confounding factors. This tutorial provides a comprehensive introduction to spatiotemporal causal inference, offering both theoretical foundations and practical guidance for researchers and practitioners. We explore key concepts such as causal inference frameworks, the impact of confounding in spatiotemporal settings, and the challenges posed by spatial and temporal dependencies. The paper covers synthetic spatiotemporal benchmark data generation, widely used spatiotemporal causal inference techniques, including regression-based, propensity score-based, and deep learning-based methods, and demonstrates their application using synthetic datasets. Through step-by-step examples, readers will gain a clear understanding of how to address common challenges and apply causal inference techniques to spatiotemporal data. This tutorial serves as a valuable resource for those looking to improve the rigor and reliability of their causal analyses in spatiotemporal contexts.
  • Item
    Week 2: Linear Classifiers, Logistic Regression, Bias-Variance Trade-off, and Regularization
    (2024) Rahman, Mohammad Saidur; Rahman, Mohammad Ishtiaque
    In this week, we will explore fundamental machine learning techniques that are widely used for classification tasks: Linear Classifiers and Logistic Regression. Additionally, we will cover core concepts like the Bias-Variance Trade-off and Regularization, which help in understanding the performance and generalization of machine learning models. These concepts are essential for building accurate and interpretable models that can classify data and predict outcomes in various fields. Understanding when and why to use these techniques is key to solving different types of problems in machine learning
  • Item
    QuasiNav: Asymmetric Cost-Aware Navigation Planning with Constrained Quasimetric Reinforcement Learning
    (2024-10-22) Hossain, Jumman; Faridee, Abu Zaher Md; Asher, Derrik; Freeman, Jade; Trout, Theron; Gregory, Timothy; Roy, Nirmalya
    Autonomous navigation in unstructured outdoor environments is inherently challenging due to the presence of asymmetric traversal costs, such as varying energy expenditures for uphill versus downhill movement. Traditional reinforcement learning methods often assume symmetric costs, which can lead to suboptimal navigation paths and increased safety risks in real-world scenarios. In this paper, we introduce QuasiNav, a novel reinforcement learning framework that integrates quasimetric embeddings to explicitly model asymmetric costs and guide efficient, safe navigation. QuasiNav formulates the navigation problem as a constrained Markov decision process (CMDP) and employs quasimetric embeddings to capture directionally dependent costs, allowing for a more accurate representation of the terrain. This approach is combined with adaptive constraint tightening within a constrained policy optimization framework to dynamically enforce safety constraints during learning. We validate QuasiNav across three challenging navigation scenarios-undulating terrains, asymmetric hill traversal, and directionally dependent terrain traversal-demonstrating its effectiveness in both simulated and real-world environments. Experimental results show that QuasiNav significantly outperforms conventional methods, achieving higher success rates, improved energy efficiency, and better adherence to safety constraints.
  • Item
    The Impact of Medicaid Expansion on Medicare Quality Measures
    (2024-11-05) Algrain, Hala; Cardosa, Elizabeth; Desai, Shekha; Fong, Eugene; Ringoir, Tanguy; Ashqar, Huthaifa
    The Affordable Care Act was signed into law in 2010, expanding Medicaid and improving access to care for millions of low-income Americans. Fewer uninsured individuals reduced the cost of uncompensated care, consequently improving the financial health of hospitals. We hypothesize that this amelioration in hospital finances resulted in a marked improvement of quality measures in states that chose to expand Medicaid. To our knowledge, the impact of Medicaid expansion on the Medicare population has not been investigated. Using a difference-in-difference analysis, we compare readmission rates for four measures from the Hospital Readmission Reduction Program: acute myocardial infarction, pneumonia, heart failure, and coronary artery bypass graft surgery. Our analysis provides evidence that between 2013 and 2021 expansion states improved hospital quality relative to non-expansion states as it relates to acute myocardial infarction readmissions (p = 0.015) and coronary artery bypass graft surgery readmissions (p = 0.039). Our analysis provides some evidence that expanding Medicaid improved hospital quality, as measured by a reduction in readmission rates. Using visualizations, we provide some evidence that hospital quality improved for the other two measures as well. We believe that a refinement of our estimation method and an improved dataset will increase our chances of finding significant results for these two other measures.
  • Item
    CMAD: Advancing Understanding of Geospatial Clusters of Anomalous Melt Events in Sea Ice Extent
    (ACM, 2024-11-22) Devnath, Maloy Kumar; Chakraborty, Sudip; Janeja, Vandana
    Traditional statistical analyses do not reveal the spatial locations and the temporal occurrences of clusters of anomalous events that are responsible for a significant loss of sea ice extent. To address this problem, we present a novel method named Convolution Matrix Anomaly Detection (CMAD). The onset and progression of clusters of anomalous melting events over the Antarctic Sea ice are studied as loss in sea ice extent, which are essentially negative values, where the traditional convolutional operation of the Convolutional Neural Network (CNN) approach is ineffective. CMAD is based on an inverse max pooling concept in the convolutional operation of CNN to address this gap. CMAD is developed to offer a solution without using a neural network, and unlike a full CNN, it doesn’t require any training or testing processes. Satellite images are utilized to establish the loss in the Antarctic region. Our analysis shows that anomalous melting patterns have significantly affected the Weddell and the Ross Sea regions more than any other regions of the Antarctic, consistent with the largest disappearance in sea ice extent over these two regions. These findings bolster the applicability of the inverse max pooling based CMAD in detecting the spatiotemporal evolution of clusters of anomalous melting events over the Antarctic region. The anomalous melting process was first noticed along the outer boundary of the sea ice extent in early September 2022 and gradually engulfed the entire sea ice region by February 2023 - in tandem with the scientific literature. These findings indicate that there is a necessity to delve deeper into the role of the anomalous melting process on sea ice retreat for a better understanding of the sea ice retreat process. The nature of the problem is to detect clusters of contiguous grids of anomalous melting events rather than detecting discrete grid points. CMAD’s ability to perform both data clustering and anomaly detection via the pooling operations allows for a more comprehensive analysis of sea ice melt patterns, facilitating the pinpointing of areas with potentially significant melt events. This method has the potential to apply in other fields of study where anomalous events are detected in clusters. The inverse max pooling concept has successfully detected clusters of anomalous events in sea ice and demonstrated the capability to detect anomalies with 87% accuracy in benchmark data. In contrast to well-established conventional methods such as DBSCAN, HDBSCAN, K-Means, Bisecting K-Means, BIRCH, Agglomerative Clustering, OPTICS, and Gaussian Mixtures, when applied to dynamic multidimensional data, CMADBenchmark (which is a variation of CMAD) exhibits superior capabilities in detecting extreme events. The comparative analysis reveals that CMADBenchmark outperforms these traditional approaches, showcasing its heightened sensitivity and efficacy in capturing significant variations within evolving multidimensional datasets over time. This heightens the detection accuracy positions of CMAD as a valuable tool for discerning extreme events in the context of dynamic and changing multidimensional data.
  • Item
    Exploring Affective Dimension Perception from Bodily Expressions and Electrodermal Activity in Paramedic Simulation Training
    (IEEE, 2022-11-25) Surely, Akiri; Taherzadeh, Sanaz; Joshi, Vasundhara; Kleinsmith, Andrea
    Paramedics are often involved in varied and complex, emotionally provoking emergency calls which can result in difficulty controlling their affective experience. As a result, their internal physiological state may “leak” out through their external visual and auditory behaviors which can affect patient care. This research aims to identify how this ‘leakage’ may be perceived by observers, what commonalities exist in how the affective dimensions seem to be expressed through the body and the relationship between these dimensions and trainees' electrodermal activity (EDA). We conducted a preliminary study with a small set of knowledgeable observers to continuously rate trainees' valence, arousal and dominance from behavioral data in thin slices of simulation videos. We analyzed the relationship between trainees' EDA and observers' independent ratings. Our findings show a significant agreement on and correlation between the observers' ratings for all dimensions and preliminary modeling indicates a significant relationship.
  • Item
    ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
    (2024-10-21) Khanjani, Zahra; Mallinson, Christine; Foulds, James; Janeja, Vandana
    Spoofed audio, i.e. audio that is manipulated or AI-generated deepfake audio, is difficult to detect when only using acoustic features. Some recent innovative work involving AI-spoofed audio detection models augmented with phonetic and phonological features of spoken English, manually annotated by experts, led to improved model performance. While this augmented model produced substantial improvements over traditional acoustic features based models, a scalability challenge motivates inquiry into auto labeling of features. In this paper we propose an AI framework, Audio-Linguistic Data Augmentation for Spoofed audio detection (ALDAS), for auto labeling linguistic features. ALDAS is trained on linguistic features selected and extracted by sociolinguistics experts; these auto labeled features are used to evaluate the quality of ALDAS predictions. Findings indicate that while the detection enhancement is not as substantial as when involving the pure ground truth linguistic features, there is improvement in performance while achieving auto labeling. Labels generated by ALDAS are also validated by the sociolinguistics experts.
  • Item
    ETBP-TD: An Efficient and Trusted Bilateral Privacy-Preserving Truth Discovery Scheme for Mobile Crowdsensing
    (IEEE, 2024-10-31) Bai, Jing; Gui, Jinsong; Wang, Tian; Song, Houbing; Liu, Anfeng; Xiong, Neal N.
    Mobile Crowdsensing (MCS) has emerged as a promising sensing paradigm for accomplishing large-scale tasks by leveraging ubiquitously distributed mobile workers. Due to the variability in sensory data provided by different workers, identifying truth values from them has garnered wide attention. However, existing truth discovery schemes either offer limited privacy protection or incur high participation costs and lower data aggregation quality due to malicious workers. In this paper, we propose an Efficient and Trusted Bilateral Privacy-preserving Truth Discovery scheme (ETBP-TD) to obtain high-quality truth values while preventing privacy leakage from both workers and the data requester. Specifically, a matrix encryption-based protocol is introduced to the whole truth discovery process, which keeps locations and data related to tasks and workers secret from other entries. Additionally, trust-based worker recruitment and trust update mechanisms are first integrated within a privacy-preserving truth discovery scheme to enhance truth value accuracy and reduce unnecessary participation costs. Our theoretical analyses on the security and regret of ETBP-TD, along with extensive simulations on real-world datasets, demonstrate that ETBP-TD effectively preserves workers' and tasks' privacy while reducing the estimated error by up to 84.40% and participation cost by 54.72%.
  • Item
    Parsing Post-deployment Students’ Feedback: Towards a Student-Centered Intelligent Monitoring System to Support Self-regulated Learning
    (Springer Nature, 2024-07-02) Javiya, Prachee; Kleinsmith, Andrea; Karen Chen, Lujie; Fritz, John
    A student-centered intelligent monitoring system collects data from students and provides insights and feedback to students about various aspects of the learning process. It often leverages data collected from educational technology systems, such as Learning Management Systems (LMS), to support students’ self-regulated learning. A well-designed system needs to strike a delicate balance between the power of machine intelligence (e.g., automatic characterization and inference of students’ behaviors) and the need to promote students’ agency (e.g., the desire to be responsible and take control of their own behaviors). This paper presents a comprehensive qualitative analysis of anonymous student survey data collected from over 500 students with experience with a Learning Activity Monitoring System (LAMS), which has been in operation for over a decade in a public minority-serving higher education institute in the US. The study offers valuable insights into the effectiveness and user perceptions of LAMS they use in their learning context. This analysis reveals the sense-making process or lack thereof, and its implication in exploiting the potential utility of the LAMS. The findings also highlight students’ varied expectations and requirements, providing critical insights for the ongoing development and refinement of the LAMS system toward an intelligent monitoring system that truly centers students’ agency and promotes self-regulated learning. This study contributes to the growing body of work in hearing and understanding students’ genuine voices and the sparse literature of large-scale qualitative analysis of students’ feedback during the post-deployment phase on student-facing data-driven monitoring systems in an ecologically valid context in higher education.
  • Item
    Automating Quantum Software Maintenance: Flakiness Detection and Root Cause Analysis
    (2024-10-31) Sivaloganathan, Janakan; Jamshidi, Ainaz; Miranskyy, Andriy; Zhang, Lei
    Flaky tests, which pass or fail inconsistently without code changes, are a major challenge in software engineering in general and in quantum software engineering in particular due to their complexity and probabilistic nature, leading to hidden issues and wasted developer effort. We aim to create an automated framework to detect flaky tests in quantum software and an extended dataset of quantum flaky tests, overcoming the limitations of manual methods. Building on prior manual analysis of 14 quantum software repositories, we expanded the dataset and automated flaky test detection using transformers and cosine similarity. We conducted experiments with Large Language Models (LLMs) from the OpenAI GPT and Meta LLaMA families to assess their ability to detect and classify flaky tests from code and issue descriptions. Embedding transformers proved effective: we identified 25 new flaky tests, expanding the dataset by 54%. Top LLMs achieved an F1-score of 0.8871 for flakiness detection but only 0.5839 for root cause identification. We introduced an automated flaky test detection framework using machine learning, showing promising results but highlighting the need for improved root cause detection and classification in large quantum codebases. Future work will focus on improving detection techniques and developing automatic flaky test fixes.