UMBC Information Systems Department
Permanent URI for this collectionhttp://hdl.handle.net/11603/51
Browse
Recent Submissions
Item On Testing and Debugging Quantum Software(2021-03-16) Miranskyy, Andriy; Zhang, Lei; Doliskani, JavadQuantum computers are becoming more mainstream. As more programmers are starting to look at writing quantum programs, they need to test and debug their code. In this paper, we discuss various use-cases for quantum computers, either standalone or as part of a System of Systems. Based on these use-cases, we discuss some testing and debugging tactics that one can leverage to ensure the quality of the quantum software. We also highlight quantum-computer-specific issues and list novel techniques that are needed to address these issues. The practitioners can readily apply some of these tactics to their process of writing quantum programs, while researchers can learn about opportunities for future work.Item On Testing Quantum Programs(IEEE, 2019-05) Miranskyy, Andriy; Zhang, LeiA quantum computer (QC) can solve many computational problems more efficiently than a classic one. The field of QCs is growing: companies (such as D-Wave, IBM, Google, and Microsoft) are building QC offerings. We position that software engineers should look into defining a set of software engineering practices that apply to QC's software. To start this process, we give examples of challenges associated with testing such software and sketch potential solutions to some of these challenges.Item Making Existing Software Quantum Safe: Lessons Learned(Elsevier, 2023-05-23) Zhang, Lei; Miranskyy, Andriy; Rjaibi, Walid; Stager, Greg; Gray, Michael; Peck, JohnThe software engineering community is facing challenges from quantum computers (QCs). In the era of quantum computing, Shor’s algorithm running on QCs can break asymmetric encryption algorithms that classical computers practically cannot. Though the exact date when QCs will become “dangerous” for practical problems is unknown, the consensus is that this future is near. Thus, the software engineering community needs to start making software ready for quantum attacks and ensure quantum safety proactively. We argue that the problem of evolving existing software to quantum-safe software is very similar to the Y2K bug. Thus, we leverage some best practices from the Y2K bug and propose our roadmap, called 7E, which gives developers a structured way to prepare for quantum attacks. It is intended to help developers start planning for the creation of new software and the evolution of cryptography in existing software. In this paper, we use a case study to validate the viability of 7E. Our software under study is the IBM Db2 database system. We upgrade the current cryptographic schemes to post-quantum cryptographic ones (using Kyber and Dilithium schemes) and report our findings and lessons learned. We show that the 7E roadmap effectively plans the evolution of existing software security features towards quantum safety, but it does require minor revisions. We incorporate our experience with IBM Db2 into the revised 7E roadmap. The U.S. Department of Commerce’s National Institute of Standards and Technology is finalizing the post-quantum cryptographic standard. The software engineering community needs to start getting prepared for the quantum advantage era. We hope that our experiential study with IBM Db2 and the 7E roadmap will help the community prepare existing software for quantum attacks in a structured manner.Item LLM-based Corroborating and Refuting Evidence Retrieval for Scientific Claim Verification(2025-03-11) Wang, Siyuan; Foulds, James; Gani, Md Osman; Pan, ShimeiIn this paper, we introduce CIBER (Claim Investigation Based on Evidence Retrieval), an extension of the Retrieval-Augmented Generation (RAG) framework designed to identify corroborating and refuting documents as evidence for scientific claim verification. CIBER addresses the inherent uncertainty in Large Language Models (LLMs) by evaluating response consistency across diverse interrogation probes. By focusing on the behavioral analysis of LLMs without requiring access to their internal information, CIBER is applicable to both white-box and black-box models. Furthermore, CIBER operates in an unsupervised manner, enabling easy generalization across various scientific domains. Comprehensive evaluations conducted using LLMs with varying levels of linguistic proficiency reveal CIBER's superior performance compared to conventional RAG approaches. These findings not only highlight the effectiveness of CIBER but also provide valuable insights for future advancements in LLM-based scientific claim verification.Item Is your quantum program bug-free?(ACM, 2020-09-18) Miranskyy, Andriy; Zhang, Lei; Doliskani, JavadQuantum computers are becoming more mainstream. As more programmers are starting to look at writing quantum programs, they face an inevitable task of debugging their code. How should the programs for quantum computers be debugged?In this paper, we discuss existing debugging tactics, used in developing programs for classic computers, and show which ones can be readily adopted. We also highlight quantum-computer-specific debugging issues and list novel techniques that are needed to address these issues. The practitioners can readily apply some of these tactics to their process of writing quantum programs, while researchers can learn about opportunities for future work.Item Introduction to Correlation and Regression Analysis(SAS Institute, 2008) Stockwell, IanSAS® has many tools that can be used for data analysis. From Freqs and Means to Tabulates and Univariates, SAS can present a synopsis of data values relatively easily. However, there is a difference between what the data are, and what the data mean. In order to take this next step, I would like to go beyond the basics and introduce correlation and hypothesis testing using regression models. A brief statistical background will be included, along with coding examples for correlation and linear regression.Item Integrating Frequency-Domain Representations with Low-Rank Adaptation in Vision-Language Models(2025-03-08) Khan, Md Azim; Gangopadhyay, Aryya; Wang, Jianwu; Erbacher, Robert F.Situational awareness applications rely heavily on real-time processing of visual and textual data to provide actionable insights. Vision language models (VLMs) have become essential tools for interpreting complex environments by connecting visual inputs with natural language descriptions. However, these models often face computational challenges, especially when required to perform efficiently in real environments. This research presents a novel vision language model (VLM) framework that leverages frequency domain transformations and low-rank adaptation (LoRA) to enhance feature extraction, scalability, and efficiency. Unlike traditional VLMs, which rely solely on spatial-domain representations, our approach incorporates Discrete Fourier Transform (DFT) based low-rank features while retaining pretrained spatial weights, enabling robust performance in noisy or low visibility scenarios. We evaluated the proposed model on caption generation and Visual Question Answering (VQA) tasks using benchmark datasets with varying levels of Gaussian noise. Quantitative results demonstrate that our model achieves evaluation metrics comparable to state-of-the-art VLMs, such as CLIP ViT-L/14 and SigLIP. Qualitative analysis further reveals that our model provides more detailed and contextually relevant responses, particularly for real-world images captured by a RealSense camera mounted on an Unmanned Ground Vehicle (UGV).Item Impact of increased anthropogenic Amazon wildfires on Antarctic Sea ice melt via albedo reduction(Cambridge University Press, 2025-03-10) Chakraborty, Sudip; Devnath, Maloy Kumar; Jabeli, Atefeh; Kulkarni, Chhaya; Boteju, Gehan; Wang, Jianwu; Janeja, VandanaThis study shows the impact of black carbon (BC) aerosol atmospheric rivers (AAR) on the Antarctic Sea ice retreat. We detect that a higher number of BC AARs arrived in the Antarctic region due to increased anthropogenic wildfire activities in 2019 in the Amazon compared to 2018. Our analyses suggest that the BC AARs led to a reduction in the sea ice albedo, increased the amount of sunlight absorbed at the surface, and a significant reduction of sea ice over the Weddell, Ross Sea (Ross), and Indian Ocean (IO) regions in 2019. The Weddell region experienced the largest amount of sea ice retreat ( km2) during the presence of BC AARs as compared to km2 during non-BC days. We used a suite of data science techniques, including random forest, elastic net regression, matrix profile, canonical correlations, and causal discovery analyses, to discover the effects and validate them. Random forest, elastic net regression, and causal discovery analyses show that the shortwave upward radiative flux or the reflected sunlight, temperature, and longwave upward energy from the earth are the most important features that affect sea ice extent. Canonical correlation analysis confirms that aerosol optical depth is negatively correlated with albedo, positively correlated with shortwave energy absorbed at the surface, and negatively correlated with Sea Ice Extent. The relationship is stronger in 2019 than in 2018. This study also employs the matrix profile and convolution operation of the Convolution Neural Network (CNN) to detect anomalous events in sea ice loss. These methods show that a higher amount of anomalous melting events were detected over the Weddell and Ross regions.Item Immutable Log Storage as a Service on Private and Public Blockchains(IEEE, 2023-01) Pourmajidi, William; Zhang, Lei; Steinbacher, John; Erwin, Tony; Miranskyy, AndriyService Level Agreements (SLA) are employed to ensure the performance of Cloud solutions. When a component fails, the importance of logs increases significantly. All departments may turn to logs to determine the cause of the issue and find the party at fault. The party at fault may be motivated to tamper with the logs to hide their role. We argue that the critical nature of Cloud logs calls for immutability and verification mechanism without the presence of a single trusted party. This article proposes such a mechanism by describing a blockchain-based log storage system, called Logchain, which can be integrated with existing private and public blockchain solutions. Logchain uses the immutability feature of blockchain to provide a tamper-resistance platform for log storage. Additionally, we propose a hierarchical structure to address blockchains’ scalability issues. To validate the mechanism, we integrate Logchain into Ethereum and IBM Blockchain. We show that the solution is scalable and perform the analysis of the cost of ownership to help a reader select an implementation that would address their needs. The Logchain's scalability improvement on a blockchain is achieved without any alteration of blockchains’ fundamental architecture. As shown in this work, it can function on private and public blockchains and, therefore, can be a suitable alternative for organizations that need a secure, immutable log storage platform.Item Immutable Log Storage as a Service(IEEE, 2019-05) Pourmajidi, William; Zhang, Lei; Steinbacher, John; Erwin, Tony; Miranskyy, AndriyLogs contain critical information about the quality of the rendered services on the Cloud and can be used as digital evidence. Hence, we argue that the critical nature of logs calls for immutability and verification mechanism without a presence of a single trusted party. In this paper, we propose a blockchain-based log system, called Logchain, which can be integrated with existing private and public blockchains. To validate the mechanism, we create Logchain as a Service (LCaaS) by integrating it with Ethereum public blockchain network. We show that the solution is scalable (being able to process 100 log files per second) and fast (being able to "seal" a log file in 23 seconds, on average).Item GPT's Devastated and LLaMA's Content: Emotion Representation Alignment in LLMs for Keyword-based Generation(2025-03-14) Choudhury, Shadab Hafiz; Kumar, Asha; Martin, Lara J.In controlled text generation using large language models (LLMs), gaps arise between the language model's interpretation and human expectations. We look at the problem of controlling emotions in keyword-based sentence generation for both GPT-4 and LLaMA-3. We selected four emotion representations: Words, Valence-Arousal-Dominance (VAD) dimensions expressed in both Lexical and Numeric forms, and Emojis. Our human evaluation looked at the Human-LLM alignment for each representation, as well as the accuracy and realism of the generated sentences. While representations like VAD break emotions into easy-to-compute components, our findings show that people agree more with how LLMs generate when conditioned on English words (e.g., "angry") rather than VAD scales. This difference is especially visible when comparing Numeric VAD to words. However, we found that converting the originally-numeric VAD scales to Lexical scales (e.g., +4.0 becomes "High") dramatically improved agreement. Furthermore, the perception of how much a generated sentence conveys an emotion is highly dependent on the LLM, representation type, and which emotion it is.Item Estimating the Costs to Mississippi Medicaid Attributable to Tobacco Using Paid Amounts to Providers for Tobacco-Related Illnesses(The Hilltop Institute, 2019-06-03) Woodcock, Cynthia; Stockwell, Ian; Middleton, Alice; Idala, David; Betley, CharlesResearch Objective: Estimating the costs of tobacco-related illness incurred by...Item Design as Hope: Reimagining Futures for Seemingly Doomed Problems(2025-03-13) Kim, JaeWon; Liu, Jiaying "Lizzy"; Pyle, Cassidy; Somanath, Sowmya; Popowski, Lindsay; Shen, Hua; Fiesler, Casey; Hayes, Gillian R.; Hiniker, Alexis; Ju, Wendy; Mueller, Florian "Floyd"; Arif, Ahmer; Kotturi, YasmineDesign has the power to cultivate hope, especially in the face of seemingly intractable societal challenges. This one-day workshop explores how design methodologies -- ranging from problem reframing to participatory, speculative, and critical design -- can empower research communities to drive meaningful real-world changes. By aligning design thinking with hope theory -- framework of viewing hope as "goal-directed," "pathways," and "agentic" thinking processes -- we aim to examine how researchers can move beyond focusing on harm mitigation and instead reimagine alternative futures. Through hands-on activities, participants will engage in problem reframing, develop a taxonomy of design methods related to hope, and explore how community-driven design approaches can sustain efforts toward societal and individual hope. The workshop also interrogates the ethical and practical boundaries of leveraging hope in design research. By the end of the session, participants will leave with concrete strategies for integrating a hopeful design approach into their research, as well as a network for ongoing collaboration. Ultimately, we position hopeful design not just as a practical tool for action and problem-solving but as a catalyst for cultivating resilience and envisioning transformative futures.Item Cost-Effective Care Coordination for People With Dementia at Home(Oxford University Press, 2020-05-01) Willink, Amber; Davis, Karen; Johnston, Deirdre M.; Black, Betty; Reuland, Melissa; Stockwell, Ian; Amjad, Halima; Lyketsos, Constantine G.; Samus, Quincy M.People with dementia (PWD) represent some of the highest-need and highest-cost individuals living in the community. Maximizing Independence (MIND) at Home is a potentially cost-effective and scalable home-based dementia care coordination program that uses trained, nonclinical community workers as the primary contact between the PWD and their care partner, supported by a multidisciplinary clinical team with expertise in dementia care.Cost of care management services based on actual time spent by care management personnel over first 12 months of MIND at Home intervention was calculated for 342 MIND at Home recipients from Baltimore, Maryland and surrounding areas participating in a Centers for Medicare and Medicaid Services (CMS) funded Health Care Innovation Award demonstration project. Difference-in-differences analysis of claims-based Medicaid spending of 120 dually-eligible MIND at Home participants with their propensity score matched comparison group (n = 360).The average cost per enrollee per month was $110, or $1,320 per annum. Medicaid expenditures of dually-eligible participants grew 1.12 percentage points per quarter more slowly than that of the matched comparison group. Most savings came from slower growth in inpatient and long-term nursing home use. Net of the cost of the 5-year MIND at Home intervention, 5-year Medicaid savings are estimated at $7,052 per beneficiary, a 1.12-fold return on investment.Managed care plans with the flexibility to engage community health workers could benefit from a low-cost, high-touch intervention to meet the needs of enrollees with dementia. Limitations for using and reimbursing community health workers exist in Medicare fee-for-service, which CMS should address to maximize benefit for PWD.Item Correlation to Causation: A Causal Deep Learning Framework for Arctic Sea Ice Prediction(2025-03-03) Hossain, Emam; Ferdous, Muhammad Hasan; Wang, Jianwu; Subramanian, Aneesh; Gani, Md OsmanTraditional machine learning and deep learning techniques rely on correlation-based learning, often failing to distinguish spurious associations from true causal relationships, which limits robustness, interpretability, and generalizability. To address these challenges, we propose a causality-driven deep learning framework that integrates Multivariate Granger Causality (MVGC) and PCMCI+ causal discovery algorithms with a hybrid deep learning architecture. Using 43 years (1979-2021) of daily and monthly Arctic Sea Ice Extent (SIE) and ocean-atmospheric datasets, our approach identifies causally significant factors, prioritizes features with direct influence, reduces feature overhead, and improves computational efficiency. Experiments demonstrate that integrating causal features enhances the deep learning model's predictive accuracy and interpretability across multiple lead times. Beyond SIE prediction, the proposed framework offers a scalable solution for dynamic, high-dimensional systems, advancing both theoretical understanding and practical applications in predictive modeling.Item Community laboratories in the United States: BioMakerspaces for life science learning(Sage, 2022-12-14) Walker, Justice T.; Stamato, Lydia; Asgarali-Hoffman, S. Nisa; Hamidi, Foad; Scheifele, Lisa Z.Informal learning environments play a critical role in science, technology, engineering, and mathematics learning across the lifespan and are consequential in informing public understanding and engagement. This can be difficult to accomplish in life science where expertise thresholds and logistics involved with handling biological materials can restrict access. Community laboratories are informal learning environments that provide access to the resources necessary to carry out pursuits using enabling biotechnologies. We investigate a group of these spaces in order to ascertain how this occurs—with specific attention to how material and intellectual resources are structured and shape learning. Using surveys and focus group interviews, we explore a group of these spaces located in the United States. We found that the spaces examined offer learning activities that are sufficiently scaffolded and flexible as to promote personalized and community-driven practice. We discuss these findings in relation to informal learning environment design and learningItem CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation(2025-03-19) Ahmed, Masud; Hasan, Zahid; Haque, Syed Arefinul; Faridee, Abu Zaher Md; Purushotham, Sanjay; You, Suya; Roy, NirmalyaTraditional transformer-based semantic segmentation relies on quantized embeddings. However, our analysis reveals that autoencoder accuracy on segmentation mask using quantized embeddings (e.g. VQ-VAE) is 8% lower than continuous-valued embeddings (e.g. KL-VAE). Motivated by this, we propose a continuous-valued embedding framework for semantic segmentation. By reformulating semantic mask generation as a continuous image-to-embedding diffusion process, our approach eliminates the need for discrete latent representations while preserving fine-grained spatial and semantic details. Our key contribution includes a diffusion-guided autoregressive transformer that learns a continuous semantic embedding space by modeling long-range dependencies in image features. Our framework contains a unified architecture combining a VAE encoder for continuous feature extraction, a diffusion-guided transformer for conditioned embedding generation, and a VAE decoder for semantic mask reconstruction. Our setting facilitates zero-shot domain adaptation capabilities enabled by the continuity of the embedding space. Experiments across diverse datasets (e.g., Cityscapes and domain-shifted variants) demonstrate state-of-the-art robustness to distribution shifts, including adverse weather (e.g., fog, snow) and viewpoint variations. Our model also exhibits strong noise resilience, achieving robust performance (≈ 95% AP compared to baseline) under gaussian noise, moderate motion blur, and moderate brightness/contrast variations, while experiencing only a moderate impact (≈ 90% AP compared to baseline) from 50% salt and pepper noise, saturation and hue shifts. Code available: this https URLItem Approximate Mean Value Analysis for multi-core systems(IEEE, 2015-07) Zhang, Lei; Down, Douglas G.Mean Value Analysis (MVA) has long been a standard approach for performance analysis of computer systems. While the exact load-dependent MVA algorithm is an efficient technique for computer system performance modeling, it fails to address several features of multi-core platforms. In addition, the load-dependent MVA algorithm suffers from numerical difficulties under heavy load conditions. The goal of our paper is to find an efficient and robust method which is easy to use in practice and also achieves accuracy for performance prediction for multi-core platforms. Our contributions are: We present a flow-equivalent performance model designed specifically to address multi-core computer systems. We identify the influence on the CPU demand of the effects of Dynamic Frequency Scaling (DFS) and Hyper-Threading Technology (HTT). We adopt an approximation technique to estimate resource demands to parameterize the MVA algorithm. We use a modified Conditional MVA (CMVA) algorithm to address the potential numerical instability. To validate the application of our method, we investigate a case study of an e-commerce web server which is equipped with diverse classes of user requests. We show that our method achieves better accuracy compared with other commonly used MVA algorithms.Item xIDS-EnsembleGuard: An Explainable Ensemble Learning-based Intrusion Detection System(2025-03-01) Adil, Muhammad; Jan, Mian Ahmad; Hakim, Safayat Bin; Song, Houbing; Jin, ZhanpengIn this paper, we focus on addressing the challenges of detecting malicious attacks in networks by designing an advanced Explainable Intrusion Detection System (xIDS). The existing machine learning and deep learning approaches have invisible limitations, such as potential biases in predictions, a lack of interpretability, and the risk of overfitting to training data. These issues can create doubt about their usefulness, transparency, and a decrease in trust among stakeholders. To overcome these challenges, we propose an ensemble learning technique called "EnsembleGuard." This approach uses the predicted outputs of multiple models, including tree-based methods (LightGBM, GBM, Bagging, XGBoost, CatBoost) and deep learning models such as LSTM (long short-term memory) and GRU (gated recurrent unit), to maintain a balance and achieve trustworthy results. Our work is unique because it combines both tree-based and deep learning models to design an interpretable and explainable meta-model through model distillation. By considering the predictions of all individual models, our meta-model effectively addresses key challenges and ensures both explainable and reliable results. We evaluate our model using well-known datasets, including UNSW-NB15, NSL-KDD, and CIC-IDS-2017, to assess its reliability against various types of attacks. During analysis, we found that our model outperforms both tree-based models and other comparative approaches in different attack scenarios.Item VIVAR: learning view-invariant embedding for video action recognition(SPIE, 2025-03-10) Hasan, Zahid; Ahmed, Masud; Faridee, Abu Zaher Md; Purushotham, Sanjay; Lee, Hyungtae; Kwon, Heesung; Roy, NirmalyaDeep learning has achieved state-of-the-art video action recognition (VAR) performance by comprehending action-related features from raw video. However, these models often learn to jointly encode auxiliary view (viewpoints and sensor properties) information with primary action features, leading to performance degradation under novel views and security concerns by revealing sensor types and locations. Here, we systematically study these shortcomings of VAR models and develop a novel approach, VIVAR, to learn view-invariant spatiotemporal action features removing view information. In particular, we leverage contrastive learning to separate actions and jointly optimize adversarial loss that aligns view distributions to remove auxiliary view information in the deep embedding space using the unlabeled synchronous multiview (MV) video to learn view-invariant VAR system. We evaluate VIVAR using our in-house large-scale time synchronous MV video dataset containing 10 actions with three angular viewpoints and sensors in diverse environments. VIVAR successfully captures view-invariant action features, improves inter and intra-action clusters’ quality, and outperforms SoTA models consistently with 8% more accuracy. We additionally perform extensive studies with our datasets, model architectures, multiple contrastive learning, and view distribution alignments to provide VIVAR insights. We open-source our code and dataset to facilitate further research in view-invariant systems.