UMBC College of Engineering and Information Technology Dean's Office
Permanent URI for this collectionhttp://hdl.handle.net/11603/7919
Browse
Recent Submissions
Item LLM-based Corroborating and Refuting Evidence Retrieval for Scientific Claim Verification(2025-03-11) Wang, Siyuan; Foulds, James; Gani, Md Osman; Pan, ShimeiIn this paper, we introduce CIBER (Claim Investigation Based on Evidence Retrieval), an extension of the Retrieval-Augmented Generation (RAG) framework designed to identify corroborating and refuting documents as evidence for scientific claim verification. CIBER addresses the inherent uncertainty in Large Language Models (LLMs) by evaluating response consistency across diverse interrogation probes. By focusing on the behavioral analysis of LLMs without requiring access to their internal information, CIBER is applicable to both white-box and black-box models. Furthermore, CIBER operates in an unsupervised manner, enabling easy generalization across various scientific domains. Comprehensive evaluations conducted using LLMs with varying levels of linguistic proficiency reveal CIBER's superior performance compared to conventional RAG approaches. These findings not only highlight the effectiveness of CIBER but also provide valuable insights for future advancements in LLM-based scientific claim verification.Item Integrating Frequency-Domain Representations with Low-Rank Adaptation in Vision-Language Models(2025-03-08) Khan, Md Azim; Gangopadhyay, Aryya; Wang, Jianwu; Erbacher, Robert F.Situational awareness applications rely heavily on real-time processing of visual and textual data to provide actionable insights. Vision language models (VLMs) have become essential tools for interpreting complex environments by connecting visual inputs with natural language descriptions. However, these models often face computational challenges, especially when required to perform efficiently in real environments. This research presents a novel vision language model (VLM) framework that leverages frequency domain transformations and low-rank adaptation (LoRA) to enhance feature extraction, scalability, and efficiency. Unlike traditional VLMs, which rely solely on spatial-domain representations, our approach incorporates Discrete Fourier Transform (DFT) based low-rank features while retaining pretrained spatial weights, enabling robust performance in noisy or low visibility scenarios. We evaluated the proposed model on caption generation and Visual Question Answering (VQA) tasks using benchmark datasets with varying levels of Gaussian noise. Quantitative results demonstrate that our model achieves evaluation metrics comparable to state-of-the-art VLMs, such as CLIP ViT-L/14 and SigLIP. Qualitative analysis further reveals that our model provides more detailed and contextually relevant responses, particularly for real-world images captured by a RealSense camera mounted on an Unmanned Ground Vehicle (UGV).Item Delivery of Tempol from Polyurethane Nanocapsules to Address Oxidative Stress Post-Injury(ACS, 2025-02-08) Ale, Temitope; Ale, Tolulope; Baker, Kimberly J.; Zuniga, Kameel M.; Hutcheson, Jack; Lavik, ErinTraumatic brain injuries (TBIs) result in significant morbidity and mortality due to the cascade of secondary injuries involving oxidative stress and neuroinflammation. The development of effective therapeutic strategies to mitigate these effects is critical. This study explores the fabrication and characterization of polyurethane nanocapsules for the sustained delivery of Tempol, a potent antioxidant. The nanocapsules were designed to extend the release of Tempol over a 30-day period, addressing the prolonged oxidative stress observed post-TBI. Tempol-loaded polyurethane nanocapsules were synthesized using interfacial polymerization and nanoemulsion techniques. Two generations of nanocapsules were produced, differing in Tempol loading and PEGylation levels. The first generation, with lower Tempol loading, exhibited an average size of 159.8 ± 12.61 nm and a Z-average diameter of 771.9 ± 87.95 nm. The second generation, with higher Tempol loading, showed an average size of 141.4 ± 6.13 nm and a Z-average diameter of 560.7 ± 171.1 nm. The zeta potentials were ?18.9 ± 5.02 mV and ?11.9 ± 3.54 mV for the first and second generations, respectively. Both generations demonstrated the presence of urethane linkages, confirmed by Fourier Transform Infrared Spectroscopy (FTIR). Loading studies revealed Tempol concentrations of 61.94 ± 3.04 ?g/mg for the first generation and 77.61 ± 3.04 ?g/mg for the second generation nanocapsules. Release profiles indicated an initial burst followed by a sustained, nearly linear release over 30 days. The higher PEGylation in the second generation nanocapsules is advantageous for intravenous administration, potentially enhancing their therapeutic efficacy in TBI treatment. This study demonstrates the feasibility of using polyurethane nanocapsules for the prolonged delivery of Tempol, offering a promising approach to manage oxidative stress and improve outcomes in TBI patients. Future work will include testing these nanocapsules in vivo to determine their potential at modulating recovery from TBI.Item Bioconjugates for Cancer Prevention: Opportunities for Impact(ACS, 2024-08-21) Lavik, Erin; Minasian, LoriCancer prevention encompasses both screening strategies to find cancers early when they are likely to be most treatable and prevention and interception strategies to reduce the risk of developing cancers. Bioconjugates, here defined broadly as materials and molecules that have synthetic and biological components, have roles to play across the cancer-prevention spectrum. In particular, bioconjugates may be developed as affordable, accessible, and effective screening strategies or as novel vaccines and drugs to reduce one’s risk of developing cancers. Developmental programs are available for taking novel technologies and evaluating them for clinical use in cancer screening and prevention. While a variety of different challenges exist in implementing cancer-prevention interventions, a thoughtful approach to bioconjugates could improve the delivery and acceptability of the interventions.Item Polyurethane Nanocapsules Incorporating Epigallocatechin Gallate, A Green Tea Extract(Wiley, 2025-02-26) Ale, Temitope; Ghunney, Nhyira; Pandala, Narendra; Tucker, Budd; McFadden, Kassandra; Hutcheson, Jack; Lavik, ErinExplosions cause 79% of combat-related injuries, often leading to traumatic brain injury (TBI) and hemorrhage. Epigallocatechin gallate (EGCG), a green tea polyphenol, aids neuroprotection and wound healing. In this work, we sought to investigate the fabrication and characterization of polyurethane nanocapsules encapsulating EGCG, demonstrating controlled, on-demand release, and highlighting their potential for targeted therapeutic delivery in trauma care.Item Delivery of Tempol from Polyurethane Nanocapsules to Address Oxidative Stress Post-Injury(ACS, 2025-02-19) Ale, Temitope; Ale, Tolulope; Baker, Kimberly J.; Zuniga, Kameel M.; Hutcheson, Jack; Lavik, ErinTraumatic brain injuries (TBIs) result in significant morbidity and mortality due to the cascade of secondary injuries involving oxidative stress and neuroinflammation. The development of effective therapeutic strategies to mitigate these effects is critical. This study explores the fabrication and characterization of polyurethane nanocapsules for the sustained delivery of Tempol, a potent antioxidant. The nanocapsules were designed to extend the release of Tempol over a 30-day period, addressing the prolonged oxidative stress observed post-TBI. Tempol-loaded polyurethane nanocapsules were synthesized using interfacial polymerization and nanoemulsion techniques. Two generations of nanocapsules were produced, differing in Tempol loading and PEGylation levels. The first generation, with lower Tempol loading, exhibited an average size of 159.8 ± 12.61 nm and a Z-average diameter of 771.9 ± 87.95 nm. The second generation, with higher Tempol loading, showed an average size of 141.4 ± 6.13 nm and a Z-average diameter of 560.7 ± 171.1 nm. The zeta potentials were -18.9 ± 5.02 mV and -11.9 ± 3.54 mV for the first and second generations, respectively. Both generations demonstrated the presence of urethane linkages, confirmed by Fourier Transform Infrared Spectroscopy (FTIR). Loading studies revealed Tempol concentrations of 61.94 ± 3.04 μg/mg for the first generation and 77.61 ± 3.04 μg/mg for the second generation nanocapsules. Release profiles indicated an initial burst followed by a sustained, nearly linear release over 30 days. The higher PEGylation in the second generation nanocapsules is advantageous for intravenous administration, potentially enhancing their therapeutic efficacy in TBI treatment. This study demonstrates the feasibility of using polyurethane nanocapsules for the prolonged delivery of Tempol, offering a promising approach to manage oxidative stress and improve outcomes in TBI patients. Future work will include testing these nanocapsules in vivo to determine their potential at modulating recovery from TBI.Item Advancing climate model interpretability: Feature attribution for Arctic melt anomalies(2025-02-11) Ale, Tolulope; Schlegel, Nicole-Jeanne; Janeja, VandanaThe focus of our work is improving the interpretability of anomalies in climate models and advancing our understanding of Arctic melt dynamics. The Arctic and Antarctic ice sheets are experiencing rapid surface melting and increased freshwater runoff, contributing significantly to global sea level rise. Understanding the mechanisms driving snowmelt in these regions is crucial. ERA5, a widely used reanalysis dataset in polar climate studies, offers extensive climate variables and global data assimilation. However, its snowmelt model employs an energy imbalance approach that may oversimplify the complexity of surface melt. In contrast, the Glacier Energy and Mass Balance (GEMB) model incorporates additional physical processes, such as snow accumulation, firn densification, and meltwater percolation/refreezing, providing a more detailed representation of surface melt dynamics. In this research, we focus on analyzing surface snowmelt dynamics of the Greenland Ice Sheet using feature attribution for anomalous melt events in ERA5 and GEMB models. We present a novel unsupervised attribution method leveraging counterfactual explanation method to analyze detected anomalies in ERA5 and GEMB. Our anomaly detection results are validated using MEaSUREs ground-truth data, and the attributions are evaluated against established feature ranking methods, including XGBoost, Shapley values, and Random Forest. Our attribution framework identifies the physics behind each model and the climate features driving melt anomalies. These findings demonstrate the utility of our attribution method in enhancing the interpretability of anomalies in climate models and advancing our understanding of Arctic melt dynamics.Item Polyurethane Nanocapsules Incorporating Epigallocatechin Gallate, A Green Tea Extract(Wiley, 2025-02-26) Ale, Temitope; Ghunney, Nhyira; Pandala, Narendra; Tucker, Budd; McFadden, Kassandra; Hutcheson, Jack; Lavik, ErinExplosions cause 79% of combat-related injuries, often leading to traumatic brain injury (TBI) and hemorrhage. Epigallocatechin gallate (EGCG), a green tea polyphenol, aids neuroprotection and wound healing. In this work, we sought to investigate the fabrication and characterization of polyurethane nanocapsules encapsulating EGCG, demonstrating controlled, on-demand release, and highlighting their potential for targeted therapeutic delivery in trauma care.Item Strengthening Workforce Education: Excellence in Programming Securely (SWEEPS)(ACM, 2025-02-18) Kariuki, Deborah; Ngambeki, Ida; Dai, Jun; Bishop, Matt; Sun, Xiaoyan; Dark, Melissa; Daugherty, Jenny; Lowrie, Alex; Geissler, Markus; Nico, Phillip; Noor, ArshadThis paper presents and advocates for an initiative to expand access to secure programming education. The Strengthening Workforce Education: Excellence in Programming Securely (SWEEPS) initiative, funded by the National Centers of Academic Excellence in Cybersecurity (NCAE-C) program, seeks to advance secure programming and help achieve security aims. SWEEPS establishes a secure programming curriculum and workforce development coalition of seven institutions across two CAE (Center of Academic Excellence) regions (Northeast and Southwest) and five states (California, Massachusetts, Maryland, Indiana, and North Carolina). This coalition includes industry-based stakeholders collaborating with the US Army and government agencies on various projects. SWEEPS draws on prior work establishing critical concepts in secure programming, assessment tools, learning aids, and system infrastructure. The initiative offers a series of interconnected, stackable learning experiences tailored for early to mid-career professionals looking to enhance their cybersecurity skills. These experiences, which include practical one-day workshops and comprehensive year-long graduate certificates, provide a reassuring path for upskilling in secure programming. This paper recommends the efficacy of stackable training approaches in secure programming by exploring the practices of targeting and training individuals with diverse proficiency levels of programming experience who would benefit from increased knowledge and training.Item A Beam-Search Based Method to Select Classification and Imputation Methods for Fair and Accurate Data Analysis(IEEE, 2024-12) Mowoh, Dodavah; Chen, ZhiyuanMembers from disadvantaged or minority groups are often more likely to have missing values in their record. Imputation is a common approach to deal with missing values before the data is being analyzed. Several studies have found interplay of imputation methods and classification methods with respect to accuracy and fairness: different combinations of imputation and classification methods will lead to different accuracy and fairness results. However, it is unclear how to choose the combination of imputation method and classification method to optimize the tradeoff between accuracy and fairness. An exhaustive search approach will be too expensive because it needs to check all combinations and measure both accuracy and fairness for every combination. This paper proposes a beam-search based method to select the optimal combination of imputation methods and classification methods. An empirical study was also conducted to compare the performance of the proposed method to exhaustive search. The proposed solution achieves the same result as the exhaustive search method but with much lower search cost.Item Qualitative Research Methods in Software Engineering: Past, Present, and Future(IEEE, 2025) Seaman, Carolyn; Hoda, Rashina; Feldt, RobertThe paper entitled “Qualitative Methods in Empirical Studies of Software Engineering” by Carolyn Seaman was published in TSE in 1999. It has been chosen as one of the most influential papers from the third decade of TSE’s 50 years history. In this retrospective, the authors discuss the evolution of the use of qualitative methods in software engineering research, the impact it’s had on research and practice, and reflections on what is coming and deserves attention.Item A LSTM with Dual-stage Attention Method to Predict Amine Emissions for Carbon Dioxide Capture and Storage(IEEE, 2025-01-16) Rapelli, Sai Rajesh; Chen, Zhiyuan; Lu, WeiTo mitigate climate change impacts, carbon capture technologies have been implemented at significant CO2 emission points, such as industrial sites and electric power generation facilities. Solvent-based carbon capture solutions are pivotal in reducing atmospheric CO2 levels and enhancing air quality by capturing harmful pollutants. Amine-based solvents, favored for their efficiency in post-combustion CO2 capture, are susceptible to thermal and oxidative degradation, leading to complex emissions profiles that demand comprehensive management strategies. We develop a Machine Learning model designed to predict future amine emissions in real-time, thereby assisting in the formulation of mitigation strategies required for the operation of capture plants. We conducted an experiment using data from test campaigns run at the Technology Centre Mongstad (TCM). We employed a Long Short-Term Memory (LSTM) autoencoder model with dual-stage attention mechanisms to predict amine emissions using historical data. The results were quite promising: we achieved a mean absolute percentage error ranging from 5.8% to 6.8% percent for the real-time prediction of amine emissions. The results are better than existing approaches using simpler machine learning models as well as the standard LSTM autoencoder model.Item Can Generative AI be Egalitarian?(IEEE, 2024-10) Feldman, Philip; Foulds, James; Pan, ShimeiThe recent explosion of “foundation” generative AI models has been built upon the extensive extraction of value from online sources, often without corresponding reciprocation. This pattern mirrors and intensifies the extractive practices of surveillance capitalism [46], while the potential for enormous profit has challenged technology organizations’ commitments to responsible AI practices, raising significant ethical and societal concerns. However, a promising alternative is emerging: the development of models that rely on content willingly and collaboratively provided by users. This article explores this “egalitarian” approach to generative AI, taking inspiration from the successful model of Wikipedia. We explore the potential implications of this approach for the design, development, and constraints of future foundation models. We argue that such an approach is not only ethically sound but may also lead to models that are more responsive to user needs, more diverse in their training data, and ultimately more aligned with societal values. Furthermore, we explore potential challenges and limitations of this approach, including issues of scalability, quality control, and potential biases inherent in volunteercontributed content.Item Fair Inference for Discrete Latent Variable Models: An Intersectional Approach(ACM, 2024-09-04) Islam, Rashidul; Pan, Shimei; Foulds, JamesIt is now widely acknowledged that machine learning models, trained on data without due care, often exhibit discriminatory behavior. Traditional fairness research has mainly focused on supervised learning tasks, particularly classification. While fairness in unsupervised learning has received some attention, the literature has primarily addressed fair representation learning of continuous embeddings. This paper, however, takes a different approach by investigating fairness in unsupervised learning using graphical models with discrete latent variables. We develop a fair stochastic variational inference method for discrete latent variables. Our approach uses a fairness penalty on the variational distribution that reflects the principles of intersectionality, a comprehensive perspective on fairness from the fields of law, social sciences, and humanities. Intersectional fairness brings the challenge of data sparsity in minibatches, which we address via a stochastic approximation approach. We first show the utility of our method in improving equity and fairness for clustering using naïve Bayes and Gaussian mixture models on benchmark datasets. To demonstrate the generality of our approach and its potential for real-world impact, we then develop a specialized graphical model for criminal justice risk assessments, and use our fairness approach to prevent the inferences from encoding unfair societal biases.Item RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots(2024-06-12) Feldman, Philip; Foulds, James; Pan, ShimeiLarge language models (LLMs) like ChatGPT demonstrate the remarkable progress of artificial intelligence. However, their tendency to hallucinate -- generate plausible but false information -- poses a significant challenge. This issue is critical, as seen in recent court cases where ChatGPT's use led to citations of non-existent legal rulings. This paper explores how Retrieval-Augmented Generation (RAG) can counter hallucinations by integrating external knowledge with prompts. We empirically evaluate RAG against standard LLMs using prompts designed to induce hallucinations. Our results show that RAG increases accuracy in some cases, but can still be misled when prompts directly contradict the model's pre-trained understanding. These findings highlight the complex nature of hallucinations and the need for more robust solutions to ensure LLM reliability in real-world applications. We offer practical recommendations for RAG deployment and discuss implications for the development of more trustworthy LLMs.Item GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models(2024-12-16) Zhang, Tao; Zeng, Ziqian; Xiao, Yuxiang; Zhuang, Huiping; Chen, Cen; Foulds, James; Pan, ShimeiLarge Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicly available. The commonly used and publicly available alignment dataset, HH-RLHF, still exhibits gender bias to some extent. There is a lack of publicly available alignment datasets specifically designed to address gender bias. Hence, we developed a new dataset named GenderAlign, aiming at mitigating a comprehensive set of gender biases in LLMs. This dataset comprises 8k single-turn dialogues, each paired with a "chosen" and a "rejected" response. Compared to the "rejected" responses, the "chosen" responses demonstrate lower levels of gender bias and higher quality. Furthermore, we categorized the gender biases in the "rejected" responses of GenderAlign into 4 principal categories. The experimental results show the effectiveness of GenderAlign in reducing gender bias in LLMs.Item Foad Hamidi Launches New Projects To Expand Technology-rich Learning Opportunities For Youth In Baltimore(UMBC News, 2024-12-17) Meyers, CatherineFoad Hamidi, an assistant professor in the Department of Information Systems, has won funding from the National Science Foundation (NSF) to support two new projects offering technology-rich informal learning opportunities to diverse populations in Baltimore and beyond.Item Mohamed Younis Honored For Contributions To Modern Communication Technologies(UMBC News, 2024-12-13) Meyers, Catherine; Demond, MarlaynaMohamed Younis, professor and chair of the Department of Computer Science and Electrical Engineering, has been honored by the Institute of Electrical and Electronics Engineers (IEEE) Communications Society for his significant and lasting contributions to the advancement of modern communication technologies. The award was announced December 9 at the society抯 Global Communications Conference in Cape...Item What is the Point? Evaluating the Structure, Color, and Semantic Traits of Computer Vision Point Clouds of Vegetation(MDPI, 2017-04-09) Dandois, Jonathan P.; Baker, Matthew; Olano, Marc; Parker, Geoffrey G.; Ellis, Erle C.Remote sensing of the structural and spectral traits of vegetation is being transformed by structure from motion (SFM) algorithms that combine overlapping images to produce three-dimensional (3D) red-green-blue (RGB) point clouds. However, much remains unknown about how these point clouds are used to observe vegetation, limiting the understanding of the results and future applications. Here, we examine the content and quality of SFM point cloud 3D-RGB fusion observations. An SFM algorithm using the Scale Invariant Feature Transform (SIFT) feature detector was applied to create the 3D-RGB point clouds of a single tree and forest patches. The fusion quality was evaluated using targets placed within the tree and was compared to fusion measurements from terrestrial LIDAR (TLS). K-means clustering and manual classification were used to evaluate the semantic content of SIFT features. When targets were fully visible in the images, SFM assigned color in the correct place with a high accuracy (93%). The accuracy was lower when targets were shadowed or obscured (29%). Clustering and classification revealed that the SIFT features highlighted areas that were brighter or darker than their surroundings, showing little correspondence with canopy objects like leaves or branches, though the features showed some relationship to landscape context (e.g., canopy, pavement). Therefore, the results suggest that feature detectors play a critical role in determining how vegetation is sampled by SFM. Future research should consider developing feature detectors that are optimized for vegetation mapping, including extracting elements like leaves and flowers. Features should be considered the fundamental unit of SFM mapping, like the pixel in optical imaging and the laser pulse of LIDAR. Under optimal conditions, SFM fusion accuracy exceeded that of TLS, and the two systems produced similar representations of the overall tree shape. SFM is the lower-cost solution for obtaining accurate 3D-RGB fusion measurements of the outer surfaces of vegetation, the critical zone of interaction between vegetation, light, and the atmosphere from leaf to canopy scales.Item Mitigating Demographic Bias in AI-based Resume Filtering(ACM, 2020-07-13) Deshpande, Ketki V.; Pan, Shimei; Foulds, JamesWith increasing diversity in the labor market as well as the work force, employers receive resumes from an increasingly diverse population. However, studies and field experiments have confirmed the presence of bias in the labor market based on gender, race, and ethnicity. Many employers use automated resume screening to filter the many possible matches. Depending on how the automated screening algorithm is trained it can potentially exhibit bias towards a particular population by favoring certain socio-linguistic characteristics. The resume writing style and socio-linguistics are a potential source of bias as they correlate with protected characteristics such as ethnicity. A biased dataset is often translated into biased AI algorithms and de-biasing algorithms are being contemplated. In this work, we study the effects of socio-linguistic bias on resume to job description matching algorithms. We develop a simple technique, called fair-tf-idf, to match resumes with job descriptions in a fair way by mitigating the socio-linguistic bias.