Browsing by Subject "Interpretability"

Now showing 1 - 2 of 2

INTERPRETABLE DEEP LEARNING MODELS FOR ELECTRONIC HEALTH RECORDS
(2022-01-01) Shi, Peichang; Gangopadhyay, Aryya; Information Systems; Information Systems
Analysis of healthcare data could help reduce cost, improve patient outcomes and understand the best practices related to diseases.However,with rapid increase of massive amounts of health-related data, such as Electronic Health Records (EHRs),high dimensionality and large sample size have become challenges for traditional statis- tical approaches.Deep learning models have been proved to be powerful tools in computer vi- sion and machine learning in healthcare.However,despite their superior performance to traditional statistical methods,it remains challenging to understand their inner mechanism due to the black box effects. A variety of interpretability algorithms have been developed to help explain the deep learning models.However due to the trade off between model accuracy and complexity, the current interpretability algorithms have lower performance com- pared to original deep learning models, which cause some concern for high stakes in healthcare. Also,most of interpretability algorithms focus on correlation interpretation, highly correlated features may lead to biased causal inference, which may be more important in healthcare. In this dissertations paper,we proposed a new ensemble approach for deep learn- ing interpretation,Local surrogate Interpretable model-agnostic Visualizations and Explanations (LIVE), where we assumed all the predictions from deep learning model form a mixture of a finite number of Gaussian distributions with unknown parameters.We applied ensemble trees to obtain the mixing coefficients. The rule sets from the trees were used to build an interpretable model through randomized experimental design for interpretation. Our LIVE algorithm was validated using different types of datasets (image and structured datasets) with different deep learning model structures. Our experiments showed that LIVE algorithm could not only help improve the model accuracy, but also provide visual interpretation.
Learning Explainable Models using Self-Supervised Learning
(2023-01-01) Pillai, Vipin; Pirsiavash, Hamed; Computer Science and Electrical Engineering; Computer Science
The last decade has witnessed an exponential rise in the research and deployment of Deep Neural Networks (DNNs) for widespread applications spanning various domains such as computer vision, natural language processing, speech recognition, statistical analysis, and most recently generative AI applications. Given such wide-ranging deployments impacting the day-to-day lives of people across the globe, it is imperative to develop mechanisms to understand the decision-making process of the underlying DNNs. Moreover, safety critical deployments such as medical diagnosis, self-driving cars, law enforcement applications make it crucial to be able to understand and explain each individual decision, rather than relying on them as black box algorithms. For computer vision applications such as image classification, various explainability algorithms have been introduced in the last few years for attributing DNN decisions back to the input image regions. In this dissertation, we scrutinize the reliability of existing explanation algorithms and push the state-of-the-art by introducing novel methods for learning explainable models such that they are not only accurate, but also explainable by design. We first study the reliability of existing explanation algorithms and observe that they might not always explain the true cause of a network's prediction. Although DNN decisions have been shown to be vulnerable to adversarial attacks, we show that it is possible to create adversarial patches which not only fool the prediction, but also change what we interpret regarding the cause of the prediction. We introduce our attack as a controlled setting to measure the accuracy of interpretation algorithms and benchmark the resiliency of explanation algorithms on ImageNet and PASCAL-VOC datasets. We then explore methods towards improving the interpretability of DNNs by learning explainable models. Obtaining annotations for explanations to train explanation algorithms is not trivial since the explanation depends on both the input and the model under consideration. To this end, we introduce an algorithm to improve the interpretability of deep neural networks for a given explanation method. Our method encourages the network to learn consistent interpretations together with maximizing the log-likelihood of the correct class. We also introduce new evaluation metrics to benchmark the quality of explanation heatmaps obtained by explanation algorithms and show that our method outperforms the baseline on ImageNet and MS-COCO datasets. Building upon this work, we introduce another novel method to train models to produce consistent explanations across image transformations. Self-supervised training has emerged as a viable alternative to supervised training in the absence of ground truth by leveraging large-scale unlabeled data to learn features that generalize across various tasks. Since obtaining the ground truth for a desired model explanation is not a well-defined task, we adopt ideas from contrastive self-supervised learning and apply them to the interpretations of the model rather than its embeddings. We perform extensive experiments and show that our method results in models with improved interpretability, while also acting as a regularizer and improving the accuracy on limited-data, fine-grained classification settings. We believe our methods will serve as a strong foundation and will encourage the community to develop models such that they are not just accurate, but also explainable by design.

Browsing by Subject "Interpretability"

Results Per Page

Sort Options