Computational Design Map for Heterogeneous Experimental Studies





Citation of Original Publication


This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
Public Domain Mark 1.0



This paper focuses on the discovery of a computational design map of disparate heterogeneous outcomes from bioinformatics experiments in pig (porcine) studies to help identify key variables impacting the experiment outcomes. Specifically we aim to connect discoveries from disparate laboratory experimentation in the area of trauma, blood loss and blood clotting using data science methods in a collaborative ensemble setting. Trauma related grave injuries cause exsanguination and death, constituting up to 50% of deaths especially in the armed forces. Restricting blood loss in such scenarios usually requires the presence of first responders, which is not feasible in certain cases. Moreover, a traumatic event may lead to a cytokine storm, reflected in the cytokine variables. Hemostatic nanoparticles have been developed to tackle these kinds of situations of trauma and blood loss. This paper highlights a collaborative effort of using data science methods in evaluating the outcomes from a lab study to further understand the efficacy of the nanoparticles. An intravenous administration of hemostatic nanoparticles was executed in pigs that had to undergo hemorrhagic shock and blood loss and other immune response variables, cytokine response variables are measured. Thus, through various hemostatic nanoparticles used in the intervention, multiple data outcomes are produced and it becomes critical to understand which nanoparticles are critical and what variables are key to study further variations in the lab. We propose a collaborative data mining framework which combines the results from multiple data mining methods to discover impactful features. We used frequent patterns observed in the data from these experiments. We further validate the connections between these frequent rules by comparing the results with decision trees and feature ranking. Both the frequent patterns and the decision trees help us identify the critical variables that stand out in the lab studies and need further validation and follow up in future studies. The outcomes from the data mining methods help produce a computational design map of the experimental results. Our preliminary results from such a computational design map provided insights in determining which features can help in designing the most effective hemostatic nanoparticles.