Bayesian Analysis of Singly Imputed Partially Synthetic Data Generated by Plug-in Sampling and Posterior Predictive Sampling Under the Multiple Linear Regression Model

Author/Creator ORCID

Date

2021-08-25

Type of Work

Department

Program

Citation of Original Publication

Guin, Abhishek; Roy, Anindya; Sinha, Bimal; Bayesian Analysis of Singly Imputed Partially Synthetic Data Generated by Plug-in Sampling and Posterior Predictive Sampling Under the Multiple Linear Regression Model; Research Report Series(Statistics #2021-02), 25 August 2021; https://www.census.gov/content/dam/Census/library/working-papers/2021/adrm/RRS2021-02.pdf

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Public Domain Mark 1.0
This is a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.

Subjects

Abstract

In this paper we develop Bayesian inference based on singly imputed partially synthetic data, when the original data are derived from a multiple linear regression model. We assume that the synthetic data are generated by using two methods: plug-in sampling, where unknown parameters in the data model are set equal to observed values of their point estimators based on the original data, and synthetic data are drawn from this estimated version of the model; posterior predictive sampling, where an imputed posterior distribution of the unknown parameters is used to generate a posterior draw, which in turn is plugged in the original model to beget synthetic data. Simulation results are presented to demonstrate how the proposed methodology performs compared to the theoretical predictions. We outline some ways to extend the proposed methodology for certain scenarios where the required set of conditions do not hold.