Generation and Analysis of Synthetic Data for Privacy Protection Under the Multivariate Linear Regression Model

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Bottom Coding
Likelihood
Linear Regression
Privacy Protection
Synthetic Data
Top Coding

Abstract

In this dissertations, the author derives likelihood-based exact inference for multiply imputed synthetic data under the multiple (p>1) univariate linear regression model and for singly and multiply imputed data under the multivariate linear regression model. In the former, the synthetic data are generated under plug-in sampling, where unknown parameters in the model are set equal to observed values of point estimators. In the latter, synthetic data are also generated under posterior predictive sampling where they are drawn from a posterior predictive distribution. Simulations are presented to confirm the methodology performs as the theory predicts and to evaluate privacy protection. Robustness studies are also given. In the final chapter, a new privacy protection method similar to bottom- and top-coding is proposed and its inferential properties explored.

Generation and Analysis of Synthetic Data for Privacy Protection Under the Multivariate Linear Regression Model

Files

Links to Files

Permanent Link

Collections

Author/Creator

Author/Creator ORCID

Date

Type of Work

Department

Program

Citation of Original Publication

Rights

Subjects

Abstract