A Compression Framework for Query Results

Date

Department

Program

Citation of Original Publication

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Decision-support applications in emerging environments require that entire SQL query results be shipped to clients for further analysis and presentation. These clients may use low bandwidth connections (like modems) or have severe memory restrictions (like palmtops). Consequently, there is a need to compress the results of a query for efficient transfer and client-side storage. This paper explores a variety of techniques that address this issue. We model the problem as the choice of an appropriate compression plan and present a framework to model acceptable compression plans. The factors that influence this choice include schema information and statistics on stored tables. Importantly, we demonstrate that the query itself and its evaluation plan can provide semantic information that can be used to compress the result. We demonstrate that these techniques can result in 75% greater compression than standard compression tools like WinZip on queries adapted from the TPC-D benchmark. We identify two topics for future research: the choice of an optimal compression plan, and the integration of query result compression into the regular query evaluation plan