Query optimization in compressed database systems

dc.contributor.authorChen, Zhiyuan
dc.contributor.authorGehrke, Johannes
dc.contributor.authorKorn, Flip
dc.date.accessioned2025-06-05T14:02:52Z
dc.date.available2025-06-05T14:02:52Z
dc.date.issued2001-05-01
dc.descriptionSIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data
dc.description.abstractOver the last decades, improvements in CPU speed have outpaced improvements in main memory and disk access rates by orders of magnitude, enabling the use of data compression techniques to improve the performance of database systems. Previous work describes the benefits of compression for numerical attributes, where data is stored in compressed format on disk. Despite the abundance of string-valued attributes in relational schemas there is little work on compression for string attributes in a database context. Moreover, none of the previous work suitably addresses the role of the query optimizer: During query execution, data is either eagerly decompressed when it is read into main memory, or data lazily stays compressed in main memory and is decompressed on demand onlyIn this paper, we present an effective approach for database compression based on lightweight, attribute-level compression techniques. We propose a IIierarchical Dictionary Encoding strategy that intelligently selects the most effective compression method for string-valued attributes. We show that eager and lazy decompression strategies produce sub-optimal plans for queries involving compressed string attributes. We then formalize the problem of compression-aware query optimization and propose one provably optimal and two fast heuristic algorithms for selecting a query plan for relational schemas with compressed attributes; our algorithms can easily be integrated into existing cost-based query optimizers. Experiments using TPC-H data demonstrate the impact of our string compression methods and show the importance of compression-aware query optimization. Our approach results in up to an order speed up over existing approaches.
dc.description.urihttps://dl.acm.org/doi/10.1145/375663.375692
dc.format.extent12 pages
dc.genreconference papers and proceedings
dc.genrepostprints
dc.identifierdoi:10.13016/m292i6-ictu
dc.identifier.citationChen, Zhiyuan, Johannes Gehrke, and Flip Korn. “Query Optimization in Compressed Database Systems.” Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, SIGMOD ’01, May 1, 2001, 271–82. https://doi.org/10.1145/375663.375692.
dc.identifier.urihttps://doi.org/10.1145/375663.375692
dc.identifier.urihttp://hdl.handle.net/11603/38614
dc.language.isoen_US
dc.publisherACM
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC College of Engineering and Information Technology Dean's Office
dc.relation.ispartofUMBC Information Systems Department
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectUMBC Mobile, Pervasive and Sensor Computing Lab (MPSC Lab)
dc.subjectUMBC Cybersecurity Institute
dc.subjectUMBC Accelerated Cognitive Cybersecurity Laboratory
dc.titleQuery optimization in compressed database systems
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-6984-7248

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
queryoptimizationincompressed.pdf
Size:
277.87 KB
Format:
Adobe Portable Document Format