Counting twig matches in a tree

dc.contributor.authorChen, Zhiyuan
dc.contributor.authorJagadish, H. V.
dc.contributor.authorKorn, Flip
dc.contributor.authorKoudas, N.
dc.contributor.authorMuthukrishnan, S.
dc.contributor.authorNg, R.
dc.contributor.authorSrivastava, D.
dc.date.accessioned2025-06-05T14:03:46Z
dc.date.available2025-06-05T14:03:46Z
dc.date.issued2001-04
dc.description.abstractDescribes efficient algorithms for accurately estimating the number of matches of a small node-labeled tree, i.e. a twig, in a large node-labeled tree, using a summary data structure. This problem is of interest for queries on XML and other hierarchical data, to provide query feedback and for cost-based query optimization. Our summary data structure scalably represents approximate frequency information about twiglets (i.e. small twigs) in the data tree. Given a twig query, the number of matches is estimated by creating a set of query twiglets, and combining two complementary approaches: set hashing, used to estimate the number of matches of each query twiglet, and maximal overlap, used to combine the query twiglet estimates into an estimate for the twig query. We propose several estimation algorithms that apply these approaches on query twiglets formed using variations on different twiglet decomposition techniques. We present an extensive experimental evaluation using several real XML data sets, with a variety of twig queries. Our results demonstrate that accurate and robust estimates can be achieved, even with limited space.
dc.description.sponsorship. V. Jagadish was supported in part by NSF under grant IIS0002356.
dc.description.urihttps://ieeexplore.ieee.org/abstract/document/914874
dc.format.extent10 pages
dc.genreconference papers and proceedings
dc.genrepostprints
dc.identifierdoi:10.13016/m2jyid-mfl1
dc.identifier.citationChen, Zhiyuan, H.V. Jagadish, Flip Korn, N. Koudas, S. Muthukrishnan, R. Ng, and D. Srivastava. “Counting Twig Matches in a Tree.” Proceedings 17th International Conference on Data Engineering, April 2001, 595–604. https://doi.org/10.1109/ICDE.2001.914874.
dc.identifier.urihttps://doi.org/10.1109/ICDE.2001.914874
dc.identifier.urihttp://hdl.handle.net/11603/38754
dc.language.isoen_US
dc.publisherIEEE
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC College of Engineering and Information Technology Dean's Office
dc.relation.ispartofUMBC Information Systems Department
dc.rights© 2001 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.subjectTree data structures
dc.subjectUMBC Mobile, Pervasive and Sensor Computing Lab (MPSC Lab)
dc.subjectUMBC Cybersecurity Institute
dc.subjectFile systems
dc.subjectRobustness
dc.subjectDatabase languages
dc.subjectFrequency
dc.subjectData structures
dc.subjectBooks
dc.subjectFeedback
dc.subjectXML
dc.subjectQuery processing
dc.subjectUMBC Accelerated Cognitive Cybersecurity Laboratory
dc.titleCounting twig matches in a tree
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-6984-7248

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
countingtwigs.pdf
Size:
277.99 KB
Format:
Adobe Portable Document Format