Counting twig matches in a tree
| dc.contributor.author | Chen, Zhiyuan | |
| dc.contributor.author | Jagadish, H. V. | |
| dc.contributor.author | Korn, Flip | |
| dc.contributor.author | Koudas, N. | |
| dc.contributor.author | Muthukrishnan, S. | |
| dc.contributor.author | Ng, R. | |
| dc.contributor.author | Srivastava, D. | |
| dc.date.accessioned | 2025-06-05T14:03:46Z | |
| dc.date.available | 2025-06-05T14:03:46Z | |
| dc.date.issued | 2001-04 | |
| dc.description.abstract | Describes efficient algorithms for accurately estimating the number of matches of a small node-labeled tree, i.e. a twig, in a large node-labeled tree, using a summary data structure. This problem is of interest for queries on XML and other hierarchical data, to provide query feedback and for cost-based query optimization. Our summary data structure scalably represents approximate frequency information about twiglets (i.e. small twigs) in the data tree. Given a twig query, the number of matches is estimated by creating a set of query twiglets, and combining two complementary approaches: set hashing, used to estimate the number of matches of each query twiglet, and maximal overlap, used to combine the query twiglet estimates into an estimate for the twig query. We propose several estimation algorithms that apply these approaches on query twiglets formed using variations on different twiglet decomposition techniques. We present an extensive experimental evaluation using several real XML data sets, with a variety of twig queries. Our results demonstrate that accurate and robust estimates can be achieved, even with limited space. | |
| dc.description.sponsorship | . V. Jagadish was supported in part by NSF under grant IIS0002356. | |
| dc.description.uri | https://ieeexplore.ieee.org/abstract/document/914874 | |
| dc.format.extent | 10 pages | |
| dc.genre | conference papers and proceedings | |
| dc.genre | postprints | |
| dc.identifier | doi:10.13016/m2jyid-mfl1 | |
| dc.identifier.citation | Chen, Zhiyuan, H.V. Jagadish, Flip Korn, N. Koudas, S. Muthukrishnan, R. Ng, and D. Srivastava. “Counting Twig Matches in a Tree.” Proceedings 17th International Conference on Data Engineering, April 2001, 595–604. https://doi.org/10.1109/ICDE.2001.914874. | |
| dc.identifier.uri | https://doi.org/10.1109/ICDE.2001.914874 | |
| dc.identifier.uri | http://hdl.handle.net/11603/38754 | |
| dc.language.iso | en_US | |
| dc.publisher | IEEE | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC College of Engineering and Information Technology Dean's Office | |
| dc.relation.ispartof | UMBC Information Systems Department | |
| dc.rights | © 2001 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |
| dc.subject | Tree data structures | |
| dc.subject | UMBC Mobile, Pervasive and Sensor Computing Lab (MPSC Lab) | |
| dc.subject | UMBC Cybersecurity Institute | |
| dc.subject | File systems | |
| dc.subject | Robustness | |
| dc.subject | Database languages | |
| dc.subject | Frequency | |
| dc.subject | Data structures | |
| dc.subject | Books | |
| dc.subject | Feedback | |
| dc.subject | XML | |
| dc.subject | Query processing | |
| dc.subject | UMBC Accelerated Cognitive Cybersecurity Laboratory | |
| dc.title | Counting twig matches in a tree | |
| dc.type | Text | |
| dcterms.creator | https://orcid.org/0000-0002-6984-7248 |
Files
Original bundle
1 - 1 of 1
