Swinburne
Browse
- No file added yet -

Fast XML structural join algorithms by partitioning

Download (1.1 MB)
journal contribution
posted on 2024-07-12, 13:14 authored by Nan Tang, Jeffrey Xu Yu, Kam-Fai Wong, Jianxin Li
An XML structural join evaluates structural relationships (e.g. parent-child or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching. Several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. The Dietz numbering scheme is used for encoding since nodes with the Dietz encodings could be well distributed on a plane. We first extend the relationships between nodes to the relationships between partitions on a plane and obtain some observations and properties about the relationships between partitions. We then propose a new partition-based method, named P-Join for structural join between ancestor and descendant nodes based on the properties derived from our observations. Moreover, we present an enhanced partitioned-based structural join algorithm and two optimised methods. Extensive experiments show that the performance of our proposed algorithms outperform that of Stack-tree and XR-Tree algorithms. In order to store the partitioning results, we design a simple but efficient index structure, called PSS-tree. The experimental result shows that it has less maintenance overhead than XR-Tree.

History

Available versions

PDF (Published version)

ISSN

1443-458X

Journal title

Journal of Research and Practice in Information Technology

Volume

40

Issue

1

Pagination

20 pp

Publisher

Australian Computer Society

Copyright statement

Copyright © 2008 Australian Computer Society Inc. General permission to republish, but not for profit, all or part of this material is granted, provided that the JRPIT copyright notice is given and that reference is made to the publication, to its date of issue, and to the fact that reprinting privileges were granted by permission of the Australian Computer Society Inc.

Language

eng

Usage metrics

    Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC