Optimized Cluster-based Filtering Algorithm for Graph Metadata

Haifeng Liu, Milenko Petrovic, Hans-Arno Jacobsen, and Zhaohui Wu.

Information Sciences, 181(24)5468-5484, December 2011.


With the increasing amount of information on the Web and the proliferation of RSS offerings, efficient graph-based metadata filtering algorithm for large scale information dissemination is very important today. Matching graph-based documents is expensive due to the expressiveness of the language. The centralized architecture does not work well for the large scale information dissemination service. To address these problems, in this paper we develop a cluster-based publish/subscribe system for filtering graph-based RSS documents. Essentially, we develop two indexing algorithms to enable workload distribution and cluster-based filtering. Furthermore, we proposed an optimized graph matching algorithm which speeds up the constraint evaluation for subscriptions. The experimental results show that we can support one million subscriptions on a compute cluster with 5-20 nodes and the throughput scales linearly with the number of cluster nodes.


Tags: graph-based pub/sub

Readers who enjoyed the above work, may also like the following:

  • Efficient and Scalable Filtering of Graph-based Metadata.
    Haifeng Liu, Milenko Petrovic, and Hans-Arno Jacobsen.
    J. Web Sem., 3(4)294-310, 2005.
    Tags: graph-based pub/sub
  • G-ToPSS: Fast Filtering of Graph-based Metadata.
    Milenko Petrovic, Haifeng Liu, and Hans-Arno Jacobsen.
    In World Wide Web Conference, pages 539-547, Chiba, Japan, May 2005.
    Nominated for best paper award, i.e., one of four finalist papers. Acceptance rate: 14%. Number of submissions: 550.
    Tags: algorithms, content-based publish/subscribe, event processing, publish/subscribe, topss, graph-based pub/sub