We consider the problem of "progressively" joining relations whose records are continuously retrieved from remote sources through an unstable network that may incur temporary failures. The objectives are to (i) start reporting the first output tuples as soon as possible (before the participating relations are completely received), and (ii) produce the remaining results at a fast rate. We develop a new algorithm RPJ (Rate-based Progressive Join) based on solid theoretical analysis. RPJ maximizes the output rate by optimizing its execution according to the characteristics of the join relations (e.g., data distribution, tuple arrival pattern, etc.). Extensive experiments prove that our technique delivers results significantly faster than the previous methods.
|Number of pages||12|
|Journal||Proceedings of the ACM SIGMOD International Conference on Management of Data|
|Publication status||Published - 1 Dec 2005|
|Event||SIGMOD 2005: ACM SIGMOD International Conference on Management of Data - Baltimore, MD, United States|
Duration: 14 Jun 2005 → 16 Jun 2005
ASJC Scopus subject areas
- Information Systems