Abstract
In relational databases, a composition requires three operations: join, projection and duplicate elimination. An expensive external sort is required to eliminate duplicates in a large file. Most conventional composition algorithms take join and duplicate elimination as separate operations to reduce overhead on either operation, but not both. Direct composition algorithms outperform all these algorithms by executing the composition as a single primitive. In this paper, we show that the direct composition can even outperform its component operation, join, under various conditions. Moreover, when the density of the joining attribute is high enough, the hot-spot direct composition may even run faster as the operand relations become bigger. These results can encourage real DBMSs to remove duplicates in some relational operations, and thus, to preserve the closure property of the relational data model.
Original language | English |
---|---|
Pages (from-to) | 204-209 |
Number of pages | 6 |
Journal | Proceedings - IEEE Computer Society's International Computer Software and Applications Conference |
State | Published - 1997 |
Event | Proceedings of the 1997 21st Annual International Computer Software & Applications Conference, COMPSAC'97 - Washington, DC, USA Duration: 13 Aug 1997 → 15 Aug 1997 |