TY - GEN
T1 - Towards query pricing on incomplete data (extended abstract)
AU - Miao, Xiaoye
AU - Gao, Yunjun
AU - Chen, Lu
AU - Peng, Huanhuan
AU - Yin, Jianwei
AU - Li, Qing
N1 - Funding Information:
Next, Fig. 2(c) plots the history-aware prices on the bench-mark dataset SSB with eight consecutive queries. First, the gap between history-aware QUCA price and QUCA price becomes larger with more query instances. This is because, these eight queries are only different in query parameters, and more tuples are becoming free when processing more query instances. We can also find that, with the growth of the data update rate, more and more tuples in the collected lineage sets are expired, and thus cannot be re-used. It leads to the history-aware QUCA price to approach the (history-oblivious) QUCA price. Hence, the history-aware QUCA price is cost-effective in most cases (except the case with extremely update speed of the data). ACKNOWLEDGMENTS This work was supported in part by the NSFC under Grants No.61902343, No.62025206, No.61972338, No.61825205, No.61772459, the National Key R&D Program of China under Grants No.2019YFE0126200 and No.2017YFB1400601, and the Zhejiang Provincial Natural Science Foundation under Grant No.LR21F020005. Yunjun Gao is the corresponding author of the work.
Publisher Copyright:
© 2021 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - As data markets have started to receive much attention from both industry and academia, how to price the tradable data is an indispensable problem. Pricing incomplete data is more practical and challenging, due to the pervasiveness of incomplete data. In this paper, we explore the pricing problem for queries over incomplete data. We propose a sophisticated pricing mechanism, termed as iDBPricer, which considers a series of essential factors, including the data contribution/usage, data completeness, and query quality. We present two novel price functions, namely, the usage and completeness-aware price function (UCA price for short) and the quality, usage, and completeness-aware price function (QUCA price for short). Moreover, we develop efficient algorithms for deriving the query prices. Extensive experiments using both real and benchmark datasets confirm the superiority of iDBPricer to the state-of-the-art price functions.
AB - As data markets have started to receive much attention from both industry and academia, how to price the tradable data is an indispensable problem. Pricing incomplete data is more practical and challenging, due to the pervasiveness of incomplete data. In this paper, we explore the pricing problem for queries over incomplete data. We propose a sophisticated pricing mechanism, termed as iDBPricer, which considers a series of essential factors, including the data contribution/usage, data completeness, and query quality. We present two novel price functions, namely, the usage and completeness-aware price function (UCA price for short) and the quality, usage, and completeness-aware price function (QUCA price for short). Moreover, we develop efficient algorithms for deriving the query prices. Extensive experiments using both real and benchmark datasets confirm the superiority of iDBPricer to the state-of-the-art price functions.
UR - http://www.scopus.com/inward/record.url?scp=85112869070&partnerID=8YFLogxK
U2 - 10.1109/ICDE51399.2021.00260
DO - 10.1109/ICDE51399.2021.00260
M3 - Conference article published in proceeding or book
AN - SCOPUS:85112869070
T3 - Proceedings - International Conference on Data Engineering
SP - 2348
EP - 2349
BT - Proceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
PB - IEEE Computer Society
T2 - 37th IEEE International Conference on Data Engineering, ICDE 2021
Y2 - 19 April 2021 through 22 April 2021
ER -