TY - JOUR
T1 - Frequent Itemsets Mining with Differential Privacy over Large-Scale Data
AU - Xiong, Xinyu
AU - Chen, Fei
AU - Huang, Peizhi
AU - Tian, Miaomiao
AU - Hu, Xiaofang
AU - Chen, Badong
AU - Qin, Jing
N1 - Funding Information:
This work was supported in part by the Innovation and Technology Fund of Hong Kong under Project ITS/304/16, in part by The Hong Kong Polytechnic University under Project 1-ZE8J, in part by the National Natural Science Foundation of China under Grants 61502314 and 61601376, in part by the Science and Technology Plan Projects of Shenzhen under Grants JCYJ20160307115030281 and JCYJ20170302145623566, in part by the Fundamental Science and Advanced Technology Research Foundation of Chongqing under Grant cstc2016jcyjA0547, in part by the Chongqing Postdoctoral Science Foundation Special Fund under Grant Xm2017039, and in part by the Doctoral Foundation of Southwest University under Grant SWU116005.
Publisher Copyright:
© 2013 IEEE.
PY - 2018/5/22
Y1 - 2018/5/22
N2 - Frequent itemsets mining with differential privacy refers to the problem of mining all frequent itemsets whose supports are above a given threshold in a given transactional dataset, with the constraint that the mined results should not break the privacy of any single transaction. Current solutions for this problem cannot well balance efficiency, privacy, and data utility over large-scale data. Toward this end, we propose an efficient, differential private frequent itemsets mining algorithm over large-scale data. Based on the ideas of sampling and transaction truncation using length constraints, our algorithm reduces the computation intensity, reduces mining sensitivity, and thus improves data utility given a fixed privacy budget. Experimental results show that our algorithm achieves better performance than prior approaches on multiple datasets.
AB - Frequent itemsets mining with differential privacy refers to the problem of mining all frequent itemsets whose supports are above a given threshold in a given transactional dataset, with the constraint that the mined results should not break the privacy of any single transaction. Current solutions for this problem cannot well balance efficiency, privacy, and data utility over large-scale data. Toward this end, we propose an efficient, differential private frequent itemsets mining algorithm over large-scale data. Based on the ideas of sampling and transaction truncation using length constraints, our algorithm reduces the computation intensity, reduces mining sensitivity, and thus improves data utility given a fixed privacy budget. Experimental results show that our algorithm achieves better performance than prior approaches on multiple datasets.
KW - differential privacy
KW - Frequent itemsets mining
KW - sampling
KW - string matching
KW - transaction truncation
UR - https://www.scopus.com/pages/publications/85047618952
U2 - 10.1109/ACCESS.2018.2839752
DO - 10.1109/ACCESS.2018.2839752
M3 - Journal article
AN - SCOPUS:85047618952
SN - 2169-3536
VL - 6
SP - 28877
EP - 28889
JO - IEEE Access
JF - IEEE Access
ER -