TY - GEN
T1 - A Pattern-Based Framework for Addressing Data Representational Inconsistency
AU - Yi, Bingyu
AU - Hua, Wen
AU - Sadiq, Shazia
N1 - Funding Information:
This work was supported by the grant DP140103171 (Declaration, Exploration, Enhancement and Provenance: The DEEP Approach to Data Quality Management Systems) from the Australian Research Council.
Publisher Copyright:
© Springer International Publishing AG 2016.
PY - 2016
Y1 - 2016
N2 - Data representational inconsistency, where data has diverse formats or structures, is a crucial data quality problem. Existing fixing approaches either target on a specific domain or require massive information from users. In this work, we propose a user-friendly pattern-based framework for addressing data representational inconsistency. Our framework consists of three modules: pattern design, pattern detection, and pattern unification.We identify several challenges in all the three tasks in order to handle an inconsistent dataset both accurately and efficiently. We propose various techniques to tackle these issues, and our experimental results on real-life datasets demonstrate better performance of our proposals compared with existing methods.
AB - Data representational inconsistency, where data has diverse formats or structures, is a crucial data quality problem. Existing fixing approaches either target on a specific domain or require massive information from users. In this work, we propose a user-friendly pattern-based framework for addressing data representational inconsistency. Our framework consists of three modules: pattern design, pattern detection, and pattern unification.We identify several challenges in all the three tasks in order to handle an inconsistent dataset both accurately and efficiently. We propose various techniques to tackle these issues, and our experimental results on real-life datasets demonstrate better performance of our proposals compared with existing methods.
UR - http://www.scopus.com/inward/record.url?scp=84990062975&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-46922-5_31
DO - 10.1007/978-3-319-46922-5_31
M3 - Conference article published in proceeding or book
AN - SCOPUS:84990062975
SN - 9783319469218
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 395
EP - 406
BT - Databases Theory and Applications - 27th Australasian Database Conference, ADC 2016, Proceedings
A2 - Cheema, Muhammad Aamir
A2 - Zhang, Wenjie
A2 - Chang, Lijun
PB - Springer Verlag
T2 - 27th Australasian Database Conference on Databases Theory and Applications, ADC 2016
Y2 - 28 September 2016 through 29 September 2016
ER -