TY - GEN
T1 - A spatial insight for UGC Apps
T2 - 20th International Conference on Mobile Data Management, MDM 2019
AU - Li, Zhe
AU - Li, Yu
AU - Yiu, Man Lung
PY - 2019/6
Y1 - 2019/6
N2 - In the era of smartphones, massive data are generated with geo-related info. A large portion of them come from UGC applications (e.g., Twitter, Instagram), where the content provider are users themselves. Such applications are highly attractive for targeted marketing and recommendation, which have been well studied in recommendation system. In this paper, we consider this from a brand new spatial aspect using UGC contents only. To do this we first representing each message as a point with its geo info as its location and then grouping all the points by their keywords to form multiple point groups. We form a similarity search problem that given a query keyword, our problem aims to find k keywords with the most similar distribution of locations. Our case study shows that with similar distribution, the keywords are highly likely to have semantic connections. However, the performance of existing solutions degrades when different point groups have significant overlapping, which frequently happens in UGC contents. We propose efficient techniques to process similarity search on this kind of point groups. Experimental results on Twitter data demonstrate that our solution is faster than the state-of-The-Art by up to 6 times.
AB - In the era of smartphones, massive data are generated with geo-related info. A large portion of them come from UGC applications (e.g., Twitter, Instagram), where the content provider are users themselves. Such applications are highly attractive for targeted marketing and recommendation, which have been well studied in recommendation system. In this paper, we consider this from a brand new spatial aspect using UGC contents only. To do this we first representing each message as a point with its geo info as its location and then grouping all the points by their keywords to form multiple point groups. We form a similarity search problem that given a query keyword, our problem aims to find k keywords with the most similar distribution of locations. Our case study shows that with similar distribution, the keywords are highly likely to have semantic connections. However, the performance of existing solutions degrades when different point groups have significant overlapping, which frequently happens in UGC contents. We propose efficient techniques to process similarity search on this kind of point groups. Experimental results on Twitter data demonstrate that our solution is faster than the state-of-The-Art by up to 6 times.
KW - Hausdorff distance
KW - Similarity Searching
KW - Spatio-Textual Searching
UR - http://www.scopus.com/inward/record.url?scp=85070973156&partnerID=8YFLogxK
U2 - 10.1109/MDM.2019.00-26
DO - 10.1109/MDM.2019.00-26
M3 - Conference article published in proceeding or book
AN - SCOPUS:85070973156
T3 - Proceedings - IEEE International Conference on Mobile Data Management
SP - 371
EP - 372
BT - Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 10 June 2019 through 13 June 2019
ER -