TY - GEN
T1 - On interpretation of network embedding via taxonomy induction
AU - Liu, Ninghao
AU - Huang, Xiao
AU - Li, Jundong
AU - Hu, Xia
PY - 2018/7/19
Y1 - 2018/7/19
N2 - Network embedding has been increasingly used in many network analytics applications to generate low-dimensional vector representations, so that many off-the-shelf models can be applied to solve a wide variety of data mining tasks. However, similar to many other machine learning methods, network embedding results remain hard to be understood by users. Each dimension in the embedding space usually does not have any specific meaning, thus it is difficult to comprehend how the embedding instances are distributed in the reconstructed space. In addition, heterogeneous content information may be incorporated into network embedding, so it is challenging to specify which source of information is effective in generating the embedding results. In this paper, we investigate the interpretation of network embedding, aiming to understand how instances are distributed in embedding space, as well as explore the factors that lead to the embedding results. We resort to the post-hoc interpretation scheme, so that our approach can be applied to different types of embedding methods. Specifically, the interpretation of network embedding is presented in the form of a taxonomy. Effective objectives and corresponding algorithms are developed towards building the taxonomy. We also design several metrics to evaluate interpretation results. Experiments on real-world datasets from different domains demonstrate that, by comparing with the state-of-the-art alternatives, our approach produces effective and meaningful interpretation to embedding results.
AB - Network embedding has been increasingly used in many network analytics applications to generate low-dimensional vector representations, so that many off-the-shelf models can be applied to solve a wide variety of data mining tasks. However, similar to many other machine learning methods, network embedding results remain hard to be understood by users. Each dimension in the embedding space usually does not have any specific meaning, thus it is difficult to comprehend how the embedding instances are distributed in the reconstructed space. In addition, heterogeneous content information may be incorporated into network embedding, so it is challenging to specify which source of information is effective in generating the embedding results. In this paper, we investigate the interpretation of network embedding, aiming to understand how instances are distributed in embedding space, as well as explore the factors that lead to the embedding results. We resort to the post-hoc interpretation scheme, so that our approach can be applied to different types of embedding methods. Specifically, the interpretation of network embedding is presented in the form of a taxonomy. Effective objectives and corresponding algorithms are developed towards building the taxonomy. We also design several metrics to evaluate interpretation results. Experiments on real-world datasets from different domains demonstrate that, by comparing with the state-of-the-art alternatives, our approach produces effective and meaningful interpretation to embedding results.
KW - Machine Learning Interpretation
KW - Network Embedding
KW - Taxonomy
UR - http://www.scopus.com/inward/record.url?scp=85051482600&partnerID=8YFLogxK
U2 - 10.1145/3219819.3220001
DO - 10.1145/3219819.3220001
M3 - Conference article published in proceeding or book
AN - SCOPUS:85051482600
SN - 9781450355520
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1812
EP - 1820
BT - KDD 2018 - Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018
Y2 - 19 August 2018 through 23 August 2018
ER -