TY - GEN
T1 - Bayesian heteroscedastic matrix factorization for conversion rate prediction
AU - Yang, Hongxia
N1 - Publisher Copyright:
© 2017 Association for Computing Machinery.
PY - 2017/11/6
Y1 - 2017/11/6
N2 - Display Advertising has generated billions of revenue and originated hundreds of scientific papers and patents, yet the accuracy of prediction technologies leaves much to be desired. Conversion rates (CVR) predictions can often be formulated as a matrix or tensor completion problem where each dimension consists of thousands or even hundreds of thousands of levels. Observed entries are typically extremely sparse, comprising only 0.01% to 1% of the entire matrix or tensor with highly unevenly distributed conversion as well as impression sizes. To deal with these issues, we propose an extension of matrix factorization, namely Bayesian Heteroscedastic Matrix Factorization (BHMF), with three key features. First, BHMF accounts for the fact that each observed entry of a matrix has different magnitude of errors depending on the corresponding impression sizes.We extend the previous research on empirical instance-wise weighted matrix factorization [10] with rigorous probabilistic modelling framework. Second, BHMF is amenable to an efficient Bayesian inference algorithm that is scalable to high dimensional matrices. Compared to the optimization based training, it is more robust to the choices of dimensions of the latent factors as well as regularization parameters. Last, the Bayesian approach provides predictive uncertainty estimations for unseen entries that is capable of dealing with cold-start problems. This can potentially affect a good amount of revenue in the real time bidding (RTB) environment. We focus on matrix CVR predictions in this paper but the proposed BHMF can be naturally extended and applied to higher dimensional tensors. We demonstrate the substantial improvement of our model in predictive capabilities on Yahoo! demand side platform (DSP) BrightRoll.
AB - Display Advertising has generated billions of revenue and originated hundreds of scientific papers and patents, yet the accuracy of prediction technologies leaves much to be desired. Conversion rates (CVR) predictions can often be formulated as a matrix or tensor completion problem where each dimension consists of thousands or even hundreds of thousands of levels. Observed entries are typically extremely sparse, comprising only 0.01% to 1% of the entire matrix or tensor with highly unevenly distributed conversion as well as impression sizes. To deal with these issues, we propose an extension of matrix factorization, namely Bayesian Heteroscedastic Matrix Factorization (BHMF), with three key features. First, BHMF accounts for the fact that each observed entry of a matrix has different magnitude of errors depending on the corresponding impression sizes.We extend the previous research on empirical instance-wise weighted matrix factorization [10] with rigorous probabilistic modelling framework. Second, BHMF is amenable to an efficient Bayesian inference algorithm that is scalable to high dimensional matrices. Compared to the optimization based training, it is more robust to the choices of dimensions of the latent factors as well as regularization parameters. Last, the Bayesian approach provides predictive uncertainty estimations for unseen entries that is capable of dealing with cold-start problems. This can potentially affect a good amount of revenue in the real time bidding (RTB) environment. We focus on matrix CVR predictions in this paper but the proposed BHMF can be naturally extended and applied to higher dimensional tensors. We demonstrate the substantial improvement of our model in predictive capabilities on Yahoo! demand side platform (DSP) BrightRoll.
KW - Bayesian matrix factorization
KW - Conversion rate prediction
KW - Heteroscedastic
UR - https://www.scopus.com/pages/publications/85037343599
U2 - 10.1145/3132847.3133076
DO - 10.1145/3132847.3133076
M3 - Conference article published in proceeding or book
AN - SCOPUS:85037343599
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2407
EP - 2410
BT - CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 26th ACM International Conference on Information and Knowledge Management, CIKM 2017
Y2 - 6 November 2017 through 10 November 2017
ER -