Healthcare question answering (HQA) system plays a vital role in encouraging patients to inquire for professional consultation. However, there are some challenging factors in learning and representing the question corpus of HQA datasets, such as high dimensionality, sparseness, noise, nonprofessional expression, etc. To address these issues, we propose an inception convolutional autoencoder model for Chinese healthcare question clustering (ICAHC). First, we select a set of kernels with different sizes using convolutional autoencoder networks to explore both the diversity and quality in the clustering ensemble. Thus, these kernels encourage to capture diverse representations. Second, we design four ensemble operators to merge representations based on whether they are independent, and input them into the encoder using different skip connections. Third, it maps features from the encoder into a lower-dimensional space, followed by clustering. We conduct comparative experiments against other clustering algorithms on a Chinese healthcare dataset. Experimental results show the effectiveness of ICAHC in discovering better clustering solutions. The results can be used in the prediction of patients’ conditions and the development of an automatic HQA system.
- Kernel , Medical services , Feature extraction , Deep learning , Pediatrics , Knowledge discovery , Indexes