This study examines the potential of the smart card data in public transit systems to infer passengers’ demographic attributes, thereby enabling a human-centered public transport service design while reducing the use of expensive and time-consuming travel surveys. This is challenging since travel behaviors vary significantly over the population, space and time and developing meaningful links between them and passengers’ demographic attributes are not trivial. To achieve this, we conduct an extensive analysis of spatio-temporal travel behavior patterns using smart card data from the Greater Sydney area, based on which we develop an end-to-end Hybrid Spatial-Temporal Neural Network. In particular, we first empirically analyze passenger movement and mobility travel patterns from both spatial and temporal perspectives and design a set of discriminative features to characterizing the patterns. We then propose a novel Product-based Spatial-Temporal module which encodes the relationships across a variety of features and harnesses them collectively under an Auto-Encoder Compression module, in order to predict passengers’ demographic information. The experiments are conducted using a large-scale real-world public transportation dataset covering 171.77 million users. The experimental results demonstrate the effectiveness of the proposed method against a number of established tools in the literature.