This article describes a multi-object tracking method through sensor fusion with a monocular camera and a 3-D Lidar for autonomous vehicles. Specifically, several pairwise costs from information, such as locations, movements, and poses of 3-D cues, are designed for tracking. These costs can complement each other to reduce matching errors during the tracking process. Moreover, they are efficient to be on-line computed with embedded equipment. We feed the pairwise costs to the data-association framework, which is based on the Hungarian algorithm, and then do the back-end fusion for the tracking results. The experimental results on our autonomous sightseeing car demonstrate that our tracking method could achieve accurate and robust results in real-world traffic scenarios.