Many organizations and companies have deployed not only datacenters but also large number of geo-distributed heterogeneous edges to provide fast data analytics services. Since large volume of data transmission across WAN can be costly, existing works mainly focus on pre-processing data in-place to avoid transmission. However, the heterogeneity of edges on either local computing capacity or network bandwidth limits the efficient use on scarce resource, which may result in long task completion time. To cope with dynamic demands on scarce resource, we take the heterogeneity of both computing capacity and network bandwidth of geo-distributed edges into consideration when assigning data analytical tasks and their associated data between the central datacenter and edges such that the overall latency can be reduced. We formulate the geo-distributed data-task joint scheduling problem (GJS), show its NP-hardness, and propose a near-optimal randomized scheduling algorithm (ran-GJS). ran-GJS can be proved concentrated around its optimum value with high probability, i.e., 1-O(e-t2) where t is the concentration bound by using Martingale Analysis. The experimental results obtained form both extensive simulations and Yarn-based prototype show that ran-GJS significantly speeds up the geo-distributed analytics with a gain on average completion time of at least 28% over state-of-the-art baseline algorithms.