Traditional mobile crowdsensing photo selection process focuses on selecting photos from participants to a server. The server may contain tons of photos for a certain area. A new problem is how to select a set of photos from the server to a smartphone user when the user requests to view an area (e.g., a hot spot). The challenge of the new problem is that the photo set should attain both photo coverage and view quality (e.g., with clear Points of Interest). However, contributions of these geo-tagged photos could be uncertain for a target area due to unavailable information of photo shooting direction and no reference photos. In this paper, we propose a novel and generic server-to-requester photo selection approach. Our approach leverages a utility measure to quantify the contribution of a photo set, where photos' spatial distribution and visual correlation are jointly exploited to evaluate their performance on photo coverage and view quality. Finding the photo set with the maximum utility is proven to be NP-hard. We then propose an approximation algorithm based on a greedy strategy with rigorous theoretical analysis. The effectiveness of our approach is demonstrated with real-world datasets. The results show that the proposal outperforms other approaches with much higher photo coverage and better view quality.