Detecting anomalous Border Gateway Protocol (BGP) traffic is significantly important in improving both security and robustness of the Internet. Existing solutions apply classic classifiers to make real-time decision based on the traffic features of present moment. However, due to the frequently happening burst and noise in dynamic Internet traffic, the decision based on short-term features is not reliable. To address this problem, we propose MS-LSTM, a multi-scale Long Short-Term Memory (LSTM) model to consider the Internet flow as a multi-dimensional time sequence and learn the traffic pattern from historical features in a sliding time window. In addition, we find that adopting different time scale to preprocess the traffic flow has great impact on the performance of all classifiers. In this paper, comprehensive experiments are conducted and the results show that a proper time scale can improve about 10% accuracy of LSTM as well as all conventional machine learning methods. Particularly, MS-LSTM with optimal time scale 8 can achieve 99.5% accuracy in the best case.