TY - JOUR
T1 - Analyzing Spatial-Temporal Distribution of Natural Hazards in China by Mining News Sources
AU - Liu, Xiao
AU - Guo, Haixiang
AU - Lin, Yu Ru
AU - Li, Yijing
AU - Hou, Jundong
N1 - Funding Information:
This research has been supported by the National Natural Science Foundation of China under Grant No. 71573237; New Century Excellent Talents in University of China under Grant No. NCET-13-1012; the Research Foundation of Humanities and Social Sciences of Ministry of Education of China No. 15YJA630019; Special Funding for Basic Scientific Research of Chinese Central University under Grant Nos. CUG120111, CUG110411, G2012002A, CUG140604, and CUG160605; Natural Science Foundation of Hubei Province of China No. 2016CFB503; and China Institute of Geo-Environment Monitoring No. 0001212016CC60013.
Publisher Copyright:
© 2018 American Society of Civil Engineers.
PY - 2018/8/1
Y1 - 2018/8/1
N2 - Natural hazards cause severe consequences to society, the economy, and the environment. However, it is difficult to analyze natural hazards occurrences in China because there is no complete natural hazard database in China, and it is difficult to gather all conventional natural hazard data because they are kept by many different departments. To resolve this problem, this paper proposes a social media data mining methodology. Because social media is a real-time data source, it is an effective channel for up-to-date information about the characteristics of disasters/hazards. News about natural hazards from 2008 to 2017 is mined from a news organization in China as the key data. Text mining, descriptive statistics, association rule mining, and other methods are used to extract the natural hazard events and hazard characteristics type, time, and location for the analysis. First, from an analysis of the news headlines, each hazard-focused news event is identified and the time and location information are extracted. Second, the spatial-temporal distributions of the natural hazards are analyzed using statistical analysis and network visualization, from which it is found that rainstorms, floods, wind and hail, and other meteorological hazards are the main natural hazard types in China. The high co-occurrence of meteorological hazards and geological hazards indicates that the government needs to pay more attention to geological hazards if there is also a meteorological hazard, especially in mountainous areas. Most hazards are found to have an obvious time distribution, with the high-frequency period being from April to September. Yunnan, Sichuan, and Guizhou Provinces are found to suffer the most frequently from a range of different hazards. An analysis of the associations between hazard regions finds that the southern Chinese regions are strongly related, especially Guizhou, Sichuan, Hubei, and Hunan. The results of this study offer insights into the identification of hazard risks and assists in the development of effective hazard prevention and mitigation programs.
AB - Natural hazards cause severe consequences to society, the economy, and the environment. However, it is difficult to analyze natural hazards occurrences in China because there is no complete natural hazard database in China, and it is difficult to gather all conventional natural hazard data because they are kept by many different departments. To resolve this problem, this paper proposes a social media data mining methodology. Because social media is a real-time data source, it is an effective channel for up-to-date information about the characteristics of disasters/hazards. News about natural hazards from 2008 to 2017 is mined from a news organization in China as the key data. Text mining, descriptive statistics, association rule mining, and other methods are used to extract the natural hazard events and hazard characteristics type, time, and location for the analysis. First, from an analysis of the news headlines, each hazard-focused news event is identified and the time and location information are extracted. Second, the spatial-temporal distributions of the natural hazards are analyzed using statistical analysis and network visualization, from which it is found that rainstorms, floods, wind and hail, and other meteorological hazards are the main natural hazard types in China. The high co-occurrence of meteorological hazards and geological hazards indicates that the government needs to pay more attention to geological hazards if there is also a meteorological hazard, especially in mountainous areas. Most hazards are found to have an obvious time distribution, with the high-frequency period being from April to September. Yunnan, Sichuan, and Guizhou Provinces are found to suffer the most frequently from a range of different hazards. An analysis of the associations between hazard regions finds that the southern Chinese regions are strongly related, especially Guizhou, Sichuan, Hubei, and Hunan. The results of this study offer insights into the identification of hazard risks and assists in the development of effective hazard prevention and mitigation programs.
KW - Association rule
KW - Natural hazards
KW - Network visualization
KW - Spatial-temporal distribution
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=85045061592&partnerID=8YFLogxK
U2 - 10.1061/(ASCE)NH.1527-6996.0000291
DO - 10.1061/(ASCE)NH.1527-6996.0000291
M3 - Journal article
AN - SCOPUS:85045061592
SN - 1527-6988
VL - 19
JO - Natural Hazards Review
JF - Natural Hazards Review
IS - 3
M1 - 04018006
ER -