Affected by hardware and wireless conditions in WSNs, raw sensory data usually have notable data loss and corruption. Existing studies mainly consider the interpolation of random missing data in the absence of the data corruption. There is also no strategy to handle the successive missing data. To address these problems, this paper proposes a novel approach based on matrix completion (MC) to recover the successive missing and corrupted data. By analyzing a large set of weather data collected from 196 sensors in Zhu Zhou, China, we verify that weather data have the features of low-rank, temporal stability, and spatial correlation. Moreover, from simulations on the real weather data, we also discover that successive data corruption not only seriously affects the accuracy of missing and corrupted data recovery but even pollutes the normal data when applying the matrix completion in a traditional way. Motivated by these observations, we propose a novel Principal Component Analysis (PCA)-based scheme to efficiently identify the existence of data corruption. We further propose a two-phase MC-based data recovery scheme, named MC-Two-Phase, which applies the matrix completion technique to fully exploit the inherent features of environmental data to recover the data matrix due to either data missing or corruption. Finally, the extensive simulations with real-world sensory data demonstrate that the proposed MC-Two-Phase approach can achieve very high recovery accuracy in the presence of successively missing and corrupted data.
- Corrupted data recovery
- matrix completion
- wireless sensor networks
ASJC Scopus subject areas
- Computer Networks and Communications
- Electrical and Electronic Engineering