Semi-supervised segmentation of echocardiography videos via noise-resilient spatiotemporal semantic calibration and fusion

Huisi Wu, Jiasheng Liu, Fangyan Xiao, Zhenkun Wen, Lan Cheng, Jing Qin

Research output: Journal article publicationJournal articleAcademic researchpeer-review


We present a novel model for left ventricle endocardium segmentation from echocardiography video, which is of great significance in clinical practice and yet a challenging task due to (1) the severe speckle noise in echocardiography videos, (2) the irregular motion of pathological heart, and (3) the limited training data caused by high annotation cost. The proposed model has three compelling characteristics. First, we propose a novel adaptive spatiotemporal semantic calibration method to align the feature maps of consecutive frames, where the spatiotemporal correspondences are figured out based on feature maps instead of pixels, thereby mitigating the adverse effects of speckle noise in the calibration. Second, we further learn the importance of each feature map of neighbouring frames to the current frame from the temporal perspective so as to distinctively rather than uniformly harness the temporal information to tackle the irregular and anisotropic motions. Third, we integrate these techniques into the mean teacher semi-supervised architecture to leverage a large amount of unlabeled data to improve the segmentation accuracy. We extensively evaluate the proposed method on two public echocardiography video datasets (EchoNet-Dynamic and CAMUS), where the average dice coefficient on the left ventricular endocardium segmentation achieves 92.87% and 93.79%, respectively. Comparisons with state-of-the-art methods also demonstrate the effectiveness of the proposed method by achieving a better segmentation performance with a faster speed.

Original languageEnglish
Pages (from-to)18-26
Number of pages9
Publication statusPublished - 7 Jun 2022


  • Deep learning
  • Echocardiography
  • Spatiotemporal semantic calibration
  • Spatiotemporal semantic fusion
  • Temporal context extraction
  • Video semantic segmentation

ASJC Scopus subject areas

  • Computer Science Applications
  • Cognitive Neuroscience
  • Artificial Intelligence


Dive into the research topics of 'Semi-supervised segmentation of echocardiography videos via noise-resilient spatiotemporal semantic calibration and fusion'. Together they form a unique fingerprint.

Cite this