Abstract
Transcription factor binding sites (TFBSs) play an important role in gene expression regulation. Many computational methods for TFBS prediction need sufficient labeled data. However, many transcription factors (TFs) lack labeled data in cell types. We propose a novel method, referred to as DANN_TF, for TFBS prediction. DANN_TF consists of a feature extractor, a label predictor, and a domain classifier. The feature extractor and the domain classifier constitute an Adversarial Network, which ensures that learned features are common features across different cell types. DANN_TF is evaluated on five TFs in five cell types with a total of 25 cell-type TF pairs and compared to a baseline method which does not use Adversarial Network. For both data augmentation and cross-cell-type prediction, DANN_TF performs better than the baseline method on most cell-type TF pairs. DANN_TF is further evaluated by an additional 13 TFs in the five cell types with a total of 65 cell-type TF pairs. Results show that DANN_TF achieves significantly higher AUC than the baseline method on 96.9% pairs of the 65 cell-type TF pairs. This is a strong indication that DANN_TF can indeed learn common features for cross-cell-type TFBS prediction.
Original language | English |
---|---|
Article number | 3425 |
Pages (from-to) | 1-20 |
Journal | International Journal of Molecular Sciences |
Volume | 20 |
Issue number | 14 |
DOIs | |
Publication status | Published - 2 Jul 2019 |
Keywords
- Adversarial network
- Convolutional neural network
- Cross-cell-type
- Deep learning
- TF-binding site
ASJC Scopus subject areas
- Catalysis
- Molecular Biology
- Spectroscopy
- Computer Science Applications
- Physical and Theoretical Chemistry
- Organic Chemistry
- Inorganic Chemistry