A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana

dc.contributor.authorNajnin, Tanzira
dc.contributor.authorSaimon, Sakhawat Hossain
dc.contributor.authorSunter, Garry
dc.contributor.authorRuan, Jianhua
dc.description.abstractTranscription factors are an integral component of the cellular machinery responsible for regulating many biological processes, and they recognize distinct DNA sequence patterns as well as internal/external signals to mediate target gene expression. The functional roles of an individual transcription factor can be traced back to the functions of its target genes. While such functional associations can be inferred through the use of binding evidence from high-throughput sequencing technologies available today, including chromatin immunoprecipitation sequencing, such experiments can be resource-consuming. On the other hand, exploratory analysis driven by computational techniques can alleviate this burden by narrowing the search scope, but the results are often deemed low-quality or non-specific by biologists. In this paper, we introduce a data-driven, statistics-based strategy to predict novel functional associations for transcription factors in the model plant Arabidopsis thaliana. To achieve this, we leverage one of the largest available gene expression compendia to build a genome-wide transcriptional regulatory network and infer regulatory relationships among transcription factors and their targets. We then use this network to build a pool of likely downstream targets for each transcription factor and query each target pool for functionally enriched gene ontology terms. The results exhibited sufficient statistical significance to annotate most of the transcription factors in Arabidopsis with highly specific biological processes. We also perform DNA binding motif discovery for transcription factors based on their target pool. We show that the predicted functions and motifs strongly agree with curated databases constructed from experimental evidence. In addition, statistical analysis of the network revealed interesting patterns and connections between network topology and system-level transcriptional regulation properties. We believe that the methods demonstrated in this work can be extended to other species to improve the annotation of transcription factors and understand transcriptional regulation on a system level.
dc.description.departmentComputer Science
dc.identifierdoi: 10.3390/genes14020282
dc.identifier.citationGenes 14 (2): 282 (2023)
dc.rightsAttribution 4.0 United States
dc.subjecttranscriptional regulatory network
dc.subjectnetwork topology
dc.subjectgene expression pattern
dc.subjectgene ontology
dc.subjectfunctional annotation
dc.subjectmotif discovery
dc.titleA Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana


Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
3.22 MB
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
1.86 KB
Item-specific license agreed upon to submission