Assessment of transfer methods for comparative genomics of regulatory networks in bacteria

dc.contributor.authorKılıç, Sefa
dc.contributor.authorErill, Ivan
dc.date.accessioned2021-03-05T20:21:34Z
dc.date.available2021-03-05T20:21:34Z
dc.date.issued2016-08-31
dc.description.abstractBackground Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. Results We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. Conclusions Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision-recall (PR) curves is a more precise and informative metric than that of receiver-operating-characteristic (ROC) curves, confirming similar findings in other class-imbalanced settings. Our systematic assessment of transfer methods reveals that simple approaches to both motif- and network-based transfer of regulatory information provide equal or better results than more elaborate methods. We also show that there are not effective predictors of transfer efficacy, substantiating the long-standing practice of manual curation in comparative genomics analyses.en_US
dc.description.sponsorshipThe authors wish to thank Patrick O’Neill for insightful discussions and assistance in the data mining process, and Dinara Sagitova and the CollecTF team for their annotation work. This work, including the publication, was funded by the U. S. National Science Foundation award MCB-1158056.en_US
dc.description.urihttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1113-7en_US
dc.format.extent10 pagesen_US
dc.genrejournal articlesen_US
dc.identifierdoi:10.13016/m2n7he-kzuo
dc.identifier.citationKılıç, S., Erill, I. Assessment of transfer methods for comparative genomics of regulatory networks in bacteria. BMC Bioinformatics 17, 277 (2016). https://doi.org/10.1186/s12859-016-1113-7en_US
dc.identifier.urihttps://doi.org/10.1186/s12859-016-1113-7
dc.identifier.urihttp://hdl.handle.net/11603/21086
dc.language.isoen_USen_US
dc.publisherBMCen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Biological Sciences Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rightsAttribution 4.0 International*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.titleAssessment of transfer methods for comparative genomics of regulatory networks in bacteriaen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
s12859-016-1113-7.pdf
Size:
1.38 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Supplimentary.pdf
Size:
960.79 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: