A NOVEL METHODOLOGY FOR DEFINING THE SEQUENCE SPECIFICITIES OF DNA-BINDING PROTEINS: ANALYSIS OF GCN4 AND NEW GENE PRODUCT SBF

Author/Creator ORCID

Type of Work

Department

Hood College Biology

Program

Biomedical and Environmental Science

Citation of Original Publication

Rights

Subjects

Abstract

Transcriptional regulation of genes is exerted in part by transactivating DNA-binding proteins which interact with DNA at cis-elements to positively or negatively affect gene expression. A novel methodology was developed that identifies the nucleotide sequence specificity of DNA-binding proteins. This technique is based upon the phenomenon whereby a DNA-binding protein can bind a specific nucleotide sequence present among oligonucleotides containing random combinations of base sequences. These specific sequences are then be amplified by polymerase chain reaction (PCR) for subsequent re-selecting, cloning, sequencing and analysis. To evaluate the feasibility of this methodology, the previously characterized yeast transactivating protein, GCN4, was used as a model system. The nucleotide consensus sequence of binding was determined to be TGACTCA in agreement with previously reported data. A common variation in the consensus, which was also bound with high affinity, TGACTAN involved specificity to an A in position +2 and a loss of specificity in position +3. Moreover, this methodology allowed the identification of additional DNA sequences which were recognized by GCN4 with lower affinity; these binding sites deviated from the consensus by one or sometimes two positions, 3' to the central C residue. This data suggests that although the recognition sequence(s) were palindromic, binding to both sides was not equivalent. In addition, the consensus binding sequence of a transactivating protein SBF (S-element binding factor) was determined. A SBF cDNA clone was identified from an expression library by its binding to an oligonucleotide sequence present in the ETS-2 promoter, and was further characterized as encoding a helix-loop-helix leucine-zipper DNA binding protein. The optimal DNA-binding sequence was determined to be CACGTG. Like GCN4, the SBF protein dimer binds both sides of the palindrome with different affinities, asymmetrically allowing variations in the +1 or +2 position. Also, SBF was shown to specifically bind sequences which had mutations in the palindromic NCANNTGN motif. This population of mutant sequences were represented by the consensus PuTCAPyPuAGG.