A NOVEL METHODOLOGY FOR DEFINING THE SEQUENCE SPECIFICITIES OF DNA-BINDING PROTEINS: ANALYSIS OF GCN4 AND NEW GENE PRODUCT SBF

Author/Creator ORCID

Date

1992-10

Type of Work

Department

Hood College Biology

Program

Biomedical and Environmental Science

Citation of Original Publication

Rights

Subjects

Abstract

Transcriptional regulation of genes is exerted in part by transactivating DNA-binding proteins which interact with DNA at cis-elements to positively or negatively affect gene expression. A novel methodology was developed that identifies the nucleotide sequence specificity of DNA-binding proteins. This technique is based upon the phenomenon whereby a DNA-binding protein can bind a specific nucleotide sequence present among oligonucleotides containing random combinations of base sequences. These specific sequences are then be amplified by polymerase chain reaction (PCR) for subsequent re-selecting, cloning, sequencing and analysis. To evaluate the feasibility of this methodology, the previously characterized yeast transactivating protein, GCN4, was used as a model system. The nucleotide consensus sequence of binding was determined to be TGACTCA in agreement with previously reported data. A common variation in the consensus, which was also bound with high affinity, TGACTAN involved specificity to an A in position +2 and a loss of specificity in position +3. Moreover, this methodology allowed the identification of additional DNA sequences which were recognized by GCN4 with lower affinity; these binding sites deviated from the consensus by one or sometimes two positions, 3' to the central C residue. This data suggests that although the recognition sequence(s) were palindromic, binding to both sides was not equivalent. In addition, the consensus binding sequence of a transactivating protein SBF (S-element binding factor) was determined. A SBF cDNA clone was identified from an expression library by its binding to an oligonucleotide sequence present in the ETS-2 promoter, and was further characterized as encoding a helix-loop-helix leucine-zipper DNA binding protein. The optimal DNA-binding sequence was determined to be CACGTG. Like GCN4, the SBF protein dimer binds both sides of the palindrome with different affinities, asymmetrically allowing variations in the +1 or +2 position. Also, SBF was shown to specifically bind sequences which had mutations in the palindromic NCANNTGN motif. This population of mutant sequences were represented by the consensus PuTCAPyPuAGG.