A NOVEL METHODOLOGY FOR DEFINING THE SEQUENCE SPECIFICITIES OF DNA-BINDING PROTEINS: ANALYSIS OF GCN4 AND NEW GENE PRODUCT SBF
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Hood College Biology
Program
Biomedical and Environmental Science
Citation of Original Publication
Rights
Subjects
Abstract
Transcriptional regulation of genes is exerted in part by
transactivating DNA-binding proteins which interact with DNA at
cis-elements to positively or negatively affect gene expression.
A novel methodology was developed that identifies the nucleotide
sequence specificity of DNA-binding proteins. This technique is
based upon the phenomenon whereby a DNA-binding protein can bind a
specific nucleotide sequence present among oligonucleotides
containing random combinations of base sequences. These specific
sequences are then be amplified by polymerase chain reaction (PCR)
for subsequent re-selecting, cloning, sequencing and analysis.
To evaluate the feasibility of this methodology, the
previously characterized yeast transactivating protein, GCN4, was
used as a model system. The nucleotide consensus sequence of
binding was determined to be TGACTCA in agreement with previously
reported data. A common variation in the consensus, which was also
bound with high affinity, TGACTAN involved specificity to an A in
position +2 and a loss of specificity in position +3. Moreover,
this methodology allowed the identification of additional DNA
sequences which were recognized by GCN4 with lower affinity; these
binding sites deviated from the consensus by one or sometimes two
positions, 3' to the central C residue. This data suggests that
although the recognition sequence(s) were palindromic, binding to
both sides was not equivalent.
In addition, the consensus binding sequence of a
transactivating protein SBF (S-element binding factor) was
determined. A SBF cDNA clone was identified from an expression
library by its binding to an oligonucleotide sequence present in
the ETS-2 promoter, and was further characterized as encoding a
helix-loop-helix leucine-zipper DNA binding protein. The optimal
DNA-binding sequence was determined to be CACGTG. Like GCN4, the
SBF protein dimer binds both sides of the palindrome with different
affinities, asymmetrically allowing variations in the +1 or +2
position. Also, SBF was shown to specifically bind sequences which
had mutations in the palindromic NCANNTGN motif. This population
of mutant sequences were represented by the consensus PuTCAPyPuAGG.