How long is a tata box




















We define a homology group, as in the EPD, as sequence similarity due to common phylogenetic origin. However, as the definition of homologous promoters is based only on similarity of DNA sequence in the promoter region, they can be either orthologs or paralogs. We then deleted from each set multiple sequences coming from the same homology group, as well as TATA boxes in the transcribed region.

Thus, the dataset now corresponds to a representative set of not closely related promoters. To the summation we add a constant C 0 , chosen such that the best binding site scores zero and poorer sites score positively. It is calculated by subtracting from the observed number of occurrences of each motif the expected number of occurrences based on the mononucleotide frequency of the component base pairs, and then dividing this value by the expected SD We have grouped the ten TATA boxes studied here to two groups.

All sequences in this group have a central A 4 -A 5 step. All sequences in this group have a central A 4 -T 5 step. This observation is still valid, but is less distinctive, when we analyze co-crystal structures corresponding to the present ten sequences Table 1. Looking at Table 1 , we observe that sequences belonging to group I all harbor an A-tract, defined as a DNA region consisting of four or more A's in a row A-tracts are known to adopt a dominant unique structure, distinct from that of generic B-DNA 37 , which is invariant and sequence-context independent 38 , A-tract may even confer unique structural properties to sequence adjacent to them Thus, the variability in roll angle at these positions is 5-fold larger for group II sequences than those of group I sequences compare the SD values of group I to those of group II, i.

Moreover, the average deviation of the roll angle along any one crystal structure corresponding to those studied here is also larger for sequences of group II than those of group I, Large roll fluctuations are commonly associated with conformational flexibility.

Packer et al. We have used the values given by Packer et al. We calculated the flexibility with respect to slide of the ten sequences studied here by summing the components tetranucleotides in a sliding window along the core 8-bp of each sequence. We then averaged the values from each 5-sequence group.

Thus, group I is significantly more rigid than group II, also with respect to slide. Shown are the relative mobilities of the bound DNA divided by the relative mobilities of the free DNA as a function of the linker length.

The values shown are of one representative experiment of 3—4 independent experiments. The line is from the best fit to a cosine function These results are similar to those obtained in the study of Wu et al. The values of the bend angles in the study of Wu et al. These results contrast with the crystallographic study of Patikoglou et al. Two different explanations are possible for this discrepancy.

First, the difference may be due, at least partly, to the writhed DNA structure observed in the crystalline state, as discussed in Variations in electrophoretic mobility, between DNA fragments of the same length, are related to differences in the mean square end-to-end distance of the molecules. Therefore, the difference between the crystallographic and solution results may thus be partly attributed to the difference in the outcome of projecting a 3D curve onto a 2D plane in the two methods.

However, no sequence-dependent differences in writhe have been observed in the crystallographic studies 5— Second, it has been shown by Wu et al. The data presented here support this option and points to the last base-pair step, position 7—8, as the origin of the sequence-dependent pattern. We have determined the rate of dissociation of yTBPc from all variants studied here by gel electrophoresis Figure 2 , as previously described However, we could not fit well the data of two of the variants T 7 and T 8 to a one-phase dissociation equation, as in our previous study Hence, we analyzed these two variants by a two-phase kinetic equation Figure 3.

Consequently, we have also re-analyzed the remaining data by a two-phase kinetic equation Figure 3. The number below each gel denotes the time after adding competitor DNA 1. Plot of the fraction of molecules bound to consensus-like TATA-box variants at time t divided by the fraction of molecules bound at time zero is plotted as a function of time. The lines are from the best fit to a double exponential curve. The shown experimental points are those from only one experiment, out of 3—6 independent experiments conducted with each DNA target.

Hence, they may deviate slightly from the averaged values presented in Table 1. There are a priori two likely interpretations for the observed behavior. According to this hypothesis the T 7 and T 8 variants have very low sequence specificities, and thus dissociation from the TATA-box region of the T 7 and T 8 variants and dissociation from the sequence flanking them has similar and short half-life.

Second possible explanation is that the kinetics of dissociation could proceeds through a complex mechanism with several intermediates 49— Parkhurst et al. To differentiate between these two possibilities, we have studied TATA-box variants with methylated cytosines in their flanking sequences.

If the different biphasic behavior observed for the T 7 and T 8 variants are due to dissociation from specific sequences core TATA box versus nonspecific sequences the flanking sequences , i. However, if the biphasic behavior of the T 7 and T 8 variants is due to different dissociation events from the TATA box itself, then we would expect similar dissociation behavior of methylated and unmethylated sequences.

MLP represents the TATA-box variants that undergo mainly the slow process, whereas T 7 represents the variants that undergo mainly the fast process. Dissociation kinetics experiments using methylated DNA targets. Right: stem of DNA hairpin containing the T 7 target with methylated cytosine residues. For other details see Figure 2.

These results indicate no significant difference between the dissociation kinetic behavior of yTBPc complexes with regular TATA boxes versus with TATA boxes containing methylated cytosines in their flanking sequences. Thus, we can conclude that the two processes are dissociation events from complexes having different intercalation states. Fraction of yTBPc molecules bound at time t divided by the fraction of molecules bound at time 0 is plotted as a function of time.

For other details see Figure 3. Powell et al. In the AdE4 target, the next step is ordering of the flexible T-A steps, and intercalation of the second phenylalanine pair between the nucleotides at position 7 and 8 The interaction is basically over before the final step, which consists of further structural and energetic adjustments.

In the more rigid AdMLP target, the second step is not as facile, and thus the intercalation step is delayed until the final step We propose that in group I, the second intercalation event is delayed relative to that in group II. This is similar to the relationship found by Starr et al.

However, in group II other structural correlations emerge. Here the most rigid dinucleotide A-A forms the most stable complex with TBP, and the most flexible dinucleotide T-G forms the weakest complex.

In group II tetranucleotides with the lowest conformational energy minimized with respect to all six base-pair step parameters for the central dinucleotide form the most stable complexes with TBP. Since sequences in group II are on the whole more flexible than canonical B-DNA, and in particular more flexible relative to group I, it is logical that binding stability correlates with lower conformational energy, as well as with more rigid sequences with respect to slide.

In group I, the sequences are on the whole more rigid. Thus, the most stable binder is also the most bent one, which probably forms the most intimate interface. However, sequences in group II being highly flexible, and thus found in a variety of conformations, need a firm anchor to grip to after the first intercalation event, for the second intercalation event to occur, since they may not be rightly oriented at that step, and thus the most rigid sequence, that with the AAA tract is the most stable binder.

Hence, in this more flexible group the correlation is between rigidity relative to other group members and binding stability. Berg and von-Hippel 30 , 31 were the first to link between the statistics of binding sites occurrences and their binding free energy.

In group I we see such correlations, both at the 8-bp level as well as on the dinucleotide level Table 1. No such correlation is observed among group II sequences. The matrix elements are the log-odds ratio per base pair and per position. This degenerate consensus sequence includes all high and moderate probability mononucleotide combinations appearing in the base frequency table of Bucher There are such sequences in the EPD, when we take only identified sequences belonging to known homology groups, and we take only one sequence per homology group, i.

The matrix elements are the maximum probability estimate for the binding energy contribution of each base at each position, when we assume that each position contributes independently to the total binding energy The sum of the dot product between this matrix and a matrix containing only 0's and 1's as its elements corresponding to a sequence studied here gives an informational score for that sequence, which is the calculated total binding energy for that sequence If the additivity assumption holds true for the studied binding sites the informational score for these sequences should correlate with their measured binding affinity.

When we have indications for non-additivity in protein—DNA interactions, we can correct for nearest-neighbor interactions by calculating the dinucleotide information score for these sequences Science , — Genome Research , 1— Google Scholar.

Helden JV, Andre B, Collado-Vides J: Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies.

Journal of Molecular Biology , — Article PubMed Google Scholar. Bioinformatics , — Annu Rev Biochem , — Physiological Genomics , 7: 97— Trends Genet , — Physiological Genomics , 2: — Cell , — BMC Genomics , 6: J Mol Biol , — Genome Research , — Nucleic Acids Res , — J Biol Chem , — PubMed Google Scholar. Cytogenet Genome Res , 78— Download references. We thank Andrey Shlyakhtenko for the detailed explanation of their statistical algorithm, Fuchun Huang for helpful discussion, and Elicia Lanham for improving the English of this paper.

You can also search for this author in PubMed Google Scholar. Correspondence to Wei Shi. WS conceived the study, implemented the algorithm and drafted the manuscript. WZ supervised the project. All authors contributed to, read and approved the final manuscript.

Reprints and Permissions. Shi, W. BMC Bioinformatics 7, S2 Download citation. Published : 12 December Anyone you share the following link with will be able to read this content:. Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative. Skip to main content. Search all BMC articles Search. Download PDF. Volume 7 Supplement 4.

Results In this study, we conduct a dedicated analysis on the frequency distribution of TATA Box and its extension sequences on human promoters. Background Transcription factor binding sites TFBSs play a very important role in the regulation of gene expressions. Results Promoters which have been aligned to their TSSs are divided into a number of bins, each of which contains 20 bp from each gene.

Frequency distribution of A, T, G and C in human promoters First of all, we determine the distribution of each of the four single bases A, T, G and C in each of six data sets.

Figure 1. Full size image. Figure 2. Figure 3. Figure 4. Full size table. Figure 5. Conclusion This paper uses a statistical approach to analyze the frequency distribution of TATA elements and TATA extension sequences on the promoters of human and three other organisms. Human housekeeping genes and tissue specific genes Housekeeping genes and tissue specific genes were collected from the literature [ 6 — 9 ]. References 1. Google Scholar 4. PubMed Google Scholar Acknowledgements We thank Andrey Shlyakhtenko for the detailed explanation of their statistical algorithm, Fuchun Huang for helpful discussion, and Elicia Lanham for improving the English of this paper.

View author publications. Additional information Authors' contributions WS conceived the study, implemented the algorithm and drafted the manuscript. The results from the transcription assays performed with the whole-cell extract show that transcription levels normalized to the wild-type TATA box were overall higher than in the purified system. This pattern was largely replicated using the same promoter in a reporter assay system in vivo.

The source of this difference is unknown, but one explanation could be that there is a factor in vivo which has a negative effect on initiation of certain weak TATA boxes.

Perhaps this factor is not present in its active form in the extract that was used in our experiments. Given the extensive coupling between mRNA transcription and processing 61 , it is also possible that weak TATA boxes could lead to less efficiently processed, and hence less stable, transcripts in vivo. Lastly, chromatin-based mechanisms of repression have been shown to suppress transcription initiation from weak promoter like elements 62 , These processes would, of course, not be reflected in our nonchromatin transcription system in vitro.

The relative lack of sensitivity to nonconsensus TATA mutants in our extract and transcription assays in vivo is somewhat surprising given the greater sensitivity to TATA box mutations observed in some other yeast studies.

The ability of the extract to facilitate higher levels of relative transcription than the purified system for nonconsensus TATA boxes suggests that there are additional factors that assist in transcription of weak TATA containing or TATA-less promoters. Fractionating these extracts and adding them back to our purified system is a promising approach for future attempts to identify these factors.

We thank Dr S. We also thank Benjamin Guidi for assistance in the preparation of purified general transcription factors. National Center for Biotechnology Information , U. Journal List Nucleic Acids Res v. Nucleic Acids Res. Published online Apr 1. Gudrun Bjornsdottir and Lawrence C. Author information Article notes Copyright and License information Disclaimer. This article has been cited by other articles in PMC.

G-less cassette transcription assays Transcription activities of CYC1 TATA box mutants with purified factors were measured in a G-less cassette assay as previously described 47 with the following modifications.

Open in a separate window. Figure 1. Figure 2. Figure 3. Figure 4. Mediator increases the selectivity for consensus bases at the sixth and eighth position of the TATA box, but neither Mediator nor Gal4-VP16 specifically compensate for low transcription levels of weak TATA boxes The minimal purified basal transcription system is not capable of responding to DNA-bound transcriptional activators Figure 5.

Figure 6. Figure 7. Transcription activities of TATA box mutants in vivo correlate with their activities in an extract-based system To determine whether the results of either the extract transcription system or the purified transcription system accurately reflected the discrimination of TATA boxes in vivo , we performed a reporter assay with the CYC1 promoter TATA box and mutant variants.

Conflict of interest statement. None declared. New problems in RNA polymerase II transcription initiation: Matching the diversity of core promoters with a variety of promoter recognition factors.

Identification and distinct regulation of yeast TATA box-containing genes. Chen W, Struhl K. Natl Acad. Functional distinctions between yeast TATA elements. Cell Biol. Ponticelli AS, Struhl K. Analysis of Saccharomyces cerevisiae his3 transcription in vitro: Biochemical support for multiple mechanisms of transcription.

Wobbe CR, Struhl K. Genes Dev. Wong JM, Bateman E. Hall DB, Struhl K. The VP16 activation domain interacts with multiple transcriptional components as determined by protein-protein cross-linking in vivo. In vivo target of a transcriptional activator revealed by fluorescence resonance energy transfer.

Function of a eukaryotic transcription activator during the transcription cycle. The yeast Mediator complex and its regulation. Trends Biochem. Asturias FJ. RNA polymerase II structure, and organization of the preinitiation complex. The structural and functional organization of the yeast Mediator complex. Simultaneous recruitment of coactivators by Gcn4p stimulates multiple steps of transcription in vivo.

Mediator requirement downstream of chromatin remodeling during transcriptional activation of CHA1 in yeast. Selective recruitment of TAFs by yeast upstream activating sequences: Implications for eukaryotic promoter structure. Systematic analysis of essential yeast TAFs in genome-wide transcription and preinitiation complex assembly. EMBO J. Requirement for yeast TAF function in transcriptional activation of the RPS5 promoter that depends on both core promoter structure and upstream activating sequences.

A multiprotein mediator of transcriptional activation and its interaction with the C-terminal repeat domain of RNA polymerase II. Li WZ, Sherman F. Preponderance of free Mediator in the yeast Saccharomyces cerevisiae.

Med19 Rox3 regulates intermodule interactions in the Saccharomyces cerevisiae Mediator complex. Methods Companion Methods Enzymol. Mediator protein mutations that selectively abolish activated transcription. Extensive promoter contacts and co-activator function. Genetic isolation of ADA2: A potential transcriptional adaptor required for function of certain acidic activation domains. Epigenetic effects on yeast transcription caused by mutations in an actin-related protein present in the nucleus.

Reconstitution of transcription with five purified initiation factors and RNA polymerase II from Saccharomyces cerevisiae. Interplay of TBP inhibitors in global transcriptional control. Sawadogo M, Roeder RG. Factors involved in specific transcription by human RNA polymerase II: analysis by a rapid and quantitative in vitro assay.

Protein Chem. TBP flanking sequences: asymmetry of binding, long-range effects and consensus sequences. Precise nucleosome positioning and the TATA box dictate requirements for the histone H4 tail and the bromodomain factor Bdf1. Maniatis T, Reed R.

An extensive network of coupling among gene expression machines.



0コメント

  • 1000 / 1000