CAAT box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

File:Alosa fallax.jpg
As representative of the Metazoa here is an image of a twaid shad. Credit: Hans Hillewaert.

A "CCAAT box (also sometimes abbreviated a CAAT box or CAT box) is a distinct pattern of nucleotides"[1] along the template strand of DNA in eukaryotes.

Boxes

A "repeating sequence of nucleotides that forms a transcription or a regulatory signal"[2] is a box.

Consensus sequences

In the direction of transcription on the template strand, the consensus sequence for a CAAT box is GGCCAATCT.[1]

On the coding strand "(T/C)G ATTGG (T/C)(T/C)(A/G) was the sequence that favored CBF binding [in the mouse pro-α2(1) collagen promoter]."[3] On the template strand, this is (C/T)(A/G)(A/G)CCAATC(A/G). "[T]he favorable sequence for CBF binding was TG ATTGG (T/C)(T/C)(A/G)."[3]

The upstream activating sequence (UAS) for the [heme-activated protein] Hap4p is CCAAT.[4]

Core promoters

Notation: let the symbol CBF represent the CAAT-box binding factor.

A CAAT box when present occurs "upstream by 75-80 bases to the initial transcription site."[1]

"In many eukaryotic class II promoters, CCAAT motifs are often found between 50 and 100 nucleotides upstream of the transcription start site (17-20), and these motifs are recognized by different classes of CCAAT-binding proteins, one of which is CBF."[5]

"In many higher eukaryotic class II promoters, CCAAT motifs (or ATTGG motifs in the opposite strand), are often found between −50 and −110 relative to the start of transcription (1-4). The precise location of these CCAAT motifs and the promoter sequences around the motif of a specific gene are highly conserved during evolution."[3]

"In metazoa, the CBF-DNA complex is characterized by its requirement for a high degree of conservation within the binding motif CCAAT (7, 21, 22), and sequences surrounding the pentameric motif contribute to the binding specificity (Ref. 16 and references therein)."[5]

"Computer analysis of 502 unrelated RNA polymerase II promoter regions showed that approximately 30% of the promoters contained a CCAAT sequence (or ATTGG sequence on the complementary strand) and that in a large number of vertebrate promoters the CCAAT motif was located around nucleotide −80 upstream of the transcription start site (4)."[3]

"[I]n most of these promoters the flanking sequences of ATTGG were TG on the 5′ side and (T/C)(T/C)(A/G) on the 3′ side".[3]

"[T]he CCAAT-flanking sequences [occur] around the CCAAT motifs in most eukaryotic promoters harboring a CCAAT sequence in these proximal promoters."[3]

"In contrast to many animal CCAAT motifs, the majority of the plant sequences contain only one C or lack a CAAT-box completely."[5]

Gene transcriptions

"Genes that have this element seem to require it for the gene to be transcribed in sufficient quantities. It is frequently absent from genes that encode proteins used in virtually all cells. This box along with the GC box is known for binding general transcription factors. CAAT and GC are primarily located in the region from 100-150bp upstream from the TATA box. Both of these consensus sequences belong to the regulatory promoter. Full gene expression occurs when transcription activator proteins bind to each module within the regulatory promoter. Protein specific binding is required for the CCAAT box activation. These proteins are known as CCAAT box binding proteins/CCAAT box binding factors."[1]

Cadherins

"Transcriptional downregulation of E-cadherin appears to be an important event in the progression of various epithelial tumors. SIP1 (ZEB-2) is a Smad-interacting, multi-zinc finger protein that shows specific DNA binding activity. [Expression] of wild-type but not of mutated SIP1 downregulates mammalian E-cadherin transcription via binding to both conserved E2 boxes of the minimal E-cadherin promoter."[6]

"Analysis of mouse and human E-cadherin promoters revealed a conserved modular structure with positive regulatory elements including two E2 boxes (CACCTG) with a potential repressor role Behrens et al. 1991, Giroldi et al. 1997."[6]

"The two E2 boxes in the mouse and human E-cadherin promoter sequences were demonstrated to play a crucial role in the epithelial-specific expression of E-cadherin Behrens et al. 1991, Giroldi et al. 1997. Mutation of these sequence elements results in upregulation of the E-cadherin promoter in dedifferentiated cancer cells, whereas the wild-type promoter shows low activity in such cells. Recently, it was shown that the zinc finger transcriptional repressor Snail can downregulate E-cadherin by binding to the E boxes in the E-cadherin promoter Batlle et al. 2000, Cano et al. 2000. Human Snail belongs to a family of zinc finger proteins, which contain four or five zinc finger domains of the C2H2 type at their C-terminal end. These zinc fingers bind to the CANNTG sequence in E box motifs."[6]

"δEF1 and SIP1 have been shown to bind spaced CACCT DNA sequences, including E2 boxes (CACCTG), by their zinc finger clusters (Remacle et al., 1999)."[6]

"To address the specificity of SIP1 action, mutagenesis of the E-cadherin promoter in either its upstream E2 box 1 (−75) or its downstream E2 box 3 (−25), or in both E2 boxes was performed [...]."[6]

Wild-type "SIP1 represses the E-cadherin promoter, likely through binding via both zinc finger clusters to spaced E2 boxes as demonstrated previously (Remacle et al., 1999) and confirmed here by a DNA-mediated pull-down assay of SIP1 protein [...]. Wild-type but not mutated SIP1 from transfected human cells could be efficiently precipitated by biotinylated E-cadherin promoter oligonucleotides, comprising two wild-type E2 box sequences. Mutation of the E2 boxes resulted in the loss of SIP1 binding."[6]

Human E2 boxes are E2-box 1 (GCAGGTGA), E2-box 2 (TGGCCGGC) and E2-box 3 (TCACCTGG).[6]

"Alignment of the E-cadherin promoter sequences of dog, mouse, and man. Conserved regulatory elements are indicated: E2 boxes 1 and 3, CCAAT box, and GC box. The E2 box 2 has been described as part of a palindromic E-pal sequence in the mouse E-cadherin promoter (Behrens et al., 1991), but is conserved neither in canine nor in human sequences."[6]

Human NeuroD (BETA2/BHF1) genes

"There was no consensus CAAT box. [...] In addition, we performed mutation analyses of the E2 box and the E3 box to evaluate whether the E2 and E3 boxes regulate the transcriptional activity of the human NeuroD gene [...]."[7]

Human glucocerebrosidase genes

The "5′ genomic sequences revealed promoter elements containing a TATA box at nucleotides −23 to −27 and a CAAT box between nucleotides [...] and an E2 box [...]."[8]

Cap signal elements

"Studies have reported that the cap signal element with the TATA-box, CAAT-box, and GC-box is the most general element of the POL II promoter and exists in major protein [...]."[9]

Hypotheses

  1. A1BG is not transcribed by a CAAT box.

A1BG samplings

A CCAAT box (also sometimes abbreviated a CAAT box or CAT box) is a distinct pattern of nucleotides along the template strand of DNA in eukaryotes.

On the template strand, the CAAT box consensus sequence is 3'-(C/T)(A/G)(A/G)CCAATC(A/G)-5'.

For the Basic programs (starting with SuccessablesCAAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. CAAT - 3'-(C/T)(A/G)(A/G)CCAATC(A/G)-5', -- there are zero, -+ there are zero, +- there are zero, ++ there are zero.
  2. CAAT - 3'-(A/G)(C/T)(C/T)GGTTAG(C/T)-5', complement, -- there are zero, -+ there are zero, +- there are zero, and ++ there are zero.
  3. CAAT - 3'-(A/G)-C-T-A-A-C-C-(A/G)-(A/G)-(C/T)-5', inverse, -- there are zero, -+ there are zero, +- there are zero, and ++ there are zero.
  4. CAAT - 3'-(C/T)-G-A-T-T-G-G-(C/T)-(C/T)-(A/G)-5', complement inverse, -- there are zero, -+ there are zero, +- there are zero, and ++ there are zero.

With each SuccessablesCAAT.bas extended from 958 to 4445 nts starting just beyond ZNF497, there are no changes in results.

Copying the consensus sequence 5'-CAAT-3' and putting the sequence in "⌘F" finds seven location between ZNF497 and A1BG or no locations between ZSCAN22 and A1BG as can be found by the computer programs.

Transcribed CAAT boxes

Gene ID: 1051 is CEBPB CCAAT enhancer binding protein beta aka TCF5; NF-IL6; TF5: "This intronless gene encodes a transcription factor that contains a basic leucine zipper (bZIP) domain. The encoded protein functions as a homodimer but can also form heterodimers with CCAAT/enhancer-binding proteins alpha, delta, and gamma. Activity of this protein is important in the regulation of genes involved in immune and inflammatory responses, among other processes. The use of alternative in-frame AUG start codons results in multiple protein isoforms, each with distinct biological functions."[10]

  1. NP_001272807.1 CCAAT/enhancer-binding protein beta isoform b: "Transcript Variant: This variant (1) encodes multiple isoforms through the use of alternative translation initiation codons. The isoform [b, also known as LAP (liver activating protein)] represented in this RefSeq results from translation initiation at a downstream AUG start codon. Isoform b has a shorter N-terminus, compared to isoform a."[10]
  2. NP_001272808.1 CCAAT/enhancer-binding protein beta isoform c: "Transcript Variant: This variant (1) encodes multiple isoforms through the use of alternative translation initiation codons. The isoform [c, also known as LIP (liver inhibitory protein)] represented in this RefSeq results from translation initiation at a downstream AUG start codon. Isoform c has a shorter N-terminus, compared to isoform a."[10]
  3. NP_005185.2 CCAAT/enhancer-binding protein beta isoform a: "Transcript Variant: This variant (1) encodes multiple isoforms through the use of alternative translation initiation codons. The isoform (a, also known as LAP*) represented in this RefSeq results from translation initiation at the 5' most AUG start codon and is the longest isoform."[10]

Heme-activated protein (Hap) samplings

Copying the consensus sequence for the Hap4p CCAAT and putting the sequence in "⌘F" finds one location between ZNF497 and A1BG or no locations between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CCAAT (starting with Successables CCAAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CCAAT, 0.
  2. positive strand, negative direction, looking for CCAAT, 1, CCAAT at 2848.
  3. negative strand, positive direction, looking for CCAAT, 1, CCAAT at 2908.
  4. positive strand, positive direction, looking for CCAAT, 1, CCAAT at 3024.
  5. inverse complement, negative strand, negative direction, looking for ATTGG, 1, ATTGG at 614.
  6. inverse complement, positive strand, negative direction, looking for ATTGG, 3, ATTGG at 3529, ATTGG at 1045, ATTGG at 643.
  7. inverse complement, negative strand, positive direction, looking for ATTGG, 1, ATTGG at 24.
  8. inverse complement, positive strand, positive direction, looking for ATTGG, 0.

Hap (4560-2846) UTRs

  1. Positive strand, negative direction: ATTGG at 3529, CCAAT at 2848.

Hap negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: ATTGG at 614.
  2. Positive strand, negative direction: ATTGG at 1045, ATTGG at 643.

Hap positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CCAAT at 2908, ATTGG at 24.
  2. Positive strand, positive direction: CCAAT at 3024.

Hap random dataset samplings

  1. CCAATr0: 7, CCAAT at 4457, CCAAT at 3878, CCAAT at 3351, CCAAT at 2733, CCAAT at 2614, CCAAT at 2492, CCAAT at 2231.
  2. CCAATr1: 5, CCAAT at 3303, CCAAT at 2599, CCAAT at 2563, CCAAT at 786, CCAAT at 296.
  3. CCAATr2: 7, CCAAT at 4397, CCAAT at 3960, CCAAT at 3882, CCAAT at 2975, CCAAT at 1612, CCAAT at 1461, CCAAT at 1354.
  4. CCAATr3: 9, CCAAT at 4252, CCAAT at 3740, CCAAT at 3367, CCAAT at 3043, CCAAT at 1728, CCAAT at 1670, CCAAT at 1277, CCAAT at 830, CCAAT at 236.
  5. CCAATr4: 7, CCAAT at 4145, CCAAT at 2480, CCAAT at 2325, CCAAT at 2107, CCAAT at 1784, CCAAT at 1761, CCAAT at 717.
  6. CCAATr5: 8, CCAAT at 4474, CCAAT at 4156, CCAAT at 2635, CCAAT at 2225, CCAAT at 1869, CCAAT at 1463, CCAAT at 911, CCAAT at 192.
  7. CCAATr6: 10, CCAAT at 4188, CCAAT at 3103, CCAAT at 3037, CCAAT at 2486, CCAAT at 2228, CCAAT at 1955, CCAAT at 1646, CCAAT at 1626, CCAAT at 740, CCAAT at 692.
  8. CCAATr7: 8, CCAAT at 4452, CCAAT at 4262, CCAAT at 3779, CCAAT at 3386, CCAAT at 2924, CCAAT at 2355, CCAAT at 1561, CCAAT at 952.
  9. CCAATr8: 11, CCAAT at 3938, CCAAT at 3898, CCAAT at 3047, CCAAT at 2713, CCAAT at 2420, CCAAT at 2219, CCAAT at 1451, CCAAT at 1420, CCAAT at 1266, CCAAT at 1129, CCAAT at 604.
  10. CCAATr9: 3, CCAAT at 3562, CCAAT at 3429, CCAAT at 1526.
  11. CCAATr0ci: 5, ATTGG at 3851, ATTGG at 3203, ATTGG at 1776, ATTGG at 379, ATTGG at 81.
  12. CCAATr1ci: 3, ATTGG at 3814, ATTGG at 2647, ATTGG at 314.
  13. CCAATr2ci: 4, ATTGG at 3592, ATTGG at 1415, ATTGG at 1335, ATTGG at 755.
  14. CCAATr3ci: 11, ATTGG at 2942, ATTGG at 2842, ATTGG at 2826, ATTGG at 2389, ATTGG at 2353, ATTGG at 2050, ATTGG at 1231, ATTGG at 1163, ATTGG at 1048, ATTGG at 666, ATTGG at 182.
  15. CCAATr4ci: 4, ATTGG at 3576, ATTGG at 3097, ATTGG at 708, ATTGG at 651.
  16. CCAATr5ci: 3, ATTGG at 3810, ATTGG at 3172, ATTGG at 1888.
  17. CCAATr6ci: 10, ATTGG at 4430, ATTGG at 4099, ATTGG at 3233, ATTGG at 2516, ATTGG at 2495, ATTGG at 2489, ATTGG at 1389, ATTGG at 1326, ATTGG at 1026, ATTGG at 604.
  18. CCAATr7ci: 3, ATTGG at 4180, ATTGG at 3162, ATTGG at 2457.
  19. CCAATr8ci: 7, ATTGG at 4023, ATTGG at 4007, ATTGG at 3239, ATTGG at 3117, ATTGG at 1977, ATTGG at 855, ATTGG at 204.
  20. CCAATr9ci: 7, ATTGG at 4435, ATTGG at 3992, ATTGG at 3722, ATTGG at 2889, ATTGG at 2851, ATTGG at 2329, ATTGG at 1804.

Hapr arbitrary (evens) (4560-2846) UTRs

  1. CCAATr0: CCAAT at 4457, CCAAT at 3878, CCAAT at 3351.
  2. CCAATr2: CCAAT at 4397, CCAAT at 3960, CCAAT at 3882, CCAAT at 2975.
  3. CCAATr4: CCAAT at 4145.
  4. CCAATr6: CCAAT at 4188, CCAAT at 3103, CCAAT at 3037.
  5. CCAATr8: CCAAT at 3938, CCAAT at 3898, CCAAT at 3047.
  6. CCAATr0ci: ATTGG at 3851, ATTGG at 3203.
  7. CCAATr2ci: ATTGG at 3592.
  8. CCAATr4ci: ATTGG at 3576, ATTGG at 3097.
  9. CCAATr6ci: ATTGG at 4430, ATTGG at 4099, ATTGG at 3233.
  10. CCAATr8ci: ATTGG at 4023, ATTGG at 4007, ATTGG at 3239, ATTGG at 3117.

Hapr alternate (odds) (4560-2846) UTRs

  1. CCAATr1: CCAAT at 3303.
  2. CCAATr3: CCAAT at 4252, CCAAT at 3740, CCAAT at 3367, CCAAT at 3043.
  3. CCAATr5: CCAAT at 4474, CCAAT at 4156.
  4. CCAATr7: CCAAT at 4452, CCAAT at 4262, CCAAT at 3779, CCAAT at 3386, CCAAT at 2924.
  5. CCAATr9: CCAAT at 3562, CCAAT at 3429, CCAAT at 1526.
  6. CCAATr1ci: ATTGG at 3814.
  7. CCAATr3ci: ATTGG at 2942.
  8. CCAATr5ci: ATTGG at 3810, ATTGG at 3172.
  9. CCAATr7ci: ATTGG at 4180, ATTGG at 3162.
  10. CCAATr9ci: ATTGG at 4435, ATTGG at 3992, ATTGG at 3722, ATTGG at 2889, ATTGG at 2851.

Hapr alternate negative direction (odds) (2846-2811) core promoters

  1. CCAATr3ci: ATTGG at 2842, ATTGG at 2826.

Hapr arbitrary positive direction (odds) (4445-4265) core promoters

  1. CCAATr9ci: ATTGG at 4435.

Hapr alternate positive direction (evens) (4445-4265) core promoters

  1. CCAATr2: CCAAT at 4397.
  2. CCAATr6ci: ATTGG at 4430.

Hapr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. CCAATr0: CCAAT at 2733, CCAAT at 2614.
  2. CCAATr8: CCAAT at 2713.

Hapr alternate negative direction (odds) (2811-2596) proximal promoters

  1. CCAATr1: CCAAT at 2599.
  2. CCAATr5: CCAAT at 2635.
  3. CCAATr1ci: ATTGG at 2647.

Hapr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. CCAATr3: CCAAT at 4252.
  2. CCAATr5: CCAAT at 4156.
  3. CCAATr7: CCAAT at 4262.
  4. CCAATr7ci: ATTGG at 4180.

Hapr alternate positive direction (evens) (4265-4050) proximal promoters

  1. CCAATr4: CCAAT at 4145.
  2. CCAATr6: CCAAT at 4188.
  3. CCAATr6ci: ATTGG at 4099.

Hapr arbitrary negative direction (evens) (2596-1) distal promoters

  1. CCAATr0: CCAAT at 2492, CCAAT at 2231.
  2. CCAATr2: CCAAT at 1612, CCAAT at 1461, CCAAT at 1354.
  3. CCAATr4: CCAAT at 2480, CCAAT at 2325, CCAAT at 2107, CCAAT at 1784, CCAAT at 1761, CCAAT at 717.
  4. CCAATr6: CCAAT at 2486, CCAAT at 2228, CCAAT at 1955, CCAAT at 1646, CCAAT at 1626, CCAAT at 740, CCAAT at 692.
  5. CCAATr8: CCAAT at 2420, CCAAT at 2219, CCAAT at 1451, CCAAT at 1420, CCAAT at 1266, CCAAT at 1129, CCAAT at 604.
  6. CCAATr0ci: ATTGG at 1776, ATTGG at 379, ATTGG at 81.
  7. CCAATr2ci: ATTGG at 1415, ATTGG at 1335, ATTGG at 755.
  8. CCAATr4ci: ATTGG at 708, ATTGG at 651.
  9. CCAATr6ci: ATTGG at 2516, ATTGG at 2495, ATTGG at 2489, ATTGG at 1389, ATTGG at 1326, ATTGG at 1026, ATTGG at 604.
  10. CCAATr8ci: ATTGG at 1977, ATTGG at 855, ATTGG at 204.

Hapr alternate negative direction (odds) (2596-1) distal promoters

  1. CCAATr1: CCAAT at 2563, CCAAT at 786, CCAAT at 296.
  2. CCAATr3: CCAAT at 1728, CCAAT at 1670, CCAAT at 1277, CCAAT at 830, CCAAT at 236.
  3. CCAATr5: CCAAT at 2225, CCAAT at 1869, CCAAT at 1463, CCAAT at 911, CCAAT at 192.
  4. CCAATr7: CCAAT at 2355, CCAAT at 1561, CCAAT at 952.
  5. CCAATr9: CCAAT at 1526.
  6. CCAATr1ci: ATTGG at 314.
  7. CCAATr3ci: ATTGG at 2389, ATTGG at 2353, ATTGG at 2050, ATTGG at 1231, ATTGG at 1163, ATTGG at 1048, ATTGG at 666, ATTGG at 182.
  8. CCAATr5ci: ATTGG at 1888.
  9. CCAATr7ci: ATTGG at 2457.
  10. CCAATr9ci: ATTGG at 2329, ATTGG at 1804.

Hapr arbitrary positive direction (odds) (4050-1) distal promoters

  1. CCAATr1: CCAAT at 3303, CCAAT at 2599, CCAAT at 2563, CCAAT at 786, CCAAT at 296.
  2. CCAATr3: CCAAT at 3740, CCAAT at 3367, CCAAT at 3043, CCAAT at 1728, CCAAT at 1670, CCAAT at 1277, CCAAT at 830, CCAAT at 236.
  3. CCAATr5: CCAAT at 2635, CCAAT at 2225, CCAAT at 1869, CCAAT at 1463, CCAAT at 911, CCAAT at 192.
  4. CCAATr7: CCAAT at 3779, CCAAT at 3386, CCAAT at 2924, CCAAT at 2355, CCAAT at 1561, CCAAT at 952.
  5. CCAATr9: CCAAT at 3562, CCAAT at 3429, CCAAT at 1526.
  6. CCAATr1ci: ATTGG at 3814, ATTGG at 2647, ATTGG at 314.
  7. CCAATr3ci: ATTGG at 2942, ATTGG at 2842, ATTGG at 2826, ATTGG at 2389, ATTGG at 2353, ATTGG at 2050, ATTGG at 1231, ATTGG at 1163, ATTGG at 1048, ATTGG at 666, ATTGG at 182.
  8. CCAATr5ci: ATTGG at 3810, ATTGG at 3172, ATTGG at 1888.
  9. CCAATr7ci: ATTGG at 3162, ATTGG at 2457.
  10. CCAATr9ci: ATTGG at 3992, ATTGG at 3722, ATTGG at 2889, ATTGG at 2851, ATTGG at 2329, ATTGG at 1804.

Hapr alternate positive direction (evens) (4050-1) distal promoters

  1. CCAATr0: CCAAT at 3878, CCAAT at 3351, CCAAT at 2733, CCAAT at 2614, CCAAT at 2492, CCAAT at 2231.
  2. CCAATr2: CCAAT at 3960, CCAAT at 3882, CCAAT at 2975, CCAAT at 1612, CCAAT at 1461, CCAAT at 1354.
  3. CCAATr4: CCAAT at 2480, CCAAT at 2325, CCAAT at 2107, CCAAT at 1784, CCAAT at 1761, CCAAT at 717.
  4. CCAATr6: CCAAT at 3103, CCAAT at 3037, CCAAT at 2486, CCAAT at 2228, CCAAT at 1955, CCAAT at 1646, CCAAT at 1626, CCAAT at 740, CCAAT at 692.
  5. CCAATr8: CCAAT at 3938, CCAAT at 3898, CCAAT at 3047, CCAAT at 2713, CCAAT at 2420, CCAAT at 2219, CCAAT at 1451, CCAAT at 1420, CCAAT at 1266, CCAAT at 1129, CCAAT at 604.
  6. CCAATr0ci: ATTGG at 3851, ATTGG at 3203, ATTGG at 1776, ATTGG at 379, ATTGG at 81.
  7. CCAATr2ci: ATTGG at 3592, ATTGG at 1415, ATTGG at 1335, ATTGG at 755.
  8. CCAATr4ci: ATTGG at 3576, ATTGG at 3097, ATTGG at 708, ATTGG at 651.
  9. CCAATr6ci: ATTGG at 3233, ATTGG at 2516, ATTGG at 2495, ATTGG at 2489, ATTGG at 1389, ATTGG at 1326, ATTGG at 1026, ATTGG at 604.
  10. CCAATr8ci: ATTGG at 4023, ATTGG at 4007, ATTGG at 3239, ATTGG at 3117, ATTGG at 1977, ATTGG at 855, ATTGG at 204.

Hap analysis and results

The upstream activating sequence (UAS) for the [heme-activated protein] Hap4p is CCAAT.[4]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 1 (--0,+-1)
Randoms UTR arbitrary negative 26 10 2.6 2.6
Randoms UTR alternate negative 26 10 2.6 2.6
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 2 10 0.2 0.1
Randoms Core alternate negative 0 10 0 0.1
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.15
Randoms Core alternate positive 2 10 0.2 0.15
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 3 10 0.3 0.3
Randoms Proximal alternate negative 3 10 0.3 0.3
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 4 10 0.4 0.35
Randoms Proximal alternate positive 3 10 0.3 0.35
Reals Distal negative 3 2 1.5 1.5 ± 0.5 (--1,+-2)
Randoms Distal arbitrary negative 43 10 4.3 3.65
Randoms Distal alternate negative 30 10 3.0 3.65
Reals Distal positive 3 2 1.5 1.5 ± 0.5 (-+2,++1)
Randoms Distal arbitrary positive 54 10 5.4 6.0
Randoms Distal alternate positive 66 10 6.6 6.0

Comparison:

The occurrences of real Hap UTRs and distals are less than the randoms. This suggests that the real Haps are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. 1.0 1.1 1.2 1.3 "CAAT box". San Francisco, California: Wikimedia Foundation, Inc. April 8, 2013. Retrieved 2013-04-14.
  2. "Box (disambiguation)". San Francisco, California: Wikimedia Foundation, Inc. May 23, 2013. Retrieved 2013-06-15.
  3. 3.0 3.1 3.2 3.3 3.4 3.5 Weimin Bi, Ling Wu, Françoise Coustry, Benoit de Crombrugghe and Sankar N. Maity (October 17, 1997). "DNA Binding Specificity of the CCAAT-binding Factor CBF/NF-Y". The Journal of Biological Chemistry. 272 (42): 26562–72. doi:10.1074/jbc.272.42.26562. Retrieved 2013-04-14.
  4. 4.0 4.1 Hongting Tang, Yanling Wu, Jiliang Deng, Nanzhu Chen, Zhaohui Zheng, Yongjun Wei, Xiaozhou Luo, and Jay D. Keasling (6 August 2020). "Promoter Architecture and Promoter Engineering in Saccharomyces cerevisiae". Metabolites. 10 (8): 320–39. doi:10.3390/metabo10080320. PMID 32781665 Check |pmid= value (help). Retrieved 18 September 2020.
  5. 5.0 5.1 5.2 Victor Kusnetsov, Martin Landsberger, Jörg Meurer and Ralf Oelmüller (December 10, 1999). "The Assembly of the CAAT-box Binding Complex at a Photosynthesis Gene Promoter Is Regulated by Light, Cytokinin, and the Stage of the Plastids". The Journal of Biological Chemistry. 274 (50): 36009–14. doi:10.1074/jbc.274.50.36009. Retrieved 2013-04-14.
  6. 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 Joke Comijn, Geert Berx, Petra Vermassen, Kristin Verschueren, Leo van Grunsven, Erik Bruyneel, Marc Mareel, Danny Huylebroeck, Frans van Roy (June 2001). "The Two-Handed E Box Binding Zinc Finger Protein SIP1 Downregulates E-Cadherin and Induces Invasion". Molecular Cell. 7 (6): 1267–78. doi:10.1016/S1097-2765(01)00260-X. Retrieved 11 January 2019.
  7. Takafumi Miyachi, Hirofumi Maruyama, Takeshi Kitamura, Shigenobu, Nakamura and Hideshi Kawakami (8 June 1999). "Structure and regulation of the human NeuroD (BETA2/BHF1) gene". Molecular Brain Research. 69 (2): 223–231. doi:10.1016/S0169-328X(99)00112-6. Retrieved 2 February 2019.
  8. Dan Moran, Emilia Galperin and Mia Horowitz (31 July 1997). "Identification of factors regulating the expression of the human glucocerebrosidase gene". Gene. 194 (2): 201–213. Retrieved 2 February 2019.
  9. Hyun-Jun Jang, Jin Won Choi, Young Min Kim, Sang Su Shin, Kichoon Lee and Jae Yong Han (November 2011). "Reactivation of Transgene Expression by Alleviating CpG Methylation of the Rous sarcoma virus Promoter in Transgenic Quail Cells". Molecular Biotechnology. 49 (3): 222–228. doi:10.1007/s12033-011-9393-7. Retrieved 2 February 2019.
  10. 10.0 10.1 10.2 10.3 RefSeq (October 2013). "CEBPB CCAAT enhancer binding protein [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2 May 2020.

External links