Cell-cycle box gene transcriptions

Jump to navigation Jump to search

Associate Editor(s)-in-Chief: Henry A. Hoff

"The 5' non-coding part contains the sequence elements characteristic for eukaryotic promoters such as TATA and CAAT boxes as well as an inverted motif typical for cell-cycle regulated genes named "cell-cycle box" (CCB). The consensus sequence of CCB is CACGAAAA (Nasmyth, 1985), however, more relaxed variants such as CACGAAA, ACGAAA and C-CGAAA were described in budding yeast CLN1 and CLN2 (Ogas et al, 1991)."[1]

Human genes

Gene expressions

Interactions

Consensus sequences

Binding site for

Inverse copies

"In our soybean genomic clone CCB is represented as an inverted motif TTTTGGTG at the -66 position. (Breeden and Nasmyth (1987) suggested that the "cell cycle box" is functional in either orientation, acting as an enhancer."[1]

Enhancer activity

Promoter occurrences

Hypotheses

  1. A1BG has no regulatory elements in either promoter.
  2. A1BG is not transcribed by a regulatory element.
  3. No regulatory element participates in the transcription of A1BG.

CCB Samplings

Copying a responsive elements consensus sequence CACGAAAA and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CACGAAAA (starting with SuccessablesCCB.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CACGAAAA, 0.
  2. positive strand, negative direction, looking for CACGAAAA, 0.
  3. positive strand, positive direction, looking for CACGAAAA, 0.
  4. negative strand, positive direction, looking for CACGAAAA, 0.
  5. complement, negative strand, negative direction, looking for GTGCTTTT, 0.
  6. complement, positive strand, negative direction, looking for GTGCTTTT, 0.
  7. complement, positive strand, positive direction, looking for GTGCTTTT, 0.
  8. complement, negative strand, positive direction, looking for GTGCTTTT, 0.
  9. inverse complement, negative strand, negative direction, looking for TTTTCGTG, 0.
  10. inverse complement, positive strand, negative direction, looking for TTTTCGTG, 0.
  11. inverse complement, positive strand, positive direction, looking for TTTTCGTG, 0.
  12. inverse complement, negative strand, positive direction, looking for TTTTCGTG, 0.
  13. inverse negative strand, negative direction, looking for AAAAGCAC, 0.
  14. inverse positive strand, negative direction, looking for AAAAGCAC, 0.
  15. inverse positive strand, positive direction, looking for AAAAGCAC, 0.
  16. inverse negative strand, positive direction, looking for AAAAGCAC, 0.

CCB variant samplings

Copying a responsive elements consensus sequence CACGAAA, ACGAAA and C-CGAAA and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs. Actual consensus sequences to conform should contain CACGAAA, ACGAAA or CCGAAA.

For the Basic programs testing consensus sequence NNCGAAAN (starting with SuccessablesCCBV.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 0.
  2. positive strand, negative direction: 3, GGCGAAAT at 2158, TACGAAAA at 495, GACGAAAC at 313.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 3, ACCGAAAG at 2164, TCCGAAAG at 1180, TCCGAAAG at 1096.
  5. inverse complement, negative strand, negative direction: 6, TTTTCGTT at 2480, TTTTCGTT at 2474, CTTTCGCC at 1680, TTTTCGGT at 946, TTTTCGTA at 187, CTTTCGAC at 139.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 2, TTTTCGGG at 1752, CTTTCGTC at 1183.
  8. inverse complement, positive strand, positive direction: 3, CTTTCGTG at 3600, GTTTCGCA at 2538, TTTTCGTC at 2007.

CCB negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TTTTCGTT at 2480, TTTTCGTT at 2474, CTTTCGCC at 1680, TTTTCGGT at 946, TTTTCGTA at 187, CTTTCGAC at 139.
  2. Positive strand, negative direction: GGCGAAAT at 2158, TACGAAAA at 495, GACGAAAC at 313.

CCB positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TTTTCGGG at 1752, CTTTCGTC at 1183.
  2. Positive strand, positive direction: ACCGAAAG at 2164, TCCGAAAG at 1180, TCCGAAAG at 1096.
  3. Positive strand, positive direction: CTTTCGTG at 3600, GTTTCGCA at 2538, TTTTCGTC at 2007.

Cell-cycle box variants (CCBV) random dataset samplings

  1. CCBVr0: 4, GCCGAAA at 4295, CCCGAAA at 4138, CCCGAAA at 1527, ACCGAAA at 251.
  2. CCBVr1: 4, CCCGAAA at 4438, AACGAAA at 2588, CCCGAAA at 2456, AACGAAA at 1407.
  3. CCBVr2: 5, TTCGAAA at 3775, GGCGAAA at 3652, ATCGAAA at 3440, GCCGAAA at 3077, AACGAAA at 2951.
  4. CCBVr3: 6, CGCGAAA at 3966, ACCGAAA at 3939, GTCGAAA at 3481, TACGAAA at 2461, CCCGAAA at 1979, AGCGAAA at 1781.
  5. CCBVr4: 6, TGCGAAA at 4469, TCCGAAA at 3876, TCCGAAA at 3768, AACGAAA at 3485, GGCGAAA at 528, GGCGAAA at 222.
  6. CCBVr5: 6, AGCGAAA at 4334, GGCGAAA at 4283, AACGAAA at 4210, TACGAAA at 2281, GGCGAAA at 1032, CCCGAAA at 992.
  7. CCBVr6: 2, CCCGAAA at 1979, GGCGAAA at 1236.
  8. CCBVr7: 5, AACGAAA at 4545, CCCGAAA at 4331, CGCGAAA at 3960, CTCGAAA at 3484, CGCGAAA at 864.
  9. CCBVr8: 4, ATCGAAA at 3076, GACGAAA at 2977, GCCGAAA at 2522, AACGAAA at 2406.
  10. CCBVr9: 6, CCCGAAA at 2490, CACGAAA at 2087, CCCGAAA at 1793, CACGAAA at 1376, TGCGAAA at 882, CCCGAAA at 179.
  11. CCBVr0ci: 4, TTTCGGC at 2347, TTTCGGG at 1279, TTTCGCT at 1170, TTTCGCA at 479.
  12. CCBVr1ci: 6, TTTCGTA at 3412, TTTCGAC at 2678, TTTCGAC at 2397, TTTCGGT at 1497, TTTCGGC at 477, TTTCGAT at 108.
  13. CCBVr2ci: 12, TTTCGGC at 4206, TTTCGAT at 4171, TTTCGCC at 3957, TTTCGCG at 3668, TTTCGAG at 3420, TTTCGTC at 3385, TTTCGCA at 1411, TTTCGCG at 1148, TTTCGTC at 721, TTTCGCG at 589, TTTCGAC at 537, TTTCGAG at 296.
  14. CCBVr3ci: 7, TTTCGCC at 4076, TTTCGCT at 4044, TTTCGCA at 3697, TTTCGCT at 3071, TTTCGGC at 3004, TTTCGTA at 2228, TTTCGGG at 1545.
  15. CCBVr4ci: 8, TTTCGGA at 4217, TTTCGGC at 3021, TTTCGGA at 2794, TTTCGGC at 2680, TTTCGCG at 2550, TTTCGGG at 1601, TTTCGCG at 1084, TTTCGTT at 369.
  16. CCBVr5ci: 2, TTTCGTA at 3659, TTTCGAT at 2501.
  17. CCBVr6ci: 5, TTTCGGA at 3298, TTTCGGG at 3018, TTTCGGT at 1475, TTTCGGG at 1263, TTTCGGA at 406.
  18. CCBVr7ci: 10, TTTCGGG at 3463, TTTCGTT at 3279, TTTCGGG at 3178, TTTCGGT at 2521, TTTCGGC at 2494, TTTCGCA at 2390, TTTCGTT at 2248, TTTCGTC at 2122, TTTCGGG at 1994, TTTCGCC at 1347.
  19. CCBVr8ci: 7, TTTCGAA at 3419, TTTCGGA at 3005, TTTCGAG at 2547, TTTCGAG at 1892, TTTCGGT at 1655, TTTCGGG at 1554, TTTCGTG at 1534.
  20. CCBVr9ci: 5, TTTCGGA at 4163, TTTCGCT at 2872, TTTCGAG at 1330, TTTCGCG at 1128, TTTCGCC at 722.

CCBVr arbitrary (evens) (4560-2846) UTRs

  1. CCBVr0: GCCGAAA at 4295, CCCGAAA at 4138.
  2. CCBVr2: TTCGAAA at 3775, GGCGAAA at 3652, ATCGAAA at 3440, GCCGAAA at 3077, AACGAAA at 2951.
  3. CCBVr4: TGCGAAA at 4469, TCCGAAA at 3876, TCCGAAA at 3768, AACGAAA at 3485.
  4. CCBVr8: ATCGAAA at 3076, GACGAAA at 2977.
  5. CCBVr2ci: TTTCGGC at 4206, TTTCGAT at 4171, TTTCGCC at 3957, TTTCGCG at 3668, TTTCGAG at 3420, TTTCGTC at 3385.
  6. CCBVr4ci: TTTCGGA at 4217, TTTCGGC at 3021.
  7. CCBVr6ci: TTTCGGA at 3298, TTTCGGG at 3018.
  8. CCBVr8ci: TTTCGAA at 3419, TTTCGGA at 3005.

CCBVr alternate (odds) (4560-2846) UTRs

  1. CCBVr1: CCCGAAA at 4438.
  2. CCBVr3: CGCGAAA at 3966, ACCGAAA at 3939, GTCGAAA at 3481.
  3. CCBVr5: AGCGAAA at 4334, GGCGAAA at 4283, AACGAAA at 4210.
  4. CCBVr7: AACGAAA at 4545, CCCGAAA at 4331, CGCGAAA at 3960, CTCGAAA at 3484.
  5. CCBVr1ci: TTTCGTA at 3412.
  6. CCBVr3ci: TTTCGCC at 4076, TTTCGCT at 4044, TTTCGCA at 3697, TTTCGCT at 3071, TTTCGGC at 3004.
  7. CCBVr5ci: TTTCGTA at 3659.
  8. CCBVr7ci: TTTCGGG at 3463, TTTCGTT at 3279, TTTCGGG at 3178.
  9. CCBVr9ci: TTTCGGA at 4163, TTTCGCT at 2872.

CCBVr arbitrary positive direction (odds) (4445-4265) core promoters

  1. CCBVr1: CCCGAAA at 4438.
  2. CCBVr5: AGCGAAA at 4334, GGCGAAA at 4283.
  3. CCBVr7: CCCGAAA at 4331.

CCBVr alternate positive direction (evens) (4445-4265) core promoters

  1. CCBVr0: GCCGAAA at 4295.

CCBVr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. CCBVr4ci: TTTCGGA at 2794, TTTCGGC at 2680.

CCBVr alternate negative direction (odds) (2811-2596) proximal promoters

  1. CCBVr1ci: TTTCGAC at 2678.

CCBVr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. CCBVr5: AACGAAA at 4210.
  2. CCBVr3ci: TTTCGCC at 4076.
  3. CCBVr9ci: TTTCGGA at 4163.

CCBVr alternate positive direction (evens) (4265-4050) proximal promoters

  1. CCBVr0: CCCGAAA at 4138.
  2. CCBVr2ci: TTTCGGC at 4206, TTTCGAT at 4171.
  3. CCBVr4ci: TTTCGGA at 4217.

CCBVr arbitrary negative direction (evens) (2596-1) distal promoters

  1. CCBVr0: CCCGAAA at 1527, ACCGAAA at 251.
  2. CCBVr4: GGCGAAA at 528, GGCGAAA at 222.
  3. CCBVr6: CCCGAAA at 1979, GGCGAAA at 1236.
  4. CCBVr8: GCCGAAA at 2522, AACGAAA at 2406.
  5. CCBVr0ci: TTTCGGC at 2347, TTTCGGG at 1279, TTTCGCT at 1170, TTTCGCA at 479.
  6. CCBVr2ci: TTTCGCA at 1411, TTTCGCG at 1148, TTTCGTC at 721, TTTCGCG at 589, TTTCGAC at 537, TTTCGAG at 296.
  7. CCBVr4ci: TTTCGCG at 2550, TTTCGGG at 1601, TTTCGCG at 1084, TTTCGTT at 369.
  8. CCBVr6ci: TTTCGGT at 1475, TTTCGGG at 1263, TTTCGGA at 406.
  9. CCBVr8ci: TTTCGAG at 2547, TTTCGAG at 1892, TTTCGGT at 1655, TTTCGGG at 1554, TTTCGTG at 1534.

CCBVr alternate negative direction (odds) (2596-1) distal promoters

  1. CCBVr1: AACGAAA at 2588, CCCGAAA at 2456, AACGAAA at 1407.
  2. CCBVr3: TACGAAA at 2461, CCCGAAA at 1979, AGCGAAA at 1781.
  3. CCBVr5: TACGAAA at 2281, GGCGAAA at 1032, CCCGAAA at 992.
  4. CCBVr7: CGCGAAA at 864.
  5. CCBVr9: CCCGAAA at 2490, CACGAAA at 2087, CCCGAAA at 1793, CACGAAA at 1376, TGCGAAA at 882, CCCGAAA at 179.
  6. CCBVr1ci: TTTCGAC at 2397, TTTCGGT at 1497, TTTCGGC at 477, TTTCGAT at 108.
  7. CCBVr3ci: TTTCGTA at 2228, TTTCGGG at 1545.
  8. CCBVr5ci: TTTCGAT at 2501.
  9. CCBVr7ci: TTTCGGT at 2521, TTTCGGC at 2494, TTTCGCA at 2390, TTTCGTT at 2248, TTTCGTC at 2122, TTTCGGG at 1994, TTTCGCC at 1347.
  10. CCBVr9ci: TTTCGAG at 1330, TTTCGCG at 1128, TTTCGCC at 722.

CCBVDr arbitrary positive direction (odds) (4050-1) distal promoters

  1. CCBVr1: AACGAAA at 2588, CCCGAAA at 2456, AACGAAA at 1407.
  2. CCBVr3: CGCGAAA at 3966, ACCGAAA at 3939, GTCGAAA at 3481, TACGAAA at 2461, CCCGAAA at 1979, AGCGAAA at 1781.
  3. CCBVr5: TACGAAA at 2281, GGCGAAA at 1032, CCCGAAA at 992.
  4. CCBVr7: CGCGAAA at 3960, CTCGAAA at 3484, CGCGAAA at 864.
  5. CCBVr9: CCCGAAA at 2490, CACGAAA at 2087, CCCGAAA at 1793, CACGAAA at 1376, TGCGAAA at 882, CCCGAAA at 179.
  6. CCBVr1ci: TTTCGTA at 3412, TTTCGAC at 2678, TTTCGAC at 2397, TTTCGGT at 1497, TTTCGGC at 477, TTTCGAT at 108.
  7. CCBVr3ci: TTTCGCT at 4044, TTTCGCA at 3697, TTTCGCT at 3071, TTTCGGC at 3004, TTTCGTA at 2228, TTTCGGG at 1545.
  8. CCBVr5ci: TTTCGTA at 3659, TTTCGAT at 2501.
  9. CCBVr7ci: TTTCGGG at 3463, TTTCGTT at 3279, TTTCGGG at 3178, TTTCGGT at 2521, TTTCGGC at 2494, TTTCGCA at 2390, TTTCGTT at 2248, TTTCGTC at 2122, TTTCGGG at 1994, TTTCGCC at 1347.
  10. CCBVr9ci: TTTCGCT at 2872, TTTCGAG at 1330, TTTCGCG at 1128, TTTCGCC at 722.

CCBVr alternate positive direction (evens) (4050-1) distal promoters

  1. CCBVr0: CCCGAAA at 1527, ACCGAAA at 251.
  2. CCBVr2: TTCGAAA at 3775, GGCGAAA at 3652, ATCGAAA at 3440, GCCGAAA at 3077, AACGAAA at 2951.
  3. CCBVr4: TCCGAAA at 3876, TCCGAAA at 3768, AACGAAA at 3485, GGCGAAA at 528, GGCGAAA at 222.
  4. CCBVr6: CCCGAAA at 1979, GGCGAAA at 1236.
  5. CCBVr8: ATCGAAA at 3076, GACGAAA at 2977, GCCGAAA at 2522, AACGAAA at 2406.
  6. CCBVr0ci: TTTCGGC at 2347, TTTCGGG at 1279, TTTCGCT at 1170, TTTCGCA at 479.
  7. CCBVr2ci: TTTCGCC at 3957, TTTCGCG at 3668, TTTCGAG at 3420, TTTCGTC at 3385, TTTCGCA at 1411, TTTCGCG at 1148, TTTCGTC at 721, TTTCGCG at 589, TTTCGAC at 537, TTTCGAG at 296.
  8. CCBVr6ci: TTTCGGA at 3298, TTTCGGG at 3018, TTTCGGT at 1475, TTTCGGG at 1263, TTTCGGA at 406.
  9. CCBVr8ci: TTTCGAA at 3419, TTTCGGA at 3005, TTTCGAG at 2547, TTTCGAG at 1892, TTTCGGT at 1655, TTTCGGG at 1554, TTTCGTG at 1534.

Cell-cycle boxes analysis and results

The real promoters have been examined for the CCB variants: CACGAAA, ACGAAA and C-CGAAA, where C-C indicates CC with the likely A being absent (CCGAAA). The inverse complements are TTTCGTG, TTTCGT and TTTCG-G (TTTCGG). The possibility of finding these CCB variants has been performed using a general consensus sequence of NNCGAAA, the expected variants should occur if present. The real promoters have CCB variants only in the distal promoters. In the negative direction, the variants occur ACGAAA at 494 and ACGAAA at 312. The actual general consensus sequence occurrences are GGCGAAA at 2157, TACGAAA at 494, and GACGAAA at 312. CACGAAA or CCGAAA never occurred. The inverse complements occur in the negative and positive direction: TTTCGT at 2479, TTTCGT at 2473 and TTTCGT at 186, and TTTCGGG at 1752, negative strand, positive strand: TTTCGTG at 3600, TTTCGT at 2006, respectively. The occurrences are 2.5 and 1.5 per direction.

The random datasets had twenty-three UTR general consensus sequences for an occurrence of 2.3. Of these the CCB variants had the following frequencies: CACGAAA (0), ACGAAA (2) and CCGAAA (5). The inverse complements had TTTCGTG (0), TTTCGT (1) and TTTCGG (6). The remaining nine were of the general consensus sequence of NNCGAAA or TTTCGNN.

The random datasets had three general consensus sequences in arbitrary positive direction core promoter only for an occurrence of 0.3 (two strands, one direction) or 0.15 (four strands, both directions). The proximal general consensus sequences (five) had occurrences of 0.2 and 0.3.

The distal promoters for the random datasets had twenty-six in the arbitrary negative direction for an occurrence of 2.6 and forty-two in the positive direction for an occurrence of 4.2.

As the choices for direction are arbitrary for the random datasets an average occurrence would be 3.25. Even separately the real occurrences are lower than the random ones albeit not by much for the real negative direction (2.5) vs. the arbitrarily chosen negative direction (2.6) for the randoms. The randoms also had UTR, core and proximal promoter occurrences where the reals have none.

These results suggest that the real variants ACGAAA (2), TTTCGT (5) and TTTCGG (1), are likely active or activable.

"The consensus sequence of CCB is CACGAAAA (Nasmyth, 1985), however, more relaxed variants such as CACGAAA, ACGAAA and C-CGAAA were described in budding yeast CLN1 and CLN2 (Ogas et al, 1991)."[1] The specific relaxed variant tested for in A1BG is ACGAAA.

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 25 10 2.5 2.4
Randoms UTR alternate negative 23 10 2.3 2.4
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 4 10 0.4 0.25
Randoms Core alternate positive 1 10 0.1 0.25
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 2 10 0.2 0.15
Randoms Proximal alternate negative 1 10 0.1 0.15
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 3 10 0.3 0.35
Randoms Proximal alternate positive 4 10 0.4 0.35
Reals Distal negative 9 2 4.5 4.5 ± 1.5 (--6,+-3)
Randoms Distal arbitrary negative 30 10 3 3.15
Randoms Distal alternate negative 33 10 3.3 3.15
Reals Distal positive 8 2 4 4 ± (-+2,++6)
Randoms Distal arbitrary positive 49 10 4.9 4.65
Randoms Distal alternate positive 44 10 4.4 4.65

Comparison:

The occurrences of real CCB distals are outside than the randoms or overlap at the low end in the negative distals. This suggests that the real CCBs are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 1.2 Joanna Deckert, Natasha Taranenko, and Peter M. Gresshoff (1994). Peter M. Gresshoff, ed. Cell Cycle Genes and Their Plant Homologues, In: Plant Genome Analysis: Current Topics in Plant Molecular Biology. 2000 Corporate Blvd., N.W., Boca Raton, Florida 33431: CRC Press, Inc. pp. 169–194. ISBN 0-8493-8264-5. Retrieved 19 April 2021.

External links