CCCTC-binding factor gene transcriptions: Difference between revisions

Jump to navigation Jump to search
Line 97: Line 97:
# CTCFr3: 5, CCCTC at 3829, CCCTC at 3519, CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
# CTCFr3: 5, CCCTC at 3829, CCCTC at 3519, CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
# CTCFr4: 8, CCCTC at 4306, CCCTC at 2619, CCCTC at 2356, CCCTC at 1908, CCCTC at 1525, CCCTC at 1521, CCCTC at 1177, CCCTC at 1058.
# CTCFr4: 8, CCCTC at 4306, CCCTC at 2619, CCCTC at 2356, CCCTC at 1908, CCCTC at 1525, CCCTC at 1521, CCCTC at 1177, CCCTC at 1058.
# RDr5: 0.
# CTCFr5: 7, CCCTC at 4416, CCCTC at 4356, CCCTC at 3244, CCCTC at 2134, CCCTC at 1860, CCCTC at 1853, CCCTC at 1405.
# RDr6: 0.
# RDr6: 0.
# RDr7: 0.
# RDr7: 0.
Line 121: Line 121:


# CTCFr3: CCCTC at 3829, CCCTC at 3519.
# CTCFr3: CCCTC at 3829, CCCTC at 3519.
# CTCFr5: CCCTC at 4416, CCCTC at 4356, CCCTC at 3244.


===RDr arbitrary negative direction (evens) (2846-2811) core promoters===
===RDr arbitrary negative direction (evens) (2846-2811) core promoters===
Line 126: Line 127:
===RDr alternate negative direction (odds) (2846-2811) core promoters===
===RDr alternate negative direction (odds) (2846-2811) core promoters===


===RDr arbitrary positive direction (odds) (4445-4265) core promoters===
===CTCFr arbitrary positive direction (odds) (4445-4265) core promoters===
 
# CTCFr5: CCCTC at 4416, CCCTC at 4356.


===CTCFr alternate positive direction (evens) (4445-4265) core promoters===
===CTCFr alternate positive direction (evens) (4445-4265) core promoters===
Line 139: Line 142:


===RDr arbitrary positive direction (odds) (4265-4050) proximal promoters===
===RDr arbitrary positive direction (odds) (4265-4050) proximal promoters===


===RDr alternate positive direction (evens) (4265-4050) proximal promoters===
===RDr alternate positive direction (evens) (4265-4050) proximal promoters===
Line 151: Line 155:
# CTCFr1: CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
# CTCFr1: CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
# CTCFr3: CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
# CTCFr3: CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
# CTCFr5: CCCTC at 2134, CCCTC at 1860, CCCTC at 1853, CCCTC at 1405.


===CTCFr arbitrary positive direction (odds) (4050-1) distal promoters===
===CTCFr arbitrary positive direction (odds) (4050-1) distal promoters===
Line 156: Line 161:
# CTCFr1: CCCTC at 2735, CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
# CTCFr1: CCCTC at 2735, CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
# CTCFr3: CCCTC at 3829, CCCTC at 3519, CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
# CTCFr3: CCCTC at 3829, CCCTC at 3519, CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
# CTCFr5: CCCTC at 3244, CCCTC at 2134, CCCTC at 1860, CCCTC at 1853, CCCTC at 1405.


===CTCFr alternate positive direction (evens) (4050-1) distal promoters===
===CTCFr alternate positive direction (evens) (4050-1) distal promoters===

Revision as of 20:08, 9 May 2023

Associate Editor(s)-in-Chief: Henry A. Hoff

Consensus sequences

"Experiments using chromatin immunoprecipitation exonuclease (ChIP-exo) uncovered a broad CTCF-binding motif that contains a 12–15 bp consensus sequence, 5′-NCA-NNA-G(G/A)N-GGC-(G/A)(C/G)(T/C)-3′ (Nakahashi et al., 2013, Rhee and Pugh, 2011) [...]."[1]

Hashimoto samplings

Copying the consensus of the CTCF: 5'-CACCAGG-3' and putting the sequence in "⌘F" finds no locations between ZSCAN22 and A1BG and CTCF: 5'-CACCAGGAGG-3' finds no locations between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs SuccessablesCFbox.bas written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-NCA-NNA-G(A/G)N-GGC-(A/G)(C/G)(C/T)-3'[1], 0.
  2. negative strand, positive direction, looking for 5'-NCA-NNA-G(A/G)N-GGC-(A/G)(C/G)(C/T)-3', 0.
  3. positive strand, negative direction, looking for 5'-NCA-NNA-G(A/G)N-GGC-(A/G)(C/G)(C/T)-3', 0.
  4. positive strand, positive direction, looking for 5'-NCA-NNA-G(A/G)N-GGC-(A/G)(C/G)(C/T)-3', 0.
  5. complement, negative strand, negative direction, looking for 5'-NGT-NNT-C(C/T)N-CCG-(C/T)(C/G)(A/G)-3', 0.
  6. complement, negative strand, positive direction, looking for 5'-NGT-NNT-C(C/T)N-CCG-(C/T)(C/G)(A/G)-3', 0.
  7. complement, positive strand, negative direction, looking for 5'-NGT-NNT-C(C/T)N-CCG-(C/T)(C/G)(A/G)-3', 0.
  8. complement, positive strand, positive direction, looking for 5'-NGT-NNT-C(C/T)N-CCG-(C/T)(C/G)(A/G)-3', 0.
  9. inverse complement, negative strand, negative direction, looking for 5'-(A/G)(C/G)(C/T)-GCC-N(C/T)C-TNN-TGN-3', 0.
  10. inverse complement, negative strand, positive direction, looking for 5'-(A/G)(C/G)(C/T)-GCC-N(C/T)C-TNN-TGN-3', 0.
  11. inverse complement, positive strand, negative direction, looking for 5'-(A/G)(C/G)(C/T)-GCC-N(C/T)C-TNN-TGN-3', 0.
  12. inverse complement, positive strand, positive direction, looking for 5'-(A/G)(C/G)(C/T)-GCC-N(C/T)C-TNN-TGN-3', 0.
  13. inverse, negative strand, negative direction, looking for 5'-(C/T)(C/G)(A/G)-CGG-N(A/G)G-ANN-ACN-3', 0.
  14. inverse, negative strand, positive direction, looking for 5'-(C/T)(C/G)(A/G)-CGG-N(A/G)G-ANN-ACN-3', 0.
  15. inverse, positive strand, negative direction, looking for 5'-(C/T)(C/G)(A/G)-CGG-N(A/G)G-ANN-ACN-3', 0.
  16. inverse, positive strand, positive direction, looking for 5'-(C/T)(C/G)(A/G)-CGG-N(A/G)G-ANN-ACN-3', 0.

CTCF samplings

CCCTC-Binding factor or CTCF was initially discovered as a negative regulator of the chicken c-myc gene. This protein was found to be binding to three regularly spaced repeats of the core sequence CCCTC and thus was named CCCTC binding factor.[2]

For the Basic programs testing consensus sequence CCCTC (starting with SuccessablesCTCF.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 22, CCCTC at 4560, CCCTC at 4549, CCCTC at 4497, CCCTC at 4303, CCCTC at 4271, CCCTC at 4153, CCCTC at 4002, CCCTC at 3989, CCCTC at 3888, CCCTC at 3752, CCCTC at 3714, CCCTC at 3080, CCCTC at 2701, CCCTC at 2221, CCCTC at 2104, CCCTC at 1962, CCCTC at 1930, CCCTC at 1795, CCCTC at 1018, CCCTC at 686, CCCTC at 550, CCCTC at 413.
  2. Positive strand, negative direction: 1, CCCTC at 2626.
  3. Negative strand, positive direction: 7, CCCTC at 4294, CCCTC at 3502, CCCTC at 3355, CCCTC at 3206, CCCTC at 1772, CCCTC at 95, CCCTC at 87.
  4. Positive strand, positive direction: 16, CCCTC at 4432, CCCTC at 4044, CCCTC at 3978, CCCTC at 3673, CCCTC at 3657, CCCTC at 3453, CCCTC at 3184, CCCTC at 2288, CCCTC at 1896, CCCTC at 1782, CCCTC at 1683, CCCTC at 661, CCCTC at 493, CCCTC at 382, CCCTC at 368, CCCTC at 313.
  5. inverse complement, negative strand, negative direction: 3, GAGGG at 4258, GAGGG at 1507, GAGGG at 388.
  6. inverse complement, positive strand, negative direction: 5, GAGGG at 4558, GAGGG at 3652, GAGGG at 2699, GAGGG at 1673, GAGGG at 88.
  7. inverse complement, negative strand, positive direction: 16, GAGGG at 4434, GAGGG at 3980, GAGGG at 3906, GAGGG at 3879, GAGGG at 3653, GAGGG at 3479, GAGGG at 3182, GAGGG at 2796, GAGGG at 2655, GAGGG at 2290, GAGGG at 1898, GAGGG at 311, GAGGG at 258, GAGGG at 245, GAGGG at 181, GAGGG at 18.
  8. inverse complement, positive strand, positive direction: 9, GAGGG at 4296, GAGGG at 3554, GAGGG at 3332, GAGGG at 3198, GAGGG at 3077, GAGGG at 2531, GAGGG at 2395, GAGGG at 2382, GAGGG at 465.

CTCF (4560-2846) UTRs

  1. Negative strand, negative direction: CCCTC at 4560, CCCTC at 4549, CCCTC at 4497, CCCTC at 4303, CCCTC at 4271, CCCTC at 4153, CCCTC at 4002, CCCTC at 3989, CCCTC at 3888, CCCTC at 3752, CCCTC at 3714, CCCTC at 3080.
  2. Negative strand, negative direction: GAGGG at 4258.
  3. Positive strand, negative direction: GAGGG at 4558, GAGGG at 3652.

CTCF positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: CCCTC at 4294.
  2. Negative strand, positive direction: GAGGG at 4434.
  3. Positive strand, positive direction: CCCTC at 4432.
  4. Positive strand, positive direction: GAGGG at 4296.

CTCF negative direction (2811-2596) proximal promoters

  1. Negative strand, negative direction: CCCTC at 2701.
  2. Positive strand, negative direction: CCCTC at 2626.
  3. Positive strand, negative direction: GAGGG at 2699.

CTCF positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: CCCTC at 4294.

CTCF negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: CCCTC at 2221, CCCTC at 2104, CCCTC at 1962, CCCTC at 1930, CCCTC at 1795, CCCTC at 1018, CCCTC at 686, CCCTC at 550, CCCTC at 413.
  2. Negative strand, negative direction: GAGGG at 1507, GAGGG at 388.
  3. Positive strand, negative direction: GAGGG at 1673, GAGGG at 88.

CTCF positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CCCTC at 3502, CCCTC at 3355, CCCTC at 3206, CCCTC at 1772, CCCTC at 95, CCCTC at 87.
  2. Negative strand, positive direction: GAGGG at 4434, GAGGG at 3980, GAGGG at 3906, GAGGG at 3879, GAGGG at 3653, GAGGG at 3479, GAGGG at 3182, GAGGG at 2796, GAGGG at 2655, GAGGG at 2290, GAGGG at 1898, GAGGG at 311, GAGGG at 258, GAGGG at 245, GAGGG at 181, GAGGG at 18.
  3. Positive strand, positive direction: CCCTC at 4044, CCCTC at 3978, CCCTC at 3673, CCCTC at 3657, CCCTC at 3453, CCCTC at 3184, CCCTC at 2288, CCCTC at 1896, CCCTC at 1782, CCCTC at 1683, CCCTC at 661, CCCTC at 493, CCCTC at 382, CCCTC at 368, CCCTC at 313.
  4. Positive strand, positive direction: GAGGG at 3554, GAGGG at 3332, GAGGG at 3198, GAGGG at 3077, GAGGG at 2531, GAGGG at 2395, GAGGG at 2382, GAGGG at 465.

CTCF random dataset samplings

  1. CTCFr0: 5, CCCTC at 2873, CCCTC at 2302, CCCTC at 1863, CCCTC at 587, CCCTC at 26.
  2. CTCFr1: 8, CCCTC at 2735, CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
  3. CTCFr2: 8, CCCTC at 4478, CCCTC at 3917, CCCTC at 3842, CCCTC at 3643, CCCTC at 956, CCCTC at 894, CCCTC at 692, CCCTC at 670.
  4. CTCFr3: 5, CCCTC at 3829, CCCTC at 3519, CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
  5. CTCFr4: 8, CCCTC at 4306, CCCTC at 2619, CCCTC at 2356, CCCTC at 1908, CCCTC at 1525, CCCTC at 1521, CCCTC at 1177, CCCTC at 1058.
  6. CTCFr5: 7, CCCTC at 4416, CCCTC at 4356, CCCTC at 3244, CCCTC at 2134, CCCTC at 1860, CCCTC at 1853, CCCTC at 1405.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

CTCFr arbitrary (evens) (4560-2846) UTRs

  1. CTCFr0: CCCTC at 2873.
  2. CTCFr4: CCCTC at 4306.

CTCFr alternate (odds) (4560-2846) UTRs

  1. CTCFr3: CCCTC at 3829, CCCTC at 3519.
  2. CTCFr5: CCCTC at 4416, CCCTC at 4356, CCCTC at 3244.

RDr arbitrary negative direction (evens) (2846-2811) core promoters

RDr alternate negative direction (odds) (2846-2811) core promoters

CTCFr arbitrary positive direction (odds) (4445-4265) core promoters

  1. CTCFr5: CCCTC at 4416, CCCTC at 4356.

CTCFr alternate positive direction (evens) (4445-4265) core promoters

  1. CTCFr4: CCCTC at 4306.

RDr arbitrary negative direction (evens) (2811-2596) proximal promoters

CTCFr alternate negative direction (odds) (2811-2596) proximal promoters

  1. CTCFr1: CCCTC at 2735.

RDr arbitrary positive direction (odds) (4265-4050) proximal promoters

RDr alternate positive direction (evens) (4265-4050) proximal promoters

CTCFr arbitrary negative direction (evens) (2596-1) distal promoters

  1. CTCFr0: CCCTC at 2302, CCCTC at 1863, CCCTC at 587, CCCTC at 26.
  2. CTCFr4: CCCTC at 2356, CCCTC at 1908, CCCTC at 1525, CCCTC at 1521, CCCTC at 1177, CCCTC at 1058.

CTCFr alternate negative direction (odds) (2596-1) distal promoters

  1. CTCFr1: CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
  2. CTCFr3: CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
  3. CTCFr5: CCCTC at 2134, CCCTC at 1860, CCCTC at 1853, CCCTC at 1405.

CTCFr arbitrary positive direction (odds) (4050-1) distal promoters

  1. CTCFr1: CCCTC at 2735, CCCTC at 2580, CCCTC at 2300, CCCTC at 1369, CCCTC at 859, CCCTC at 424, CCCTC at 217, CCCTC at 127.
  2. CTCFr3: CCCTC at 3829, CCCTC at 3519, CCCTC at 2497, CCCTC at 1521, CCCTC at 1429.
  3. CTCFr5: CCCTC at 3244, CCCTC at 2134, CCCTC at 1860, CCCTC at 1853, CCCTC at 1405.

CTCFr alternate positive direction (evens) (4050-1) distal promoters

  1. CTCFr0: CCCTC at 2873, CCCTC at 2302, CCCTC at 1863, CCCTC at 587, CCCTC at 26.
  2. CTCFr4: CCCTC at 2619, CCCTC at 2356, CCCTC at 1908, CCCTC at 1525, CCCTC at 1521, CCCTC at 1177, CCCTC at 1058.

CTCF analysis and results

This protein was found to be binding to three regularly spaced repeats of the core sequence CCCTC and thus was named CCCTC binding factor.[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 15 2 7.5 7.5 ± 5.5 (--13,+-2)
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 4 2 2 2 ± 0 (--+2,++2)
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 3 2 1.5 1.5 ± 0.5 (--1,+-2)
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 13 2 6.5 6.5 ± 4.5 (--11,+-2)
Randoms Distal arbitrary negative 0 10 0 0
Randoms Distal alternate negative 0 10 0 0
Reals Distal positive 45 2 22.5 22.5 ± 0.5 (-+22,++23)
Randoms Distal arbitrary positive 0 10 0 0
Randoms Distal alternate positive 0 10 0 0

Comparison:

The occurrences of real CTCFs are greater than the randoms. This suggests that the real CTCFs are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 Hideharu Hashimoto, Dongxue Wang, John R. Horton, Xing Zhang, Victor G. Corces and Xiaodong Cheng (1 June 2017). "Structural Basis for the Versatile and Methylation-Dependent Binding of CTCF to DNA". Molecular Cell. 66 (5): 711–720.e3. doi:10.1016/j.molcel.2017.05.004. PMID 28529057. Retrieved 28 August 2020.
  2. 2.0 2.1 Lobanenkov VV, Nicolas RH, Adler VV, Paterson H, Klenova EM, Polotskaja AV, Goodwin GH (December 1990). "A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5'-flanking sequence of the chicken c-myc gene". Oncogene. 5 (12): 1743–53. PMID 2284094.

External links