Reb1 general regulatory factor gene transcriptions

Jump to navigation Jump to search

Associate Editor(s)-in-Chief: Henry A. Hoff

Purified "Reb1 bound [...] exact TTACCCK occurrences [...] with >60% of 780 occurrences at promoters. [And can have] the extended motif VTTACCCGNH (IUPAC nomenclature) (Rhee and Pugh 2011)."[1] K = G, T; V = not T, N - aNy base and H = not G.

Human genes

Consensus sequences

The apparent consensus sequence for Reb1 is 5'-TTACCC(G/T)-3'; however, an extended Reb1 has the consensus sequence 5'-ATTACCCGAA-3'.

Reb1 samplings

Copying the apparent consensus sequence for Reb1 TTACCC(G/T) and putting it in "⌘F" finds one located between ZSCAN22 or none between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence TTACCC(G/T) (starting with SuccessablesREB.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for TTACCC(G/T), 1, TTACCCT at 3661.
  2. positive strand, negative direction, looking for TTACCC(G/T), 0.
  3. positive strand, positive direction, looking for TTACCC(G/T), 2, TTACCCT at 3170, TTACCCG at 2912.
  4. negative strand, positive direction, looking for TTACCC(G/T), 0.
  5. inverse complement, negative strand, negative direction, looking for (A/C)GGGTAA, 0.
  6. inverse complement, positive strand, negative direction, looking for (A/C)GGGTAA, 0.
  7. inverse complement, positive strand, positive direction, looking for (A/C)GGGTAA, 0.
  8. inverse complement, negative strand, positive direction, looking for (A/C)GGGTAA, 0.

Reb1 (4560-2846) UTRs

  1. Negative strand, negative direction: TTACCCT at 3661.

Reb1 positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: TTACCCT at 3170, TTACCCG at 2912.

Reb1 random dataset samplings

  1. REBr0: 2, TTACCCG at 4135, TTACCCT at 1015.
  2. REBr1: 2, TTACCCG at 4416, TTACCCG at 2965.
  3. REBr2: 0.
  4. REBr3: 0.
  5. REBr4: 0.
  6. REBr5: 0.
  7. REBr6: 0.
  8. REBr7: 0.
  9. REBr8: 0.
  10. REBr9: 1, TTACCCT at 4250.
  11. REBr0ci: 2, CGGGTAA at 973, AGGGTAA at 321.
  12. REBr1ci: 1, CGGGTAA at 1815.
  13. REBr2ci: 2, CGGGTAA at 3979, AGGGTAA at 3112.
  14. REBr3ci: 1, AGGGTAA at 1812.
  15. REBr4ci: 1, AGGGTAA at 1210.
  16. REBr5ci: 0.
  17. REBr6ci: 1, AGGGTAA at 399.
  18. REBr7ci: 0.
  19. REBr8ci: 0.
  20. REBr9ci: 1, AGGGTAA at 1107.

Reb1r arbitrary (evens) (4560-2846) UTRs

  1. REBr0: TTACCCG at 4135.
  2. REBr2ci: CGGGTAA at 3979, AGGGTAA at 3112.

Reb1r alternate (odds) (4560-2846) UTRs

  1. REBr1: TTACCCG at 4416, TTACCCG at 2965.
  2. REBr9: TTACCCT at 4250.

Reb1r arbitrary positive direction (odds) (4445-4265) core promoters

  1. REBr1: TTACCCG at 4416.

Reb1r alternate negative direction (odds) (2811-2596) proximal promoters

  1. REBr1: TTACCCG at 2965.

Reb1r arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. REBr9: TTACCCT at 4250.

Reb1r alternate positive direction (evens) (4265-4050) proximal promoters

  1. REBr0: TTACCCG at 4135.

Reb1r arbitrary negative direction (evens) (2596-1) distal promoters

  1. REBr0: TTACCCT at 1015.
  2. REBr0ci: CGGGTAA at 973, AGGGTAA at 321.
  3. REBr4ci: AGGGTAA at 1210.
  4. REBr6ci: AGGGTAA at 399.

Reb1r alternate negative direction (odds) (2596-1) distal promoters

  1. REBr1ci: CGGGTAA at 1815.
  2. REBr3ci: AGGGTAA at 1812.
  3. REBr9ci: AGGGTAA at 1107.

Reb1Dr arbitrary positive direction (odds) (4050-1) distal promoters

  1. REBr1: TTACCCG at 2965.
  2. REBr1ci: CGGGTAA at 1815.
  3. REBr3ci: AGGGTAA at 1812.
  4. REBr9ci: AGGGTAA at 1107.

Reb1r alternate positive direction (evens) (4050-1) distal promoters

  1. REBr0: TTACCCT at 1015.
  2. REBr0ci: CGGGTAA at 973, AGGGTAA at 321.
  3. REBr2ci: CGGGTAA at 3979, AGGGTAA at 3112.
  4. REBr4ci: AGGGTAA at 1210.

Reb1 analysis and results

Purified "Reb1 bound [...] exact TTACCCK occurrences [...] with >60% of 780 occurrences at promoters."[1]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5 ± 0.5 (--1,+-0)
Randoms UTR arbitrary negative 3 10 0.3 0.3
Randoms UTR alternate negative 3 10 0.3 0.3
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.05
Randoms Proximal alternate negative 1 10 0.1 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0.1
Randoms Proximal alternate positive 1 10 0.1 0.1
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 5 10 0.5 0.4
Randoms Distal alternate negative 3 10 0.3 0.4
Reals Distal positive 2 2 1 1 ± 1 (-+0,++2)
Randoms Distal arbitrary positive 4 10 0.4 0.5
Randoms Distal alternate positive 6 10 0.6 0.5

Comparison:

The occurrences of real Reb1s are greater than the randoms. This suggests that the real Reb1s are likely active or activable.

Extended Reb1 samplings

An extended Reb1 (ATTACCCGAA) finds none located between ZSCAN22 or between ZNF497 and A1BG.

For the Basic programs testing consensus sequence ATTACCCGAA (starting with SuccessablesREBE.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for ATTACCCGAA, 0.
  2. positive strand, negative direction, looking for ATTACCCGAA, 0.
  3. positive strand, positive direction, looking for ATTACCCGAA, 0.
  4. negative strand, positive direction, looking for ATTACCCGAA, 0.
  5. complement, negative strand, negative direction, looking for TAATGGGCTT, 0.
  6. complement, positive strand, negative direction, looking for TAATGGGCTT, 0.
  7. complement, positive strand, positive direction, looking for TAATGGGCTT, 0.
  8. complement, negative strand, positive direction, looking for TAATGGGCTT, 0.
  9. inverse complement, negative strand, negative direction, looking for TTCGGGTAAT, 0.
  10. inverse complement, positive strand, negative direction, looking for TTCGGGTAAT, 0.
  11. inverse complement, positive strand, positive direction, looking for TTCGGGTAAT, 0.
  12. inverse complement, negative strand, positive direction, looking for TTCGGGTAAT, 0.
  13. inverse negative strand, negative direction, looking for AAGCCCATTA, 0.
  14. inverse positive strand, negative direction, looking for AAGCCCATTA, 0.
  15. inverse positive strand, positive direction, looking for AAGCCCATTA, 0.
  16. inverse negative strand, positive direction, looking for AAGCCCATTA, 0.

Random dataset samplings

  1. REBEr0: 0.
  2. REBEr1: 0.
  3. REBEr2: 0.
  4. REBEr3: 0.
  5. REBEr4: 0.
  6. REBEr5: 0.
  7. REBEr6: 0.
  8. REBEr7: 0.
  9. REBEr8: 0.
  10. REBEr9: 0.
  11. REBEr0ci: 0.
  12. REBEr1ci: 0.
  13. REBEr2ci: 0.
  14. REBEr3ci: 0.
  15. REBEr4ci: 0.
  16. REBEr5ci: 0.
  17. REBEr6ci: 0.
  18. REBEr7ci: 0.
  19. REBEr8ci: 0.
  20. REBEr9ci: 0.

Discussion

Reb1 consensus sequences TTACCC(G/T) have three occurrences in A1BG: UTR at 3661 and two distal promoters at 3170 and 2912 in the positive direction all more than half way from either Zn finger.

Using the random datasets: there are three sequences in two datasets within the UTR: TTACCCG at 4135 and CGGGTAA at 3979, with AGGGTAA at 3112; the core promoters contained only TTACCCG at 4416 in the positive direction; the proximal promoters contained only TTACCCT at 4250 in the positive direction; and the distal promoters contained four to five sequences: five in the negative direction all more than half way to ZSCAN22 and four in the positive direction: one TTACCCG at 2965 more than halfway toward A1BG and the remaining three less than halfway.

The extended Reb1 consensus sequence ATTACCCGAA had no locations in either direction or in random datasets for either the extended consensus sequence or its inverse complement.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 Matthew J. Rossi, William K.M. Lai and B. Franklin Pugh (21 March 2018). "Genome-wide determinants of sequence-specific DNA binding of general regulatory factors". Genome Research. 28: 497–508. doi:10.1101/gr.229518.117. PMID 29563167. Retrieved 31 August 2020.

External links