Enhancer box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

File:Klatre08.jpg
This is an image of Dendromus mysticalis, the chestnut climbing mouse. Credit: Kenneth Worm.

"An E-box (Enhancer Box) is a DNA sequence which usually lies upstream of a gene in a promoter region."[1]

Enhancers

File:Gene enhancer.svg
The illustration characterizes a DNA enhancer near a gene. Credit: .

"An enhancer is a short region of DNA that can be bound with proteins (namely, the trans-acting factors, much like a set of transcription factors) to enhance transcription levels of genes (hence the name) in a gene cluster. While enhancers are usually cis-acting, an enhancer does not need to be particularly close to the genes it acts on, and sometimes need not be located on the same chromosome.[2]

In eukaryotic cells the structure of the chromatin complex of DNA is folded in a way that although the enhancer DNA is far from the gene in regard to the number of nucleotides, it is geometrically close to the promoter and gene.

An enhancer may be located upstream or downstream of the gene it regulates.

Enhancers do not act on the promoter region itself, but are bound by activator proteins. These activator proteins interact with the mediator complex, which recruits polymerase II and the general transcription factors which then begin transcribing the genes. Enhancers can also be found within introns. An enhancer's orientation may even be reversed without affecting its function. Additionally, an enhancer may be excised and inserted elsewhere in the chromosome, and still affect gene transcription.

Def. a "short region of DNA that can increase transcription of genes"[3] is called an enhancer.

Enhancer activity depends on two copies of the AGC box AGCCGCC in the promoters of several ethylene-responsive genes.[4]

Boxes

A "repeating sequence of nucleotides that forms a transcription or a regulatory signal"[5] is a box.

Immunoglobulin domains

The immunoglobulin domain is a type of protein domain that consists of a 2-layer sandwich of between 7 and 9 antiparallel β-strands arranged in two β-sheets with a Greek key topology.[6][7]

The E-box is a control element in immunoglobulin heavy-chain promoters.[8]

Consensus sequences

The consensus sequence for the E-box element is CANNTG, with a palindromic canonical sequence of CACGTG.[9]

Proximal promoters

"[T]he proximal sequence upstream of the gene that tends to contain primary regulatory elements" is a proximal promoter.[10]

It is "[a]pproximately 250 base pairs [or nucleotides, nts] upstream of the [transcription] start site".[10]

There may be an E box in the proximal promoter of some genes.[9]

Distal promoters

File:Enhancer Nucleotide Sequence.svg
Within this DNA sequence, protein(s) known as transcription factor(s) bind to the enhancer and increases the activity of the promoter. Credit: Jon Cheff.{{free media}}

An E-box usually lies within the distal promoter starting at or near -300 nts, the proximal promoter, or both.[9]

Hypotheses

  1. A1BG is not transcribed by an enhancer box.
  2. Existence of an enhancer box on either side of A1BG does not prove that it is actively used to transcribe A1BG.
  3. A1BG is not transcribed by a downstream enhancer box.

Enhancer box samplings

Regarding hypotheses 1:

A1BG has four possible transcription directions:

  1. on the negative strand from ZSCAN22 to A1BG,
  2. on the positive strand from ZSCAN22 to A1BG,
  3. on the negative strand from ZNF497 to A1BG, and
  4. on the positive strand from ZNF497 to A1BG.

For each transcription promoter that interacts directly with RNA polymerase II holoenzyme, the four possible consensus sequences need to be tested on the four possible transcription directions, even though some genes may only be transcribed from the negative strand in the 3'-direction on the transcribed strand.

For the Basic programs (starting with SuccessablesE.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are looking for, and found:

  1. Negative strand, negative direction: 10, CAGATG at 4212, CATTTG at 3482, CAGATG at 2988, CACCTG at 2116, CAGTTG at 1513, CAGATG at 1224, CAAGTG at 1179, CACATG at 797, CAGATG at 481, CACATG at 324.
  2. Positive strand, negative direction: 21, CACTTG at 4011, CACCTG at 3969, CAGGTG at 3953, CAGATG at 3919, CAACTG at 3850, CAGATG at 3627, CAGATG at 3620, CACTTG at 3241, CACTTG at 3102, CACTTG at 2920, CACATG at 2667, CACTTG at 2579, CAGGTG at 2570, CACTTG at 2126, CAGGTG at 2079, CAAATG at 1579, CACCTG at 1172, CACCTG at 1130, CACCTG at 393, CATTTG at 364, CATATG at 41.
  3. Negative strand, positive direction: 26, CACTTG at 4015, CATGTG at 3958, CACATG at 3956, CATGTG at 3902, CAGCTG at 3777, CACATG at 3742, CACATG at 3707, CAGATG at 3475, CATCTG at 3404, CAGCTG at 3241, CAGGTG at 3149, CACCTG at 3046, CACCTG at 2568, CAAGTG at 2510, CACCTG at 2432, CAGCTG at 2404, CAGGTG at 2374, CACCTG at 2249, CAGGTG at 2127, CAGCTG at 2054, CACATG at 2031, CAGGTG at 1968, CACCTG at 958, CACCTG at 858, CACGTG at 570, CAGGTG at 196.
  4. Positive strand, positive direction: 11, CAAGTG at 4202, CACTTG at 3936, CACGTG at 3884, CAGGTG at 3086, CACGTG at 2961, CAGGTG at 2028, CAGGTG at 1843, CACGTG at 1219, CATGTG at 567, CACGTG at 547, CACCTG at 186.

The complement inverse is the same as the direct and the inverse is the same as the complement.

Enhancer box (4560-2846) UTRs

  1. Negative strand, negative direction: CAGATG at 4212, CATTTG at 3482, CAGATG at 2988.
  2. Positive strand, negative direction: CACTTG at 4011, CACCTG at 3969, CAGGTG at 3953, CAGATG at 3919, CAACTG at 3850, CAGATG at 3627, CAGATG at 3620, CACTTG at 3241, CACTTG at 3102, CACTTG at 2920.

Enhancer box negative direction (2811-2596) proximal promoters

  1. Positive strand, negative direction: CACATG at 2667.

Enhancer box positive direction (4265-4050) proximal promoters

  1. Positive strand, positive direction: CAAGTG at 4202.

Enhancer box negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: CACCTG at 2116, CAGTTG at 1513, CAGATG at 1224, CAAGTG at 1179, CACATG at 797, CAGATG at 481, CACATG at 324.
  2. Positive strand, negative direction: CACTTG at 2579, CAGGTG at 2570, CACTTG at 2126, CAGGTG at 2079, CAAATG at 1579, CACCTG at 1172, CACCTG at 1130, CACCTG at 393, CATTTG at 364, CATATG at 41.

Enhancer box positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CACTTG at 4015, CATGTG at 3958, CACATG at 3956, CATGTG at 3902, CAGCTG at 3777, CACATG at 3742, CACATG at 3707, CAGATG at 3475, CATCTG at 3404, CAGCTG at 3241, CAGGTG at 3149, CACCTG at 3046, CACCTG at 2568, CAAGTG at 2510, CACCTG at 2432, CAGCTG at 2404, CAGGTG at 2374, CACCTG at 2249, CAGGTG at 2127, CAGCTG at 2054, CACATG at 2031, CAGGTG at 1968, CACCTG at 958, CACCTG at 858, CACGTG at 570, CAGGTG at 196.
  2. Positive strand, positive direction: CACTTG at 3936, CACGTG at 3884, CAGGTG at 3086, CACGTG at 2961, CAGGTG at 2028, CAGGTG at 1843, CACGTG at 1219, CATGTG at 567, CACGTG at 547, CACCTG at 186.

Enhancer box random dataset samplings

  1. Er0: 12, CACGTG at 4343, CATGTG at 3956, CAATTG at 3880, CAACTG at 3533, CAACTG at 3467, CAGTTG at 3440, CAGGTG at 3398, CAATTG at 3202, CAATTG at 2233, CATATG at 2151, CATGTG at 1059, CACCTG at 999.
  2. Er1: 10, CAATTG at 4110, CATTTG at 4051, CATGTG at 3891, CAGTTG at 3388, CAAATG at 3372, CAACTG at 2752, CATATG at 2101, CATATG at 1605, CATCTG at 1131, CAAATG at 263.
  3. Er2: 15, CAGTTG at 4536, CAGCTG at 4212, CAAATG at 3829, CACATG at 3734, CAGTTG at 3245, CATTTG at 2627, CAAGTG at 2604, CAGGTG at 2198, CACCTG at 2124, CAGTTG at 1987, CAGATG at 1826, CATATG at 1757, CAAATG at 427, CATCTG at 203, CATGTG at 166.
  4. Er3: 13, CACGTG at 3769, CAGGTG at 3527, CATCTG at 3286, CAAGTG at 3239, CAGGTG at 2880, CACTTG at 2805, CATCTG at 1770, CATGTG at 1134, CAAATG at 1055, CAGATG at 298, CAAGTG at 266, CAAATG at 158, CATGTG at 93.
  5. Er4: 18, CAGTTG at 4419, CAGATG at 4202, CATTTG at 2905, CATCTG at 2584, CACGTG at 2287, CATGTG at 2243, CATGTG at 2224, CACCTG at 1958, CACCTG at 1913, CAAATG at 1809, CATATG at 1685, CAGTTG at 1578, CAAATG at 1363, CACTTG at 1162, CACCTG at 1123, CACATG at 909, CATTTG at 836, CACTTG at 646.
  6. Er5: 12, CACTTG at 3937, CACCTG at 3116, CATTTG at 2790, CAATTG at 2227, CAGGTG at 2213, CAGCTG at 2162, CATATG at 1779, CAATTG at 1579, CATTTG at 1204, CAACTG at 1180, CAGGTG at 697, CACGTG at 59.
  7. Er6: 16, CACCTG at 4358, CAACTG at 3821, CAGGTG at 3800, CAGTTG at 3329, CAAATG at 3078, CACGTG at 2905, CAGCTG at 2881, CAGTTG at 2638, CACATG at 2601, CAATTG at 2488, CATGTG at 2330, CACTTG at 2062, CAACTG at 1809, CATGTG at 1342, CACGTG at 654, CATATG at 245.
  8. Er7: 11, CATTTG at 4381, CACTTG at 3970, CACTTG at 3118, CAAATG at 3034, CATTTG at 2446, CAATTG at 2357, CACGTG at 1856, CACCTG at 1452, CATTTG at 1177, CATCTG at 1159, CAACTG at 20.
  9. Er8: 12, CACTTG at 4159, CAGATG at 3821, CAACTG at 3658, CAGATG at 2726, CATTTG at 2428, CAGTTG at 2300, CATCTG at 1711, CAAATG at 1376, CACTTG at 1254, CAAATG at 963, CAGGTG at 292, CAAATG at 213.
  10. Er9: 14, CAGATG at 4485, CACATG at 4071, CAGATG at 4027, CACTTG at 3646, CATTTG at 3619, CATTTG at 3477, CAAATG at 2077, CATGTG at 1975, CACCTG at 1898, CAACTG at 1697, CATTTG at 1448, CACGTG at 1187, CACCTG at 922, CAACTG at 121.

Enhancerr arbitrary (evens) (4560-2846) UTRs

  1. Er0: CACGTG at 4343, CATGTG at 3956, CAATTG at 3880, CAACTG at 3533, CAACTG at 3467, CAGTTG at 3440, CAGGTG at 3398, CAATTG at 3202.
  2. Er2: CAGTTG at 4536, CAGCTG at 4212, CAAATG at 3829, CACATG at 3734, CAGTTG at 3245.
  3. Er4: CAGTTG at 4419, CAGATG at 4202, CATTTG at 2905.
  4. Er6: CACCTG at 4358, CAACTG at 3821, CAGGTG at 3800, CAGTTG at 3329, CAAATG at 3078, CACGTG at 2905, CAGCTG at 2881.
  5. Er8: CACTTG at 4159, CAGATG at 3821, CAACTG at 3658.

Enhancerr alternate (odds) (4560-2846) UTRs

  1. Er1: CAATTG at 4110, CATTTG at 4051, CATGTG at 3891, CAGTTG at 3388, CAAATG at 3372.
  2. Er3: CACGTG at 3769, CAGGTG at 3527, CATCTG at 3286, CAAGTG at 3239, CAGGTG at 2880.
  3. Er5: CACTTG at 3937, CACCTG at 3116.
  4. Er7: CATTTG at 4381, CACTTG at 3970, CACTTG at 3118, CAAATG at 3034.
  5. Er9: CAGATG at 4485, CACATG at 4071, CAGATG at 4027, CACTTG at 3646, CATTTG at 3619, CATTTG at 3477.

Enhancerr arbitrary positive direction (odds) (4445-4265) core promoters

  1. Er7: CATTTG at 4381.

Enhancerr alternate positive direction (evens) (4445-4265) core promoters

  1. Er0: CACGTG at 4343.
  2. Er4: CAGTTG at 4419.
  3. Er6: CACCTG at 4358.

Enhancerr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Er2: CATTTG at 2627, CAAGTG at 2604.
  2. Er6: CAGTTG at 2638, CACATG at 2601.
  3. Er8: CAGATG at 2726.

Enhancerr alternate negative direction (odds) (2811-2596) proximal promoters

  1. Er1: CAACTG at 2752.
  2. Er3: CACTTG at 2805.
  3. Er5: CATTTG at 2790.

Enhancerr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. Er1: CAATTG at 4110, CATTTG at 4051.
  2. Er9: CACATG at 4071.

Enhancerr alternate positive direction (evens) (4265-4050) proximal promoters

  1. Er2: CAGCTG at 4212.
  2. Er4: CAGATG at 4202.
  3. Er8: CACTTG at 4159.

Enhancerr arbitrary negative direction (evens) (2596-1) distal promoters

  1. Er0: CAATTG at 2233, CATATG at 2151, CATGTG at 1059, CACCTG at 999.
  2. Er2: CAGGTG at 2198, CACCTG at 2124, CAGTTG at 1987, CAGATG at 1826, CATATG at 1757, CAAATG at 427, CATCTG at 203, CATGTG at 166.
  3. Er4: CATCTG at 2584, CACGTG at 2287, CATGTG at 2243, CATGTG at 2224, CACCTG at 1958, CACCTG at 1913, CAAATG at 1809, CATATG at 1685, CAGTTG at 1578, CAAATG at 1363, CACTTG at 1162, CACCTG at 1123, CACATG at 909, CATTTG at 836, CACTTG at 646.
  4. Er6: CAATTG at 2488, CATGTG at 2330, CACTTG at 2062, CAACTG at 1809, CATGTG at 1342, CACGTG at 654, CATATG at 245.
  5. Er8: CATTTG at 2428, CAGTTG at 2300, CATCTG at 1711, CAAATG at 1376, CACTTG at 1254, CAAATG at 963, CAGGTG at 292, CAAATG at 213.

Enhancerr alternate negative direction (odds) (2596-1) distal promoters

  1. Er1: CATATG at 2101, CATATG at 1605, CATCTG at 1131, CAAATG at 263.
  2. Er3: CATCTG at 1770, CATGTG at 1134, CAAATG at 1055, CAGATG at 298, CAAGTG at 266, CAAATG at 158, CATGTG at 93.
  3. Er5: CAATTG at 2227, CAGGTG at 2213, CAGCTG at 2162, CATATG at 1779, CAATTG at 1579, CATTTG at 1204, CAACTG at 1180, CAGGTG at 697, CACGTG at 59.
  4. Er7: CATTTG at 2446, CAATTG at 2357, CACGTG at 1856, CACCTG at 1452, CATTTG at 1177, CATCTG at 1159, CAACTG at 20.
  5. Er9: CAAATG at 2077, CATGTG at 1975, CACCTG at 1898, CAACTG at 1697, CATTTG at 1448, CACGTG at 1187, CACCTG at 922, CAACTG at 121.

Enhancerr arbitrary positive direction (odds) (4050-1) distal promoters

  1. Er1: CATGTG at 3891, CAGTTG at 3388, CAAATG at 3372, CAACTG at 2752, CATATG at 2101, CATATG at 1605, CATCTG at 1131, CAAATG at 263.
  2. Er3: CACGTG at 3769, CAGGTG at 3527, CATCTG at 3286, CAAGTG at 3239, CAGGTG at 2880, CACTTG at 2805, CATCTG at 1770, CATGTG at 1134, CAAATG at 1055, CAGATG at 298, CAAGTG at 266, CAAATG at 158, CATGTG at 93.
  3. Er5: CACTTG at 3937, CACCTG at 3116, CATTTG at 2790, CAATTG at 2227, CAGGTG at 2213, CAGCTG at 2162, CATATG at 1779, CAATTG at 1579, CATTTG at 1204, CAACTG at 1180, CAGGTG at 697, CACGTG at 59.
  4. Er7: CACTTG at 3970, CACTTG at 3118, CAAATG at 3034, CATTTG at 2446, CAATTG at 2357, CACGTG at 1856, CACCTG at 1452, CATTTG at 1177, CATCTG at 1159, CAACTG at 20.
  5. Er9: CAGATG at 4027, CACTTG at 3646, CATTTG at 3619, CATTTG at 3477, CAAATG at 2077, CATGTG at 1975, CACCTG at 1898, CAACTG at 1697, CATTTG at 1448, CACGTG at 1187, CACCTG at 922, CAACTG at 121.

Enhancerr alternate positive direction (evens) (4050-1) distal promoters

  1. Er0: CATGTG at 3956, CAATTG at 3880, CAACTG at 3533, CAACTG at 3467, CAGTTG at 3440, CAGGTG at 3398, CAATTG at 3202, CAATTG at 2233, CATATG at 2151, CATGTG at 1059, CACCTG at 999.
  2. Er2:CAAATG at 3829, CACATG at 3734, CAGTTG at 3245, CATTTG at 2627, CAAGTG at 2604, CAGGTG at 2198, CACCTG at 2124, CAGTTG at 1987, CAGATG at 1826, CATATG at 1757, CAAATG at 427, CATCTG at 203, CATGTG at 166.
  3. Er4: CATTTG at 2905, CATCTG at 2584, CACGTG at 2287, CATGTG at 2243, CATGTG at 2224, CACCTG at 1958, CACCTG at 1913, CAAATG at 1809, CATATG at 1685, CAGTTG at 1578, CAAATG at 1363, CACTTG at 1162, CACCTG at 1123, CACATG at 909, CATTTG at 836, CACTTG at 646.
  4. Er6: CAACTG at 3821, CAGGTG at 3800, CAGTTG at 3329, CAAATG at 3078, CACGTG at 2905, CAGCTG at 2881, CAGTTG at 2638, CACATG at 2601, CAATTG at 2488, CATGTG at 2330, CACTTG at 2062, CAACTG at 1809, CATGTG at 1342, CACGTG at 654, CATATG at 245.
  5. Er8: CAGATG at 3821, CAACTG at 3658, CAGATG at 2726, CATTTG at 2428, CAGTTG at 2300, CATCTG at 1711, CAAATG at 1376, CACTTG at 1254, CAAATG at 963, CAGGTG at 292, CAAATG at 213.

Enhancer box analysis and results

The consensus sequence for the Enhancer box element is CANNTG, with a palindromic canonical sequence of CACGTG.[9]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 13 2 6.5 6.5 ± 3.5 (--3,+-10)
Randoms UTR arbitrary negative 26 10 2.6 2.4
Randoms UTR alternate negative 22 10 2.2 2.4
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.2
Randoms Core alternate positive 3 10 0.3 0.2
Reals Proximal negative 1 2 0.5 0.5 ± 0.5 (--0,+-1)
Randoms Proximal arbitrary negative 5 10 0.5 0.4
Randoms Proximal alternate negative 3 10 0.3 0.4
Reals Proximal positive 1 2 0.5 0.5 ± (-+0,++1)
Randoms Proximal arbitrary positive 3 10 0.3 0.3
Randoms Proximal alternate positive 3 10 0.3 0.3
Reals Distal negative 17 2 8.5 8.5 ± 1.5 (--7,+-10)
Randoms Distal arbitrary negative 42 10 4.2 3.85
Randoms Distal alternate negative 35 10 3.5 3.85
Reals Distal positive 36 2 18 18 ± 8 (-+26,++10)
Randoms Distal arbitrary positive 55 10 5.5 6.05
Randoms Distal alternate positive 66 10 6.6 6.05

Comparison:

The occurrences of real enhancer boxes are larger than the randoms. This suggests that the enhancer box consensus sequences are likely active or activable.

MITF E-boxes

Microphthalmia-associated transcription factor (MITF) is a basic helix-loop-helix leucine zipper transcription factor involved in lineage-specific pathway regulation of many types of cells including melanocytes, osteoclasts, and mast cells.[11]

In human subjects, because it is known that MITF controls the expression of various genes that are essential for normal melanin synthesis in melanocytes, mutations of MITF can lead to diseases such as melanoma, Waardenburg syndrome, and Tietz syndrome.[12]

MITF recognizes the E-box (CAYRTG) and M-box (TCAYRTG or CAYRTGA) sequences in the promoter regions of target genes.[13]

MITF E-box (CAYRTG) samplings

(CAYRTG) = CA(C/T)(A/G)TG.

  1. Negative strand, negative direction: 2, CACATG at 797, CACATG at 324.
  2. Positive strand, negative direction: 2, CACATG at 2667, CATATG at 41.
  3. Negative strand, positive direction: 7, CATGTG at 3958, CACATG at 3956, CATGTG at 3902, CACATG at 3742, CACATG at 3707, CACATG at 2031, CACGTG at 570.
  4. Positive strand, positive direction: 5, CACGTG at 3884, CACGTG at 2961, CACGTG at 1219, CATGTG at 567, CACGTG at 547.

MITF E-box negative direction (2811-2596) proximal promoters

  1. Positive strand, negative direction: CACATG at 2667.

MITF E-box negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: CACATG at 797, CACATG at 324.
  2. Positive strand, negative direction: CATATG at 41.

MITF E-box positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CATGTG at 3958, CACATG at 3956, CATGTG at 3902, CACATG at 3742, CACATG at 3707, CACATG at 2031, CACGTG at 570.
  2. Positive strand, positive direction: CACGTG at 3884, CACGTG at 2961, CACGTG at 1219, CATGTG at 567, CACGTG at 547.

MITF E-box random dataset samplings

  1. Er0: 4, CACGTG at 4343, CATGTG at 3956, CATATG at 2151, CATGTG at 1059.
  2. Er1: 3, CATGTG at 3891, CATATG at 2101, CATATG at 1605.
  3. Er2: 3, CACATG at 3734, CATATG at 1757, CATGTG at 166.
  4. Er3: 3, CACGTG at 3769, CATGTG at 1134, CATGTG at 93.
  5. Er4: 5, CACGTG at 2287, CATGTG at 2243, CATGTG at 2224, CATATG at 1685, CACATG at 909.
  6. Er5: 2, CATATG at 1779, CACGTG at 59.
  7. Er6: 6, CACGTG at 2905, CACATG at 2601, CATGTG at 2330, CATGTG at 1342, CACGTG at 654, CATATG at 245.
  8. Er7: 1, CACGTG at 1856.
  9. Er8: 0.
  10. Er9: 3, CACATG at 4071, CATGTG at 1975, CACGTG at 1187.

MITF E-boxr arbitrary (evens) (4560-2846) UTRs

  1. Er0: CACGTG at 4343, CATGTG at 3956.
  2. Er2: CACATG at 3734.
  3. Er6: CACGTG at 2905.

MITF E-boxr alternate (odds) (4560-2846) UTRs

  1. Er1: CATGTG at 3891.
  2. Er3: CACGTG at 3769.
  3. Er9: CACATG at 4071.

MITF E-boxr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Er6: CACATG at 2601.

MITF E-boxr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. Er9: CACATG at 4071.

MITF E-boxr arbitrary negative direction (evens) (2596-1) distal promoters

  1. Er0: CATATG at 2151, CATGTG at 1059.
  2. Er2: CATATG at 1757, CATGTG at 166.
  3. Er4: CACGTG at 2287, CATGTG at 2243, CATGTG at 2224, CATATG at 1685, CACATG at 909.
  4. Er6: CATGTG at 2330, CATGTG at 1342, CACGTG at 654, CATATG at 245.

MITF E-boxr alternate negative direction (odds) (2596-1) distal promoters

  1. Er1: CATATG at 2101, CATATG at 1605.
  2. Er3: CATGTG at 1134, CATGTG at 93.
  3. Er5: CATATG at 1779, CACGTG at 59.
  4. Er7: CACGTG at 1856.
  5. Er9: CATGTG at 1975, CACGTG at 1187.

MITF E-boxr arbitrary positive direction (odds) (4050-1) distal promoters

  1. Er1: CATGTG at 3891, CATATG at 2101, CATATG at 1605.
  2. Er3: CACGTG at 3769, CATGTG at 1134, CATGTG at 93.
  3. Er5: CATATG at 1779, CACGTG at 59.
  4. Er7: CACGTG at 1856.
  5. Er9: CATGTG at 1975, CACGTG at 1187.

MITF E-boxr alternate positive direction (evens) (4050-1) distal promoters

  1. Er0: CATGTG at 3956, CATATG at 2151, CATGTG at 1059.
  2. Er2: CACATG at 3734, CATATG at 1757, CATGTG at 166.
  3. Er4: CACGTG at 2287, CATGTG at 2243, CATGTG at 2224, CATATG at 1685, CACATG at 909.
  4. Er6: CACGTG at 2905, CACATG at 2601, CATGTG at 2330, CATGTG at 1342, CACGTG at 654, CATATG at 245.

MITF E-box analysis and results

MITF recognizes E-box (CAYRTG) and M-box (TCAYRTG or CAYRTGA) sequences in the promoter regions of target genes.[13]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 4 10 0.4 0.35
Randoms UTR alternate negative 3 10 0.3 0.35
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 1 2 0.5 0.5 ± 0.5 (--0,+-1)
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0.05
Randoms Proximal alternate positive 0 10 0 0.05
Reals Distal negative 3 2 1.5 1.5 ± 0.5 (--2,+-1)
Randoms Distal arbitrary negative 13 10 1.3 1.1
Randoms Distal alternate negative 9 10 0.9 1.1
Reals Distal positive 12 2 6.0 6.0 ± 1.0 (-+7,++5)
Randoms Distal arbitrary positive 11 10 1.1 1.4
Randoms Distal alternate positive 17 10 1.7 1.4

Comparison:

The occurrences of real MITF E-box proximals and positive distals are greater than the randoms, the negative distals overlap high randoms. This suggests that the real MITF E-boxes are likely active or activable.

Transcribed enhancer boxes

"MYC is a basic helix-loop-helix transcription factor, evolutionarily conserved in all vertebrates with a considerable amount of sequence similarity (Atchley & Fitch, 1995). It binds to thousands of promoters in mammalian cells as MYC-MAX heterodimer (Blackwood & Eisenman, 1991; C. Y. Lin et al., 2012). In particular it binds the motif CACGTG of the enhancer box (E-box) in the core promoter of active genes. Depending on the target gene, MYC can act as transcriptional activator or repressor, and, can affect transcription at both initiation and elongation steps (Rahl et al., 2010)."[14]

"MYC mediates the transcriptional response of growth-factors stimulation. Importantly, MYC does not only regulate the expression of mRNA(s), it also regulates ribosomal and tRNA genes, transcribed by the RNA Pol I and RNA Pol III respectively (Campbell & White, 2014; Dai, Sun, & Lu, 2010; Mitchell et al., 2015). Amongst the major gene ontology categories of protein-coding genes under the control of MYC there are: ribosome biogenesis, apoptosis, cell adhesion, cell size, angiogenesis and metabolic pathways (Nieminen, Partanen, & Klefstrom, 2007; Peterson & Ayer, 2011; A. M. Singh & Dalton, 2009; Uslu et al., 2014; van Riggelen, Yetil, & Felsher, 2010)."[14]

"The ATA box [AAATAT], GC box [GGCGGG], CArG box [CCTATTATGCG], [two E boxes CAGTTG] and M-CAT [CATTCCT] consensus sequences are [described from the mouse dystrophin promoter]."[15]

"The E box [ enhancer box ] sites that are most important are those of the E2 box class (GCAGXTGG/T). Two E2 box sites are present in the immunoglobulin heavy chain gene enhancer [...] and one is present in the kappa enhancer, designated KE2 [29-31]."[16]

"The developmental regulation of Ig gene expression is dependent on various sequences in the Ig enhancer. One class of such sequence elements is the E boxes. They share as a consensus sequence NNCANNTGNN. The E-box sites were first identified by dimethylsulfate protection experiments (6, 12). Factors were found to protect certain sequences from methylation in the Ig heavy- and light-chain enhancer in B cells but not in non-B cells (6,12). That the E-box elements are critical for B-cell-specific gene expression became evident from mutational analysis. Mutation of E-box sites caused a significant decrease in Ig transcription (18, 21). The most dramatic impact on Ig expression was found in mutations of elements that contain an E2 box (G/ACAGNTGT/G) (21). The E2 boxes are particularly interesting because they are also present in muscle-and pancreas-specific enhancers (3,4,32). Mutation of the E2-box elements present in these enhancers revealed the crucial role of these elements in regulating muscle- and pancreas-specific genes (16, 22, 26, 27, 32)."[17]

"The two E2 boxes in the mouse and human E-cadherin promoter sequences were demonstrated to play a crucial role in the epithelial-specific expression of E-cadherin Behrens et al. 1991, Giroldi et al. 1997. Mutation of these sequence elements results in upregulation of the E-cadherin promoter in dedifferentiated cancer cells, whereas the wild-type promoter shows low activity in such cells. Recently, it was shown that the zinc finger transcriptional repressor Snail can downregulate E-cadherin by binding to the E boxes in the E-cadherin promoter Batlle et al. 2000, Cano et al. 2000. Human Snail belongs to a family of zinc finger proteins, which contain four or five zinc finger domains of the C2H2 type at their C-terminal end. These zinc fingers bind to the CANNTG sequence in E box motifs."[18]

The CArG boxes occur between -400 and -200 nts, between the E boxes and the TCE element.[19]

The "isolated mouse chromogranin B promoter [specifically] the proximal chromogranin B promoter (from −216 to −91 bp); [...] contains an E box (at [−206 bp]CACCTG[−201 bp]), four G/C-rich regions (at[− 196 bp]CCCCGC[−191 bp], [−134 bp]CCGCCCGC[−127 bp],[− 125 bp]GGCGCCGCC[−117 bp], and [−115 bp]CGGGGC[−110 bp]), and a cAMP response element (CRE; at [−102 bp]TGACGTCA[−95 bp]). A 60-bp core promoter region, defined by an internal deletion from −134 to −74 bp upstream of the cap site and spanning the CRE and three G/C-rich regions, directed tissue-specific expression of the gene. The CRE motif directed cell type-specific expression of the chromogranin B gene in neurons, whereas three of the G/C-rich regions played a crucial role in neuroendocrine cells. Both the endogenous chromogranin B gene and the transfected chromogranin B promoter were induced by preganglionic secretory stimuli (pituitary adenylyl cyclase-activating polypeptide, vasoactive intestinal peptide, or a nicotinic cholinergic agonist), establishing stimulus-transcription coupling for this promoter. The adenylyl cyclase activator forskolin, nerve growth factor, and retinoic acid also activated the chromogranin B gene. Secretagogue-inducible expression of chromogranin B also mapped onto the proximal promoter; inducible expression was entirely lost upon internal deletion of the 60-bp core (from −134 to −74 bp). [...] CRE and G/C-rich domains are crucial determinants of both cell type-specific and secretagogue-inducible expression of the chromogranin B gene."[20]

"TCF4 is a member of the family of basic helix-loop-helix (bHLH) TFs. These proteins have a DNA-binding domain and form homo- or heterodimers to regulate gene expression. The dimers of TFs can exert different functions depending on their dimerization partners (Jones, 2004). Within the family of bHLH TFs, TCF4 belongs to the subgroup of E-proteins, which share the recognition of the pseudo-palindromic Ephrussi box (E-box) DNA element (Massari & Murre, 2000). TCF4 can dimerize with numerous other TFs, and interactions of TCF4 with ATHO1 (MATH1), HASH1 and NEUROD1 have been described in the brain (Flora, Garcia, Thaller, & Zoghbi, 2007; Navarrete et al., 2013)."[21]

Gene ID: 4609 is MYC MYC proto-oncogene, bHLH transcription factor on 8q24.21: "This gene is a proto-oncogene and encodes a nuclear phosphoprotein that plays a role in cell cycle progression, apoptosis and cellular transformation. The encoded protein forms a heterodimer with the related transcription factor MAX. This complex binds to the E box DNA consensus sequence and regulates the transcription of specific target genes. Amplification of this gene is frequently observed in numerous human cancers. Translocations involving this gene are associated with Burkitt lymphoma and multiple myeloma in human patients. There is evidence to show that translation initiates both from an upstream, in-frame non-AUG (CUG) and a downstream AUG start site, resulting in the production of two isoforms with distinct N-termini."[22]

  1. NP_001341799.1 myc proto-oncogene protein isoform 2.[22]
  2. NP_002458.2 myc proto-oncogene protein isoform 1.[22]

Gene ID: 6925 is TCF4 transcription factor 4 on 18q21.2: "This gene encodes transcription factor 4, a basic helix-loop-helix transcription factor. The encoded protein recognizes an Ephrussi-box ('E-box') binding site ('CANNTG') - a motif first identified in immunoglobulin enhancers. This gene is broadly expressed, and may play an important role in nervous system development. Defects in this gene are a cause of Pitt-Hopkins syndrome. In addition, an intronic CTG repeat normally numbering 10-37 repeat units can expand to >50 repeat units and cause Fuchs endothelial corneal dystrophy. Multiple alternatively spliced transcript variants that encode different proteins have been described."[23]

  1. NP_001077431.1 transcription factor 4 isoform a: "Transcript Variant: This variant (1) differs in the 5' UTR and coding sequence compared to variant 3. The resulting isoform (a, also known as TCF4-B+) is shorter at the N-terminus compared to isoform c."[23]
  2. NP_001230155.2 transcription factor 4 isoform c: "Transcript Variant: This variant (3) encodes the longest isoform (c)."[23]
  3. NP_001230156.1 transcription factor 4 isoform d: "Transcript Variant: This variant (4) differs in the 5' UTR and coding sequence compared to variant 3. The resulting isoform (d) is shorter at the N-terminus compared to isoform c."[23]
  4. NP_001230157.1 transcription factor 4 isoform e: "Transcript Variant: This variant (5) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 5' end of an exon compared to variant 3. The resulting isoform (e) is shorter at the N-terminus and contains an alternate internal segment compared to isoform c."[23]
  5. NP_001230159.1 transcription factor 4 isoform f: "Transcript Variant: This variant (6) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 3' end of an exon compared to variant 3. The resulting isoform (f, also known as TCF4-E-) has a shorter and distinct N-terminus and lacks an alternate internal segment compared to isoform c."[23]
  6. NP_001230160.1 transcription factor 4 isoform g: "Transcript Variant: This variant (7) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 3' end of an exon compared to variant 3. The resulting isoform (g) has a shorter and distinct N-terminus and lacks an alternate internal segment compared to isoform c."[23]
  7. NP_001230161.1 transcription factor 4 isoform h: "Transcript Variant: This variant (8) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 5' end of an exon compared to variant 3. The resulting isoform (h) has a shorter and distinct N-terminus and contains an alternate internal segment compared to isoform c."[23]
  8. NP_001230162.1 transcription factor 4 isoform i: "Transcript Variant: This variant (9) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 3' end of an exon compared to variant 3. The resulting isoform (i) is shorter at the N-terminus and lacks an alternate internal segment compared to isoform c."[23]
  9. NP_001230163.1 transcription factor 4 isoform j: "Transcript Variant: This variant (10) differs in the 5' UTR and coding sequence compared to variant 3. The resulting isoform (j, also known as TCF4-A+) has a shorter and distinct N-terminus compared to isoform c."[23]
  10. NP_001230164.1 transcription factor 4 isoform k: "Transcript Variant: This variant (11) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 3' end of an exon compared to variant 3. The resulting isoform (k, also known as TCF4-A-) has a shorter and distinct N-terminus and lacks an alternate internal segment compared to isoform c."[23]
  11. NP_001230165.1 transcription factor 4 isoform l: "Transcript Variant: This variant (12) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 3' end of an exon compared to variant 3. The resulting isoform (l) has a shorter and distinct N-terminus and lacks an alternate internal segment compared to isoform c."[23]
  12. NP_001293136.1 transcription factor 4 isoform m: "Transcript Variant: This variant (13) differs in the 5' UTR and coding sequence, and uses an alternate in-frame splice site in the 3' coding region, compared to variant 3. The encoded isoform (m) has a shorter N-terminus compared to isoform c."[23]
  13. NP_001293137.1 transcription factor 4 isoform n: "Transcript Variant: This variant (14) differs in the 5' UTR and coding sequence, and uses two alternate in-frame splice sites in the coding region, compared to variant 3. The encoded isoform (n) has a shorter and distinct N-terminus compared to isoform c."[23]
  14. NP_001317533.1 transcription factor 4 isoform o: "Transcript Variant: This variant (15) differs in the 5' UTR, lacks a portion of the 5' coding region, initiates translation at a downstream start codon, and uses an alternate in-frame splice site in the central coding region, compared to variant 3. The resulting isoform (o) is shorter at the N-terminus and lacks an internal aa compared to isoform c."[23]
  15. NP_001317534.1 transcription factor 4 isoform p: "Transcript Variant: This variant (16) differs in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at a downstream start codon, compared to variant 3. The resulting isoform (p) is shorter at the N-terminus compared to isoform c. Variants 16 and 19 encode the same isoform (p)."[23]
  16. NP_001335140.1 transcription factor 4 isoform q: "Transcript Variant: This variant (17) contains an alternate exon in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at an alternate start codon, compared to variant 3. The resulting isoform (q) is shorter at the N-terminus compared to isoform c."[23]
  17. NP_001335141.1 transcription factor 4 isoform i: "Transcript Variant: This variant (18) differs in the 5' UTR, lacks a portion of the 5' coding region, initiates translation at a downstream start codon, and uses an alternate in-frame splice site in the 3' coding region, compared to variant 3. The resulting isoform (i) is shorter at the N-terminus and lacks a small internal segment compared to isoform c. Variants 9 and 18 encode the same isoform (i)."[23]
  18. NP_001335142.1 transcription factor 4 isoform p: "Transcript Variant: This variant (19) differs in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at a downstream start codon, compared to variant 3. The resulting isoform (p) is shorter at the N-terminus compared to isoform c. Variants 16 and 19 encode the same isoform (p)."[23]
  19. NP_001335143.1 transcription factor 4 isoform r: "Transcript Variant: This variant (20) contains an alternate exon in the 5' UTR, lacks a portion of the 5' coding region, initiates translation at an alternate start codon, and uses two alternate in-frame splice sites, compared to variant 3. The resulting isoform (r) is shorter at the N-terminus and lacks several internal amino acids compared to isoform c."[23]
  20. NP_001335144.1 transcription factor 4 isoform s: "Transcript Variant: This variant (21) differs in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at a downstream start codon, compared to variant 3. The resulting isoform (s) is shorter at the N-terminus compared to isoform c."[23]
  21. NP_001335145.1 transcription factor 4 isoform t: "Transcript Variant: This variant (22) contains an alternate exon in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at an alternate start codon, compared to variant 3. The resulting isoform (t) is shorter at the N-terminus compared to isoform c."[23]
  22. NP_001335146.1 transcription factor 4 isoform d: "Transcript Variant: This variant (23) differs in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at a downstream start codon, compared to variant 3. The resulting isoform (d) is shorter at the N-terminus compared to isoform c. Variants 4, 23 and 24 all encode the same isoform (d)."[23]
  23. NP_001335147.1 transcription factor 4 isoform d: "Transcript Variant: This variant (24) differs in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at a downstream start codon, compared to variant 3. The resulting isoform (d) is shorter at the N-terminus compared to isoform c. Variants 4, 23 and 24 all encode the same isoform (d)."[23]
  24. NP_001335148.1 transcription factor 4 isoform m: "Transcript Variant: This variant (25) differs in the 5' UTR, lacks a portion of the 5' coding region, initiates translation at a downstream start codon, and uses an alternate in-frame splice site in the 3' coding region compared to variant 3. The resulting isoform (m) is shorter at the N-terminus and lacks an alternate internal segment compared to isoform c. Variants 13 and 25 encode the same isoform (m)."[23]
  25. NP_001335149.1 transcription factor 4 isoform u: "Transcript Variant: This variant (26) differs in the 5' UTR, lacks a portion of the 5' coding region, initiates translation at a downstream start codon, and uses two alternate in-frame splice sites compared to variant 3. The resulting isoform (u) is shorter at the N-terminus and lacks several internal amino acids compared to isoform c."[23]
  26. NP_001356496.1 transcription factor 4 isoform a [variant 27].[23]
  27. NP_001356497.1 transcription factor 4 isoform a [variant 28].[23]
  28. NP_001356498.1 transcription factor 4 isoform v [variant 29].[23]
  29. NP_001356499.1 transcription factor 4 isoform v [variant 30].[23]
  30. NP_001356500.1 transcription factor 4 isoform w [variant 31].[23]
  31. NP_001356501.1 transcription factor 4 isoform w [variant 32].[23]
  32. NP_001356502.1 transcription factor 4 isoform x [variant 33].[23]
  33. NP_001356503.1 transcription factor 4 isoform 27 [variant 34].[23]
  34. NP_001356504.1 transcription factor 4 isoform d [variant 35].[23]
  35. NP_001356505.1 transcription factor 4 isoform y [variant 36].[23]
  36. NP_001356506.1 transcription factor 4 isoform 28 [variant 37].[23]
  37. NP_001356507.1 transcription factor 4 isoform y [variant 38].[23]
  38. NP_001356508.1 transcription factor 4 isoform 28 [variant 39].[23]
  39. NP_001356509.1 transcription factor 4 isoform 28 [variant 40].[23]
  40. NP_001356510.1 transcription factor 4 isoform y [variant 41].[23]
  41. NP_001356511.1 transcription factor 4 isoform m [variant 42].[23]
  42. NP_001356512.1 transcription factor 4 isoform m [variant 43].[23]
  43. NP_001356513.1 transcription factor 4 isoform u [variant 44].[23]
  44. NP_001356514.1 transcription factor 4 isoform u [variant 45].[23]
  45. NP_001356515.1 transcription factor 4 isoform z [variant 46].[23]
  46. NP_003190.1 transcription factor 4 isoform b: "Transcript Variant: This variant (2) differs in the 5' UTR and coding sequence and uses an alternate in-frame splice site at the 3' end of an exon compared to variant 3. The resulting isoform (b, also known as TCF4-B-) is shorter at the N-terminus and lacks an alternate internal segment compared to isoform c."[23]

Gene ID: 6927 is HNF1A HNF1 homeobox A aka TCF1 on 12q24.31: "The protein encoded by this gene is a transcription factor required for the expression of several liver-specific genes. The encoded protein functions as a homodimer and binds to the inverted palindrome 5'-GTTAATNATTAAC-3'. Defects in this gene are a cause of maturity onset diabetes of the young type 3 (MODY3) and also can result in the appearance of hepatic adenomas. Alternative splicing results in multiple transcript variants encoding different isoforms."[24]

  1. NP_000536.6 hepatocyte nuclear factor 1-alpha isoform 2 [variant 2].[24]
  2. NP_001293108.2 hepatocyte nuclear factor 1-alpha isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."[24]

Gene ID: 6929 is TCF3 transcription factor 3 aka immunoglobulin transcription factor 1 on 19p13.3: "This gene encodes a member of the E protein (class I) family of helix-loop-helix transcription factors. E proteins activate transcription by binding to regulatory E-box sequences on target genes as heterodimers or homodimers, and are inhibited by heterodimerization with inhibitor of DNA-binding (class IV) helix-loop-helix proteins. E proteins play a critical role in lymphopoiesis, and the encoded protein is required for B and T lymphocyte development. Deletion of this gene or diminished activity of the encoded protein may play a role in lymphoid malignancies. This gene is also involved in several chromosomal translocations that are associated with lymphoid malignancies including pre-B-cell acute lymphoblastic leukemia (t(1;19), with PBX1), childhood leukemia (t(19;19), with TFPT) and acute leukemia (t(12;19), with ZNF384). Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and a pseudogene of this gene is located on the short arm of chromosome 9."[25]

  1. NP_001129611.1 transcription factor E2-alpha isoform E47: "Transcript Variant: This variant (2) differs in the 3' UTR, lacks an exon and includes an alternate exon in the 3' coding region, but maintains the reading frame, compared to variant 1. The encoded isoform (E47, also known as Pan-1) is shorter than isoform E12. Variants 2 and 4 encode the same isoform (E47)."[25]
  2. NP_001338707.1 transcription factor E2-alpha isoform 3: "Transcript Variant: This variant (3) uses an alternate in-frame splice site in the central coding region, and contains an alternate splice structure in the 3' region, resulting in differences in the 3' UTR. The encoded isoform (3) has the same N- and C-termini, but is one aa shorter than isoform 1."[25]
  3. NP_001338708.1 transcription factor E2-alpha isoform E47: "Transcript Variant: This variant (4) contains an alternate penultimate exon compared to variant 1. The encoded isoform (E47, also known as Pan-1) is shorter than isoform E12. Variants 2 and 4 encode the same isoform (E47)."[25]
  4. NP_003191.1 transcription factor E2-alpha isoform E12: "Transcript Variant: This variant (1) represents the longest transcript and encodes the longest isoform (E12). This isoform is also known as Pan-2."[25]

Gene ID: 6935 is ZEB1 zinc finger E-box binding homeobox 1 on 10p11.22: "This gene encodes a zinc finger transcription factor. The encoded protein likely plays a role in transcriptional repression of interleukin 2. Mutations in this gene have been associated with posterior polymorphous corneal dystrophy-3 and late-onset Fuchs endothelial corneal dystrophy. Alternatively spliced transcript variants encoding different isoforms have been described."[26]

  1. NP_001121600.1 zinc finger E-box-binding homeobox 1 isoform a [variant 1].[26]
  2. NP_001167564.1 zinc finger E-box-binding homeobox 1 isoform c [variant 6].[26]
  3. NP_001167565.1 zinc finger E-box-binding homeobox 1 isoform d [variant 7].[26]
  4. NP_001167566.1 zinc finger E-box-binding homeobox 1 isoform e [variant 8].[26]
  5. NP_001167567.1 zinc finger E-box-binding homeobox 1 isoform f [variant 9].[26]
  6. NP_001310567.1 zinc finger E-box-binding homeobox 1 isoform g [variant 3].[26]
  7. NP_001310570.1 zinc finger E-box-binding homeobox 1 isoform g [variant 4].[26]
  8. NP_001310571.1 zinc finger E-box-binding homeobox 1 isoform g [variant 5].[26]
  9. NP_001310572.1 zinc finger E-box-binding homeobox 1 isoform g [variant 10].[26]
  10. NP_001310573.1 zinc finger E-box-binding homeobox 1 isoform g [variant 11].[26]
  11. NP_001310574.1 zinc finger E-box-binding homeobox 1 isoform g [variant 12].[26]
  12. NP_001310575.1 zinc finger E-box-binding homeobox 1 isoform g [variant 13].[26]
  13. NP_001310576.1 zinc finger E-box-binding homeobox 1 isoform g [variant 14].[26]
  14. NP_001310577.1 zinc finger E-box-binding homeobox 1 isoform g [variant 15].[26]
  15. NP_001310578.1 zinc finger E-box-binding homeobox 1 isoform g [variant 16].[26]
  16. NP_001310579.1 zinc finger E-box-binding homeobox 1 isoform g [variant 17].[26]
  17. NP_001310580.1 zinc finger E-box-binding homeobox 1 isoform g [variant 18].[26]
  18. NP_001310581.1 zinc finger E-box-binding homeobox 1 isoform g [variant 19].[26]
  19. NP_001310582.1 zinc finger E-box-binding homeobox 1 isoform g [variant 20].[26]
  20. NP_001310583.1 zinc finger E-box-binding homeobox 1 isoform g [variant 21].[26]
  21. NP_001310584.1 zinc finger E-box-binding homeobox 1 isoform g [variant 22].[26]
  22. NP_001310585.1 zinc finger E-box-binding homeobox 1 isoform g [variant 23].[26]
  23. NP_001310586.1 zinc finger E-box-binding homeobox 1 isoform g [variant 24].[26]
  24. NP_001310587.1 zinc finger E-box-binding homeobox 1 isoform g [variant 25].[26]
  25. NP_001310588.1 zinc finger E-box-binding homeobox 1 isoform g [variant 26].[26]
  26. NP_001310589.1 zinc finger E-box-binding homeobox 1 isoform g [variant 27].[26]
  27. NP_001310590.1 zinc finger E-box-binding homeobox 1 isoform g [variant 28].[26]
  28. NP_001310591.1 zinc finger E-box-binding homeobox 1 isoform g [variant 29].[26]
  29. NP_001310592.1 zinc finger E-box-binding homeobox 1 isoform g [variant 30].[26]
  30. NP_001310593.1 zinc finger E-box-binding homeobox 1 isoform g [variant 31].[26]
  31. NP_001310594.1 zinc finger E-box-binding homeobox 1 isoform g [variant 32].[26]
  32. NP_001310595.1 zinc finger E-box-binding homeobox 1 isoform g [variant 33].[26]
  33. NP_001310600.1 zinc finger E-box-binding homeobox 1 isoform g [variant 34].[26]
  34. NP_001310601.1 zinc finger E-box-binding homeobox 1 isoform g [variant 35].[26]
  35. NP_001310602.1 zinc finger E-box-binding homeobox 1 isoform g [variant 36].[26]
  36. NP_001310603.1 zinc finger E-box-binding homeobox 1 isoform h [variant 37].[26]
  37. NP_001310604.1 zinc finger E-box-binding homeobox 1 isoform i [variant 38].[26]
  38. NP_001310605.1 zinc finger E-box-binding homeobox 1 isoform j [variant 39].[26]
  39. NP_001310606.1 zinc finger E-box-binding homeobox 1 isoform k [variant 40].[26]
  40. NP_001310607.1 zinc finger E-box-binding homeobox 1 isoform l [variant 41].[26]
  41. NP_110378.3 zinc finger E-box-binding homeobox 1 isoform b [variant 2].[26]

Binding specificity

E-boxes with different functions have a different number and type of binding factor.[27]

Comparisons of Enhancers for negative direction

Chaudhary (1999) Long (2020) Long (2020) Deckert (1994) Murre (1992) Zia (2020)
CANNTG ATCTTG TCCGCC NNCGAAAN (G/A)CAGNTGN YTA(A/T)4TAR
UTR nn(4560-2846) UTR nn(4560-2846) UTR nn(4560-2846) UTR nn(4560-2846) UTR nn(4560-2846) UTR nn(4560-2846)
CAGATG at 4212 - - - ACAGATGT at 4213 TTATTATTAA at 4226
- - TCCGCC at 3999 - - -
CATTTG at 3482 - - - - -
- - TCCGCC at 3090 - - -
CAGATG at 2988 - - - ACAGATGT at 2989 -
Core nn(2846-2811) Core nn(2846-2811) Core nn(2846-2811) Core nn(2846-2811) Core nn(2846-2811) Core nn(2846-2811)
- - - - - -
- - - - - -
- - - - - -
Proximal nn(2811-2596) Proximal nn(2811-2596) Proximal nn(2811-2596) Proximal nn(2811-2596) Proximal nn(2811-2596) Proximal nn(2811-2596)
- - - - - -
- - - - - -
- - - - - -
Distal nn(2596-1) Distal nn(2596-1) Distal nn(2596-1) Distal nn(2596-1) Distal nn(2596-1) Distal nn(2596-1)
- - - ciTTTTCGTT at 2480 - -
- - - ciTTTTCGTT at 2474 - -
- - TCCGCC at 2392 - - -
- - TCCGCC at 2231 - - -
CACCTG at 2116 - - - ciCCACCTGT at 2117 -
- - TCCGCC at 1752 - - -
- - ciCTTTCGCC at 1680 - - -
- - - - - CTATATATAA at 1601
CAGTTG at 1513 - GGCGGA at 1504 - GCAGTTGG at 1514 -
CAGATG at 1224 - - - ACAGATGT at 1225 -
CAAGTG at 1179 - - - - -
- - - ciTTTTCGGT at 946 - -
- - TCCGCC at 856 - - -
CACATG at 797 - - - - -
- - TCCGCC at 700 - - -
CAGATG at 481 - - - - -
- - TCCGCC at 427 - - -
CACATG at 324 - - - - -
- ATCTTG at 286 - - - -
- - - ciTTTTCGTA at 187 - -
- - - ciCTTTCGAC at 139 - -
- - - - ACAGATGT at 48 -
UTR pn(4560-2846) UTR pn(4560-2846) UTR pn(4560-2846) UTR pn(4560-2846) UTR pn(4560-2846) UTR pn(4560-2846)
CACTTG at 4011 - - - - -
CACCTG at 3969 - - - ciACACCTGT at 3970 -
CAGGTG at 3953 - - - - -
CAGATG at 3919 - - - ACAGATGA at 3920 -
CAACTG at 3850 - - - ciGCAACTGC at 3851 -
CAGATG at 3627 - - - - -
CAGATG at 3620 - - - - -
- ciCAAGAT at 3275 - - - -
CACTTG at 3241 - - - - -
CACTTG at 3102 - - - - -
CACTTG at 2920 - - - - -
Core pn(2846-2811) Core pn(2846-2811) Core pn(2846-2811) Core pn(2846-2811) Core pn(2846-2811) Core pn(2846-2811)
- - - - - -
- - - - - -
- - - - - -
- - - - - -
Proximal pn(2811-2596) Proximal pn(2811-2596) Proximal pn(2811-2596) Proximal pn(2811-2596) Proximal pn(2811-2596) Proximal pn(2811-2596)
- - GGCGGA at 2728 - - -
CACATG at 2667 - - - - -
- - - - - -
- - - - - -
Distal pn(2596-1) Distal pn(2596-1) Distal pn(2596-1) Distal pn(2596-1) Distal pn(2596-1) Distal pn(2596-1)
CACTTG at 2579 - - - - -
CAGGTG at 2570 - - - GCAGGTGG at 2571 -
- - GGCGGA at 2393 - - -
CACTTG at 2126 - - GGCGAAAT at 2158 - -
CAGGTG at 2079 - - - - -
- - GGCGGA at 1810 - - -
CAAATG at 1579 - TCCGCC at 1503 - - -
CACCTG at 1172 - - - - -
CACCTG at 1130 - - - ciACACCTGT at 1131 -
- ciCAAGAT at 876 - - - -
- - GGCGGA at 857 - - -
- - GGCGGA at 701 - - -
- - - TACGAAAA at 495 - -
CACCTG at 393 - GGCGGA at 428 - ciCCACCTGT at 394 -
CATTTG at 364 - - GACGAAAC at 313 - -
CATATG at 41 - - - - -
- - - - - -

Comparisons of Enhancers for positive direction

Chaudhary (1999) Long (2020) Long (2020) Deckert (1994) Murre (1992) Zia (2020)
CANNTG ATCTTG TCCGCC NNCGAAAN (G/A)CAGNTGN YTA(A/T)4TAR
Core np(4445-4265) Core np(4445-4265) Core np(4445-4265) Core np(4445-4265) Core np(4445-4265) Core np(4445-4265)
- - - - - -
- - - - - -
- - - - - -
- - - - - -
Proximal np(4265-4050) Proximal np(4265-4050) Proximal np(4265-4050) Proximal np(4265-4050) Proximal np(4265-4050) Proximal np(4265-4050)
- - - - - -
- - - - - -
- - - - - -
- - - - - -
Distal np(4050-1) Distal np(4050-1) Distal np(4050-1) Distal np(4050-1) Distal np(4050-1) Distal np(4050-1)
CACTTG at 4015 - - - - -
CATGTG at 3958 - - - - -
CACATG at 3956 - - - - -
CATGTG at 3902 - - - - -
CAGCTG at 3777 - - - - -
CACATG at 3742 - - - - -
CACATG at 3707 - - - - -
CAGATG at 3475 - - - - -
CATCTG at 3404 - - - - -
CAGCTG at 3241 - - - - -
CAGGTG at 3149 - - - - -
CACCTG at 3046 - - - - -
CACCTG at 2568 - - - - -
CAAGTG at 2510 - - - - -
CACCTG at 2432 - - - - CTAATTTTAA at 2443
CAGCTG at 2404 - - - - -
CAGGTG at 2374 - - - - -
CACCTG at 2249 - - - - -
CAGGTG at 2127 - - - - -
CAGCTG at 2054 - - - - -
CACATG at 2031 - - - - -
CAGGTG at 1968 - - - - -
- - - TTTTCGGG at 1752 - -
- - GGCGGA at 1203 CTTTCGTC at 1183 - -
CACCTG at 958 - - - - -
CACCTG at 858 - - - - -
CACGTG at 570 - - - - -
- - GGCGGA at 357 - - -
CAGGTG at 196 - - - - -
- - - - GCAGATGA at 37 -
Core pp(4445-4265) Core pp(4445-4265) Core pp(4445-4265) Core pp(4445-4265) Core pp(4445-4265) Core pp(4445-4265)
- - - - - -
- - - - - -
- - - - - -
- - - - - -
Proximal pp(4265-4050) Proximal pp(4265-4050) Proximal pp(4265-4050) Proximal pp(4265-4050) Proximal pp(4265-4050) Proximal pp(4265-4050)
- - GGCGGA at 4239 - - -
CAAGTG at 4202 - - - - -
- - - - - CTAATATTAA at 4169
- ciCAAGAT at 4075 - - - -
- ATCTTG at 4067 - - - -
- - - - - -
Distal pp(4050-1) Distal pp(4050-1) Distal pp(4050-1) Distal pp(4050-1) Distal pp(4050-1) Distal pp(4050-1)
CACTTG at 3936 - - - - -
CACGTG at 3884 - - - - -
- - - ciCTTTCGTG at 3600 - -
- - TCCGCC at 3487 - - -
CAGGTG at 3086 - - - - -
CACGTG at 2961 - - - - -
- - TCCGCC at 2485 ciGTTTCGCA at 2538 - -
- - - ACCGAAAG at 2164 - -
CAGGTG at 2028 - - ciTTTTCGTC at 2007 - -
CAGGTG at 1843 - - - - -
CACGTG at 1219 - - TCCGAAAG at 1180 - -
- - - TCCGAAAG at 1096 - -
- - GGCGGA at 905 - - -
CATGTG at 567 - - - - -
CACGTG at 547 - - - - -
CACCTG at 186 - - - - -
- - - - - -

Non-canonical E-boxes

The consensus sequence of the E-box is usually CANNTG; however, there exist other E-boxes of similar sequences called noncanonical E-boxes, including, but are not limited to:

  • CACGTT sequence 20 bp upstream of the mouse Period2 (PER2) gene and regulates its expression[28]
  • CAGCTT sequence found within the MyoD core enhancer[29]
  • CACCTCGTGAC sequence in the proximal promoter region of human and rat APOE, which is a protein component of lipoproteins.[30]

CACGTT enhancer boxes

For the Basic programs testing consensus sequence CACGTT (starting with SuccessablesPhop.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 2, CACGTT at 2864, CACGTT at 1536.
  2. Positive strand, negative direction: 1, CACGTT at 343.
  3. Negative strand, positive direction 0.
  4. Positive strand, positive direction: 2, CACGTT at 2801, CACGTT at 2335.
  5. inverse complement, negative strand, negative direction: 4, AACGTG at 3288, AACGTG at 1718, AACGTG at 1346, AACGTG at 1338.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 1, AACGTG at 3342.

CACGTT (4560-2846) UTR promoters

  1. Negative strand, negative direction: AACGTG at 3288, CACGTT at 2864.
  2. inverse complement, negative strand, negative direction: AACGTG at 3288.

CACGTT negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: CACGTT at 1536.
  2. inverse complement, negative strand, negative direction: AACGTG at 1718, AACGTG at 1346, AACGTG at 1338.
  3. Positive strand, negative direction: CACGTT at 343.

CACGTT positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: CACGTT at 2801, CACGTT at 2335.
  2. inverse complement, positive strand, positive direction: AACGTG at 3342.

CACGTT random dataset samplings

CACGTTr arbitrary UTRs

  1. Phopr4: CACGTT at 4411.
  2. Phopr6: CACGTT at 3434.
  3. Phopr8: CACGTT at 2985.
  4. Phopr0ci: AACGTG at 3209.

CACGTTr alternate UTRs

  1. Phopr5: CACGTT at 4068.
  2. Phopr1ci: AACGTG at 4030.
  3. Phopr7ci: AACGTG at 3534.

CACGTTr alternate positive direction core promoters

  1. Phopr4: CACGTT at 4411.

CACGTTr alternate negative direction proximal promoters

  1. Phopr3: CACGTT at 2751.

CACGTTr arbitrary positive direction proximal promoters

  1. Phopr5: CACGTT at 4068.

CACGTTr arbitrary negative direction distal promoters

  1. Phopr0: CACGTT at 1357.
  2. Phopr8: CACGTT at 898.
  3. Phopr0ci: AACGTG at 2399, AACGTG at 1935.
  4. Phopr2ci: AACGTG at 1697.

CACGTTr alternate negative direction distal promoters

  1. Phopr1: CACGTT at 192.
  2. Phopr5: CACGTT at 844.
  3. Phopr7: CACGTT at 1392.
  4. Phopr9: CACGTT at 875.
  5. Phopr3ci: AACGTG at 1437.
  6. Phopr9ci: AACGTG at 797.

CACGTTr arbitrary positive direction distal promoters

  1. Phopr1: CACGTT at 192.
  2. Phopr3: CACGTT at 2751.
  3. Phopr5: CACGTT at 844.
  4. Phopr7: CACGTT at 1392.
  5. Phopr9: CACGTT at 875.
  6. Phopr1ci: AACGTG at 4030.
  7. Phopr3ci: AACGTG at 1437.
  8. Phopr7ci: AACGTG at 3534.
  9. Phopr9ci: AACGTG at 797.

CACGTTr alternate positive direction distal promoters

  1. Phopr0: CACGTT at 1357.
  2. Phopr6: CACGTT at 3434.
  3. Phopr8: CACGTT at 2985, CACGTT at 898.
  4. Phopr0ci: AACGTG at 3209, AACGTG at 2399, AACGTG at 1935.
  5. Phopr2ci: AACGTG at 1697.

CACGTT analysis and results

The upstream activating sequence (UAS) for Pho4p is CAC(A/G)T(T/G) in the promoters of HIS4 and PHO5 regarding phosphate limitation with respect to regulation of the purine and histidine biosynthesis pathways [66].[31]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 3 2 1.5 1.5 ± 1.5 (+-0,--3)
Randoms UTR arbitrary negative 4 10 0.4 0.35 ± 0.5
Randoms UTR alternate negative 3 10 0.3 0.35 ± 0.5
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0.05 ± 0.05
Randoms Core alternate positive 1 10 0.1 0.05 ± 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.05 ± 0.05
Randoms Proximal alternate negative 1 10 0.1 0.05 ± 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0.05 ± 0.05
Randoms Proximal alternate positive 0 10 0 0.05 ± 0.05
Reals Distal negative 5 2 2.5 2.5 ± 4 (+-1,--4)
Randoms Distal arbitrary negative 5 10 0.5 0.55 ± 0.05
Randoms Distal alternate negative 6 10 0.6 0.55 ± 0.05
Reals Distal positive 3 2 1.5 1.5 ± 1.5 (++3,-+0)
Randoms Distal arbitrary positive 9 10 0.9 0.85 ± 0.05
Randoms Distal alternate positive 8 10 0.8 0.85 ± 0.05

Comparison:

The occurrences of real CACGTTs are greater than the randoms. This suggests that the real CACGTTs are likely active or activable.

CAGCTT samplings

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 1.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 2.
  7. inverse complement, negative strand, positive direction: 1.
  8. inverse complement, positive strand, positive direction: 0.

CT-Rich Regions

The CT-Rich Regions (CTRR) located about 23 nucleotides upstream of the E-box is important in E-box binding, transactivation (increased rate of genetic expression), and transcription of circadian genes BMAL1/NPAS2 and BMAL1/CLOCK complexes.[32]

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. "E-box". San Francisco, California: Wikimedia Foundation, Inc. April 13, 2013. Retrieved 2013-04-17.
  2. Charalampos G. Spilianakis, Maria D. Lalioti, Terrence Town, Gap Ryol Lee, Richard A. Flavell (2005). "Interchromosomal associations between alternatively expressed loci". Nature. 435 (7042): 637–45. doi:10.1038/nature03574. PMID 15880101.
  3. SemperBlotto (16 January 2011). "enhancer". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2017-08-25.
  4. Gerhard Leubner-Metzger, Luciana Petruzzelli, Rosa Waldvogel, Regina Vögeli-Lange, and Frederick Meins, Jr. (November 1998). "Ethylene-responsive element binding protein (EREBP) expression and the transcriptional regulation of class I β-1, 3-glucanase during tobacco seed germination". Plant Molecular Biology 38 (5): 785-95. doi:10.1023/A:1006040425383.
  5. "Box (disambiguation)". San Francisco, California: Wikimedia Foundation, Inc. May 23, 2013. Retrieved 2013-06-15.
  6. Bork P, Holm L, Sander C (September 1994). "The immunoglobulin fold. Structural classification, sequence patterns and common core". Journal of Molecular Biology. 242 (4): 309–20. doi:10.1006/jmbi.1994.1582. PMID 7932691.
  7. Brümmendorf T, Rathjen FG (1995). "Cell adhesion molecules 1: immunoglobulin superfamily". Protein Profile. 2 (9): 963–1108. PMID 8574878.
  8. George M. Church, Anne Ephrussi, Walter Gilbert, Susumu Tonegawa (1985). "Cell-type-specific contacts to immunoglobulin enhancers in nuclei" (PDF). Nature. 313 (6005): 798–801.
  9. 9.0 9.1 9.2 9.3 Jaideep Chaudhary and Michael K. Skinner (May 1999). "Basic Helix-Loop-Helix Proteins Can Act at the E-Box within the Serum Response Element of the c-fos Promoter to Influence Hormone-Induced Promoter Activation in Sertoli Cells". Molecular Endocrinology. 13 (5): 774–86. doi:10.1210/me.13.5.774. PMID 10319327. Retrieved 2013-06-14.
  10. 10.0 10.1 "Promoter (genetics)". San Francisco, California: Wikimedia Foundation, Inc. June 14, 2013. Retrieved 2013-06-15.
  11. Hershey CL, Fisher DE (April 2004). "Mitf and Tfe3: members of a b-HLH-ZIP transcription factor family essential for osteoclast development and function". Bone. 34 (4): 689–96. doi:10.1016/j.bone.2003.08.014. PMID 15050900.
  12. "MITF gene". Genetics Home Reference. National Institutes of Health, U.S. Department of Health & Human Services.
  13. 13.0 13.1 Keith S. Hoek, Natalie C. Schlegel, Ossia M. Eichhoff, Daniel S. Widmer, Christian Praetorius, Steingrimur O. Einarsson, Sigridur Valgeirsdottir, Kristin Bergsteinsdottir, Alexander Schepsky, Reinhard Dummer, Eirikur Steingrimsson (2008). "Novel MITF targets identified using a two-step DNA microarray strategy". Pigment Cell & Melanoma Research. 21 (6): 665–76. doi:10.1111/j.1755-148X.2008.00505.x. PMID 19067971.
  14. 14.0 14.1 Massimo Petretich (20 September 2016). Importance of Chromosomal Architecture to Organize Promoter-Enhancer Long-Range Interactions in c-Myc locus (PDF). Heidelberg, Germany: Ruperto-Carola University of Heidelberg. p. 195. Retrieved 2017-09-05.
  15. Shigemi Kimura, Kuniya Abe, Misao Suzuki, Masakatsu Ogawa, Kowashi Yoshioka, Tadasi Kaname, Teruhisa Miike, Ken‐ichi Yamamura (June 1997). "A 900 bp genomic region from the mouse dystrophin promoter directs lacZ reporter expression only to the right heart of transgenic mice". Development, Growth & Differentiation. 39 (1): 257–265. doi:10.1046/j.1440-169X.1997.t01-2-00001.x. Retrieved 25 March 2019.
  16. Cornelis Murre, Gretchen Bain, Marc A. van Dijk, Isaac Engel, Beth A. Furnari, Mark E. Massari, James R. Matthews, Melanie W. Quong, Richard R. Rivera, Maarten H. Stuiver (June 1994). "Structure and function of helix-loop-helix proteins". Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression. 1218 (2): 129–35. Retrieved 2017-02-08.
  17. Gretchen Bain, Stefan Gruenwald, and Cornelis Murre (June 1993). "E2A and E2-2 are subunits of B-cell-specific E2-box DNA-binding proteins" (PDF). Molecular and Cellular Biology. 13 (6): 3522–3529. doi:10.1128/MCB.13.6.3522. Retrieved 2 February 2019.
  18. Joke Comijn, Geert Berx, Petra Vermassen, Kristin Verschueren, Leo van Grunsven, Erik Bruyneel, Marc Mareel, Danny Huylebroeck, Frans van Roy (June 2001). "The Two-Handed E Box Binding Zinc Finger Protein SIP1 Downregulates E-Cadherin and Induces Invasion". Molecular Cell. 7 (6): 1267–78. doi:10.1016/S1097-2765(01)00260-X. Retrieved 11 January 2019.
  19. Oliver G. McDonald, Brian R. Wamhoff, Mark H. Hoofnagle, and Gary K. Owens (January 4, 2006). "Control of SRF binding to CArG box chromatin regulates smooth muscle gene expression in vivo". The Journal of Clinical Investigation. 116 (1): 36–48. Retrieved 2014-06-05.
  20. Nitish R. Mahapatra, Manjula Mahata, Arun K. Datta, Hans-Hermann Gerdes, Wieland B. Huttner, Daniel T. O’Connor, Sushil K. Mahata (1 October 2000). "Neuroendocrine Cell Type-Specific and Inducible Expression of the Chromogranin B Gene: Crucial Role of the Proximal Promoter". Endocrinology. 141 (10): 3668–3678. doi:10.1210/endo.141.10.7725. Retrieved 15 September 2018.
  21. Melanie Schoof, Malte Hellwig, Luke Harrison, Dörthe Holdhof, Marlen C. Lauffer, Judith Niesen, Sanamjeet Virdi, Daniela Indenbirken and Ulrich Schüller (9 January 2020). "The basic helix-loop-helix transcription factor TCF4 impacts brain architecture as well as neuronal morphology and differentiation" (PDF). European Journal of Neuroscience. 00 (14674): 1–17. doi:10.1111/ejn.14674. Retrieved 29 April 2020.
  22. 22.0 22.1 22.2 RefSeq (August 2017). "MYC MYC proto-oncogene, bHLH transcription factor [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 1 May 2020.
  23. 23.00 23.01 23.02 23.03 23.04 23.05 23.06 23.07 23.08 23.09 23.10 23.11 23.12 23.13 23.14 23.15 23.16 23.17 23.18 23.19 23.20 23.21 23.22 23.23 23.24 23.25 23.26 23.27 23.28 23.29 23.30 23.31 23.32 23.33 23.34 23.35 23.36 23.37 23.38 23.39 23.40 23.41 23.42 23.43 23.44 23.45 23.46 RefSeq (July 2016). "TCF4 transcription factor 4 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 5 April 2020.
  24. 24.0 24.1 24.2 RefSeq (April 2015). HNF1A HNF1 homeobox A [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 7 November 2018.
  25. 25.0 25.1 25.2 25.3 25.4 RefSeq (September 2011). "TCF3 transcription factor 3 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 29 April 2020.
  26. 26.00 26.01 26.02 26.03 26.04 26.05 26.06 26.07 26.08 26.09 26.10 26.11 26.12 26.13 26.14 26.15 26.16 26.17 26.18 26.19 26.20 26.21 26.22 26.23 26.24 26.25 26.26 26.27 26.28 26.29 26.30 26.31 26.32 26.33 26.34 26.35 26.36 26.37 26.38 26.39 26.40 26.41 RefSeq (March 2010). "ZEB1 zinc finger E-box binding homeobox 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 1 May 2020.
  27. Bose, Sudeep; Boockfor, Fredric R. (2010). "Episodes of prolactin gene expression in GH3 cells are dependent on selective promoter binding of multiple circadian elements". Endocrinology. 151 (5): 2287–2296. doi:10.1210/en.2009-1252. PMC 2869263. PMID 20215567.
  28. Yoo, S.H.; Ko, C.H.; Lowrey, P.L.; Buhr, E.D.; Song, E.J.; Chang, S.; Yoo, O.J.; Yamazaki, S.; Lee, C. (2005). "A noncanonical E-box enhancer drives mouse Period2 circadian oscillations in vivo". Proceedings of the National Academy of Science USA. 102 (7): 2608–2613. Bibcode:2005PNAS..102.2608Y. doi:10.1073/pnas.0409763102. PMC 548324. PMID 15699353.
  29. Zhang, X.; Patel, S. P.; McCarthy, J. J.; Rabchevsky, A. G.; Goldhamer, D. J.; Esser, K. A. (2012). "A non-canonical E-box within the MyoD core enhancer is necessary for circadian expression in skeletal muscle". Nucleic Acids Research. 40 (8): 3419–3430. doi:10.1093/nar/gkr1297. PMC 3333858. PMID 22210883.
  30. Salero, Enrique; Giménez, Cecilio; Zafra, Francisco (15 March 2003). "Identification of a non-canonical E-box motif as a regulatory element in the proximal promoter region of the apolipoprotein E gene". The Biochemical Journal. 370 (3): 979–986. doi:10.1042/BJ20021142. PMC 1223214. PMID 12444925.
  31. Hongting Tang, Yanling Wu, Jiliang Deng, Nanzhu Chen, Zhaohui Zheng, Yongjun Wei, Xiaozhou Luo, and Jay D. Keasling (6 August 2020). "Promoter Architecture and Promoter Engineering in Saccharomyces cerevisiae". Metabolites. 10 (8): 320–39. doi:10.3390/metabo10080320. PMID 32781665.
  32. Muñoz, Estela; Michelle Brewer; Ruben Baler (2006). "Modulation of BMAL/CLOCK/E-Box complex activity by a CT-rich cis-acting element". Molecular and Cellular Endocrinology. 252 (1–2): 74–81. doi:10.1016/j.mce.2006.03.007. PMID 16650525.

Further reading

External links