A1BG regulatory elements and regions

Jump to navigation Jump to search

It may be still fair to say that in the apparent present era of functional genomics, the challenge is to elucidate gene function such as that of A1BG, its likely regulatory networks and signaling pathways.[1] "Since regulation of gene expression in vivo mainly occurs at the transcriptional level, identifying the location of genetic regulatory elements is a key to understanding the machinery regulating gene transcription. A major goal of current genome research is to identify the locations of all gene regulatory elements, including promoters, enhancers, silencers, insulators and boundary elements, and to analyze their relationship to the current annotation of human genes."[2][3] Although "many genome-wide strategies have been developed for identifying functional elements", "no method yet has the resolution to precisely identify all regulatory elements or can be readily applied to the entire human genome."[4]

"The experimental evidence demonstrates that genome binding specificity is achieved through the interplay of at least three factors: DNA sequence; DNA shape; and occlusion by chromatin."[5]

There is one CRISPRi-validated cis-regulatory element on 19q13.43: Gene ID: 116286197 LOC116286197. And, four Sharpr-MPRA regulatory regions: (1) Gene ID: 112553117 LOC112553117 Sharpr-MPRA regulatory region 1998, Gene ID: 112553119 LOC112553119 Sharpr-MPRA regulatory region 10473, Gene ID: 112577453 LOC112577453 Sharpr-MPRA regulatory region 7872, and Gene ID: 112577454 is Sharpr-MPRA regulatory region 9894.

Def. nucleotide "sequences, usually upstream, which are recognized by specific regulatory transcription factors, thereby causing gene response to various regulatory agents", [that] "may be found in both promoter and enhancer regions"[6] are called response elements.

Heterodimers

Some bZIP proteins, "including LIP19, OsZIP-2a, and OsZIP-2b, do not bind to DNA sequences. Instead, these bZIP proteins form heterodimers with other bZIPs to regulate transcriptional activities (Nantel and Quatrano, 1996; Shimizu et al., 2005)."[7]

DNase I hypersensitive sites

"This genomic region represents a DNase I hypersensitive site (DHS) that was predicted to be an enhancer by the ENCODE (ENCyclopedia Of DNA Elements) project based on various combinations of H3K27 acetylation and binding of p300, GATA1 and RNA polymerase II in K562 erythroleukemia cells. It was validated as a high-confidence cis-regulatory element for the ZNF582 (zinc finger protein 582) gene on chromosome 19 based on multiplex CRISPR/Cas9-mediated perturbation in K562 cells."[8]

Gene ID: 116286197 CRISPRi-validated cis-regulatory element chr19.6329 is at NC_000019.10 (56186901..56187499).[8]

Gene ID: 147948 ZNF582 is at NC_000019.10 (56382751..56393585, complement).[9] The CRISPRi-validated cis-regulatory element chr19.6329 is (56382751 - 56186901) = 195850 nts from the beginning of ZNF582.

Transcriptional regulatory regions

"This genomic sequence was predicted to be a transcriptional regulatory region based on chromatin state analysis from the ENCODE (ENCyclopedia Of DNA Elements) project. It was validated as a functional enhancer by the Sharpr-MPRA technique (Systematic high-resolution activation and repression profiling with reporter tiling using massively parallel reporter assays) in K562 erythroleukemia cells (group: K562 Activating DNase unmatched - State 1:Tss, active promoter, TSS/CpG island region), with weaker activation in HepG2 liver carcinoma cells (group: HepG2 Activating DNase matched - State 1:Tss)."[10]

"This genomic sequence was predicted to be a transcriptional regulatory region based on chromatin state analysis from the ENCODE (ENCyclopedia Of DNA Elements) project. It was validated as a functional enhancer by the Sharpr-MPRA technique (Systematic high-resolution activation and repression profiling with reporter tiling using massively parallel reporter assays) in HepG2 liver carcinoma cells (group: HepG2 Activating DNase matched - State 5:Enh, candidate strong enhancer, open chromatin). It also displayed weak repressive activity by Sharpr-MPRA in K562 erythroleukemia cells (group: K562 Repressive non-DNase unmatched - State 24:Quies, heterochromatin/dead zone)."[11]

"This genomic sequence was predicted to be a transcriptional regulatory region based on chromatin state analysis from the ENCODE (ENCyclopedia Of DNA Elements) project. It was validated as a functional enhancer by the Sharpr-MPRA technique (Systematic high-resolution activation and repression profiling with reporter tiling using massively parallel reporter assays) in both HepG2 liver carcinoma cells (group: HepG2 Activating DNase unmatched - State 1:Tss, active promoter, TSS/CpG island region) and K562 erythroleukemia cells (group: K562 Activating DNase unmatched - State 1:Tss)."[12]

"This genomic sequence was predicted to be a transcriptional regulatory region based on chromatin state analysis from the ENCODE (ENCyclopedia Of DNA Elements) project. It was validated as a functional enhancer by the Sharpr-MPRA technique (Systematic high-resolution activation and repression profiling with reporter tiling using massively parallel reporter assays) in K562 erythroleukemia cells (group: K562 Activating DNase unmatched - State 1:Tss, active promoter, TSS/CpG island region), with weaker activation in HepG2 liver carcinoma cells (group: HepG2 Activating DNase matched - State 1:Tss)."[13]

"The growth hormone-regulated transcription factors STAT5 and BCL6 coordinately regulate sex differences in mouse liver, primarily through effects in male liver, where male-biased genes are upregulated and many female-biased genes are actively repressed."[14] "CUX2, a highly female-specific liver transcription factor, contributes to an analogous regulatory network in female liver. Adenoviral overexpression of CUX2 in male liver induced 36% of female-biased genes and repressed 35% of male-biased genes. In female liver, CUX2 small interfering RNA (siRNA) preferentially induced genes repressed by adenovirus expressing CUX2 (adeno-CUX2) in male liver, and it preferentially repressed genes induced by adeno-CUX2 in male liver. CUX2 binding in female liver chromatin was enriched at sites of male-biased DNase hypersensitivity and at genomic regions showing male-enriched STAT5 binding. CUX2 binding was also enriched near genes repressed by adeno-CUX2 in male liver or induced by CUX2 siRNA in female liver but not at genes induced by adeno-CUX2, indicating that CUX2 binding is preferentially associated with gene repression. Nevertheless, direct CUX2 binding was seen at several highly female-specific genes that were positively regulated by CUX2, including A1bg [A1BG in humans], Cyp2b9, Cyp3a44, Tox [TOX in humans], and Trim24 [TRIM24 in humans]."[14]

ABA-response elements

"The ABA responsive element (ABRE) is a key cis‐regulatory element in ABA signalling. However, its consensus sequence (ACGTG(G/T)C) is present in the promoters of only about 40% of ABA‐induced genes in rice aleurone cells, suggesting other ABREs may exist."[15]

"Many ABA‐inducible genes in various species contain a conserved cis‐regulatory ABA responsive element (ABRE) with the consensus sequence ACGTG(G/T)C (Hattori et al. 2002; Shen et al. 2004)."[15]

ABRE core promoters

Positive strand, positive direction: 5'-ACGTGGC-3' at 4344 and complement.

ABRE proximal promoters

Positive strand, negative direction: 5'-ACGTGGC-3' at 4239 and complement.

ABRE distal promoters

Negative strand, negative direction: 5'-CTGTGCA-3' at 3429 and complement.

Positive strand, positive direction: 5'-GACACGT-3' at 2960, 5'-ACGTGTC-3' at 1823 and complements.

Abf1 regulatory factors

Abfm regulatory factor distal promoters

Positive strand, negative direction: 5'-CGTTCTTTATGAT-3' at 352 and complement.

Positive strand, positive direction: 5'-CGTCACCGGTGAC-3' at 2073, 5'-CGTTCGGTGTGAC-3' at 346 and complements.

A boxes

"Most bZIP proteins show high binding affinity for the ACGT motifs, which include CACGTG (G box), GACGTC (C box), TACGTA (A box), AACGTT (T box), and a GCN4 motif, namely TGA(G/C)TCA (Landschulz et al., 1988;[16] Nijhawan et al., 2008[17])."[7]

"The human TGF-β1 promoter region contains two binding sequences for AP-1, designated AP-1 box A (TGACTCT) and box B (TGTCTCA), which mediate the up-regulation of promoter activity after [High glucose] HG stimulation."[18]

A box proximal promoters

Negative direction: 5'-TACGTA-3' at 4246 and complement.

A box distal promoters

Positive direction: 5'-TACGTA-3' at 3071 and complement.

Box A distal promoters

Negative direction: 5'-TGACTCT-3' at 2788 and complement.

Positive direction: 5'-TCTCAGT-3' at 2613 and complement.

Abscisic acid-responsive elements

Abscisic acid-responsive elements (CACGTG).[19]

"The [palindromic E-box motif (CACGTG)] motif is bound by the transcription factor Pho4, [and has the] class of basic helix-loop-helix DNA binding domain and core recognition sequence (Zhou and O'Shea 2011)."[5]

The Pho4 homodimer binds to DNA sequences containing the bHLH binding site 5'-CACGTG-3'.[20]

The upstream activating sequence (UAS) for Pho4p is 5'-CAC(A/G)T(T/G)-3' in the promoters of HIS4 and PHO5 regarding phosphate limitation with respect to regulation of the purine and histidine biosynthesis pathways [66].[21]

Phop core promoters

Positive strand, negative direction: 5'-CACATT-3' at 4533 and complement.

Pho4 distal promoters

Negative strand, positive direction: 5'-CACGTG-3' at 570 and complement.

Positive strand, positive direction: 5'-CACGTG-3' at 3884, 5'-CACGTG-3' at 2961, 5'-CACGTG-3' at 1219, and 5'-CACGTG-3' at 547 and complements.

Phop distal promoters

Negative strand, negative direction: 5'-TTACAC-3' at 4091, 5'-AACGTG-3' at 3288, 3'-CACGTT-5' at 2864, 3'-CACATT-5' at 2087, 5'-TTACAC-3' at 2064, 5'-AACGTG-3' at 1718, 3'-CACGTT-5' at 1536, 5'-AACGTG-3' at 1346, 5'-AACGTG-3' at 1338, 3'-CACATG-5' at 797, 3'-CACATT-5' at 612, and 3'-CACATG-5' at 324 and complements.

Positive strand, negative direction: 5'-CACATG-3' at 2667, 5'-CACGTT-3' at 343 and complements.

Negative strand, positive direction: 5'-CATGTG-3' at 3958, 5'-CACATG-3' at 3956, 5'-CATGTG-3' at 3902, 5'-CACATG-3' at 3742, 5'-CACATG-3' at 3707, 5'-CACATG-3' at 2031, 5'-CACGTG-3' at 570, 5'-AATGTG-3' at 229 and complements.

Positive strand, positive direction: 5'-CACGTG-3' at 3884, 5'-CACGTG-3' at 2961, 5'-CACGTT-3' at 2801, 5'-CACGTT-3' at 2335, 5'-CACGTG-3' at 1219, 5'-CACGTG-3' at 547 and complements.

ACA boxes

The "3' end of mature hTR (45) has an ACA trinucleotide 3 nt upstream of its 3' end. In addition, the 3' region of hTR contains a single H box consensus sequence (5'-AGAGGA-3')."[22]

H and ACA box core promoters

Positive strand, negative direction: 5'-AGGACA-3' at 4468 and complements.

H and ACA box proximal promoters

Negative strand, positive direction: 3'-AGGACA-3' at 4252 and complements.

H and ACA box distal promoters

Negative strand, negative direction: 5'-AGGACA-3' at 1911 and complements.

Negative strand, positive direction: 5'-ACAGGA-3' at 3572, 3'-AGGACA-3' at 3131, 3'-AGGACA-3' at 2460 and complements.

Positive strand, negative direction: 5'-AGGACA-3' at 3756, 5'-AGGACA-3' at 3389, 5'-ACAGGA-3' at 2690 and complements.

Positive strand, positive direction: 5'-AGGACA-3' at 3622, 5'-ACAGGA-3' at 3620, 5'-AGGACA-3' at 144 and complements.

ACGT-containing elements

A box proximal promoters

Negative direction: 5'-TACGTA-3' at 4246 and complement.

A box distal promoters

Positive direction: 5'-TACGTA-3' at 3071 and complement.

ABRE core promoters

Positive strand, positive direction: 5'-ACGTGGC-3' at 4344 and complement.

ABRE proximal promoters

Positive strand, negative direction: 5'-ACGTGGC-3' at 4239 and complement.

ABRE distal promoters

Negative strand, negative direction: 5'-GACACGT-3' at 3429 and complement.

Positive strand, positive direction: 5'-GACACGT-3' at 2960, 5'-ACGTGTC-3' at 1823 and complements.

ACE proximal promoters

Negative strand, negative direction: 5'-ACGTG-3' at 4339 and complement.

Positive strand, negative direction: 5'-ACGTG-3' at 4237 and complement.

ACE distal promoters

Negative strand, negative direction: 5'-CACGT-3' at 3429, 5'-ACGTG-3' at 3288, 5'-CACGT-3' at 2863, 5'-ACGTG-3' at 2760, 5'-ACGTG-3' at 2425, 5'-CACGT-3' at 2081, 5'-ACGTG-3' at 1999, 5'-ACGTG-3' at 1718, 5'-CACGT-3' at 1535, 5'-CACGT-3' at 1470, 5'-ACGTG-3' at 1346, 5'-ACGTG-3' at 1338 and complements.

Negative strand, positive direction: 5'-CACGT-3' at 3254, 5'-ACGTG-3' at 570, 5'-CACGT-3' at 569, and complements.

Positive strand, negative direction: 5'-CACGT-3' at 1772, 5'-CACGT-3' at 531, 5'-CACGT-3' at 342, and complements.

Positive strand, positive direction: 5'-CACGT-3' at 3960, 5'-ACGTG-3' at 3884, 5'-CACGT-3' at 3883, 5'-CACGT-3' at 3464, 5'-ACGTG-3' at 3342, 5'-ACGTG-3' at 3321, 5'-ACGTG-3' at 2961, 5'-CACGT-3' at 2960, 5'-CACGT-3' at 2800, 5'-CACGT-3' at 2681, 5'-CACGT-3' at 2334, 5'-CACGT-3' at 2326, 5'-CACGT-3' at 2063, 5'-ACGTG-3' at 1821, 5'-CACGT-3' at 1786, 5'-ACGTG-3' at 1471, 5'-ACGTG-3' at 1371, 5'-ACGTG-3' at 1219, 5'-CACGT-3' at 1218, 5'-CACGT-3' at 783, 5'-ACGTG-3' at 547, 5'-CACGT-3' at 546, and complements.

Activating transcription factor (Burton) distal promoters

Negative strand, positive direction: 5'-TGACGTAAG-3' at 2207 and complement.

cAMP response element proximal promoters

Negative strand, negative direction: 5'-TGACGTCA-3' at 4317.

C-box (Song) core promoters

Positive strand, positive direction: 5'-GACGTC-3' at 4316, and complement.

C-box (Song) proximal promoters

Negative strand, negative direction: 5'-GACGTC-3', 4316 and complement.

C-box (Song) distal promoters

Positive strand, positive direction: 5'-GACGTC-3' at 3280, 5'-GACGTC-3' at 3231, 5'-GACGTC-3' at 2858, 5'-GACGTC-3' at 1506, 5'-GACGTC-3' at 1120, 5'-GACGTC-3' at 532, 5'-GACGTC-3' at 437, 5'-GACGTC-3' at 193, and complements.

C/G-box hybrid (Song) distal promoters

Positive strand, positive direction: 5'-ACACGTCA-3' at 3962 and complement.

CRE box proximal promoters

Negative strand, negative direction: 5'-TGACGTCA-3', 4317, and complement.

Enhancer box distal promoters

Negative strand, positive direction: 5'-CACGTG-3' at 570 and complement.

Positive strand, positive direction: 5'-CACGTG-3' at 3884 5'-CACGTG-3' at 2961, 5'-CACGTG-3' at 1219, 5'-CACGTG-3' at 547, and complements.

Initiator element (YYANWYY) core promoters

Positive strand, positive direction: 5'-GACGTGG-3' at 4343 and complement.

Initiator element (YYANWYY) proximal promoters

Negative strand, negative direction: 5'-GACGTGA-3' at 4340, and complement.

Initiator element (YYANWYY) distal promoters

Negative strand, negative direction: 5'-TTACGTC-3' at 3772, 5'-AACGTGA-3' at 3289, 5'-GACGTGG-3' at 2761, 5'-GACGTGA-3' at 2426, 5'-CCACGTC-3' at 2082, 5'-GACGTGA-3' at 2000, 5'-TCACGTT-5' at 1536, 5'-TCACGTC-3' at 1471, 5'-AACGTGA-3' at 1347, 5'-AACGTGG-3' at 1339, 5'-GACGTAA-3' at 152, and complements.

Positive strand, negative direction: 5'-GACGTGG-3' at 4238, 5'-TCACGTC-3' at 1773, and complements.

Negative strand, positive direction: 5'-TCACGTC-3' at 3255, and complement.

Positive strand, positive direction: 5'-TCACGTC-3' at 3465, 5'-CTACGTC-3' at 3460, 5'-AACGTAG-3' at 3402, 5'-AACGTGA-3' at 3343, 5'-GACGTGG-3' at 3322, 5'-CCACGTT-3' at 2801, 5'-CCACGTT-3' at 2335, 5'-TCACGTC-3' at 2327, 5'-TCACGTC-3' at 2064, 5'-GACGTAA-3' at 2206, 5'-TCACGTC-3' at 1787, 5'-GACGTGA-3' at 1472, 5'-GACGTGA-3' at 1372, 5'-CCACGTC-3' at 784, and complement.

Initiator element (BBCABW) core promoters

Positive strand, positive direction: 5'-TGACGT-3' at 4341, 5'-TGACGT-3' at 4338, 5'-TGACGT-3' at 4330, 5'-ACGTCT-3' at 4317, and complements.

Initiator element (BBCABW) proximal promoters

Negative strand, negative direction: 5'-ACGTGA-3' at 4340, 5'-ACGTCA-3' at 4317, 5'-TGACGT-3' at 4315, and complements.

Initiator element (BBCABW) distal promoters

Negative strand, negative direction: 5'-TTACGT-3', 3771, 5'-ACGTCT-3' at 3431, 5'-ACACGT-3' at 3429, 5'-ACGTGA-3' at 3289, 5'-ACACGT-3' at 2863, 5'-TGACGT-3' at 2759, 53'-ACGTCA-3' at 2737, 5'-ACGTGA-3' at 2426, 5'-TGACGT-3' at 2424, 5'-ACGTCA-3' at 2402, 5'-ACGTCA-3' at 2083, 5'-ACGTGA-3' at 2000, 5'-TGACGT-3' at 1998, 5'-ACGTCA-3' at 1976, 5'-ACGTGT-3' at 1719, 5'-TCACGT-3' at 1535, 5'-TGACGT-3' at 1494, 5'-ACGTCA-3' at 1472, 5'-ACGTGA-3' at 1347, 5'-ACGTCA-3' at 1323, 5'-ACGTCA-3' at 1032, 5'-ACGTAA-3' at 152, and complements.

Positive strand, negative direction: 5'-AGACGT-3' at 4236, 5'-ACGTCT-3' at 1774, 5'-TCACGT-3' at 1772, 5'-ACGTAA-3' at 533, 5'-ACACGT-3' at 531, 5'-ACACGT-3' at 342, and complements.

Negative strand, positive direction: 5'-ACGTCT-3' at 3256, 5'-TCACGT-3' at 3254, 5'-ACACGT-3' at 569, and complements.

Positive strand, positive direction: 5'-ACGTCA-3' at 3962, 5'-ACACGT-3' at 3960, 5'-ACGTCT-3' at 3831, 5'-TCACGT-3' at 3464, 5'-ACGTCA-3' at 3461, 5'-ACGTGA-3' at 3343, 5'-TGACGT-3' at 3320, 5'-ACGTCA-3' at 3281, 5'-AGACGT-3' at 3279, 5'-AGACGT-3' at 3268, 5'-ACGTCA-3' at 3232, 5'-ACGTAA-3' at 3072, 5'-TTACGT-3' at 3070, 5'-AGACGT-3' at 3061, 5'-ACGTGT-3' at 2962, 5'-ACACGT-3' at 2960, 5'-ACGTCT-3' at 2859, 5'-AGACGT-3' at 2857, 5'-ACGTCT-3' at 2721, 5'-ACACGT-3' at 2681, 5'-ACGTCA-3' at 2328, 5'-TCACGT-3' at 2326, 5'-ACGTAA-3' at 2206, 5'-TGACGT-3' at 2204, 5'-ACGTCA-3' at 2065, 5'-TCACGT-3' at 2063, 5'-ACGTCT-3' at 1937, 5'-ACGTGT-3' at 1822, 5'-TCACGT-3' at 1786, 5'-TGACGT-3' at 1505, 5'-ACGTGA-3' at 1472, 5'-ACGTGA-3' at 1372, 5'-ACGTGT-3' at 1220, 5'-ACGTGT-3' at 548, 5'-ACGTCT-3' at 438, 5'-AGACGT-3' at 224, and complements.

MRE proximal promoters

Negative strand, negative direction: 5'-ACGTGAG-3' at 4341 and complement.

MRE distal promoters

Negative strand, negative direction: 5'-ACGTGAG-3' at 3290, 5'-CACACGT-3' at 2863, 5'-ACGTGGG-3' at 2762, 5'-ACGTGAG-3' at 2427, 5'-ACGTGAG-3' at 2001, 5'-CTCACGT-3' at 1470, 5'-ACGTGAG-3' at 1348, and complements.

Positive strand, negative direction: 5'-ACGTGGG-3' at 3323, 5'-ACGTGTG-3' at 2963, 5'-CTCACGT-3' at 1772, 5'-ACGTGAG-3' at 1473, 5'-ACGTGAG-3' at 1373, 5'-ACGTGTG-3' at 1221, 5'-ACGTGTG-3' at 549, 5'-CACACGT-3' at 531, and complements.

Positive strand, positive direction: 5'-CCCACGT-3' at 3883, 5'-CCCACGT-3' at 2800, 5'-CTCACGT-3' at 2326, 5'-CTCACGT-3' at 1786, 5'-CGCACGT-3' at 1218, 5'-CGCACGT-3' at 546, and complements.

Phop distal promoters

Negative strand, negative direction: 5'-AACGTG-3' at 3288, 3'-CACGTT-5' at 2864, 5'-AACGTG-3' at 1718, 3'-CACGTT-5' at 1536, 5'-AACGTG-3' at 1346, 5'-AACGTG-3' at 1338 and complements.

Positive strand, negative direction: 5'-CACGTT-3' at 343 and complements.

Negative strand, positive direction: 5'-CACGTG-3' at 570 and complement.

Positive strand, positive direction: 5'-CACGTG-3' at 3884, 5'-CACGTG-3' at 2961, 5'-CACGTT-3' at 2801, 5'-CACGTT-3' at 2335, 5'-CACGTG-3' at 1219, 5'-CACGTG-3' at 547 and complements.

Z box distal promoters

Positive strand, positive direction: 5'-ACACGTGT-3' at 2962 and complement.

Activating protein 2

"AP-2 proteins can bind to G/C-rich elements, such as 5’-[G/C]CCN(3,4)GG[G/C]-3’ (41, 42)."[23]

Consensus sequences for the Activating protein 2 (AP-2) are GCCTGGCC.[24]

Activating protein (Murata) core promoters

Negative strand, positive direction: 5'-CCCTGGGGC-3' at 4427, 5'-CCCTTGGGG-3' at 4302 and complement.

Activating protein (Murata) proximal promoters

Negative strand, positive direction: 5'-CCCATGGGG-3' at 4224, 5'-CCCCATGGG-3' at 4223, and complements.

Activating protein (Murata) distal promoters

Negative strand, negative direction: 5'-CCCTGCGGC-3' at 1154 and complement.

Negative strand, positive direction: 5'-GCCCTGGGC-3' at 3498, 5'-GCCAATGGG-3' at 2911, 5'-GCCTCTGGC-3' at 2884, 5'-CCCTTAGGG-3' at 2766, 5'-GCCACCGGC-3' at 1547, 5'-GCCACCGGC-3' at 1295, 5'-GCCAGCGGC-3' at 332, 5'-CCCTCAGGC-3' at 91, and complements.

Positive strand, negative direction: 5'-CCCAAGGGC-3' at 1820 and complement.

Positive strand, positive direction: 5'-CCCGTTGGC-3' at 3912, 5'-CCCTGTGGG-3' at 3533, 5'-GCCAACGGG-3' at 3493, 5'-CCCAGAGGC-3' at 1961, 5'-GCCGGTGGG-3' at 1852, 5'-GCCCGCGGG-3' at 1770, 5'-CCCGGCGGC-3' at 1758, 5'-GCCCCCGGC-3' at 1647, 5'-CCCGACGGC-3' at 483, 5'-CCCTCCGGG-3' at 372, and complements.

Activating protein (Cohen) distal promoters

Negative strand, negative direction: 5'-CCGGTCCG-3' at 4103, 5'-CGGACCGG-3' at 3130, 5'-CCGGTCCG-3' at 2520, 5'-CGGACCGG-3' at 1200, 5'-CCGGTCCG-3' at 649 and complements.

Negative strand, positive direction: 5'-GCCTGGCC-3' at 3681, 5'-GCCTGGCC-3' at 2990, 5'-GGCCAGGC-3' at 1176 and complements.

Activating protein (Cohen2) distal promoters

Positive strand, positive direction: 5'-TCCCCCGCCC-3' at 4440 and complement.

Activating protein (Yao1) proximal promoters

Positive strand, positive direction: 5'-CCCTTCT-3' at 4264 and complement.

Activating protein (Yao1) distal promoters

Negative strand, negative direction: 5'-TCTTCCC-3' at 1657, 5'-GGGAAGA-3' at 620 and complements.

Activating protein (Yao2) core promoters

Negative strand, negative direction: 5'-ACCCTC-3' at 4549, 5'-ACCCTC-3' at 4497 and complements.

Activating protein (Yao2) proximal promoters

Negative strand, negative direction: 5'-ACCCTC-3' at 4303, 5'-ACCCTC-3' at 4271, 5'-GAGGGT-3' at 4259, 5'-ACCCTC-3' at 4153 and complements.

Activating protein (Yao2) distal promoters

Negative strand, negative direction: 5'-ACCCTC-3' at 3989, 5'-ACCCTC-3' at 3752, 5'-ACCCTC-3' at 3714, 5'-ACCCTC-3' at 3080, 5'-ACCCTC-3' at 2221, 5'-ACCCTC-3' at 2104, 5'-ACCCTC-3' at 1962, 5'-ACCCTC-3' at 1930, 5'-ACCCTC-3' at 1795, 5'-ACCCTC-3' at 1018, 5'-ACCCTC-3' at 686, 5'-ACCCTC-3' at 550, 5'-ACCCTC-3' at 413, 5'-GAGGGT-3' at 389 and complements.

Negative strand, positive direction: 5'-CTCCCA-3' at 3333, 5'-CTCCCA-3' at 2532, 5'-CTCCCA-3' at 2396, 5'-CTCCCA-3' at 2383, 5'-TGGGAG-3' at 1782, 5'-CTCCCA-3' at 466 and complements.

Positive strand, positive direction: 5'-CTCCCA-3' at 3880, 5'-CTCCCA-3' at 2797, 5'-CTCCCA-3' at 182.

Activating transcription factors

"The ATF4 binding consensus sequence has been reported as (G/A/C)TT(G/A/T)C(G/A)TCA (38), which matches the ChIP-seq data."[25]

Combined consensus sequences are XTTXCATCA (where X = G, A or T), TTTTCATCA, and (G/A/C)TT(G/A/T)C(G/A)TCA to produce 5'-NTT(A/G/T)C(A/G)TCA-3'.

Copying the consensus for the ATF4: 5'-TTTTCA-3', 5'-CTTTCGTCA-3', or 5'-GTTTCA-3' 5'-GTTTCATC-3' 5'-ATTTCAT-3' (where X = G, A or T) and putting the sequence in "⌘F" finds no, no, no, no, no locations between ZSCAN22 and A1BG and no, one, no, no no, no locations between ZNF497 and A1BG as can be found by the computer programs.

Activating transcription factor (Burton) distal promoters

Positive strand, negative direction: 5'-ATTTCATCA-3' at 2888, 5'-TGACGAAAC-3' at 313 and complement.

Negative strand, positive direction: 5'-CTTGCGTCA-3' at 2423, 5'-TGACGTAAG-3' at 2207, 5'-TGATGAAAC-3' at 2147, 5'-CTTTCGTCA-3' at 1184 and complements.

Activating transcription factor (Kilberg) distal promoters

Positive strand, negative direction: 5'-ATTTCATCA-3' at 2888 and complement.

Negative strand, positive direction: 5'-TGATGAAAC-3' at 2147 and complement.

Adr1ps

The upstream activating sequence (UAS) for Adr1p is 5'-TTGGGG-3' or 5'-TTGG(A/G)G-3'.[21]

Copying 5'-TTGGGG-3' in "⌘F" yields six between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

Aft1ps

The upstream activating sequence (UAS) for Aft1p is 5'-PyPuCACCCPu-3' or 5'-(C/T)(A/G)CACCC(A/G).[21]

Copying 5'-TGCACCC-3' in "⌘F" yields none between ZSCAN22 and A1BG and one 5'-TGCACCCG-3' between ZNF497 and A1BG as can be found by the computer programs.

AGC boxes

"The GCC box, also referred to as the AGC box (10), GCC element (11), or AGCCGCC sequence (13), is an ethylene-responsive element found in the promoters of a large number of [pathogenesis related] PR genes whose expression is up-regulated following pathogen attack."[26]

AGC box distal promoters

Negative strand, negative direction: 5'-CCGCCGA-3' at 1754 nts and complement.

Angiotensinogen core promoter elements

The consensus sequence is 5'-A/C-T-C/T-3'.[27] The core nucleotides for AGCE1 include 5'-A/C-T-C/T-G-T-G-3', "located between the TATA box and transcription initiation site (positions −25 to −1) is an authentic regulator of human AG transcription."[28]

AGCE core promoters

Negative direction: 5'-CACGAG-3' at 4472 and complement.

Positive direction: 5'-CTCGTG-3' at 4376 and complement.

AGCE proximal promoters

Negative direction: 5'-CACGAG-3' at 4403 and complement.

AGCE distal promoters

Negative direction: 5'-CTCGTG-3' at 3914, 5'-CTTGTG-3' at 3669, 5'-CACAAG-3' at 3634, 5'-CACAAT-3' at 3515, 5'-CACGAG-3' at 3232, 5'-CACAAG-3' at 2244, 5'-ATCGTG-3' at 2096, 5'-CACAAT-3' at 1721, 5'-CACGAG-3' at 1182, 5'-CACGAG-3' at 708, 5'-CACGAG-3' at 572, 5'-CACGAG-3' at 435, 5'-ATTGTG-3' at 340, 5'-CACGAT-3' at 336 and complements.

Positive direction: 5'-CTCGTG-3' at 3739, 5'-CACGAG-3' at 3152, 5'-CTTGTG-3' at 3095, 5'-ATTGTG-3' at 2679, 5'-CACGAG-3' at 2090, 5'-CTCGTG-3' at 1627, 5'-CTCGTG-3' at 1207, 5'-CTCGTG-3' at 955, 5'-CTCGTG-3' at 855, 5'-CACGAG-3' at 243, 5'-CACAAG-3' at 107 and complements.

ATA boxes

"The 3' flanking area contained the highly conserved hexanucleotide sequence A-A-T-A-A-A found in eukaryotic messages between the terminator codon and the polyadenylylation site (44)."[29]

ATA core promoters

Negative direction: 5'-AAATAA-3' at 4537 and complement.

ATA proximal promoters

Negative direction: 5'-AAATAA-3' at 4221 and complement.

ATA distal promoters

Negative direction: 5'-AAATAA-3' at 4075, 5'-AATAAA-3' at 4072, 5'-AAATAA-3' at 4071, 5'-AATAAA-3' at 3335, 5'-AAATAA-3' at 3334, 5'-AATAAA-3' at 3014, 5'-AAATAA-3' at 3013, 5'-AATAAA-3' at 1726 and complements.

Positive direction: 5'-AAATAA-3' at 703 and complement.

Auxin response factors

The "genome binding of two [auxin response factors] ARFs (ARF2 and ARF5/Monopteros [MP]) differ largely because these two factors have different preferred ARF binding site (ARFbs) arrangements (orientation and spacing)."[30] "ARFbs were originally defined as TGTCTC (Ulmasov et al., 1995, Guilfoyle et al., 1998), [...]. More recently, protein binding microarray (PBM) experiments suggested that TGTCGG are preferred ARFbs, [...] (Boer et al., 2014, Franco-Zorrilla et al., 2014, Liao et al., 2015)."[30]

A more general consensus sequence may be 1(C/G/T)-2N-3(G/T)-4G-5(C/T)-6(C/T)-7N-8N-9N-10N, where ARF2[b] is 1(C/G/T)-2(A/C/T)-3(G/T)-4G-5(C/T)-6(C/T)-7(G/T)-8(C/G)-9(A/C/T)-10(A/G/T) and ARF5/MP[b] is 1(C/G/T)-2N-3(G/T)-4G-5T-6C-7(G/T)-8N-9-10N.[30] ARF1[b] has 4G.[30]

Copying an auxin response factor consensus sequence 5'-TGTCGG-3' and putting the sequence in "⌘F" finds no location between ZNF497 and A1BG or one locations between ZSCAN22 and A1BG as can be found by the computer programs.

B boxes

While there appear to be at least two B boxes, TGGGCA is one B-box,[31] where the "mP2 EB fragment used for binding was the 118 nucleotide fragment extending from the Dde I site at position -140 to the Dde I site at position -23 [...]. This fragment contains the GC, E, B, CAAT, and TATA boxes."[31]

B box proximal promoters

Negative direction: 5'-TGCCCA-3' at 4251 and complement.

Positive direction: 5'-TGGGCA-3' at 4180 and complement.

B box distal promoters

Negative direction: 5'-TGGGCA-3' at 4191, 5'-TGGGCA-3' at 4040, 5'-TGCCCA-3' at 3883, 5'-TGCCCA-3' at 3854, 5'-TGGGCA-3' at 3301, 5'-TGGGCA-3' at 2773, 5'-TGGGCA-3' at 2438, 5'-TGCCCA-3' at 1458, 5'-TGGGCA-3' at 1359, 5'-TGGGCA-3' at 1114, 5'-TGGGCA-3' at 902, 5'-TGGGCA-3' at 462, and complements.

Positive direction: 5'-TGCCCA-3' at 3750, 5'-TGCCCA-3' at 3377, 5'-TGCCCA-3' at 3237, 5'-TGGGCA-3' at 2894, 5'-TGGGCA-3' at 1945, 5'-TGGGCA-3' at 27 and complements.

The other is associated with the human transforming growth factor b1 binding sequences.[32]

And, has the consensus sequence 5'-TGTCTCA-3'. Let it be designated B1box.

B1 box proximal promoters

Negative direction: 5'-TGTCTCA-3' at 4373 and complement.

B1 box distal promoters

Negative direction: 5'-TGTCTCA-3' at 3323, 5'-TGTCTCA-3' at 2445, 5'-TGTCTCA-3' at 2033, 5'-TGAGACA-3' at 2029, 5'-TGTCTCA-3' at 1089, 5'-TGAGACA-3' at 1085, 5'-TGTCTCA-3' at 1075, 5'-TGTCTCA-3' at 923, 5'-TGAGACA-3' at 919 and complements.

Positive direction: 5'-TGTCTCA-3' at 2468, 5'-TGAGACA-3' at 2308, 5'-TGTCTCA-3' at 2174 and complements.

B recognition elements

The factor II B recognition element is BREu.

"The transcription factor II B recognition elements BREu "CGACGCA" and BREd "ATGGTTG" were upstream (− 279 to − 273 of the transcript) and downstream (− 165 to − 159 of the transcript) of the TATA box, respectively."[33]

The general consensus sequence using degenerate nucleotides is 5’-SSRCGCC-3’, where S = G or C and R = A or G.[34]

The consensus sequence is 5’-G/C G/C G/A C G C C-3’.[35]

BREu distal promoters

Negative strand, negative direction: 5'-CCACGCC-3' at 2197, 5'-CCGCGCC-3' at 1762, 5'-CCTGCGG-3' at 1153, 5'-CCACGCC-3' at 380, and complements.

Positive strand, negative direction: 5'-GGCGTGG-3' at 3047, 5'-GGCGTGG-3' at 1897, 5'-GGCGTGG-3' at 1244, and complements.

Negative strand, positive direction: 5'-GGCGCCC-3' at 1770, 5'-GGGCGCC-3' at 1769, 5'-GGACGCC-3' at 1672, 5'-GCACGCC-3' at 1302 and complements.

Positive strand, positive direction: 5'-GGCGTGG-3' at 2566, 5'-CCACGCC-3' at 1764, 5'-GGCGCCG-3' at 1438, 5'-GGCGCCG-3' at 1338, 5'-CGACGCC-3' at 1033, 5'-GGCGCGC-3' at 682, 5'-CCACGCC-3' at 489, and complements.

CadC binding domains

"Altogether, the specific contacts observed suggest a consensus binding motif of 5′-T-T-A-x-x-x-x-T-3′."[36] "Dimerization of [cadaverine C-terminal] CadC enables the binding of two DBDs to the two Cad1 consensus target sites."[36] "The DNA consensus sequence 5′-T-T-A-x-x-x-x-T-3′ is present once in the quasi-palindromic Cad1 17-mer DNA, consistent with the formation of a 1:1 complex. However, a second consensus facilitates the formation of the 2:1 complex of CadC with Cad1 41-mer DNA as evidenced by the CadC model with the minimal Cad1 26-mer DNA that spans the two AT-rich regions, i.e. consensus sites."[36]

Copying the cadaverine C-terminal binding domain consensus sequence 5'-T-T-A-x-x-x-x-T-3' and putting the sequence in "⌘F" finds one location between ZNF497 and A1BG or four locations between ZSCAN22 and A1BG as can be found by the computer programs.

Carbohydrate response elements

A high glucose "HG environment promotes [carbohydrate response element-binding protein] ChREBP translocation to the nucleus leading to formation of a heterodimeric complex with MLX (Max-like protein X) and binding to the carbohydrate response elements (ChoRE) of ChREBP target genes in the nucleus (14-16)."[37]

"The putative ChREBP binding sites ChoRE1 (CACGTGACCGGATCTTG, -324 to -308) and ChoRE2 (TCCGCCCCCATCACGTG, -298 to - 282) were mutated into CACGTGACGGATCTTG and TCCGCCCCATCACGTG respectively, where the 5-nt spacer between the two E-boxes in ChoRE motifs were shortened to 4-nt (underlined) as previously studies showed (10,35)."[37]

The E-boxes in ChoRE1 and ChoRE2 are CACGTG and ATCTTG and TCCGCC and CACGTG.[37]

Copying the E-boxes putting these sequences in "⌘F" finds: CACGTG none between ZSCAN22 and A1BG and one between ZNF497 and A1BG, ATCTTG none between ZSCAN22 and A1BG and none between ZNF497 and A1BG, and TCCGCC one between ZSCAN22 and A1BG and none between ZNF497 and A1BG, as can be found by the computer programs.

Copying the ChoRE consensus sequence 5'-ACCGGATCTTG-3' or 5'-TCCGCCCCCAT-3' and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

Copying the core ChoRE consensus sequence 5'-ACCGG-3' or 5'-CCCAT-3' and putting the sequence in "⌘F" finds one and five between ZSCAN22 and A1BG or six and three between ZNF497 and A1BG as can be found by the computer programs.

CAREs

A CARE occurs in the negative direction: 5'-CAACTC-3' at 86 possibly associated with ZSCAN22. But inverse CAREs occur 5'-CTCAAC-3' at 1406, 5'-CTCAAC-3' at 2592, 5'-CTCAAC-3' at 2704, 5'-CTCAAC-3' at 3115, and 5'-CTCAAC-3' at 4096.

A CARE occurs in the positive direction: 5'-CAACTC-3' at 3292 in the positive direction. But inverse CARE occur 5'-CTCAAC-3' at 1406 and 5'-CTCAAC-3' at 1621 and 5'-CTCAAC-3' at 3290.

CArG boxes

"RIN [Ripening Inhibitor] binds to DNA sequences known as the CA/T-rich-G (CArG) box, which is the general target of MADS box proteins (Ito et al., 2008)."[38]

"MADS-box proteins bind to a consensus sequence, the CArG box, that has the core motif CC(A/T)6GG (15)."[39]

"Of the [Flowering Locus C] FLC binding sites, 69% contained at least one CArG-box motif with the core consensus sequence CCAAAAAT(G/A)G and an AAA extension at the 3′ end [...]."[39]

Three "other MADS-box flowering-time regulators, SOC1, SVP, and AGAMOUS-LIKE 24 (AGL24), bind to two different CArG-box motifs at 502 bp (CTAAATATGG) and 287 bp (CAATAATTGG) upstream of the translation start in the SEP3 gene (24), consistent with different specificities for the different MADS-box proteins."[39] These together with the core motif CC(A/T)6GG (15) suggest a more general CArG-box motif of (C(C/A/T)(A/T)6(A/G)G).

CArG box distal promoters

Positive strand, negative direction: 5'-CATTAAAAGG-3' at 3441, 5'-CAAAAAAAAG-3' at 1399, and complements.

CAT boxes

"The M-CAT consensus sequence [is] CATTCCT".[40]

"A [chloramphenicol acetyltransferase] CAT-box-like element, GCCATT [34], adjacent to the GC-box, is conserved in the three promoters."[40]

C boxes

"Most bZIP proteins show high binding affinity for the ACGT motifs, which include [...] GACGTC (C-box) [...]."[7]

Analysis "of the recombinant (soybean [Glycine max] TGACG-motif binding factor 1) STF1 protein revealed the C-box (nGACGTCn) to be a high-affinity binding site (Cheong et al., 1998). [...] To test whether STF1 and HY5 have similar DNA-binding properties, the binding properties of each were compared with eight different DNA sequences that represent G-, C-, and C/G-box motifs [TGACGTGT]. C-box sequences carrying the mammalian cAMP responsive element (CRE; TGACGTCA) motif and the Hex sequence (TGACGTGGC), a hybrid C/G-box (Cheong et al., 1998), were high-affinity binding sites for both proteins [...]."[41]

The human ribosomal protein L11 gene (HRPL11) has [...] two potential snRNA-coding sequences in intron 4: the C box beginning at +4131 (GGTGATG), [...] a D box beginning at +4237 (TCCTG), [...].[42]

"Members of the box C/D snoRNA family, which are the subject of the present report, possess characteristic sequence elements known as box C (UGAUGA) and box D (GUCUGA)."[43]

Substituting T for U yields C box = 5'-AGTAGT-3' in the translation direction on the template strand.

C-box core promoters

Positive strand, positive direction: 5'-GACGTC-3' at 4316,[41] and complement.

C-box proximal promoters

Negative strand, negative direction: 5'-GACGTC-3' at 4316,[41] and complement.

C box distal promoters

Negative strand, negative direction: 5'-AGTAGT-3' at 3521, 5'-AGTAGT-3' at 3418, 5'-AGTAGT-3' at 2944, 5'-AGTAGT-3' at 2888,[43] and complements.

Negative strand, positive direction" 5'-TGTGCAGT-3' at 3962,[41] (hybrid C/G-box) and complement.

Positive strand, negative direction: 5'-GGTGATG-3' at 3798,[42] and complement.

Positive strand, positive direction: 5'-ACTACT-3' at 2144, 5'-AGTAGT-3' at 3251,[43] and complements.

Positive strand, positive direction: 5'-GACGTC-3' at 3280, 5'-GACGTC-3' at 3231, 5'-GACGTC-3' at 2858, 5'-GACGTC-3' at 1506, 5'-GACGTC-3' at 1120, 5'-GACGTC-3' at 532, 5'-GACGTC-3' at 437, 5'-GACGTC-3' at 193,[41] (C-box) and complements.

CCAAT-box-binding transcription factors

CAAT boxes: CCAAT-box-binding transcription factor, TGGCA-binding protein are used by some nuclear factors.[44]

Copying the consensus sequence for the Hap4p 5'-CCAAT-3' and putting the sequence in "⌘F" finds one location between ZNF497 and A1BG or no locations between ZSCAN22 and A1BG as can be found by the computer programs.

CGCG boxes

Negative strand in the negative direction there are 2: 5'-GCGCGT-3', 161, 5'-CCGCGC-3', 1761, in the distal promoter.

Positive strand in the negative direction there is 1: 5'-GCGCGG-3', 1762, in the distal promoter.

Negative strand in the positive direction there are 8: between 543 and 1650, in the distal promoter.

Positive strand in the positive direction there are 22: between 161 and 1769, in the distal promoter.

Cold-responsive elements

A "putative cold-responsive element (CRE) [...] is specified by a conserved 5-bp core sequence (CCGAC) typical for C-repeat (CRT)/dehydration-responsive elements (DRE) that are recognized by cold-specific transcription factors (TFs) [16]."[45]

Copying the consensus of the CRE: 5'-CCGAC-3' and putting the sequence in "⌘F" finds 21 locations between ZSCAN22 and A1BG and five locations between ZNF497 and A1BG as can be found by the computer programs.

Coupling elements

"In barley, the combination of an ABRE and one of two known coupling elements CE1 (TGCCACCGG) and CE3 (GCGTGTC) constitutes an ABA responsive complex (ABRC) in the regulation of the ABA‐inducible genes HVA1 and HVA22 (Shen and Ho 1995; Shen et al. 1996)."[15]

"In Arabidopsis, the CE3 element is practically absent; thus, Arabidopsis relies on paired ABREs to form ABRCs (Gomez‐Porras et al. 2007) or on the coupling of a DRE (TACCGACAT) with ABRE (Narusaka et al. 2003; Nakashima et al. 2006)."[15]

CRE boxes

"Within the cAMP-responsive element of the somatostatin gene, we observed an 8-base palindrome, 5'-TGACGTCA-3', which is highly conserved in many other genes whose expression is regulated by cAMP."[46]

The upstream activating sequence (UAS) for the Aca1p, the basic "leucine zipper (bZIP) transcription factor [55] involved in carbon source utilization" is 5'-TGACGTCA-3'[21] the same as a CRE.

The upstream activating sequence (UAS) for the Sko1p, involved "in osmotic and oxidative stress responses" is 5'-TGACGTCA-3'[21] the same as a CRE.

CRE box proximal promoters

Negative strand in the negative direction there is 1: 5'-TGACGTCA-3', 4317, and complement.

D boxes

There is one D box[43] in the distal promoter: 5'-AGTCTG-3' at 2947 on the negative strand in the negative direction and its complement on the positive strand.

Positive strand in the negative direction there is 1: 5'-AGTCTG-3', 1355.

Inverse complement, positive strand, negative direction there are 2: 5'-CAGACT-3', 15, 5'-CAGACT-3', 1616.

There is one D box in the distal promoter: 5'-AGTCTG-3' at 3923 on the negative strand in the positive direction and its complement on the positive strand.

Inverse complement, negative strand, positive direction there are 2: 5'-CAGACT-3', 1744, 5'-CAGACT-3', 2416.

Inverse complement, positive strand, positive direction there are 3: 5'-CAGACT-3', 2943, 5'-CAGACT-3', 3006, 5'-CAGACT-3', 3924.

The human ribosomal protein L11 gene (HRPL11) has two potential snRNA-coding sequences in intron 4: a D box beginning at +4237 (TCCTG).[42]

Copying the consensus of the D-box: 5'-TCCTG-3'[42] and putting the sequence in "⌘F" finds three locations between ZSCAN22 and A1BG and nine locations between ZNF497 and A1BG as can be found by the computer programs.

D-box (TGAGTGG).[47]

Copying the consensus of the D-box: 5'-TGAGTGG-3' and putting the sequence in "⌘F" finds no locations between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

Downstream B recognition elements

  1. negative strand in the negative direction, looking for 5'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-3', 59: between 68 and 4458 and their complements.
  2. negative strand in the positive direction, looking for 5'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-3', 11: between 56 and 4397 and their complements.
  3. positive strand in the negative direction, looking for 5'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-3', 31: between 43 and 4110 and their complements.
  4. positive strand in the positive direction, looking for 5'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-3', 19: between 72 and 4328 and their complements.
  5. inverse, negative strand, negative direction, is SuccessablesdBREi--.bas, looking for 5'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-3': 44 between 230 and 4454 and their complements.
  6. inverse, negative strand, positive direction, is SuccessablesdBREi-+.bas, looking for 5'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-3', 16: between 59 and 4398 and their complements.
  7. inverse, positive strand, negative direction, is SuccessablesdBREi+-.bas, looking for 5'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-3', 16: between 217 and 3945 and their complements.
  8. inverse, positive strand, positive direction, is SuccessablesdBREi++.bas, looking for 5'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-3', 14: between 72 and 4287 and their complements.

Downstream core elements

In the negative direction on the negative strand, the A1BG transcription start site is at 4460 nucleotides from the last nucleotide of the gene ZSCAN22. In the positive direction on the negative strand, the A1BG transcription start site is at 4300 from well within the gene ZNF497. Downstream core elements are expected downstream of these TSSs. Occurrences before the TSSs can be found on Downstream core element gene transcriptions.

  1. positive strand, negative direction, looking for DCE SI: 5'-CTTC-3' at 4528.
  1. negative strand, negative direction, looking for DCE SII: 5'-CTGT-3', 2, 5'-CTGT-3' at 4468 , 5'-CTGT-3' at 4507.
  2. negative strand, positive direction, looking for DCE SII: 5'-CTGT-3', 1, 5'-CTGT-3' at 4392.
  3. positive strand, positive direction, looking for DCE SII: 5'-CTGT-3', 1, 5'-CTGT-3' at 4332.
  1. negative strand, positive direction, looking for DCE SIII: 5'-AGC-3', 1, 5'-AGC-3' at 4352.
  2. positive strand, negative direction, looking for DCE SIII: 5'-AGC-3', 3, 5'-AGC-3' at 4480, 5'-AGC-3' at 4489, 5'-AGC-3' at 4520.
  3. positive strand, positive direction, looking for DCE SIII: 5'-AGC-3', 1, 5'-AGC-3' at 4374.

Complements

  1. negative strand, negative direction, looking for DCE SIc: 5'-GAAG-3', 1, 5'-GAAG-3' at 4528.
  1. negative strand, positive direction, looking for DCE SIIc: 5'-GACA-3', 1, 5'-GACA-3' at 4332.
  2. positive strand, negative direction, looking for DCE SIIc: 5'-GACA-3', 2, 5'-GACA-3' at 4468, 5'-GACA-3' at 4507.
  3. positive strand, positive direction, looking for DCE SIIc: 5'-GAAG-3', 1, 5'-GACA-3' at 4392.
  1. negative strand, negative direction, looking for DCE SIIIc: 5'-TCG-3', 3, 5'-TCG-3' at 4480, 5'-TCG-3' at 4489, 5'-TCG-3' at 4520.
  2. negative strand, positive direction, looking for DCE SIIIc: 5'-TCG-3', 1, 5'-TCG-3' at 4374.
  3. positive strand, positive direction, looking for DCE SIIIc: 5'-TCG-3', 1, 5'-TCG-3' at 4352.

Inverse complements

  1. looking for DCE SIci: 5'-GAAG-3', same as the complements.
  1. positive strand, negative direction, looking for DCE SIIci: 5'-ACAG-3', 1, 5'-ACAG-3' at 4517.
  2. positive strand, positive direction, looking for DCE SIIci: 5'-ACAG-3', 1, 5'-ACAG-3' at 4366.
  1. negative strand, negative direction, looking for DCE SIIIci: 5'-GCT-3', 1, 5'-GCT-3' at 4471.
  2. negative strand, positive direction, looking for DCE SIIIci: 5'-GCT-3', 4, 5'-GCT-3' at 4312, 5'-GCT-3' at 4321, 5'-GCT-3' at 4372, 5'-GCT-3' at 4390.
  3. positive strand, positive direction, looking for DCE SIIIci: 5'-GCT-3', 1, 5'-GCT-3' at 4356.

Inverses

  1. looking for DCE SIi: 5'-CTTC-3', same as the direct transcript.
  1. negative strand, negative direction, looking for DCE SIIi: 5'-TGTC-3', 1, 5'-TGTC-3' at 4517.
  2. negative strand, positive direction, looking for DCE SIIi: 5'-TGTC-3', 1, 5'-TGTC-3' at 4366.
  1. negative strand, positive direction, looking for DCE SIIIi: 5'-CGA-3', 1, 5'-CGA-3' at 4356.
  2. positive strand, negative direction, looking for DCE SIIIi: 5'-CGA-3', 1, 5'-CGA-3' at 4471.
  3. positive strand, positive direction, looking for DCE SIIIi: 5'-CGA-3', 4, 5'-CGA-3' at 4312, 5'-CGA-3' at 4321, 5'-CGA-3' at 4372, 5'-CGA-3' at 4390.

Downstream promoter elements

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesDPE--.bas, looking for 5'-A/G-G-A/T-C/T-A/C/G-3', 163: between 35 and 4546, and their complements.
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesDPE-+.bas, looking for 5'-A/G-G-A/T-C/T-A/C/G-3', 73: between 37 and 4420, and their complements.
  3. positive strand in the negative direction is SuccessablesDPE+-.bas, looking for 5'-A/G-G-A/T-C/T-A/C/G-3', 101: between 32 and 4507, and their complements.
  4. positive strand in the positive direction is SuccessablesDPE++.bas, looking for 5'-A/G-G-A/T-C/T-A/C/G-3', 159: between 8 and 4424, and their complements.
  5. inverse, negative strand, negative direction, is SuccessablesDPEi--.bas, looking for 5'-A/C/G-C/T-A/T-G-A/G-3', 58: between 32 and 4476,
  6. inverse, negative strand, positive direction, is SuccessablesDPEi-+.bas, looking for 5'-A/C/G-C/T-A/T-G-A/G-3', 152: between 8 and 4424.
  7. inverse, positive strand, negative direction, is SuccessablesDPEi+-.bas, looking for 5'-A/C/G-C/T-A/T-G-A/G-3', 174: between 13 and 4546,
  8. inverse, positive strand, positive direction, is SuccessablesDPEi++.bas, looking for 5'-A/C/G-C/T-A/T-G-A/G-3', 95: between 30 and 4420.

Consensus sequence for the DPE is 5'-AGTCTC-3'.[33]

E2 boxes

Negative strand in the negative direction there are 5: 5'-ACAGATGT-3', 482, 5'-ACAGATGT-3', 1225, 5'-GCAGTTGG-3', 1514, 5'-ACAGATGT-3', 2989, 5'-ACAGATGT-3', 4213, in the distal promoter.

Positive strand in the negative direction there are 2: 5'-GCAGGTGG-3', 2571, 5'-ACAGATGA-3', 3920.

Inverse complement, negative strand, negative direction there is 1: 5'-CCACCTGT-3', 2117.

Inverse complement, positive strand, negative direction there are 4: 5'-CCACCTGT-3', 394, 5'-ACACCTGT-3', 1131, 5'-GCAACTGC-3', 3851, 5'-ACACCTGT-3', 3970

Negative strand in the positive direction there is 1: 5'-GCAGATGA-3', 37.

EIN3 binding sites

"We scanned the ORE1 promoter and found a putative EIN3 binding site (EBS), ATGAACCT, located 1056~1064 bp upstream from the start codon (ATG) of the gene [...]."[48]

"EIN3/EIL1 transcription factors were reported to bind to a consensus DNA sequence of A[CT]G[AT]A[CT]CT [34,35]."[48]

Endosperm expression

Endosperm expression (TGTGTCA).[19]

Copying an apparent consensus sequence of TGTGTCA and putting it in "⌘F" finds one located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

Enhancer boxes

Core promoters

Proximal promoters

Negative strand, negative direction there is 1: 5'-CAGATG-3' at 4212.

Positive strand, positive direction there is 1: 5'-CAAGTG-3' at 4202.

Distal promoters

Negative strand in the negative direction there are 9: between 324 and 3482.

Positive strand in the negative direction there are 21: between 41 and 4011.

Negative strand in the positive direction there are 26: between 196 and 4015.

Positive strand in the positive direction there are 10: between 186 and 3936.

Ethylene responsive elements

Ethylene responsive elements (ATTTCAAA).[19]

Copying an apparent consensus sequence of ATTTCAAA and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

F boxes

"Male sex determination in the Caenorhabditis elegans hermaphrodite germline requires translational repression of tra-2 mRNA by the [Germ Line Development] GLD-1 RNA binding protein."[49]

Skp, Cullin, F-box containing complex (or SCF complex) is a multi-protein E3 ubiquitin ligase complex that catalyzes the ubiquitination of proteins destined for 26S proteasomal degradation.[50]

"Canonical F-box proteins act as bridging components of the SCF ubiquitin ligase complex; the N-terminal F-box binds a Skp1 homolog, recruiting ubiquination machinery, while a C-terminal protein-protein interaction domain binds a specific substrate for degradation."[49]

GAAC elements

  1. negative strand in the negative direction, looking for 5'-GAACT-3', 13: between 843 and 4294 and complements,
  2. negative strand in the positive direction, looking for 5'-GAACT-3', 1, 5'-GAACT-3', 609 and complement,
  3. positive strand in the negative direction, looking for 5'-GAACT-3', 2, 5'-GAACT-3', 1685, 5'-GAACT-3', 3460 and complements,
  4. positive strand in the positive direction, looking for 5'-GAACT-3', 2, 5'-GAACT-3', 577, 5'-GAACT-3', 692 and complements,
  5. inverse complement, negative strand, negative direction, looking for 5'-AGTTC-3', 3: between 3844 and 4178 and complements,
  6. inverse complement, negative strand, positive direction, looking for 5'-AGTTC-3', 1, 5'-AGTTC-3', 761 and complement,
  7. inverse complement, positive strand, negative direction, looking for 5'-AGTTC-3', 6: between 253 and 4417.

GA responsive elements

Only one GARE (an inverse) occurs: between ZSCAN22 and A1BG 5'-AAACAAT-3' at 230 nts and its complement.

GATA boxes

GTGA-box has the consensus sequence GATA.[51]

Proximal promoters

Inverse complement, negative strand, positive direction there is 1: 5'-TTTATCAC-3', 4125.

Distal promoters

Positive strand in the negative direction there are 2: 5'-GGGATAGA-3', 100, 5'-ATGATAGA-3', 355.

Inverse complement, negative strand, negative direction there is 1: 5'-GTTATCAT-3', 2500.

Inverse complement, positive strand, negative direction there is 1: 5'-TTTATCTT-3', 1732.

Inverse complement, negative strand, positive direction there is 1: 5'-GTTATCCC-3', 3385.

Inverse complement, positive strand, positive direction there are 2: 5'-GCTATCAG-3', 1840, 5'-TTTATCTT-3', 2628.

GC boxes

GC box (GGGCGG).[52]

Positive strand in the negative direction there are 2; 5'-TGGGCGTGGT-3', 1898, 5'-TGGGCGTGGT-3', 3048, in the distal promoter.

Inverse complement, negative strand, negative direction there is 1: 5'-ACTCCGCCCA-3', 3092.

Inverse complement, positive strand, negative direction there is 1: 5'-GCTCCGCCTC-3', 1505.

Negative strand in the positive direction there is 1: 5'-TGGGCGGGAC-3', 409.

Inverse complement, positive strand, positive direction there is 1:, 5'-GCCACGCCCC-3', 491.

Gibberellin responsive elements

Gibberellin responsive elements (CCTTTTG, AAACAGA).[19]

Copying an apparent consensus sequence of CCTTTTG, AAACAGA and putting it in "⌘F" finds one located between ZSCAN22 and A1BG and two between ZNF497 and A1BG as can be found by the computer programs.

Glucocorticoid response elements

"DNA-binding by the GR-DBD has been well-characterized; it is highly sequence-specific, directly recognizing invariant guanine nucleotides of two AGAACA [TGTTCT] half sites called the glucocorticoid response element (GRE), and binds as a dimer in head-to-head orientation with mid-nanomolar affinity (4,12–18). [...] The consensus DNA glucocorticoid response element (GRE) is comprised of two half-sites (AGAACA) separated by a three base-pair spacer (13,15,60,61)."[53]

Copying an apparent consensus sequence of AGAACA and putting it in "⌘F" finds one located between ZSCAN22 and A1BG and two between ZNF497 and A1BG as can be found by the computer programs.

Gcr1ps

The upstream activating sequence (UAS) for Gcr1p is 5'-CTTCC-3' for the transcriptional activator involved in the regulation of glycolysis [77].[21]

Copying an apparent consensus sequence of 5'-CTTCC-3' and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and six between ZNF497 and A1BG as can be found by the computer programs.

H boxes

Core promoters

Between ZSCAN22 and A1BG: There is one inverse and its complement 5'-AGGAGA-3' at 4428 nts.

Between ZNF497 and A1BG: There is an inverse and its complement 5'-AGGACA-3' at 4252. There is five after the TSS: between 4387 and 4392 and their complements.

Proximal promoters

Between ZSCAN22 and A1BG: There is one H box (5'-ANANNA-3'): negative direction, negative strand, 5'-ACACGA-3' at 4402. On the positive strand in the negative direction there are 16: between 4216 and 4395, with their complements on the negative strand, negative direction.

Between ZNF497 and A1BG: There is one H box (5'-ANANNA-3'): 5'-AGAGAA-3' at 4387 in the proximal promoter, negative strand, positive direction. There are four: between 4365 and 4392 and their complements in the positive direction.

Distal promoters

Between ZSCAN22 and A1BG, negative strand, negative direction: 5'-AGAGGA-3' at 3387, 5'-AGAGGA-3' at 3638, and 5'-AGAGGA-3' at 3675. One inverse and its complement 5'-AGGAGA-3' at 3790. There are 14 H boxes: between 788 and 4124.

On the positive strand, negative direction, there are 127 H boxes: between 608 and 4395.

Between ZNF497 and A1BG: There are two H boxes after nucleotide number 2300 in the negative strand and positive direction: between 420 and 530, and 5'-ACACCA-3' at 2603 and 5'-ACACCA-3' at 3825.

There are two H boxes after nucleotide number 2300 in the positive strand and positive direction: 5'-ACACCA-3' at 204, 5'-ACACCA-3' at 528, 5'-ACACCA-3' at 3643 and 5'-ACACCA-3' at 3967.

Regarding 5'-ANANNA-3', on the negative strand, positive direction, there are 25 H boxes: between 2591 and 4154.

On the positive strand, positive direction there are 20 H boxes: between 2347 and 4168.

There inverses on the negative strand in the positive direction of 31 H boxes: between 2412 and 4166.

HMG boxes

"Most HMG box proteins contain two or more HMG boxes and appear to bind DNA in a relatively sequence-aspecific manner (5, 13, 15, 16 and references therein). [...] they all appear to bind to the minor groove of the A/T A/T C A A A G-motif (10, 14, 18-20)."[54]

Copying an apparent consensus sequence of (A/T)(A/T)CAAAG and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

HNFs

Gene ID: 6927 is HNF1A HNF1 homeobox A aka TCF1 on 12q24.31: "The protein encoded by this gene is a transcription factor required for the expression of several liver-specific genes. The encoded protein functions as a homodimer and binds to the inverted palindrome 5'-GTTAATNATTAAC-3'. Defects in this gene are a cause of maturity onset diabetes of the young type 3 (MODY3) and also can result in the appearance of hepatic adenomas. Alternative splicing results in multiple transcript variants encoding different isoforms."[55]

"Canonical Wnt signaling results in the accumulation and binding of β-catenin to DNA-binding partner TCF1."[56] TCF-1 binding site is CCTTTGA.[56]

"HNF3 can bind to the site in the absence of HNF6 (Lahuna et al. 1997)."[57]

HNF6 core promoters

Inverse complement, positive strand, negative direction there is 1: 5'-TTATTAATTC-3', 4542.

HNF6 proximal promoters

Negative strand in the negative direction there is 1: 5'-TTATTAATCG-3', 4229.

Negative strand in the positive direction there are 2: 5'-TTATTAATCA-3', 4147, 5'-TTATTGATTA-3', 4164.

Inverse complement, positive strand, positive direction there are 1: 5'-ATATTAACAA-3', 4172.

HNF6 distal promoters

Negative strand in the negative direction there are 2: 5'-GTGTTAATAA-3', 1725, 5'-TAGTTGATAA-3', 3527.

Positive strand in the negative direction there is 1: 5'-AAATTGATAA-3', 3361.

Inverse complement, negative strand, negative direction there are 2: 5'-ACATGGACAT-3', 802, 5'-TAATGAACTT-3', 1301.

Inverse complement, positive strand, negative direction there are 2: 5'-AAATTGATAA-3', 3361, 5'-TCATCAACTA-3', 3525.

Negative strand in the positive direction there are 1: 5'-ATGTCCATGG-3', 3581.

Positive strand in the positive direction there is 1: 5'-GAGTCCATTG-3', 3732.

Inverse complement, positive strand, positive direction there is 1: 5'-CCATTGACTC-3', 3736.

Homeoboxes

"Transcription factors Pax-4 and Pax-6 are known to be key regulators of pancreatic cell differentiation and development. [...] The gene-targeting experiments revealed that Pax-4 and Pax-6 cannot substitute for each other in tissue with overlapping expression of both genes. [The] DNA-binding specificities of Pax-4 and Pax-6 are similar. The Pax-4 homeodomain [HD] was shown to preferentially dimerize on DNA sequences consisting of an inverted TAAT motif, separated by 4-nucleotide spacing."[58]

The "crucial difference between the binding sites of Antennapedia class and TTF-1 HDs is in the motifs 5'-TAAT-3', recognized by Antennapedia [a Hox gene, a subset of homeobox genes, first discovered in Drosophila which controls the formation of legs during development], and 5'-CAAG-3', preferentially bound by TTF-1. [The] binding of wild type and mutants TTF-1 HD to oligonucleotides containing either 5'-TAAT-3' or 5'-CAAG-3' indicate that only in the presence of the latter motif the Gln50 in TTF-1 HD is utilized for DNA recognition."[59]

Copying a portion of the homeobox motif of CAAG and putting it in "⌘F" finds eight located between ZSCAN22 and A1BG and 21 between ZNF497 and A1BG as can be found by the computer programs.

Hsf1ps

The upstream activating sequence (UAS) for the Hsf1p is 5'-NGAAN-3' or 5'-(A/C/G/T)GAA(A/C/G/T)-3'.[21]

Copying 5'-TGAAA-3' in "⌘F" yields twelve between ZSCAN22 and A1BG and 5'-CGAAC-3' one between ZNF497 and A1BG as can be found by the computer programs.

Hypoxia response elements

"The hypoxia response element (HRE) and estrogen response element (ERE) were located on −154 to −150 "ACGTG", and −94 to −80 "AGGTTATTGCCTCCT" on the transcript, respectively."[33]

Copying 5'-ACGTG-3' in "⌘F" yields eight between ZSCAN22 and A1BG and 5'-CGAAC-3' one between ZNF497 and A1BG as can be found by the computer programs.

HY boxes

Core promoters

Positive strand in the negative direction there is 1: 5'-TGAGGG-3' at 4558.

Inverse complement, negative strand, negative direction there is 1: 5'-CCCTCA-3', 4498.

Negative strand in the positive direction there is 1: 5'-TGTGGG-3', 4395.

Distal promoters

Negative strand in the negative direction there is 1: 5'-TGTGGG-3' at 749.

Positive strand in the negative direction there are 4: between 88 and 3712.

Inverse complement, negative strand, negative direction there are 3: between 2702 and 3889.

Positive strand in the positive direction there are 2: 5'-TGTGGG-3', 2965, 5'-TGTGGG-3', 3533.

Negative strand in the positive direction there are 3: between 258 and 3879.

Inverse complement, negative strand, positive direction there are 3: between 88 and 3503.

Inverse complement, positive strand, positive direction there is 5: between 494 and 3185.

Initiator elements (YYANWYY)

Core promoters

There is the following Inr in the core promoter, negative strand, negative direction: 5'-TTACTCC-3' at 4557.

There are four Inrs in the core promoter, positive strand, negative direction: between 4425 and at 4542.

There is the following Inr in the core promoter, negative strand, positive direction: 5'-CTGCACC-3' at 4343.

There are two Inrs in the core promoter, positive strand, positive direction: 5'-CCACTCC-3' at 4401 and 5'-CCAGACC-3' at 4416.

Proximal promoters

There are eight Inrs on the negative strand in the negative direction: between 4202 and 4557.

There are seven Inrs on the positive strand in the negative direction: between 4327 and 4542.

There is one Inr on the negative strand in the positive direction: 5'-CTGCACC-3' at 4343.

There is two Inrs on the positive strand in the positive direction: 5'-CCACTCC-3' at 4401 and 5'-CCAGACC-3' at 4416.

Distal promoters

Negative strand in the negative direction there are 87: between 71 and 4188.

Positive strand in the negative direction there are 40: between 20 and 3967.

Inverse complement, negative strand, negative direction there are 32: between 213 and 3967.

Negative strand in the positive direction there are 45: between 115 and 4139.

Positive strand in the positive direction there are 75: between 40 and 4136.

Inverse complement, negative strand, positive direction there are 61: between 53 and 4136.

Inverse complement, positive strand, negative direction there are 100: between 17 and 4177.

Inverse complement, positive strand, positive direction there are 75: between 524 and 4138.

Initiator elements (BBCABW)

Core promoters

There are five Inrs, positive strand, negative direction: between 4423 and 4531.

There are five Inrs, negative strand, positive direction: between 4271 and 4338.

There are four Inrs, positive strand, positive direction: between 4269 and 4414.

Proximal promoters

There are five Inrs on the negative strand in the negative direction: between 4200 and 4359.

There are nine Inrs on the positive strand in the negative direction: between 4233 and 4531.

There is six Inrs on the negative strand in the positive direction: between 4195 and 4338.

There is four Inrs on the positive strand in the positive direction: between 4269 and 4414.

Distal promoters

Negative strand in the negative direction there are 44: between 179 and 3939.

Positive strand in the negative direction there are 59: between 39 and 3965.

Inverse complement, negative strand, negative direction there are 46: 5'-TCTGAC-3', 16: between 62 and 3983.

Inverse complement, positive strand, negative direction there are 54, 5'-ACTGAA-3', 18: between 78 and 4093.

Negative strand in the positive direction there 87: between 15 and 4013.

Positive strand in the positive direction there are 40: between 153 and 4056.

Inverse complement, negative strand, positive direction there are 94: between 54 and 4095.

Inverse complement, positive strand, positive direction there are 47: between 236 and 4127.

Initiator-like element, TCT

Consensus sequence for an Inr-like/TCT is 5'-TTCTCT-3'.[33]

Copying 5'-TTCTCT-3' in "⌘F" yields three between ZSCAN22 and A1BG and three between ZNF497 and A1BG as can be found by the computer programs.

Jasmonic acid-responsive elements

Jasmonic acid-responsive elements (TGACG, CGTCA).[19]

Copying an apparent consensus sequence for the jasmonic acid-responsive element (JARE)[60] of TGACG and putting it in "⌘F" finds eight located between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

Krüppel-like factors

"Krüppel-like factor 1 (KLF1/EKLF) is a transcription factor that globally activates genes involved in erythroid cell development. [...] KLF1 belongs to the KLF family of transcription factors that binds the G-rich strand of so-called CACCC-box motifs located in regulatory regions of numerous erythroid genes."[61]

"Using the in vitro CASTing method, we identified a new set of sequences bound by [congenital dyserythropoietic anemia] CDA-KLF1, and based on them we defined the consensus binding site as 5′-NGG-GG(T/G)-(T/G)(T/G)(T/G)-3′. It differs from the consensus binding sites for [wild-type] WT-KLF1, 5′-NGG-G(C/T)G-(T/G)GG-3′, and for [neonatal anemia] Nan-KLF1, 5′-NGG-G(C/A)N-(T/G)GG-3′, as well."[61]

An apparent consensus is GGG(A/C/G/T)(A/C/G/T)(G/T)(G/T)(G/T).

Copying an apparent consensus sequence for the KLF of GGGTCGTG and putting it in "⌘F" finds six located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

M35 boxes

Negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesM35--.bas, looking for 5'-TTGACA-3', 2, 5'-TTGACA-3', 477, 5'-TTGACA-3', 4399.

Metal responsive elements

MRE proximal promoters

Positive strand, negative direction: 5'-TGCACTC-3' at 4341.

MRE distal promoters

Negative strand, negative direction: 5'-GAGTGCA-3', 1772, 5'-GTGTGCA-3', 531, and complements.

Negative strand, positive direction: 5'-GGGTGCA-3' at 3883, 5'-TGCACCC-3' at 3323, 5'-TGCACAC-3' at 2963, 5'-GGGTGCA-3' at 2800, 5'-GAGTGCA-3' at 2326, 5'-GAGTGCA-3' at 1786, 5'-TGCGCCC-3' at 1657, 5'-GTGCGCA-3' at 1523, 5'-TGCGCCC-3' at 1499, 5'-TGCACTC-3' at 1473, 5'-TGCGCCC-3' at 1399, 5'-TGCACTC-3' at 1373, 5'-TGCGCCC-3' at 1247, 5'-TGCACAC-3' at 1221, 5'-GCGTGCA-3' at 1218, 5'-GGGCGCA-3' at 976, 5'-GGGCGCA-3' at 876, 5'-GCGCGCA-3' at 684, 5'-TGCACAC-3' at 549, 5'-GCGTGCA-3' at 546, 5'-TGCGCCC-3' at 453, and complements.

Positive strand, negative direction: 5'-TGCACTC-3' at 3290, 5'-GTGTGCA-3' at 2863, 5'-TGCACCC-3' at 2762, 5'-TGCACTC-3' at 2427, 5'-TGCACTC-3' at 2001, 5'-GAGTGCA-3' at 1470, 5'-TGCACTC-3' at 1348, 5'-TGCGCTC-3' at 891, and complements.

Positive strand, positive direction: 5'-TGCGCCC-3' at 972, 5'-TGCGCCC-3' at 872, and complements.

Mig1ps

The upstream activating sequence (UAS) for the Mig1p transcription factor is 5'-SYGGGG-3' or 5'-(C/G)(C/T)GGGG-3'.[21]

Copying 5'-CTGGGG-3' in "⌘F" yields none between ZSCAN22 and A1BG and four between ZNF497 and A1BG as can be found by the computer programs.

Msn2,4p

The upstream activating sequence (UAS) for the Msn2,4p transcription factor is 5'-CCCCT-3'.[21]

Copying 5'-CCCCT-3' in "⌘F" yields one between ZSCAN22 and A1BG and three between ZNF497 and A1BG as can be found by the computer programs.

MYB recognition elements

"These elements fit the type II MYB consensus sequence A(A/C)C(A/T)A(A/C)C, suggesting that they are MYB recognition elements (MREs)."[62]

MYB binding site involved in drought induction (TAACTG).[19]

Copying an apparent core consensus sequence for the MYBRE of AACAAAC or TAACTG and putting it in "⌘F" finds none located between ZSCAN22 and none or one between ZNF497 and A1BG as can be found by the computer programs.

Myocyte enhancer factor 2 (MEF2)

Myocyte enhancer factor-2 (MEF2) proteins are a family of transcription factors which through control of gene expression are important regulators of cellular differentiation and consequently play a critical role in embryonic development.[63] In adult organisms, Mef2 proteins mediate the stress response in some tissues.[63]

"The current study delineates the conformational paradigm, clustered recognition, and comparative DNA binding preferences for MEF2A and MEF2B-specific MADS-box/MEF2 domains at the YTA(A/T)4TAR consensus motif."[64] Y = (C/T) and R = (A/G). The consensus sequence is (C/T)TA(A/T)(A/T)(A/T)(A/T)TA(A/G).[64]

Copying an apparent consensus sequence for the TTATATATA or CTAATTTTAA and putting it in "⌘F" finds none (TTATATATA) located between ZSCAN22 and one (CTAATTTTAA) between ZNF497 and A1BG as can be found by the computer programs.

Nuclear factor kappa-light-chain-enhancer of activated B cells

The "natural 11 bp 𝜿B binding site MHC H-2 [is 5'-CCCCTAAGGGG-3'] which is well ordered in our structure."[65]

Binding site for NF𝛋B in humans (GGAATTCCCC) with a core of (GAATTC).[52]

Copying an apparent core consensus sequence for the NF𝛋B of GAATTC and putting it in "⌘F" finds three cores located between ZSCAN22 and none between ZNF497 and A1BG as can be found by the computer programs.

Nuclear factor of activated T cell transcriptions

Mutation "of the core NFATp binding sequence (GGAAAA) in the IL2 promoter NFAT site entirely eliminates the function of the site, as does mutation of an adjacent non-canonical AP-1 site that is not essential for NFATp binding but that is required for formation of the NFATp-Fos-Jun complex(6, 15).3"[66]

Copying an apparent consensus sequence for the NFAT GGAAAA and putting it in "⌘F" finds none located between ZSCAN22 and one between ZNF497 and A1BG as can be found by the computer programs.

Nuclear factor 1

Nuclear factor 1 (NF-1) is a family of closely related transcription factors that constitutively bind as dimers to specific sequences of DNA with high affinity.[67] Family members contain an unusual DNA binding domain that binds to the recognition sequence 5'-TTGGCXXXXXGCCAA-3'.[68]

Consensus sequences for the nuclear factor 1 are TGGCA, TGGCG and TGGAA.[69]

An apparent consensus sequence for the NF1 is TGG(A/C)(A/G).

Copying an apparent consensus sequence for the NF1 TGGCA and putting it in "⌘F" finds none located between ZSCAN22 and five between ZNF497 and A1BG as can be found by the computer programs.

ORE1 binding sites

"As a transcription factor, ORE1 was reported to bind to consensus DNA sequences of [ACG][CA]GT[AG]N{5,6}[CT]AC[AG] [29] or T[TAG][GA]CGT[GA][TCA][TAG] [37]."[48]

Consensus sequences are 5'-(A/C/G)(A/C)GT(A/G)N5,6(C/T)AC(A/G)-3' or 5'-T(A/G/T)(A/G)CGT(A/G)(A/C/T)(A/G/T)-3'.[48]

Copying 5'-TTACGTG-3' in "⌘F" yields none between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

Copying 5'-TGACGTGAG-3' in "⌘F" yields three between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

Copying 5'-TAGCGT-3' in "⌘F" yields none between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

Copying 5'-TAACGTGAG-3' in "⌘F" yields two between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

p53 response elements

"A p53 consensus DNA RE is composed of a tandem of two decameric palindromic sequences (half-sites) 5′-RRRCWWGYYY-3′, where R = purine, Y = pyrimidine and W is either A or T. There is a variability in composition of p53 REs, thus two half-sites can be separated by a spacer DNA, typically 0–13 bp in length and many p53 DNA REs have varying numbers of half-sites (19,20,22,33–37)."[70]

Positive strand in the negative direction using SuccessablesPRE+-.bas, looking for 5'-(A/G)(A/G)(A/G)C(A/T)(A/T)G(C/T)(C/T)(C/T)-3', found 3: 5'-AGGCAAGCCT-3' at 457, 5'-AGACTTGCCT-3' at 1621, and 5'-AGACAAGCTT-3' at 4186, and their complements with the transcription start site at 4460 from ZSCAN22 toward A1BG.

p63 DNA-binding sites

"p63 bound preferentially to DNA fragments conforming to the 20 bp sequence 5'-RRRC(A/G)(A/T)GYYYRRRC(A/T)(C/T)GYYY-3'."[71]

The apparent consensus sequence is (A/G)(A/G)(A/G)C(A/G)(A/T)G(C/T)(C/T)(C/T).

Copying an apparent consensus sequence for the P63 (GAGCGAGCCT) and putting it in "⌘F" finds none located between ZSCAN22 and one between ZNF497 and A1BG as can be found by the computer programs.

P boxes

"As VRI [target gene: vrille (VRI)] accumulates in the nucleus during the mid to late day, it binds VRI/PDP1ϵ binding sites (V/P-boxes) [consensus of V box: A(/G)TTA(/T)T(/C), of P-box: GTAAT(/C)], to repress Clk and cry transcription (Hardin, 2004)."[72]

Copying the apparent consensus sequence for the P box (GTAA(T/C)) and putting it in "⌘F" finds seven located between ZSCAN22 and one between ZNF497 and A1BG as can be found by the computer programs.

Negative strand in the negative direction (from ZSCAN22 to A1BG) using SuccessablesPbox--.bas, looking for 5'-(A/G)T(A/T)(A/T)(T/C)-3' found 84.

Peroxisome proliferator hormone response elements

"After activation by ligands, PPARs/RXRs heterodimers bind to PPRE consensus sequence (AGGTCANAGGTCA) in the promoter of their target genes."[73]

The DNA consensus sequence is AGGTCANAGGTCA, with N being any nucleotide.

Peroxisome proliferator hormone response elements (PPREs) consensus sequences are AGGGGA and TCCCCT.[74]

Copying the apparent consensus sequence for the PPRE (AGGGGA) and putting it in "⌘F" finds none located between ZSCAN22 or three between ZNF497 and A1BG as can be found by the computer programs.

Phosphate starvation-response transcription factors

"The [palindromic E-box motif (CACGTG)] motif is bound by the transcription factor Pho4, [and has the] class of basic helix-loop-helix DNA binding domain and core recognition sequence (Zhou and O'Shea 2011)."[5]

The Pho4 homodimer binds to DNA sequences containing the bHLH binding site 5'-CACGTG-3'.[20]

Copying the apparent consensus sequence for the Pho (CACGTG) and putting it in "⌘F" finds none located between ZSCAN22 or one between ZNF497 and A1BG as can be found by the computer programs.

Pollen1 elements

"Electrophoretic mobility shift assays identified a pollen-specific cis-acting element POLLEN1 (AGAAA) mapped at AtACBP4 (−157/−153) which interacted with nuclear proteins from flower and this was substantiated by DNase I footprinting."[51]

"Given that AtACBP4pro::GUS (−156/−67) could drive promoter activity for pollen expression, [electrophoretic mobility shift assays] EMSAs were carried out to investigate the role of the putative POLLEN1 cis-element, AGAAA (−150/−146), and its adjacent co-dependent regulatory element TCCACCATA (–141/–133)."[51]

"POLLEN1 and the TCCACCATA element are co-dependent regulatory elements responsible for pollen-specific activation of tomato LAT52 (Bate and Twell 1998)."[51]

Copying the consensus for POLLEN1: 5'-AGAAA-3' and putting the sequence in "⌘F" finds many locations for this sequence in the A1BG directions as can be found by the computer programs.

Pribnow boxes

  1. negative strand in the negative direction, looking for 5'-TATAAT-3', 2, 5'-TATAAT-3', 3454, 5'-TATAAT-3', 3468,
  2. negative strand in the positive direction, looking for 5'-TATAAT-3', 1, 5'-TATAAT-3', 729,
  3. complement, positive strand, negative direction, looking for 5'-ATATTA-3', 2, 5'-ATATTA-3', 3454, 5'-ATATTA-3', 3468,
  4. complement, positive strand, positive direction, looking for 5'-ATATTA-3', 1, 5'-ATATTA-3', 729,
  5. inverse complement, negative strand, negative direction, looking for 5'-ATTATA-3', 2, 5'-ATTATA-3', 272, 5'-ATTATA-3', 603,
  6. inverse complement, negative strand, positive direction, looking for 5'-ATTATA-3', 1, 5'-ATTATA-3', 727,
  7. inverse, positive strand, negative direction, looking for 5'-TAATAT-3', 2, 5'-TAATAT-3', 272, 5'-TAATAT-3', 603,
  8. inverse, positive strand, positive direction, looking for 5'-TAATAT-3', 1, 5'-TAATAT-3', 727.

Prolamin boxes

  1. negative strand in the negative direction: 1, 5'-TGTAAAG-3', 2884,
  2. negative strand in the positive direction: 1, 5'-TGAAAAG-3', 489,
  3. positive strand in the negative direction: 1, 5'-TGAAAAG-3', 1627.

Pyrimidine boxes

Pyrimidine boxes and their complements in the negative direction: 5'-CCTTTT-3' at 2459, 5'-CCTTTT-3' at 2927, and 5'-CCTTTT-3' at 2968 occur. Inverse pyrimidine boxes and their complements occur 5'-AAAAGG-3' at 105, 5'-AAAAGG-3' at 1107, 5'-AAAAGG-3' at 3345, and 5'-AAAAGG-3' at 3441.

Pyrimidine boxes in the positive direction: 5'-CCTTTT-3' at 135 and 5'-CCTTTT-3' at 291 and their complements are close to ZNF497.

Q elements

"The basal regulatory elements identified include a putative TATA-box (−30/−24) for RNA polymerase binding and a CAAT box (−64/−61; [...]). Several putative floral expression-related cis-elements identified included a putative 6-nucleotide Q element (−770/−665), three GTGA boxes (−372/−369, −209/−206 and −164/−161) and four putative highly-conserved POLLEN1 boxes (−737/−733, −711/−707, −150/−146 and −36/−32; [...])."[51]

The consensus sequence for a Q element is 5'-AGGTCA-3'.[51]

Copying the apparent consensus sequence for the QE (AGGTCA) and putting it in "⌘F" finds two located between ZSCAN22 or three between ZNF497 and A1BG as can be found by the computer programs.

Rap1 regulatory factors

Consensus sequences: C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T).[5]

"Rap1 is another GRF that organizes chromatin, binds promoters of genes that encode ribosomal and glycolytic proteins, and binds telomeres (Shore 1994; Ganapathi et al. 2011; Hughes and de Boer 2013). [...] DNA shape analysis revealed that Rap1 motifs possess an intrinsically wide minor groove spanning the central degenerate region of the motif that was wider at binding-competent sites [...]. A clear trend was observed between increased width of the minor groove in the central degenerate region of the motif and increased Rap1 binding in vitro."[5]

Copying an apparent consensus sequence for Rap1 (CCCACCAACAAAA) and putting it in "⌘F" finds none located between ZSCAN22 or none between ZNF497 and A1BG as can be found by the computer programs.

Reb1 general regulatory factors

Purified "Reb1 bound [...] exact TTACCCK occurrences [...] with >60% of 780 occurrences at promoters. [And can have] the extended motif VTTACCCGNH (IUPAC nomenclature) (Rhee and Pugh 2011)."[5]

Copying the apparent consensus sequence for Reb1 (TTACCC(G/T)) and putting it in "⌘F" finds one located between ZSCAN22 or none between ZNF497 and A1BG as can be found by the computer programs. However, an extended Reb1 (ATTACCCGAA) finds none located between ZSCAN22 or between ZNF497 and A1BG.

Retinoblastoma control elements

"Robbins et al. (18) have reported that expression of pRB in mouse fibroblasts suppresses transcription of c-fos and have identified an element, termed the retinoblastoma control element (RCE), in the c-fos promoter necessary for this suppression. More recently, sequences homologous to the RCE have been identified in the TGF-β1, -β2, and -β3 promoters by Kim et al. (19)."[75]

"Comparison of the sequence of the newly cloned mouse MMP-9 promoter region with our previous human isolate revealed that [...] four units of GGGG(T/A)GGGG sequence (GT box) were conserved between the two species."[52]

"Expression of some matrix metalloproteinases (MMPs) are regulated by cytokines and tumor promoters, namely tumor necrosis factor-𝛂 (TNF-𝛂), epidermal growth factor, interleukin-1, and 12-O-tetradecanoylphorbol-13-acetate (TPA) (15-20)."[52]

Expression "of v-Src induces the synthesis of MMP-9, which is mediated by alterations in activity of binding factors for the AP-1 site and the sequence motif GGGGTGGGG (GT box). This GT box is homologous to the so-called retinoblastoma (Rb) control element (RCE) (29,30), and Rb can produce an anti-oncogene or tumor suppressor gene product (31-38) which is involved in regulating transcription of certain genes."[52]

Binding site for NF𝛋B in humans (GGAATTCCCC) with a core of (GAATTC), Sp-1 (CCGCCCC), 12-O-tetradecanoylphorbol-13-acetate (TPA) responsive element (TRE) (TGAGTCA), and GC box (GGGCGG).[52]

"Angiotensin II (Ang II) up-regulates plasminogen-activator inhibitor type-1 (PAI-1) expression in mesangial cells to enhance extracellular matrix formation. The proximal promoter region (bp -87 to -45) of the human PAI-1 gene contains several potent binding sites for transcription factors [two phorbol-ester-response-element (TRE)-like sequences; D-box (-82 to -76) and P-box (-61 to 54), and one Sp1 binding site-like sequence, Sp1-box 1 (-72 to -67)]."[47]

"The methylation-interference experiment demonstrated that human recombinant Sp1 bound to the so-called GT box (TGGGTGGGGCT, -78 to -69), which contains the Sp1-box 1."[47]

D-box (TGAGTGG), Sp1-box 1 (GGGGCT), P-box (TGAGTTCA), Sp1-box 2 (CTGCCC), and TATA box (TATAAA).[47]

Copying the apparent consensus sequence for the RCE, GT box, (GGGGTGGGG) and putting it in "⌘F" finds none located between ZSCAN22 or between ZNF497 and A1BG as can be found by the computer programs. However, RCE (GGGGAGGGG) finds none located between ZSCAN22 and one between ZNF497 and A1BG.

Retinoic acid response elements

Retinoic acid response elements (RAREs).

"Retinoic acid is considered as the earliest factor for regulating anteroposterior axis of neural tube and positioning of structures in developing brain through retinoic acid response elements (RARE) consensus sequence (5′–AGGTCA–3′) in promoter regions of retinoic acid-dependent genes."[76]

"Several studies have suggested that the target gene of the RA signal generally contains two direct-repeat half sites of the consensus sequence AGGTCA that are spaced by one to five base pairs (14,16,32,38)."[77]

"Xavier-Neto’s review demonstrated that the magic AGGTCA has high affinity but poor specificity (16). Some other [nuclear receptors] NRs also utilized the RARE with the same spacer models that are used by RXRs/RARs, for example, orphan receptors, vitamin D receptors (VDR) and peroxisome proliferator-activated receptors (PPAR) (32,39). Identifying a bona fide RARE is more difficult than a simple inspection. In order to attribute the RARE in Cx43 to a candidate sequence, some observations have been conducted in our study using molecular, biological and biophysical methods and functional approaches. In a ligand-dependent luciferase assay, RARE was located between the −1,426 to −341 base pair position. The constitutively active mutant Cx43 RARE represses the luciferase activity in the absence of the ligand and has no response to the 9cRA. Our findings indicate that RARE in the Cx43 promoter is a functional element."[77]

Additional response elements that include the 5'-AGGTCA-3' are Q elements, ROR-response elements and Thyroid hormone response elements.

A likely general consensus sequence may be 5'-AG(A/G)TCA-3'.[77]

Copying the apparent consensus sequence for the RARE (AGGTCA) and putting it in "⌘F" finds two located between ZSCAN22 and A1BG and three between ZNF497 and A1BG as can be found by the computer programs.

Root specific elements

Root specific elements (TGACGTCA).[19]

Copying the apparent consensus sequence for the RSE (TGACGTCA) and putting it in "⌘F" finds one located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

ROR-response elements

RAR-related orphan receptor "ROR-γ binds DNA with specific sequence motifs AA/TNTAGGTCA (the classic RORE motif) or CT/AG/AGGNCA (the variant RORE motif)13, 31."[78]

Copying the apparent consensus sequence for the RORE (ATATAGGTCA) and putting it in "⌘F" finds one located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

Copying the apparent consensus sequence for the variant RORE (CTGGGACA) and putting it in "⌘F" finds two located between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

R response elements

The consensus sequence for the RRE is 5'-CATCTG-3'.[79]

Copying the apparent consensus sequence for the RRE (CATCTG) and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

Serum response elements

The SRE wild type (SREwt) contains the nucleotide sequence ACAGGATGTCCATATTAGGACATCTGC, of which CCATATTAGG is the CArG box, TTAGGACAT is the C/EBP box, and CATCTG is the E box.[80]

5'-CCATATTAGG-3' is a CArG box that does not occur in either promoter of A1BG.

5'-CATCTG-3' is an E box that does not occur in either promoter of A1BG.

5'-TTAGGACAT-3' is a C/EBP box that does not occur in either promoter of A1BG using "⌘F".

5'-ACAGGATGT-3' is contained in the above nucleotide sequence which has one occurring between ZNF497 and A1BG using "⌘F" and none between ZSCAN22 and A1BG.

Servenius sequences

The "positive effect of W element may result from cooperative interactions between Z and other downstream elements such as the Servenius sequence, GGACCCT, located from -131 to -125 bp(28,38)."[81]

Copying the apparent consensus sequence for Servenius (GGACCCT) and putting it in "⌘F" finds three located between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

Specificity proteins

Sp1-box 1 (GGGGCT) and Sp1-box 2 (CTGCCC).[47]

"Sp3 has been shown to repress transcriptional activity of Sp1 [9]."[47]

Sp-1 (CCGCCCC).[52]

Sp1 (GCGGC).[69]

An apparent consensus sequences for Sp1 (GGGGCT), (CTGCCC) or (CCGCCCC) is 5'-(C/G)(C/G/T)G(C/G)C(C/T)-3'. Or, each must be considered separately.

Copying the apparent consensus sequences for Sp1 (GGGGCT), (CTGCCC) or (CCGCCCC) and putting each sequence in "⌘F" finds none located between ZSCAN22 and A1BG and four, two or none between ZNF497 and A1BG as can be found by the computer programs.

STATs

A "homologous IFN-𝛄 activation site (GAS) element, having the consensus sequence TTC/ANNNG/TAA, is found in the promoters of several [interferon-stimulated genes] ISG.(37–40)"[82] Consensus sequences: STAT1 - TTCC(C/G)GGAA, STAT3 - TTCC(C/G)GGAA, STAT4 - TTCCGGAA, STAT5 - TTCNNNGAA and STAT6 - TTCNNNNGAA.[82]

"The GAS element is palindromic and the sequence TTCN(2-4)GAA defines the optimal binding site for all STATs, with the exception of STAT2 which appears to be defective in GAS-DNA binding [...]."[83]

Proximal promoters

Negative strand in the positive direction there is 1: 5'-TTCCGGGAA-3', 4247.

Distal promoters

Positive strand in the negative direction there are 2: 5'-TTCGTTGAA-3', 3506, 5'-TTCCCTGAA-3', 3782.

Positive strand in the positive direction there is 1: 5'-TTCCATGAA-3', 128.

Ste12p

The upstream activating sequence (UAS) for Ste12p is 5'-TGAAAC-3'.[21]

Copying 5'-TGAAAC-3' in "⌘F" yields eleven between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

Synaptic Activity-Responsive Elements

"A unique synaptic activity-responsive element (SARE) sequence, composed of the consensus binding sites for SRF, MEF2 and CREB, is necessary for control of transcriptional upregulation of the Arc gene in response to synaptic activity."[84]

"Within the cAMP-responsive element of the somatostatin gene, we observed an 8-base palindrome, 5'-TGACGTCA-3', which is highly conserved in many other genes whose expression is regulated by cAMP."[46]

The consensus sequence for the myocyte enhancer factor 2 (MEF2) is (C/T)TA(A/T)(A/T)(A/T)(A/T)TA(A/G).[64]

The SRE wild type (SREwt) contains the nucleotide sequence ACAGGATGTCCATATTAGGACATCTGC, of which CCATATTAGG is the CArG box, TTAGGACAT is the C/EBP box, and CATCTG is the E box.[80]

Copying each of the consensus sequences and putting the sequence in "⌘F" or from running the computer programs finds one MEF2 (CTAATTTTAA) between ZNF497 and A1BG or 5'-TGACGTCA-3' at 4317 between ZSCAN22 and A1BG, 5'-CCATATTAGG-3' is a CArG box that does not occur in either promoter of A1BG, TTAGGACAT is a C/EBP box that does not occur in either promoter, as can be found by the computer programs.

TACTAAC boxes

"A consensus sequence TACTAA(C/T) was derived for the branch site of Dictyostelium introns."[85]

  1. positive strand in the positive direction is SuccessablesTACT++.bas, looking for 5'-TACTAA(C/T)-3', 1, 5'-TACTAAT-3', 718,
  2. complement, negative strand, positive direction is SuccessablesTACTc-+.bas, looking for 5'-ATGATT(A/G)-3', 1, 5'-ATGATTA-3', 718,
  3. inverse complement, positive strand, positive direction is SuccessablesTACTci++.bas, looking for 5'-(A/G)TTAGTA-3', 1, 5'-ATTAGTA-3', 709,
  4. inverse, negative strand, positive direction, is SuccessablesTACTi-+.bas, looking for 5'-(C/T)AATCAT-3', 1, 5'-TAATCAT-3', 709.

TAGteams

The "heptamer consensus sequence CAGGTAG (i.e., the TAGteam) is overrepresented in regulatory regions of the earliest expressed zygotic genes [2]."[86]

Copying the consensus TAGteam: 5'-CAGGTAG-3' and putting the sequence in "⌘F" finds one location between ZNF497 and A1BG or no locations between ZSCAN22 and A1BG as can be found by the computer programs.

Tapetum boxes

The consensus sequence for the TAPETUM box is TCGTGT.[51]

Copying the consensus Tapetum box: 5'-TCGTGT-3' and putting the sequence in "⌘F" finds one location between ZNF497 and A1BG and one between ZSCAN22 and A1BG as can be found by the computer programs.

TATA boxes

Negative strand in the negative direction there are 2: 5'-TATATATA-3' at 1600 (or -2860 nts upstream from the TSS) and 5'-TATATAAA-3' at 1602 (or -2858 nts).

Positive strand in the negative direction there are 3: 5'-TATAAAAG-3' at 184 (or -4276 nts), 5'-TATAAAAG-3' at 223 (or -4237 nts), and 5'-TATATAAA-3' at 2874 (or -1586 nts).

Inverse complement, negative strand, negative direction there are 2: 5'-TATATATA-3', 1600, 5'-TTTATATA-3', 2871.

Inverse complement, positive strand, negative direction there is 1: 5'-TTTTTATA-3', 219.

TAT boxes

Only an inverse and its complement occurs between ZSCAN22 and A1BG: 5'-TACCTAT-3' at 2996 nts from ZSCAN22.

T boxes

"The different inducing activities of Xbra, VegT and Eomesodermin suggest that the proteins might recognise different DNA target sequences. [...] All three proteins prove to recognise the same core sequence of TCACACCT with some differences in flanking nucleotides."[87]

"Most bZIP proteins show high binding affinity for the ACGT motifs, which include [...] AACGTT (T box) [...]."[7]

"Despite sequence variations within the Tbox DBD between family members, all members of the family appear to bind to the same DNA consensus sequence, TCACACCT. In several in vitro binding-site selection studies, members of the Tbox family were found to bind preferentially sequences containing two or more of these core motifs arranged in various orientations; however, the significance of such double sites in vivo is uncertain, as most Tbox target gene sites have been found to contain only a single consensus motif (18)."[88]

Copying the consensus T boxes: 5'-TCACACCT-3' or 5'-AACGTT-3' and putting the sequence in "⌘F" finds two locations or zero for these sequences respectively between ZSCAN22 or ZNF497 and A1BG as can be found by the computer programs.

Telomeric repeat DNA-binding factors

Copying the consensus telomeric repeat DNA-binding factor (TRF): 5'-TTAGGG-3' and putting the sequence in "⌘F" locates ten of this sequence between ZSCAN22 and A1BG in the negative direction and two nucleotides between ZNF497 and A1BG as can be found by the computer programs.

In the nucleotides between ZSCAN22 and A1BG there is are ten 5'-TTAGGG-3' beginning about 300 nucleotides from ZSCAN22 or ending at about 3900 nts. There are two among the nucleotides between ZNF497 and A1BG as A1BG is approached from ZNF497.

Homo sapiens genes containing these are found using Homo sapiens "TRF (TTAGGG repeat binding factor)".[89]

Thyroid hormone response elements

"The arrangement of TREs within the promoter might regulate THR action by determining THR isoform binding, THR dimerization, and coregulators binding. In the classic view of how TH and its receptor stimulate gene expression, the gene promoter contains TREs consisting of a 6-bp consensus sequence (AGGTCA) organized as a direct repeat separated by 4 bp (DR4), a palindrome without spacing (PAL), or an inverted palindrome (LAP) separated by 4 to 6 bp (10–13)."[90]

Copying the consensus sequence for the TRE: 5'-AGGTCA-3' and putting the sequence in "⌘F" finds no locations between ZNF497 and A1BG or two locations between ZSCAN22 and A1BG as can be found by the computer programs.

Upstream stimulating factors

"The helix-loop-helix transcription factor USF (upstream stimulating factor) binds to a regulatory sequence of the human insulin gene enhancer."[91]

"The regulation of insulin gene expression is dependent on sequences located upstream of the transcription start site (Clark and Docherty, 1992). Two important cis-acting elements, the insulin enhancer binding site 1 (IEBI) or NIR box and the IEB2 or FAR box, have been identified in the rat insulin I gene (Karlsson et al., 1987, 1989). Located at positions -104 (IEBI/NIR) and -233 (IEB2/FAR), these elements share an identical 8 bp sequence, GCCATCTG, which contains a consensus sequence, CANNTG, characteristic of E-box elements (Kingston, 1989). E boxes are present in enhancers from a variety of genes, including immunoglobulin and muscle-specific genes, where they interact with transcription factors containing a helix-loop-helix (HLH) dimerization domain (Murre et al., 1989)."[91]

"The IEB1 box is highly conserved among insulin genes, and is thus likely to play an important role in controlling transcription. The IEB2 site is not well conserved; in the rat insulin 2 gene the equivalent sequence is GCCACCCAGGAG, and in the human insulin gene the homologous sequence, which has been previously designated the GC2 box (Boam et al., 1990a), is GCCACCGG."[91]

"Confirmation that USF bound at the IEB2 site was obtained using an oligonucleotide containing the USF binding site from the adenovirus MLP."[91]

A likely general USF box consensus sequence may be 5'-GCC(A/T)NN(C/G/T)(A/G)-3'.

Those containing an E-box (CANNTG)

  1. Negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesUSFbox-+.bas, looking for 5'-GCC(A/T)NN(C/G/T)(A/G)-3': 1, 5'-GCCACATG-3' at 3707.
  2. inverse complement, positive strand, negative direction is SuccessablesUSFboxci+-.bas, looking for 5'-(C/T)(A/C/G)NN(A/T)GGC-3': 1, 5'-CAGATGGC-3' at 3629.
  3. inverse complement, positive strand, positive direction is SuccessablesUSFboxci++.bas, looking for 5'-(C/T)(A/C/G)NN(A/T)GGC-3': 1, 5'-CAGGTGGC-3' at 1845.

Those containing an E-box (GTNNAC)

  1. inverse negative strand, negative direction is SuccessablesUSFboxi--.bas, looking for 5'-(A/G)(C/G/T)NN(A/T)CCG-3': 1, 5'-GTCTACCG-3' at 3629.
  2. inverse negative strand, positive direction is SuccessablesUSFboxi-+.bas, looking for 5'-(A/G)(C/G/T)NN(A/T)CCG-3': 1, 5'-GTCCACCG-3' at 1845.
  3. inverse positive strand, positive direction is SuccessablesUSFboxi++.bas, looking for 5'-(A/G)(C/G/T)NN(A/T)CCG-3': 3, 5'-GTCCACCG-3' at 198, 5'-GTGGACCG-3' at 2570 and 5'-GTAGACCG-3' at 3406.
  4. complement, negative strand, negative direction is SuccessablesUSFboxc--.bas, looking for 5'-CGG(A/T)NN(A/C/G)(C/T)-3': 2, 5'-CGGTCCAC-3' at 2079 and 5'-CGGTCCAC-3' at 3953.
  5. complement, positive strand, positive direction is SuccessablesUSFboxc++.bas, looking for 5'-CGG(A/T)NN(A/C/G)(C/T)-3': 1, 5'-CGGTGTAC-3' at 3707.

V boxes

"As VRI accumulates in the nucleus during the mid to late day, it binds VRI/PDP1ϵ binding sites (V/P-boxes) [consensus V box:A(/G)TTA(/T)T(/C), P box:GTAAT(/C)], to repress Clk and cry transcription (Hardin, 2004)."[72]

In the negative direction (from ZSCAN22 to A1BG) there are up to 81 V boxes, 28 to 4538 nts from ZSCAN22 with the apparent TSS at 4460 nts.

In the positive direction (from ZNF497 to A1BG) there are up to 21 V boxes, 23 to 4310 nts from ZNF497 with the known TSS at 4300 nts.

W boxes

Proximal promoters

Inverse W boxes occur in the negative strand, negative direction of A1BG: 5'-GGTCAA-3' at 4416 and 5'-GGTCAA-3' at 4308.

W boxes occur in the positive direction, positive strand of A1BG: 5'-CTGACC-3' and its complement at 4216 and inverse W boxes occur 5'-GGTCAG-3' and its complement at 4270.

Distal promoters

A W box occurs 5'-CTGACC-3' at 3749, whereas 5'-CTGACT-3' at 17, 5'-TTGACT-3' at 130, 5'-TTGACT-3' at 307, and 5'-CTGACC-3' at 734 occur close to ZSCAN22, but 5'-CTGACT-3' at 1935 could be associated ZSCAN22 or an unknown gene between it and A1BG, along with their complements, negative strand, negative direction.

Inverse complement, positive strand, negative direction there are 5: 5'-GGTCAG-3', 440, 5'-GGTCAG-3', 577, 5'-GGTCAG-3', 713, 5'-GGTCAG-3', 2249, 5'-GGTCAG-3', 2586.

W box inverses occur 5'-GGTCAG-3' at 1353 negative direction.

W boxes 5'-AGTCAG-3' at 2101, 5'-GGTCAG-3' at 2221, 5'-AGTCAG-3' at 2608, 5'-AGTCAA-3' at 2614, and 5'-AGTCAG-3' at 2619 along with their complements, positive direction.

W boxes in the positive direction occur 5'-CTGACC-3' at 1662, 5'-CTGACC-3' at 2213, 5'-TTGACC-3' at 2873, 5'-CTGACT-3' at 2945, and 5'-TTGACC-3' at 4018 that could be associated with A1BG, along with 5'-TTGACC-3' at 1953, 5'-CTGACT-3' at 2674, and 5'-TTGACT-3' at 3735.

Inverse complement, positive strand, positive direction there are 6: 5'-GGTCAG-3', 2025, 5'-AGTCAG-3', 2099, 5'-GGTCAG-3', 2606, 5'-GGTCAG-3', 2997, 5'-GGTCAG-3', 3083, 5'-GGTCAA-3', 3380.

X core promoter elements

  1. negative strand in the negative direction, looking for 5'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-3', 1, 5'-TGGTGGGACC-3', 3744 and complement,
  2. inverse complement, positive strand, negative direction, looking for 5'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-3', 1, 5'-GCTCCCACCT-3', 392 and complement, and
  3. inverse, negative strand, positive direction, looking for 5'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-3', 1, 5'-CCAGGGTGGG-3', 102.

Z boxes

"The HY5 protein interacts with both the G- (CACGTG) and Z- (ATACGTGT) boxes of the light-regulated promoter of RbcS1A (ribulose bisphosphate carboxylase small subunit) and the CHS (chalcone synthase) genes (Ang et al., 1998; Chattopadhyay et al., 1998; Yadav et al., 2002)."[41]

Z-boxes 1-3 contain 5'-AGGTG-3'.[92]

Z box distal promoters

Positive strand, negative direction: 3'-ACACCTGT-5' at 3970, 3'-ATACCTAT-5' at 2996, 3'-ACACCTGT-5' at 1131 and complements.

Negative strand, positive direction: 5'-ACAGGTGT-3' at 1969 and complement.

Positive strand, positive direction: 5'-ACACGTGT-3' at 2962 and complement.

Response element negative results

Response elements not occurring in promoters near A1BG
Name of elements Consensus sequences Testing Notes
ABA-response elements 5'-GATCGATC-3', 5'-CGATCGAT-3', 5'-ACGTGTCC-3', 5'-GATCGAT-3' 16 ABREN, CGATCGAT motif, ABRE, and core of ABREN and CGATCGAT motif.[15]
Abf1 regulatory factors 5'-CGTCCTCTACG-3' 16 5'-CGTNNNNNACGAT-3'[5]
Activating proteins 5'-GCCCACGGG-3' 16 Activating protein 2 (AP-2)[23]
Activating proteins 5'-GGCCAA-3' 16 Activating protein 2 (AP-2)[74]
Alpha-amylase conserved elements 5'-TATCCA-3' ⌘F 5'-TATCCATCCATCC-3'[19]
Amino acid response elements 5'-ATTGCATCA-3' ⌘F AARE1 (5'-ATTGCATCA-3')[93]
Amino acid response elements 5'-TTTGCATCA-3' ⌘F 5'-TTTGCATCA-3'.[94][95]
AARE-like 5'-TGGTGAAAG-3' ⌘F AARE-like sequence (5′- TGGTGAAAG-3′, named AARE3)[93]
Androgen response elements 5'-GGTACA-3' ⌘F 5′-GGTACACGGTGTTCT-3′[96]
Androgen response elements 5'-TGATTCGTGAG-3' ⌘F 5'-(A/T)(A/G)(A/C/G)(C/T)(C/G/T)(A/C/G)(C/G)(A/C/T)(A/C/G)(A/T)G(A/G/T)(A/G)(C/G)(A/C/T)-3'[97]
Antioxidant-electrophile responsive elements 5'-GTGAGGTCGC-3' ⌘F 5'-GTGAGGTCGC-3'[98] or 5'-GCTGAGT-3', 5'-GCAGGCT-3' of 5'-GC(A/C/T)(A/G/T)(A/G/T)(C/G/T)T(A/C)A-3'[99], an antioxidant response element (ARE)
CAAT boxes 5'-(C/T)(A/G)(A/G)CCAATC(A/G)-3' 16 consensus sequence for the CCAAT-enhancer-binding site (C/EBP) is TAGCATT
Calcineurin-responsive transcription factor gene transcriptions (Crz1ps) 5'-TGCGCCCC-3' ⌘F 5'-TG(A/C)GCCNC-3'[21]
Calcium-response elements 5'-CTATTTCGAG-3' ⌘F CaRE1 5'-CTATTTCGAG-3'[100]
Cat8ps 5'-CGGTCCGC-3' ⌘F 5'-CGGNBNVMHGGA-3', 5'-CGG(A/C/G/T)(C/G/T)(A/C/G/T)(A/C/G)(A/C)(A/C/T)GGA-3'
Cbf1 regulatory factors 5'-TCACGTGA-3' 8 strongly bound Cbf1 motifs enriched at both ends with a "T" on the 5′ and "A" on the 3′ end
C-boxes 5'-GAGGCCATCT-3' 16 5'-GAGGCCATCT-3'[31]
C/A hybrid boxes 5'-TGACGTAT-3' 16 5'-TGACGTAT-3'[41]
C/T hybrid boxes 5'-TGACGTTA-3' 16 5'-TGACGTTA-3'[41]
C/EBP boxes 5'-TTAGGACAT-3', or 5'-TAGCATT-3' ⌘F CCAAT-enhancer-binding site (C/EBP) is TAGCATT
Cell cycle regulation 5'-CCCAACGGT-3' ⌘F tomato genome-wide analysis
CENP-B boxes 5'-TTTCGTTGGAAGCGGGA-3' 16 specifically localized at the centromere
Circadian control elements 5'-CAACTTTA-3' ⌘F CCE
CCCTC-binding factors (CTCF) 5'-NCA-NNA-G(A/G)N-GGC-(A/G)(C/G)(C/T)-3' 16 5′-NCA-NNA-G(G/A)N-GGC-(G/A)(C/G)(T/C)-3′[101]
DAF-16-associated elements 5'-TGATAAG-3' ⌘F DAF-16-associated element (DAE)[102]
DAF-16 binding elements 5'-GTAAACA-3' ⌘F DAF-16 binding element (DBE)[102]
D boxes 5'-GTTGTATAAC-3' ⌘F 5′-CTTATGTAAA-3′[103]
D-boxes 5'-TCTCACA-3' ⌘F TCTCACATT(A/C)AATAAGTCA is a D-box.[31]
Defense and stress-responsive elements 5'-ATTTTCTTCA-3' ⌘F Defense and stress-responsive elements (DREs)
DNA damage response elements (DREs) 5'-TAGCCGCCG-3' or 5'-TTTCAAT-3' ⌘F in the upstream repression sequence (URS)
DNA replication-related elements 5'-TATCGATA-3' ⌘F DNA replication-related element (DRE)[104]
DREB boxes 5'-TACCGACAT-3' 16 CRT/DREB box
EIF4E basal elements 5'-TTACCCCCCCTT-3' 16 poly(C) motif
Endoplasmic reticulum stress response elements 5'-CCAAT-3' ⌘F 5'-CCAATGGGCTGAAAC-3' between ZNF497 and A1BG
Estrogen response elements 5'-AGGTTA-3' or 5'-GGTCAGGAT-3' ⌘F 5'-AGGTTATTGCCTCCT-3' or 5'-GGTCAGGATGAC-3'
Forkhead boxes 5'-(A/G)(C/T)AAA(C/T)A-3' ⌘F 5'-GTAAACAA-3' FOXO1
Gal4ps 5'-CGGACCGC-3' ⌘F 5'-CGG(A/G)NN(A/G)C(C/T)N(C/T)NCNCCG-3'
G boxes 5'-(G/T)CCACGTG(G/T)C-3' ⌘F no "perfect palindrome" G boxes in either promoter
GCN4 motifs 5'-TGACTCA-3', 5'-TGAGTCA-3' ⌘F ACGT motif
Gcn4ps 5'-ATGACTCTT-3' ⌘F GCN4 motifs
GLM boxes 5′-(G/A)TGA(G/C)TCA(T/C)-3′ 16 GCN4-like motif
γ-interferon activated sequences (GAS) 5'-TTCCTAGAA-3' ⌘F ALS-GAS1 between nt −633 and nt −625
Grainy head transcription factor binding sites 5'-AACCGGTT-3' ⌘F also 5'-GACTGGTT-3'
GT boxes 5'-GGGGTGGGG-3' ⌘F (-78 to -69)
Hac1ps 5'-CAGCGTG-3' ⌘F Regulates the unfolded protein response
Heat-responsive elements 5'-AAAAAATTTC-3' ⌘F four nGAAn motifs
Hex sequences 5'-TGACGTGGC-3' ⌘F the Hex sequence (TGACGTGGC)[41]
HMG boxes 5'-(A/T)(A/T)CAAAG-3' ⌘F two or more HMG boxes
Hybrid C, A boxes 5'-TGACGTAT-3' ⌘F A at the 12 position
Hybrid C, G boxes 5'-TGACGTGT-3' ⌘F G at the 12 position
Hybrid C, T boxes 5'-TGACGTTA-3' ⌘F T at the 12 position
Hypoxia-inducible factors 5'-GCCCTACGT-3' ⌘F composed of HIF-1α and HIF-1β
I boxes 5'-GATAAG-3' ⌘F 5'-GGATGAGATAAGA-3'
Inositol, choline-responsive element 5'-TYTTCACATGY-3' ⌘F 5'-TCTTCAC, TCTTCACAT-3'
Kozak sequences 5'-(GCC)GCC(A/G)CCATGG-3' ⌘F 5'-GAAAATGG-3'.[33]
L boxes 5'-TAAATG(A/C/G)A-3' ⌘F L1 box
MAREs 5'-TGCTGA(G/C)TCAGCA-3' ⌘F and 5'-TGCTGA(GC/CG)TCAGCA-3'
M boxes 5'-GTCATGTGCT-3' ⌘F upstream of the TATA box
Mcm1 regulatory factors 5'-(A/C/T)(A/C/T)NC(C/T)(A/C/T)(A/C/T)(A/T)(A/C/T)(A/C/T)N(A/G)(C/G/T)(A/C/T)-3' ⌘F Genome-wide determinant search
Met31ps 5'-AAACTGTGG-3' ⌘F Sulfur amino acid metabolism [72]
Middle sporulation elements 5'-C(A/G)CAAA(A/T)-3' ⌘F 5'-ACACAAA-3' (2017)
Motif ten elements 5'-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-3' 16 Gene ID: 6309
Ndt80ps 5'-TCCGCA-3' ⌘F 5'-DNCRCAAAW-3'
Nuclear factor Y 5'-TACCGACAT-3' ⌘F NF-Y is a trimeric complex
Nutrient-sensing response element 1 5'-GTTTCATCA-3' ⌘F only one nucleotide difference between the SESN2 CARE and the ASNS
Oaf1ps 5'-(A/C/G/T)(A/C/G/T)(A/C/G/T)T(A/C/G/T)A(A/C/G/T)-3' ⌘F 5'-CGG(A/C/G/T)3T(A/C/G/T)A(A/C/G/T)9-12CCG-3'
Pdr1p/Pdr3ps 5'-TCCGCGGA-3' ⌘F Pdr1p/Pdr3p response element (PDRE)
Polycomb response elements 5'-CGCCATTT-3' ⌘F closely resembles the extended Pho-Phol consensus sequence
Rap1 regulatory factors 5'-C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T)-3' ⌘F Rap1 (CCCACCAACAAAA) none
Rgt1ps 5'-CGGACCA-3' ⌘F Glucose-responsive transcription factor
Rlm1ps 5'-CTATATATAG-3' ⌘F CTA(T/A)4TAG
Rox1ps 5'-GGGTAA-3' ⌘F Heme-dependent repressor of hypoxic genes [78]
Rpn4ps 5'-GGTGGCAAA-3' ⌘F proteasome genes
Seed-specific elements 5'-CATGCATG-3' ⌘F SRE consensus: 5'-CAGCAGATTGCG-3' is none
Shoot specific elements 5'-GATAATGATG-3' ⌘F SRE consensus: 5'-CAGCAGATTGCG-3' is none
Sip4ps 5'-CCGTCCGT-3' ⌘F 5'-CC(C/G)T(C/T)C(C/G)TCCG-3'
Smp1ps 5'-ACTACTA-3' ⌘F 5-ACTACTA(T/A)4TAG-3'
Sterol response elements 5'-TCGTATA-3' ⌘F perhaps plant specific
TATCCAC boxes 5'-TATCCAC-3' 16 GA responsive complex component
TCCACCATA elements 5'-TCCACCATA-3' ⌘F adjacent co-dependent regulatory element of POLLEN1
Tec1ps 5'-GAATGT-3' ⌘F Ste12p cofactor
Tetradecanoylphorbol-13-acetate response elements (TREs) 5'-TGA(G/C)TCA-3' 16 cis-regulatory element of the human metallothionein IIa (hMTIIa) promoter and SV40
TGF-β control elements (TCEs) 5'-GAGTGGGGCG-3' ⌘F in mouse and rat, 5'-GCGTGGGGGA-3' in humans
TGF-β inhibitory elements (TIEs) 5'-GAGTGGTGA-3' 16 in the rat transin/stromelysin promoter
Thyroid hormone response elements (TREs) 5'-AGGTCA-3' ⌘F See VDREs, X boxes
Unfolded protein response elements (UPREs) 5'-TGACGTG(G/A)-3' ⌘F XBP1 binds to UPRE
Vhr1ps 5'-AATCA-N8-TGA(C/T)T-3' ⌘F Response to low biotin [71] concentrations
Vitamin D response elements (VDREs) 5'-(A/G)G(G/T)(G/T)CA-3' ⌘F 5'-AGGTCA-3' not ⌘F
X boxes 5'-GTTGGCATGGCAAC-3' 16 X2 box is 5'-AGGTCCA-3' not ⌘F
Xbp1ps 5'-GcCTCGA(G/A)G(C/A)g(a/g)-3' ⌘F Transcriptional repressor
Xenobiotic response elements (XREs) 5'-(T/G)NGCGTG(A/C)(G/C)A-3' ⌘F contains the core sequence 5'-GCGTG-3'
Yap1p,2ps 5'-TTACTAA-3' ⌘F Yap1p binding sites
Y boxes 5'-(A/G)CTAACC(A/G)(A/G)(C/T)-3' 16 inverted CAAT box
Zap1ps 5'-ACCCTCA-3' ⌘F 5'-ACC(C/T)(C/T)(A/C/G/T)AAGGT-3'

Hypotheses

  1. Downstream core promoters may work as transcription factors even as their complements or inverses.
  2. In addition to the DNA binding sequences listed above, the transcription factors that can open up and attach through the local epigenome need to be known and specified.
  3. Each DNA binding domain serving as a transcription factor for the promoter of any immunoglobulin supergene family member, also serves or is present in the promoters for A1BG.
  4. The function of A1BG is the same as other immunoglobulin genes possessing the immunoglobulin domain cl11960 and/or any of three immunoglobulin-like domains: pfam13895, cd05751 and smart00410 in the order and nucleotide sequence: cd05751 Location: 401 → 493, smart00410 Location: 218 → 280, pfam13895 Location: 210 → 301 and cl11960 Location: 28 → 110.

See also

References

  1. Francis S Collins, Eric D Green, Alan E Guttmacher, Mark S Guyer (24 April 2003). "A vision for the future of genomics research". Nature. 422 (6934): 835–47. doi:10.1038/nature01626. PMID 12695777. Retrieved 9 August 2020.
  2. The ENCODE Project Consortium (22 October 2004). "The ENCODE (ENCyclopedia of DNA Elements) Project". Science. 306 (5696): 636–640. doi:10.1126/science.1105136. PMID 15499007. Retrieved 9 August 2020.
  3. The ENCODE Project Consortium (14 June 2007). "Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project". Nature. 447 (7146): 799–816. doi:10.1038/nature05874. PMID 17571346. Retrieved 9 August 2020.
  4. Ya-Mei Wang, Ping Zhou, Li-Yong Wang, Zhen-Hua Li, Yao-Nan Zhang, and Yu-Xiang Zhang (10 August 2012). "Correlation Between DNase I Hypersensitive Site Distribution and Gene Expression in HeLa S3 Cells". PLoS One. 7 (8): e2414. doi:10.1371/journal.pone.0042414. PMID 22900019. Retrieved 9 August 2020.
  5. 5.0 5.1 5.2 5.3 5.4 5.5 5.6 Matthew J. Rossi, William K.M. Lai and B. Franklin Pugh (21 March 2018). "Genome-wide determinants of sequence-specific DNA binding of general regulatory factors". Genome Research. 28: 497–508. doi:10.1101/gr.229518.117. PMID 29563167. Retrieved 31 August 2020.
  6. MeSH (8 July 2008). "Response Elements". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894: National Institutes of Health, Health & Human Services. Retrieved 2 September 2020.
  7. 7.0 7.1 7.2 7.3 ZG E, YP Z, JH Zhou and L W (16 April 2014). "Roles of the bZIP gene family in rice". Genetics and Molecular Research. 13 (2): 3025–36. doi:10.4238/2014.April.16.11. PMID 24782137. Vancouver style error: punctuation (help)
  8. 8.0 8.1 RefSeq (November 2019). "LOC116286197 CRISPRi-validated cis-regulatory element chr19.6329 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 25 July 2020.
  9. RefSeq (February 2016). "ZNF582 zinc finger protein 582 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 28 May 2020.
  10. RefSeq (June 2018). "LOC112553117 Sharpr-MPRA regulatory region 1998 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 25 July 2020.
  11. RefSeq (June 2018). "Sharpr-MPRA regulatory region 10473 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 July 2020.
  12. RefSeq (June 2018). "Sharpr-MPRA regulatory region 7872 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 1 August 2020.
  13. RefSeq (June 2018). "Sharpr-MPRA regulatory region 9894 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 July 2020.
  14. 14.0 14.1 Tara L. Conforto, Yijing Zhang, Jennifer Sherman, and David J. Waxman (November 2012). "Impact of CUX2 on the Female Mouse Liver Transcriptome: Activation of Female-Biased Genes and Repression of Male-Biased Genes" (PDF). Molecular and Cellular Biology. 32 (22): 4611–4627. doi:10.1128/MCB.00886-12. PMID 22966202. Retrieved 8 August 2020.
  15. 15.0 15.1 15.2 15.3 15.4 Kenneth A. Watanabe, Arielle Homayouni, Lingkun Gu, Kuan‐Ying Huang, Tuan‐Hua David Ho, Qingxi J. Shen (18 June 2017). "Transcriptomic analysis of rice aleurone cells identified a novel abscisic acid response element". Plant, Cell & Environment. 40 (9): 2004–2016. doi:10.1111/pce.13006. Retrieved 5 October 2020.
  16. Landschulz WH, Johnson PF, McKnight SL (June 1988). "The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins". Science. 240 (4860): 1759–64. Bibcode:1988Sci...240.1759L. doi:10.1126/science.3289117. PMID 3289117.
  17. Nijhawan A, Jain M, Tyagi AK, Khurana JP (February 2008). "Genomic survey and gene expression analysis of the basic leucine zipper transcription factor family in rice". Plant Physiology. 146 (2): 333–50. doi:10.1104/pp.107.112821. PMID 18065552.
  18. Keiko Kokoroishi, Ayumu Nakashima, Shigehiro Doi, Toshinori Ueno, Toshiki Doi, Yukio Yokoyama, Kiyomasa Honda, Masami Kanawa, Yukio Kato, Nobuoki Kohno & Takao Masaki (28 May 2015). "High glucose promotes TGF-β1 production by inducing FOS expression in human peritoneal mesothelial cells". Clinical and Experimental Nephrology. 20 (1): 30–8. doi:10.1007/s10157-015-1128-9. PMID 26018137. Retrieved 14 August 2020.
  19. 19.0 19.1 19.2 19.3 19.4 19.5 19.6 19.7 Bhaskar Sharma & Joemar Taganna (12 June 2020). "Genome-wide analysis of the U-box E3 ubiquitin ligase enzyme gene family in tomato". Scientific Reports. 10 (9581). doi:10.1038/s41598-020-66553-1. PMID 32533036 Check |pmid= value (help). Retrieved 27 August 2020.
  20. 20.0 20.1 Dalei Shao, Caretha L. Creasy, Lawrence W. Bergman (1 February 1998). "A cysteine residue in helixII of the bHLH domain is essential for homodimerization of the yeast transcription factor Pho4p". Nucleic Acids Research. 26 (3): 710–4. doi:10.1093/nar/26.3.710. PMC 147311. PMID 9443961.
  21. 21.00 21.01 21.02 21.03 21.04 21.05 21.06 21.07 21.08 21.09 21.10 Hongting Tang, Yanling Wu, Jiliang Deng, Nanzhu Chen, Zhaohui Zheng, Yongjun Wei, Xiaozhou Luo, and Jay D. Keasling (6 August 2020). "Promoter Architecture and Promoter Engineering in Saccharomyces cerevisiae". Metabolites. 10 (8): 320–39. doi:10.3390/metabo10080320. PMID 32781665 Check |pmid= value (help). Retrieved 18 September 2020.
  22. James R. Mitchell, Jeffrey Cheng, ang Kathleen Collins (January 1999). "A Box H/ACA Small Nucleolar RNA-Like Domain at the Human Telomerase RNA 3' End" (PDF). Molecular and Cellular Biology. 19 (1): 567–576. doi:10.1128/mcb.19.1.567. PMID 9858580. Retrieved 5 November 2018.
  23. 23.0 23.1 Takayuki Murata, Chieko Noda, Yohei Narita1, Takahiro Watanabe, Masahiro Yoshida, Keiji Ashio, Yoshitaka Sato, Fumi Goshima, Teru Kanda, Hironori Yoshiyama, Tatsuya Tsurumi, and Hiroshi Kimura (27 January 2016). "Induction of Epstein-Barr Virus Oncoprotein Latent Membrane Protein 1 (LMP1) by Transcription Factors Activating Protein 2 (AP-2) and Early B Cell Factor (EBF)" (PDF). Journal of Virology. doi:10.1128/JVI.03227-15. Retrieved 4 October 2020.
  24. Isabelle R. Cohen, Susanne Grässel, Alan D. Murdoch, and Renat V. Iozzo (1 November 1993). "Structural characterization of the complete human perlecan gene and its promoter" (PDF). Proceedings of the National Academy of Sciences USA. 90 (21): 10404–10408. doi:10.1073/pnas.90.21.10404. PMID 8234307. Retrieved 6 September 2020.
  25. Thomas D. Burton, Anthony O. Fedele, Jianling Xie, Lauren Sandeman and Christopher G. Proud (22 May 2020). "The gene for the lysosomal protein LAMP3 is a direct target of the transcription factor ATF4" (PDF). Journal of Biological Chemistry. 295 (21): 7418. doi:10.1074/jbc.RA119.011864. PMID 32312748 Check |pmid= value (help). Retrieved 5 September 2020.
  26. Michael Büttner and Karam B. Singh (May 27, 1997). "Arabidopsis thaliana ethylene-responsive element binding protein (AtEBP), an ethylene-inducible, GCC box DNA-binding protein interacts with an ocs element binding protein". Proceedings of the National Academy of Sciences of the United States of America. 94 (11): 5961–6. Retrieved 2014-05-02.
  27. Noriyuki Sato; Tomohiro Katsuya; Hiromi Rakugi; Seiju Takami; Yukiko Nakata; Tetsuro Miki; Jitsuo Higaki; Toshio Ogihara (September 1997). "Association of Variants in Critical Core Promoter Element of Angiotensinogen Gene With Increased Risk of Essential Hypertension in Japanese". Hypertension. 30 (3 Pt 1): 321–5. doi:10.1161/01.HYP.30.3.321. PMID 9314411. Retrieved 2012-02-20.
  28. Kazuyuki Yanai, Tomoko Saito, Keiko Hirota, Hideyuki Kobayashi, Kazuo Murakami and Akiyoshi Fukamizu (28 November 1997). "Molecular Variation of the Human Angiotensinogen Core Promoter Element Located between the TATA Box and Transcription Initiation Site Affects Its Transcriptional Activity". The Journal of Biological Chemistry. 272 (48): 30558–62. PMID 9374551. Retrieved 2012-02-20.
  29. Stephen A. Liebhaber, Michel J. Goossens, and Yuet Wai Kan (December 1980). "Cloning and complete nucleotide sequence of human 5'-α-globin gene" (PDF). Proceedings of the National Academy of Science USA. 77 (12): 7054–8. Retrieved 2013-06-28.
  30. 30.0 30.1 30.2 30.3 Arnaud Stigliani, Raquel Martin-Arevalillo, Jérémy Lucas, Adrien Bessy, Thomas Vinos-Poyo, Victoria Mironova, Teva Vernoux, Renaud Dumas and François Parcy (3 June 2019). "Capturing Auxin Response Factors Syntax Using DNA Binding Models". Molecular Plant. 12 (6): 822–832. doi:10.1016/j.molp.2018.09.010. PMID 30336329. Retrieved 29 August 2020.
  31. 31.0 31.1 31.2 31.3 PA Johnson, D Bunick, NB Hecht (1991). "Protein Binding Regions in the Mouse and Rat Protamine-2 Genes" (PDF). Biology of Reproduction. 44 (1): 127–134. doi:10.1095/biolreprod44.1.127. PMID 2015343. Retrieved 6 April 2019.
  32. Amber Paratore Sanchez and Kumar Sharma (July 2009). "Transcription factors in the pathogenesis of diabetic nephropathy". Expert Reviews in Molecular Medicine. 11: e13. doi:10.1017/S1462399409001057. PMID 19397838. Retrieved 1 October 2018.
  33. 33.0 33.1 33.2 33.3 33.4 Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima (9 August 2020). "Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" (PDF). Fisheries Science. 86: 873–887. doi:10.1007/s12562-020-01451-z. Retrieved 27 September 2020.
  34. Chuhu Yang, Eugene Bolotin, Tao Jiang, Frances M. Sladek, Ernest Martinez. (March 7, 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMID 17123746.
  35. Alan K. Kutach, James T. Kadonaga (July 2000). "The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters" (PDF). Molecular and Cellular Biology. 20 (13): 4754–64. PMID 10848601. Retrieved 2012-07-15.
  36. 36.0 36.1 36.2 Andreas Schlundt, Sophie Buchner, Robert Janowski, Thomas Heydenreich, Ralf Heermann, Jürgen Lassak, Arie Geerlof, Ralf Stehle, Dierk Niessing, Kirsten Jung & Michael Sattler (21 April 2017). "Structure-function analysis of the DNA-binding domain of a transmembrane transcriptional activator". Scientific Reports. 7: 1051. doi:10.1038/s41598-017-01031-9. PMID 28432336. Retrieved 28 August 2020.
  37. 37.0 37.1 37.2 Jianyin Long, Daniel L. Galvan, Koki Mise, Yashpal S. Kanwar, Li Li, Naravat Poungavrin, Paul A. Overbeek, Benny H. Chang, and Farhad R. Danesh (28 May 2020). "Role for carbohydrate response element-binding protein (ChREBP) in high glucose-mediated repression of long noncoding RNA Tug1" (PDF). Journal of Biological Chemistry. 5 (28). doi:10.1074/jbc.RA120.013228. Retrieved 6 October 2020.
  38. Masaki Fujisawa, Toshitsugu Nakano, Yoko Shima and Yasuhiro Ito (5 February 2013). "A large-scale identification of direct targets of the tomato MADS box transcription factor RIPENING INHIBITOR reveals the regulation of fruit ripening". The Plant Cell. 25 (2): 371–86. doi:10.​1105/​tpc.​112.​108118 Check |doi= value (help). PMID 23386264. Retrieved 2017-02-19. zero width space character in |doi= at position 4 (help)
  39. 39.0 39.1 39.2 Weiwei Deng, Hua Ying, Chris A. Helliwell, Jennifer M. Taylor, W. James Peacock, and Elizabeth S. Dennis (19 April 2011). "FLOWERING LOCUS C (FLC) regulates development pathways throughout the life cycle of Arabidopsis". Proceedings of the National Academy of Sciences United States of America. 108 (16): 6680–6685. doi:10.1073/pnas.1103175108. Retrieved 2017-09-17.
  40. 40.0 40.1 Christof Berberich, Ingolf Dürr, Michael Koenen and Veit Witzemann (September 1993). "Two adjacent E box elements and a M‐CAT box are involved in the muscle‐specific regulation of the rat acetylcholine receptor β subunit gene". European Journal of Biochemistry. 216 (2): 395–404. doi:10.1111/j.1432-1033.1993.tb18157.x. Retrieved 27 December 2019.
  41. 41.0 41.1 41.2 41.3 41.4 41.5 41.6 41.7 41.8 Young Hun Song, Cheol Min Yoo, An Pio Hong, Seong Hee Kim, Hee Jeong Jeong, Su Young Shin, Hye Jin Kim, Dae-Jin Yun, Chae Oh Lim, Jeong Dong Bahk, Sang Yeol Lee, Ron T. Nagao, Joe L. Key, and Jong Chan Hong (April 2008). "DNA-Binding Study Identifies C-Box and Hybrid C/G-Box or C/A-Box Motifs as High-Affinity Binding Sites for STF1 and LONG HYPOCOTYL5 Proteins" (PDF). Plant Physiology. 146 (4): 1862–1877. doi:10.1104/pp.107.113217. PMID 18287490. Retrieved 26 March 2019.
  42. 42.0 42.1 42.2 42.3 E. N. Voronina, T. D. Kolokol’tsova, E. A. Nechaeva, and M. L. Filipenko (2003). "Structural–Functional Analysis of the Human Gene for Ribosomal Protein L11" (PDF). Molecular Biology. 37 (3): 362–371. Retrieved 11 April 2019.
  43. 43.0 43.1 43.2 43.3 Dmitry A. Samarsky, Maurille J.Fournier, Robert H.Singer and Edouard Bertrand (1 July 1998). "The snoRNA box C/D motif directs nucleolar targeting and also couples snoRNA synthesis and localization" (PDF). The European Molecular Biology Organization (EMBO) Journal. 17 (13): 3747–3757. doi:10.1093/emboj/17.13.3747. PMID 9649444. Retrieved 2017-02-04.
  44. RefSeq (September 2011). "NFIA nuclear factor I A [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 4 May 2020.
  45. Björn Pietzenuk, Catarine Markus, Hervé Gaubert, Navratan Bagwan, Aldo Merotto, Etienne Bucher & Ales Pecinka (11 October 2016). "Recurrent evolution of heat-responsiveness in Brassicaceae COPIA elements". Genome Biology. 17: 209. doi:10.1186/s13059-016-1072-3. Retrieved 14 September 2020.
  46. 46.0 46.1 Marc R. Montminy, Kevin A. Sevarino, John A. Wagner, Gail Mandel, and Richard H. Goodman (September 1986). "Identification of a cyclic-AMP-responsive element within the rat somatostatin gene" (PDF). Proceedings of the National Academy of Sciences of the USA. 83 (18): 6382–6. PMID 2875459. Retrieved 17 September 2018.
  47. 47.0 47.1 47.2 47.3 47.4 47.5 Masaru Motojima, Takao Ando and Toshimasa Yoshioka (10 July 2000). "Sp1-like activity mediates angiotensin-II-induced plasminogen-activator inhibitor type-1 (PAI-1) gene expression in mesangial cells" (PDF). Biomedical Journal. 349 (2): 435–441. doi:10.1042/0264-6021:3490435. PMID 10880342. Retrieved 13 August 2020.
  48. 48.0 48.1 48.2 48.3 Kai Qiu, Zhongpeng Li, Zhen Yang, Junyi Chen, Shouxin Wu, Xiaoyu Zhu, Shan Gao, Jiong Gao, Guodong Ren, Benke Kuai, and Xin Zhou (July 2015). "EIN3 and ORE1 Accelerate Degreening during Ethylene-Mediated Leaf Senescence by Directly Activating Chlorophyll Catabolic Genes in Arabidopsis". PLoS Genetics. 11 (7): e1005399. doi:10.1371/journal.pgen.1005399. PMID 26218222. Retrieved 4 October 2020.
  49. 49.0 49.1 Robert Clifford, Min-Ho Lee, Sudhir Nayak, Mitsue Ohmachi, Flav Giorgini and Tim Schedl (December 2000). "FOG-2, a novel F-box containing protein, associates with the GLD-1 RNA binding protein and directs male sex determination in the C. elegans hermaphrodite germline" (PDF). Development. 127 (24): 5265–76. PMID 11076749. Retrieved 10 August 2020.
  50. Ou, Young; Rattner, J.B. (2004). "The Centrosome in Higher Organisms: Structure, Composition, and Duplication". International Review of Cytology. 238: 119–182. doi:10.1016/s0074-7696(04)38003-4. ISBN 978-0-12-364642-2. PMID 15364198.
  51. 51.0 51.1 51.2 51.3 51.4 51.5 51.6 Zi-Wei Ye, Jie Xu, Jianxin Shi, Dabing Zhang and Mee-Len Chye (January 2017). "Kelch-motif containing acyl-CoA binding proteins AtACBP4 and AtACBP5 are differentially expressed and function in floral lipid metabolism" (PDF). Plant Molecular Biology. 93: 209–225. doi:10.1007/s11103-016-0557-5. PMID 27826761. Retrieved 7 May 2020.
  52. 52.0 52.1 52.2 52.3 52.4 52.5 52.6 Hiroshi Sato, Megumi Kita, and Motoharu Seiki (5 November 1993). "v-Src Activates the Expression of 92-kDa Type IV Collagenase Gene through the AP-1 Site and the GT Box Homologous to Retinoblastoma Control Elements" (PDF). The Journal of Biological Chemistry. 268 (31): 23460–8. PMID 8226872. Retrieved 13 August 2020.
  53. Nicholas V Parsonnet, Nickolaus C Lammer, Zachariah E Holmes, Robert T Batey, Deborah S Wuttke (5 September 2019). "The glucocorticoid receptor DNA-binding domain recognizes RNA hairpin structures with high affinity". Nucleic Acids Research. 47 (15): 8180–8192. doi:10.1093/nar/gkz486. PMID 31147715. Retrieved 28 August 2020.
  54. Vincent Laudet, Dominique Stehelin and Hans Clevers (1993). "Ancestry and diversity of the HMG box superfamily" (PDF). Nucleic Acids Research. 21 (10): 2493–501. doi:10.1093/nar/21.10.2493. PMID 8506143. Retrieved 2017-04-05.
  55. RefSeq (April 2015). HNF1A HNF1 homeobox A [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 7 November 2018.
  56. 56.0 56.1
  57. Cissi Gardmo and Agneta Mode (1 December 2006). "In vivo transfection of rat liver discloses binding sites conveying GH-dependent and female-specific gene expression". Journal of Molecular Endocrinology. 37 (3): 433–441. doi:10.1677/jme.1.02116. PMID 17170084. Retrieved 2017-09-01.
  58. Anna Kalousová, Vladimı́r Beneš, Jan Pačes, Václav Pačes and Zbyněk Kozmik (June 1999). "DNA Binding and Transactivating Properties of the Paired and Homeobox Protein Pax4". Biochemical and Biophysical Research Communications. 259 (3): 510–518. PMID 10364449. Retrieved 6 May 2020.
  59. G. Damante, D. Fabbro, L. Pelizari, D. Civitareale, S. Guazzi, M. Polycarpou-Schwartz, S. Cauci, F. Quadrifoglio, S. Formisano and R. Di Lauro (20 June 1994). "Sequence-specific DNA recognition by the thyroid transcription factor-1 homeodomain" (PDF). Nucleic Acids Research. 22 (15): 3075–83. doi:10.1093/nar/22.15.3075. PMID 7915030. Retrieved 6 May 2020.
  60. Young Jin Kim, Dong Gwan Kim, Sun Hi Lee and Incheol Lee (February 2006). "Wound-induced expression of the ferulate 5-hydroxylase gene in Camptotheca acuminata". Biochimica et Biophysica Acta (BBA) - General Subjects. 1760 (2): 182–190. doi:10.1016/j.bbagen.2005.08.015. PMID 16332414. Retrieved 9 September 2020.
  61. 61.0 61.1 Klaudia Kulczynska, James J. Bieker, Miroslawa Siatecka (12 February 2020). "A Krüppel-like factor 1 (KLF1) Mutation Associated with Severe Congenital Dyserythropoietic Anemia Alters Its DNA-Binding Specificity". Molecular and Cellular Biology. 40 (5): e00444–19. doi:10.1128/MCB.00444-19. PMID 31818881. |access-date= requires |url= (help)
  62. Paul J Rushton and Imre E Somssich (August 1998). "Transcriptional control of plant genes responsive to pathogens" (PDF). Current Opinion in Plant Biology. 1 (4): 311–5. doi:10.1016/1369-5266(88)80052-9. PMID 10066598. Retrieved 5 November 2018.
  63. 63.0 63.1 Potthoff MJ, Olson EN (December 2007). "MEF2: a central regulator of diverse developmental programs". Development. 134 (23): 4131–40. doi:10.1242/dev.008367. PMID 17959722.
  64. 64.0 64.1 64.2 Ayisha Zia, Muhammad Imran, and Sajid Rashid (7 February 2020). "In Silico Exploration of Conformational Dynamics and Novel Inhibitors for Targeting MEF2-Associated Transcriptional Activity". Journal of Chemical Information and Modeling. 60 (3): 1892–1909. doi:10.1021/acs.jcim.0c00008. Retrieved 10 September 2020.
  65. Patrick Cramer, Christopher J. Larson, Gregory L. Verdine and Christoph W. Müller (1 December 1997). "Structure of the human NF‐κB p52 homodimer‐DNA complex at 2.1 Å resolution". The EMBO Journal. 16 (23): 7078–90. doi:10.1093/emboj/16.23.7078. PMID 9384586. Retrieved 3 May 2020.
  66. Jugnu Jain, Emmanuel Burgeon, Tina M. Badalian, Patrick G. Hogan and Anjana Rao (24 February 1995). "A Similar DNA-binding Motif in NFAT Family Proteins and the Rel Homology Region" (PDF). Journal of Biological Chemistry. 270 (8): 4138–4145. doi:10.1074/jbc.270.8.4138. PMID 7876165. Retrieved 15 August 2020.
  67. Blomquist P, Belikov S, Wrange O (January 1999). "Increased nuclear factor 1 binding to its nucleosomal site mediated by sequence-dependent DNA structure". Nucleic Acids Research. 27 (2): 517–25. doi:10.1093/nar/27.2.517. PMC 148209. PMID 9862974.
  68. Walter F. Boron (2003). Medical Physiology: A Cellular And Molecular Approach. Elsevier/Saunders. pp. 125–126. ISBN 1-4160-2328-3.
  69. 69.0 69.1 D. W. Yao, J. Luo, Q. Y. He, J. Li, H. Wang, H. B. Shi, H. F. Xu, M. Wang and J. J. Loor (May 2016). "Characterization of the liver X receptor-dependent regulatory mechanism of goat stearoyl-coenzyme A desaturase 1 gene by linoleic acid". Journal of Dairy Science. 99 (5): 3945–3957. doi:10.3168/jds.2015-10601. PMID 26947306. Retrieved 5 September 2020.
  70. Sinéad Kearns, Rudi Lurz, Elena V. Orlova, Andrei L. Okorokov (27 July 2016). "Two p53 tetramers bind one consensus DNA response element". Nucleic Acid Research. 44 (13): 6185–6199. doi:10.1093/nar/gkw215. Retrieved 6 October 2020.
  71. C A Perez, J Ott, D J Mays & J A Pietenpol (15 November 2007). "p63 consensus DNA-binding site: identification, analysis and application into a p63MH algorithm". Oncogene. 26 (52): 7363–70. doi:10.1038/sj.onc.1210561. PMID 17563751. Retrieved 28 August 2020.
  72. 72.0 72.1 Wangjie Yu and Paul E. Hardin (2006). "Circadian oscillators of Drosophila and mammals". Journal of Cell Science. 119: 4793–5. doi:10.1242/jcs.03174. PMID 17130292. Retrieved 2017-02-19.
  73. Mengli You, Shuping Yuan, Juanjuan Shi, Yongzhong Hou (1 June 2015). "PPARδ signaling regulates colorectal cancer". Current Pharmaceutical Design. 21 (21): 2956–2959. doi:10.2174/1381612821666150514104035. PMID 26004416. Retrieved 10 September 2020.
  74. 74.0 74.1 Yao EF, Denison MS (June 1992). "DNA sequence determinants for binding of transformed Ah receptor to a dioxin-responsive enhancer". Biochemistry. 31 (21): 5060–7. doi:10.1021/bi00136a019. PMID 1318077.
  75. Jennifer A. Pietenpol, Karl Munger, Peter M. Howley, Roland W. Stein and Harold L. Moses (November 15, 1991). "Factor-binding element in the human c-myc promoter involved in transcriptional regulation by transforming growth factor β1 and by the retinoblastoma gene product" (PDF). Proceedings of the National Academy of Sciences USA. 88 (22): 10227–10231. doi:10.1073/pnas.88.22.10227. PMID 1946442. Retrieved 5 December 2018.
  76. Ashutosh Kumar, Himanshu N. Singh, Vikas Pareek, Khursheed Raza, Subrahamanyam Dantham, Pavan Kumar, Sankat Mochan and Muneeb A. Faiq (9 August 2016). "A Possible Mechanism of Zika Virus Associated Microcephaly: Imperative Role of Retinoic Acid Response Element (RARE) Consensus Sequence Repeats in the Viral Genome". Frontiers in Human Neuroscience. 10: 403. doi:10.3389/fnhum.2016.00403. PMID 27555815. Retrieved 7 September 2020.
  77. 77.0 77.1 77.2 Ruoyi Gu, Jun Xu, Yixiang Lin, Jing Zhang, Huijun Wang, Wei Sheng, Duan Ma, Xiaojing Ma & Guoying Huang (July 2016). "Liganded retinoic acid X receptor α represses connexin 43 through a potential retinoic acid response element in the promoter region". Pediatric Research. 80 (1): 159–168. doi:10.1038/pr.2016.47. PMID 26991262. Retrieved 7 September 2020.
  78. Junjian Wang, June X. Zou, Xiaoqian Xue, Demin Cai, Yan Zhang, Zhijian Duan, Qiuping Xiang, Joy C. Yang, Maggie C. Louie, Alexander D. Borowsky, Allen C. Gao, Christopher P. Evans, Kit S. Lam, Jianzhen Xu, Hsing-Jien Kung, Ronald M. Evans, Yong Xu, and Hong-Wu Chen (May 2016). "ROR-γ drives androgen receptor expression and represents a therapeutic target in castration-resistant prostate cancer". Nature Medicine. 22 (5): 488–496. doi:10.1038/nm.4070. PMID 27019329. Retrieved 6 September 2020.
  79. Ulrike Hartmann, Martin Sagasser, Frank Mehrtens, Ralf Stracke and Bernd Weisshaar (January 2005). "Differential combinatorial interactions of cis-acting elements recognized by R2R3-MYB, BZIP, and BHLH factors control light-responsive and tissue-specific activation of phenylpropanoid biosynthesis genes" (PDF). Plant Molecular Biology. 57 (2): 155–171. doi:10.1007/s11103-004-6910-0. PMID 15821875. Retrieved 10 November 2018.
  80. 80.0 80.1 Ravi P. Misra, Azad Bonni, Cindy K. Miranti, Victor M. Rivera, Morgan Sheng, and Michael E.Greenberg (14 October 1994). "L-type Voltage-sensitive Calcium Channel Activation Stimulates Gene Expression by a Serum Response Factor-dependent Pathway" (PDF). The Journal of Biological Chemistry. 269 (41): 25483–25493. PMID 7929249. Retrieved 7 December 2019.
  81. John P. Cogswell, Patricia V. Basta, and Jenny P.-Y. Ting (October 1990). "X-box-binding proteins positively and negatively regulate transcription of the HLA-DRA gene through interaction with discrete upstream W and V elements" (PDF). Proceedings of the National Academy of Sciences USA. 87 (19): 7703–7707. doi:10.1073/pnas.87.19.7703. PMID 2120707. Retrieved 20 August 2020.
  82. 82.0 82.1 Julien J. Ghislain, Thomas Wong, Melody Nguyen, and Eleanor N. Fish (June 2001). "The Interferon-Inducible Stat2:Stat1 Heterodimer Preferentially Binds In Vitro to a Consensus Element Found in the Promoters of a Subset of Interferon-Stimulated Genes" (PDF). Journal of Interferon and Cytokine Research. 21 (6): 379–388. doi:10.1089/107999001750277. PMID 11440635. Retrieved 15 August 2020.
  83. Joanna Wesoly, Zofia Szweykowska-Kulinska and Hans A R Bluyssen (31 March 2007). "STAT activation and differential complex formation dictate selectivity of interferon responses". Acta Biochimica Polonica. 54 (1): 27–38. doi:10.18388/abp.2007_3266. PMID 17351669. Retrieved 15 August 2020.
  84. Fernanda M. Rodríguez-Tornos, Iñigo San Aniceto, Beatriz Cubelos, Marta Nieto (31 January 2013). "Enrichment of Conserved Synaptic Activity-Responsive Element in Neuronal Genes Predicts a Coordinated Response of MEF2, CREB and SRF". PLoS ONE. 8 (1): e53848. doi:10.1371/journal.pone.0053848. PMID 23382855. Retrieved 12 November 2018.
  85. Francisco Rivero (2002). "mRNA processing in Dictyostelium: sequence requirements for termination and splicing" (PDF). Protist. 153 (2): 169–76. doi:10.1078/1434-4610-00095. PMID 12125758. Retrieved 2017-04-05. Unknown parameter |month= ignored (help)
  86. Rodrigo Nunes da Fonseca and Thiago M. Venancio (1 March 2018). "Maternal or zygotic: Unveiling the secrets of the Pancrustacea transcription factor zelda". Plos Genetics. 14 (3): e1007201. doi:10.1371/journal.pgen.1007201. PMID 29494591. Retrieved 5 September 2020.
  87. Frank L. Conlon, Lynne Fairclough, Brenda M. J. Price, Elena S. Casey and J. C. Smith (2001). "Determinants of T box protein specificity" (PDF). Development. 128 (19): 3749–3758. PMID 11585801. Retrieved 17 November 2018.
  88. Ce Feng Liu, Gabriel S. Brandt, Quyen Q. Hoang, Natalia Naumova, Vanja Lazarevic, Eun Sook Hwang, Job Dekker, Laurie H. Glimcher, Dagmar Ringe, and Gregory A. Petsko (25 October 2016). "Crystal structure of the DNA binding domain of the transcription factor T-bet suggests simultaneous recognition of distant genome sites". Proceedings of the National Academy of Sciences of the USA. 113 (43): E6572–E6581. doi:10.1073/pnas.1613914113. PMID 27791029. Retrieved 28 August 2020.
  89. Yoshiro Maru (2016). Basic Research, In: "Inflammation and Metastasis". Tokyo: Springer. pp. 193–231. doi:10.1007/978-4-431-56024-1_10. ISBN 978-4-431-56022-7. Retrieved 28 August 2020.
  90. Vitor M S Pinto, Svetlana Minakhina, Shuiqing Qiu, Aniket Sidhaye, Michael P Brotherton, Amy Suhotliv, Fredric E Wondisford (1 September 2017). "Naturally Occurring Amino Acids in Helix 10 of the Thyroid Hormone Receptor Mediate Isoform-Specific TH Gene Regulation". Endocrinology. 158 (9): 3067–3078. doi:10.1210/en.2017-00314. PMID 28911178. Retrieved 5 September 2020.
  91. 91.0 91.1 91.2 91.3 Martin L. Read, Andrew R. Clark and Kevin Docherty (1993). "The helix-loop-helix transcription factor USF (upstream stimulating factor) binds to a regulatory sequence of the human insulin gene enhancer" (PDF). Biochemical Journal. 295: 233–237. doi:10.1042/bj2950233. PMID 8216223. Retrieved 14 August 2020.
  92. Jakob Mejlvang, Marina Kriajevska, Cindy Vandewalle, Tatyana Chernova, A. Emre Sayan, Geert Berx, J. Kilian Mellon, and Eugene Tulchinsky (November 2007). "Direct Repression of Cyclin D1 by SIP1 Attenuates Cell Cycle Progression in Cells Undergoing an Epithelial Mesenchymal Transition". Molecular Biology of the Cell. 18 (11): 4615–4624. doi:10.1091/mbc.e07-05-0406. PMID 17855508. Retrieved 15 November 2018.
  93. 93.0 93.1 Ryuto Maruyama, Makoto Shimizu, Juan Li, Jun Inoue & Ryuichiro Sato (24 March 2016). "Fibroblast growth factor 21 induction by activating transcription factor 4 is regulated through three amino acid response elements in its promoter region". Bioscience, Biotechnology, and Biochemistry. 80 (5): 929–934. doi:10.1080/09168451.2015.1135045. Retrieved 4 October 2020.
  94. Angelika Bröer, Gregory Gauthier-Coles, Farid Rahimi, Michelle van Geldermalsen, Dieter Dorsch􏰀, Ansgar Wegener􏰀, Jeff Holst, and Stefan Bröer (March 15, 2019). "Ablation of the ASCT2 (SLC1A5) gene encoding a neutral amino acid transporter reveals transporter plasticity and redundancy in cancer cells" (PDF). Journal of Biological Chemistry. 294 (11): 4012–4026. doi:10.1074/jbc.RA118.006378. Retrieved 4 October 2020.
  95. Alisa A. Garaeva, Irina E. Kovaleva, Peter M. Chumakov & Alexandra G. Evstafieva (15 January 2016). "Mitochondrial dysfunction induces SESN2 gene expression through Activating Transcription Factor 4". Cell Cycle. 15 (1): 64–71. doi:10.1080/15384101.2015.1120929. PMID 26771712. Retrieved 5 September 2020.
  96. S Kouhpayeh, AR Einizadeh, Z Hejazi, M Boshtam, L Shariati, M Mirian, L Darzi, M Sojoudi, H Khanahmad and A Rezaei (1 July 2016). "Antiproliferative effect of a synthetic aptamer mimicking androgen response elements in the LNCaP cell line" (PDF). Cancer Gene Therapy. 23: 254–257. doi:10.1038/cgt.2016.26. Retrieved 3 October 2020.
  97. Stephen Wilson, Jianfei Qi & Fabian V. Filipp (14 September 2016). "Refinement of the androgen response element based on ChIP-Seq in androgen-insensitive and androgen-responsive prostate cancer cell lines". Scientific Reports. 6: 32611. doi:10.1038/srep32611. Retrieved 3 October 2020.
  98. Akihito Otsuki, Mikiko Suzuki, Fumiki Katsuoka, Kouhei Tsuchida, Hiromi Suda, Masanobu Morita, Ritsuko Shimizu, Masayuki Yamamoto (February 2016). "Unique cistrome defined as CsMBE is strictly required for Nrf2-sMaf heterodimer function in cytoprotection". Free Radical Biology and Medicine. 91: 45–57. doi:10.1016/j.freeradbiomed.2015.12.005. PMID 26677805. Retrieved 21 August 2020.
  99. Sarah E. Lacher, Daniel C. Levings, Samuel Freeman, Matthew Slattery (October 2018). "Identification of a functional antioxidant response element at the HIF1A locus". Redox Biology. 19: 401–411. doi:10.1016/j.redox.2018.08.014. Retrieved 6 October 2020.
  100. Xu Tao, Anne E. West, Wen G. Chen, Gabriel Corfas, Michael E. Greenberg (2002). "A calcium-responsive transcription factor, CaRF, that regulates neuronal activity-dependent expression of BDNF". Neuron. 33: 383–95. doi:10.1016/S0896-6273(01)00561-X. PMID 11832226. Retrieved 2 September 2020.
  101. Hideharu Hashimoto, Dongxue Wang, John R. Horton, Xing Zhang, Victor G. Corces and Xiaodong Cheng (1 June 2017). "Structural Basis for the Versatile and Methylation-Dependent Binding of CTCF to DNA". Molecular Cell. 66 (5): 711–720.e3. doi:10.1016/j.molcel.2017.05.004. PMID 28529057. Retrieved 28 August 2020.
  102. 102.0 102.1 Yan-Hui Li and Gai-Gai Zhang (12 April 2016). "Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans". Oncotarget. 7 (15): 19185–19192. doi:10.18632/oncotarget.8313. PMID 2702736. Retrieved 27 August 2020.
  103. Philipp Mracek, Cristina Santoriello, M. Laura Idda, Cristina Pagano, Zohar Ben-Moshe, Yoav Gothilf, Daniela Vallone, Nicholas S. Foulkes (December 6, 2012). "Regulation of per and cry Genes Reveals a Central Role for the D-Box Enhancer in Light-Dependent Gene Expression". PLoS ONE. 7 (12): e51278. doi:10.1371/journal.pone.0051278. Retrieved 10 February 2019.
  104. Fumiko Hirose, Masamitsu Yamaguchi, Akio Matsukage (September 1999). "Targeted Expression of the DNA Binding Domain of DRE-Binding Factor, a Drosophila Transcription Factor, Attenuates DNA Replication of the Salivary Gland and Eye Imaginal Disc". Molecular and Cellular Biology. 19 (9): 6020–6028. doi:10.1128/MCB.19.9.6020. PMID 10454549. Retrieved 4 September 2020.

External links

{{Phosphate biochemistry}}