Complex locus A1BG and ZNF497: Difference between revisions

Jump to navigation Jump to search
Line 909: Line 909:
# NP_001303959.1 zinc finger protein 418 isoform d: "Transcript Variant: This variant (5) lacks three alternate exons compared to variant 1. The resulting isoform (d) is shorter at the N-terminus compared to isoform a."<ref name=HGNC147686/>
# NP_001303959.1 zinc finger protein 418 isoform d: "Transcript Variant: This variant (5) lacks three alternate exons compared to variant 1. The resulting isoform (d) is shorter at the N-terminus compared to isoform a."<ref name=HGNC147686/>
# NP_597717.1 zinc finger protein 418 isoform c: "Transcript Variant: This variant (3) lacks an alternate 5' exon and uses an alternate in-frame splice junction compared to variant 1. The resulting isoform (c) has a shorter and distinct N-terminus compared to isoform a. Variants 3 and 4 both encode the same isoform (c)."<ref name=HGNC147686/>
# NP_597717.1 zinc finger protein 418 isoform c: "Transcript Variant: This variant (3) lacks an alternate 5' exon and uses an alternate in-frame splice junction compared to variant 1. The resulting isoform (c) has a shorter and distinct N-terminus compared to isoform a. Variants 3 and 4 both encode the same isoform (c)."<ref name=HGNC147686/>
Gene ID: 147687 is ZNF417 zinc finger protein 417.<ref name=HGNC147687>{{ cite web
|author=HGNC
|title=ZNF417 zinc finger protein 417 [ Homo sapiens (human) ]
|publisher=National Center for Biotechnology Information, U.S. National Library of Medicine
|location=8600 Rockville Pike, Bethesda MD, 20894 USA
|date=13 March 2020
|url=https://www.ncbi.nlm.nih.gov/gene/147687
|accessdate=28 May 2020 }}</ref>
# NP_001284663.1  zinc finger protein 417 isoform 2: "Transcript Variant: This variant (2) uses an alternate in-frame splice site in the 5' coding region, compared to variant 1. It encodes isoform 2, which is shorter by one amino acid, compared to isoform 1."<ref name=HGNC147687/>
# NP_689688.2  zinc finger protein 417 isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."<ref name=HGNC147687/>
Gene ID: 147694 is ZNF548 zinc finger protein 548.<ref name=HGNC147694>{{ cite web
|author=HGNC
|title=ZNF548 zinc finger protein 548 [ Homo sapiens (human) ]
|publisher=National Center for Biotechnology Information, U.S. National Library of Medicine
|location=8600 Rockville Pike, Bethesda MD, 20894 USA
|date=13 March 2020
|url=https://www.ncbi.nlm.nih.gov/gene/147694
|accessdate=28 May 2020 }}</ref>
# NP_001166244.1  zinc finger protein 548 isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."<ref name=HGNC147694/>
# NP_690873.2  zinc finger protein 548 isoform 2: "Transcript Variant: This variant (2) lacks an alternate in-frame exon in the 5' coding region, compared to variant 1. The resulting isoform (2) lacks an internal segment, compared to isoform 1."<ref name=HGNC147694/>
Gene ID: 147947 is ZNF542P zinc finger protein 542, pseudogene.<ref name=HGNC147947>{{ cite web
|author=HGNC
|title=ZNF542P zinc finger protein 542, pseudogene [ Homo sapiens (human) ]
|publisher=National Center for Biotechnology Information, U.S. National Library of Medicine
|location=8600 Rockville Pike, Bethesda MD, 20894 USA
|date=13 March 2020
|url=https://www.ncbi.nlm.nih.gov/gene/147947
|accessdate=28 May 2020 }}</ref>
# NR_003127.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (4) has an alternate first exon, includes an additional internal exon in the 5' region, and uses an alternate splice site in an internal exon in the 3' region, compared to variant 1."<ref name=HGNC147947/>
# NR_024055.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (5) has an alternate first exon, includes an additional internal exon in the 5' region, and lacks an internal exon in the 3' region, compared to variant 1."<ref name=HGNC147947/>
# NR_024056.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (2) uses an alternate splice site in an internal exon in the 3' region, compared to variant 1."<ref name=HGNC147947/>
# NR_024057.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (3) lacks an alternate internal exon in the 3' region, compared to variant 1."<ref name=HGNC147947/>
# NR_033418.1 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (1) represents the longest transcript."<ref name=HGNC147947/>


==A boxes==
==A boxes==

Revision as of 01:35, 29 May 2020

Associate Editor(s)-in-Chief: Henry A. Hoff

Alpha-1-B glycoprotein is a 54.3 kDa protein in humans that is encoded by the A1BG gene.[1] The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins.

A1BG is located on the negative DNA strand of chromosome 19 from 58,858,172 – 58,864,865.[2] Additionally, A1BG is located directly adjacent to the ZSCAN22 gene (58,838,385-58,853,712) on the positive DNA strand, as well as the ZNF837 (58,878,990 - 58,892,389, complement) and ZNF497 (58865723 - 58,874,214, complement) genes on the negative strand.[2]

ZSCAN22

  1. Gene ID: 342945 is ZSCAN22 zinc finger and SCAN domain containing 22 on 19q13.43.[3] ZSCAN22 is transcribed in the negative direction from LOC100887072.[3]
  2. Gene ID: 102465484 is MIR6806 microRNA 6806 on 19q13.43: "microRNAs (miRNAs) are short (20-24 nt) non-coding RNAs that are involved in post-transcriptional regulation of gene expression in multicellular organisms by affecting both the stability and translation of mRNAs. miRNAs are transcribed by RNA polymerase II as part of capped and polyadenylated primary transcripts (pri-miRNAs) that can be either protein-coding or non-coding. The primary transcript is cleaved by the Drosha ribonuclease III enzyme to produce an approximately 70-nt stem-loop precursor miRNA (pre-miRNA), which is further cleaved by the cytoplasmic Dicer ribonuclease to generate the mature miRNA and antisense miRNA star (miRNA*) products. The mature miRNA is incorporated into a RNA-induced silencing complex (RISC), which recognizes target mRNAs through imperfect base pairing with the miRNA and most commonly results in translational inhibition or destabilization of the target mRNA. The RefSeq represents the predicted microRNA stem-loop."[4] MIR6806 is transcribed in the negative direction from LOC105372480.[4]

Alpha-1-B glycoprotein

Def. "a substance that induces an immune response, usually foreign"[5] is called an antigen.

Def. any "substance that elicits [an] immune response"[6] is called an immunogen.

An antigen "or immunogen is a molecule that sometimes stimulates an immune system response."[7] But, "the immune system does not consist of only antibodies",[7] instead it "encompasses all substances that can be recognized by the adaptive immune system."[7]

Def. "a protein produced by B-lymphocytes that binds to an [a specific][8] antigen"[9] is called an antibody.

Five different antibody isotypes are known in mammals, which perform different roles, and help direct the appropriate immune response for each different type of foreign object they encounter.[10]

Although the general structure of all antibodies is very similar, a small region, known as the hypervariable region, at the tip of the protein is extremely variable, allowing millions of antibodies with slightly different tip structures to exist, where each of these variants can bind to a different target, known as an antigen.[11]

Def. "any of the glycoproteins in blood serum that respond to invasion by foreign antigens and that protect the host by removing pathogens;"[12] "an antibody"[13] is called an immunoglobulin.

Gene ID: 1 is A1BG alpha-1-B glycoprotein on 19q13.43, a 54.3 kDa protein in humans that is encoded by the A1BG gene.[14] A1BG is transcribed in the positive direction from ZNF497.[14] "The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins."[14]

  1. NP_570602.2 alpha-1B-glycoprotein precursor, cd05751 Location: 401 → 493 Ig1_LILRB1_like; First immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILR)B1 (also known as LIR-1) and similar proteins, smart00410 Location: 218 → 280 IG_like; Immunoglobulin like, pfam13895 Location: 210 → 301 Ig_2; Immunoglobulin domain and cl11960 Location: 28 → 110 Ig; Immunoglobulin domain.[14]

Patients who have pancreatic ductal adenocarcinoma show an overexpression of A1BG in pancreatic juice.[15]

  1. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[16] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[16]

Immunoglobulin supergene family

"𝛂1B-glycoprotein(𝛂1B) [...] consists of a single polypeptide chain N-linked to four glucosamine oligosaccharides. The polypeptide has five intrachain disulfide bonds and contains 474 amino acid residues. [...] 𝛂1B exhibits internal duplication and consists of five repeating structural domains, each containing about 95 amino acids and one disulfide bond. [...] several domains of 𝛂1B, especially the third, show statistically significant homology to variable regions of certain immunoglobulin light and heavy chains. 𝛂1B [...] exhibits sequence similarity to other members of the immunoglobulin supergene family such as the receptor for transepithelial transport of IgA and IgM and the secretory component of human IgA."[17]

"Some of the domains of 𝛂1B show significant homology to variable (V) and constant (C) regions of certain immunoglobulins. Likewise, there is statistically significant homology between 𝛂1B and the secretory component (SC) of human IgA (15) and also with the extracellular portion of the rabbit receptor for transepithelial transport of polymeric immunoglobulins (IgA and IgM). Mostov et al. (16) have called the later protein the poly-Ig receptor or poly-IgR and have shown that it is the precursor of SC."[17]

The immunoglobulin supergene family is "the group of proteins that have immunoglobulin-like domains, including histocompatibility antigens, the T-cell antigen receptor, poly-IgR, and other proteins involved in the vertebrate immune response (17)."[17]

"The internal homology in primary structure [...] and the presence of an intrasegment disulfide bond suggest that 𝛂1B is composed of five structural domains that arose by duplication of a primordial gene coding for about 95 amino acid residues."[17]

"Unlike immunoglobulins (25), ceruloplasmin (6), and hemopexin (7), 𝛂1B is not subject to limited interdomain cleavage by proteolytic enzymes. At least, we were not able to produce such fragments by use of a variety of proteases. This stability of 𝛂1B is probably associated with the frequency of proline in the sequences linking the domains [...]."[17]

"A peptide identified in the late and early milk proteomes showed homology to eutherian alpha 1B glycoprotein (A1BG), a plasma protein with unknown function46, as well as venom inhibitors characterised in the Southern opossum Didelphis marsupialis (DM43 and DM4647,48,49), all members of the immunoglobulin superfamily. To characterise the relationship between the peptide sequence identified in koala, A1BG, DM43 and DM46, a phylogenetic tree was constructed [...] including all marsupial and monotreme homologs (identified by BLAST), three phylogenetically representative eutherian sequences, with human IGSF1 and TARM1, related members of the immunoglobulin super family, used as outgroups. This phylogeny indicates that A1BG-like proteins in marsupials and the Didelphis antitoxic proteins are homologs of eutherian A1BG, with excellent bootstrap support (98%). The marsupial A1BG-like sequences and the Didelphis antitoxic proteins formed a single clade with strong bootstrap support (97%)."[18]

"Human TARM1 and IGSF1, related members of the immunoglobulin superfamily are used as outgroups. The tree was constructed using the maximum likelihood approach and the JTT model with bootstrap support values from 500 bootstrap tests. Bootstrap values less than 50% are not displayed. Accession numbers: Tasmanian devil (Sarcophilus harrisii; XP_012402143), Wallaby (Macropus eugenii; FY619507), Possum (Trichosurus vulpecula; DY596639) Virginia opossum (Didelphis virginiana; AAA30970, AAN06914), Southern opossum (Didelphis marsupialis; AAL82794, P82957, AAN64698), Human (Homo sapiens; P04217, B6A8C7, Q8N6C5), Platypus (Ornithorhychus anatinus; ENSOANP00000000762), Cow (Bos taurus; Q2KJF1), Alpaca (Vicugna pacos; XP_015107031)."[18]

"The sequences of 𝛂1B-glycoprotein (38) and chicken N-CAM (neural cell-adhesion molecule) (39) have been shown to be related to the immunoglobulin supergene family."[19]

A1BG contains the immunoglobulin domain: cl11960 and three immunoglobulin-like domains: pfam13895, cd05751 and smart00410.

"Immunoglobulin (Ig) domain [cl11960] found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond."[20]

"This domain [pfam13895] contains immunoglobulin-like domains."[21]

"Ig1_LILR_KIR_like: [cd05751] domain similar to the first immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs) and Natural killer inhibitory receptors (KIRs). This group includes LILRB1 (or LIR-1), LILRA5 (or LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (or CD89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3,and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1 (LIR-1), LILRB2 (LIR-2), and LILRB3 (LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs) it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LIR-1 (LILRB1), and LIR-2 (LILRB2), show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1 (LIR-1), and LILRB2 (LIR-2), and two in the case of LILRB4 (LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst."[22]

"IG domains [smart00410] that cannot be classified into one of IGv1, IGc1, IGc2, IG."[23] "𝛂1B-glycoprotein(𝛂1B) [...] consists of a single polypeptide chain N-linked to four glucosamine oligosaccharides. The polypeptide has five intrachain disulfide bonds and contains 474 amino acid residues. [...] 𝛂1B exhibits internal duplication and consists of five repeating structural domains, each containing about 95 amino acids and one disulfide bond. [...] several domains of 𝛂1B, especially the third, show statistically significant homology to variable regions of certain immunoglobulin light and heavy chains. 𝛂1B [...] exhibits sequence similarity to other members of the immunoglobulin supergene family such as the receptor for transepithelial transport of IgA and IgM and the secretory component of human IgA."[17]

A1BG protein species

Def. a "group of plants or animals having similar appearance"[24] or "the largest group of organisms in which [any][25] two individuals [of the appropriate sexes or mating types][25] can produce fertile offspring, typically by sexual reproduction"[26] is called a species.

The gene contains 20 distinct introns.[27] Transcription produces 15 different mRNAs, 10 alternatively spliced variants and 5 unspliced forms.[27] There are 4 probable alternative promoters, 4 non overlapping alternative last exons and 7 validated alternative polyadenylation sites.[27] The mRNAs appear to differ by truncation of the 5' end, truncation of the 3' end, presence or absence of 4 cassette exons, overlapping exons with different boundaries, splicing versus retention of 3 introns.[27]

Variants or isoforms

Def. a "different sequence of a gene (locus)"[28] is called a variant.

Def. any "of several different forms of the same protein, arising from either single nucleotide polymorphisms,[29] differential splicing of mRNA, or post-translational modifications (e.g. sulfation, glycosylation, etc.)"[30] is called an isoform.

Regarding additional isoforms, mention has been made of "new genetic variants of A1BG."[31]

"Proteomic analysis revealed that [a circulating] set of plasma proteins was α 1 B-glycoprotein (A1BG) and its post-translationally modified isoforms."[32]

Pharmacogenomic variants have been reported.[33]

Genotypes

Def. the "part (DNA sequence) of the genetic makeup of an organism which determines a specific characteristic (phenotype) of that organism"[34] or a "group of organisms having the same genetic constitution" [35]is called a genotype.

There are A1BG genotypes.[33]

A1BG has a genetic risk score of rs893184.[33]

"A genetic risk score, including rs16982743, rs893184, and rs4525 in F5, was significantly associated with treatment-related adverse cardiovascular outcomes in whites and Hispanics from the INVEST study and in the Nordic Diltiazem study (meta-analysis interaction P=2.39×10−5)."[33]

Polymorphs

Def. the "regular existence of two or more different genotypes within a given species or population; also, variability of amino acid sequences within a gene's protein"[36] is called polymorphism.

Def. "one of a number of alternative forms of the same gene occupying a given position, [or locus],[37] on a chromosome"[38] is called an allele.

"rs893184 causes a histidine (His) to arginine (Arg) [nonsynonymous single nucleotide polymorphism (nsSNP), A (minor) for G (major)] substitution at amino acid position 52 in A1BG."[33]

"Genetic polymorphism of human plasma (serum) alpha 1B-glycoprotein (alpha 1B) was observed using one-dimensional horizontal polyacrylamide gel electrophoresis (PAGE) pH 9.0 of plasma samples followed by Western blotting with specific antiserum to alpha 1B."[39]

A1B*5 is a "new allele [...] of human plasma 𝜶1B-glycoprotein [...]."[40]

"Genetic polymorphism of human plasma 𝜶1B-glycoprotein (𝜶1B) was reported first, in brief, by Altland et al. [1983; also given in Altkand and Hacklar, 1984]. A detailed description of human 𝜶1B polymorphism was reported in subsequent studies [Gahne et al., 1987; Juneja et al., 1988, 1989]. Five different 𝜶1B alleles (A1B*1, A1B*2, A1B*3, A1B*4 and A1B*5) were reported. In Caucasian whites, the frequencies of A1B*1 and ''A1B*2 were about 0.95 and 0.05, respectively. A1B*4 was observed in 2 related Czech individuals. In American blacks, A1B*1 and A1B*2 occurred with a frequency of 0.73 and 0.21, respectively, while a new allele, viz, A1B*3 had a frequency of 0.06. A1B*5 was observed only in Swedish Lapps and in Finns with a frequency of 0.04 and 0.007, respectively."[41]

"The frequency of A1B*1 varied from 0.89 to 0.91 and that of A1B*2 from 0.08 to 0.10. The A1B*3 allele, reported previously only in American blacks, was observed with a frequency range of 0.003-0.01 in 3 of the Chinese populations, in Koreans and in Malays. A new 𝜶1B allele (A1B*6) was observed in 2 Chinese individuals."[41]

Phenotypes

Def. the "appearance of an organism based on a single trait [multifactorial combination of genetic traits and environmental factors][42], especially used in pedigrees"[43] or any "observable characteristic of an organism, such as its morphological, developmental, biochemical or physiological properties, or its behavior"[44] is called a phenotype.

"The three different phenotypes of α1B observed (designated 1-1, 1-2, and 2-2) were apparently identical to those reported by Altland et al. (1983), who used double one-dimensional electrophoresis. Family data supported the hypothesis that the three α1B phenotypes are determined by two codominant alleles at an autosomal locus, designated A1B. Allele frequencies in a Swedish population were: A1B *1, 0.937; A1B *2, 0.063; PIC, 0.111."[39]

Protein species

"Both protein species of [alpha 1-beta glycoprotein] A1B (A1Ba, p = 0.008; f.c.= +1.62, A1Bb, p = 0.003; f.c. = +1.82) [...] were apparently overexpressed in patients with PTCa [...]."[45]

A1BG is mainly produced in the liver, and is secreted to plasma to levels of approximately 0.22 mg/mL.[17]

CRISPs

The human cysteine-rich secretory protein (CRISP3) "is present in exocrine secretions and in secretory granules of neutrophilic granulocytes and is believed to play a role in innate immunity."[46] CRISP3 has a relatively high content in human plasma.[46]

"The A1BG-CRISP-3 complex is noncovalent with a 1:1 stoichiometry and is held together by strong electrostatic forces."[46] "Similar [complex formation] between toxins from snake venom and A1BG-like plasma proteins ... inhibits the toxic effect of snake venom metalloproteinases or myotoxins and protects the animal from envenomation."[46]

Opossums have a remarkably robust immune system, and show partial or total immunity to the venom of rattlesnakes, Agkistrodon piscivorus, cottonmouths, and other Crotalinae, pit vipers.[47][48]

"Crisp3 [is] mainly [expressed] in the salivary glands, pancreas, and prostate."[49] "CRISP3 is highly expressed in the human cauda epididymidis and ampulla of vas deferens (Udby et al. 2005)."[49]

ZNF497

  1. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[16] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[16]
  2. Gene ID: 162968 is ZNF497 zinc finger protein 497.[50] ZNF497 is transcribed in the positive direction from RNA5SP473.[50]
  3. Gene ID: 100419840 is LOC100419840 zinc finger protein 446 pseudogene.[51] LOC100419840 may be transcribed in the positive direction from LOC105372483.[51]
  4. Gene ID: 105372483 is LOC105372483 uncharacterized LOC105372483 ncRNA.[52] LOC105372483 is transcribed in the negative direction from LOC100419840.[52]
  5. Gene ID: 106479017 is RNA5SP473 RNA, 5S ribosomal pseudogene 473.[53] RNA5SP473 may be transcribed in the negative direction from ZNF497.[53]

19q13.43

Gene ID: 2282 is FKBP1AP1 FKBP prolyl isomerase 1A pseudogene 1.[54]

  1. NR_024162.1 RNA Sequence, ncRNA.[54]

Gene ID: 6795 is AURKC aurora kinase C: "This gene encodes a member of the Aurora subfamily of serine/threonine protein kinases. The encoded protein is a chromosomal passenger protein that forms complexes with Aurora-B and inner centromere proteins and may play a role in organizing microtubules in relation to centrosome/spindle function during mitosis. This gene is overexpressed in several cancer cell lines, suggesting an involvement in oncogenic signal transduction. Alternative splicing results in multiple transcript variants."[55]

  1. NP_001015878.1 aurora kinase C isoform 1: "Transcript Variant: This variant (1) encodes the longest isoform (1)."[55]
  2. NP_001015879.1 aurora kinase C isoform 2: "Transcript Variant: This variant (2), also known as Aurora C-SV, uses an alternate splice site at the 5' end of the first intron and an alternate upstream translation initiation site, compared to variant 1. The encoded protein (isoform 2) contains a shorter and distinct N-terminus, compared to isoform 1."[55]
  3. NP_003151.2 aurora kinase C isoform 3: "Transcript Variant: This variant (3) uses an alternate splice site at the 5' end of the first intron and an alternate downstream translation initiation site, compared to variant 1. The encoded protein (isoform 3) has a shorter N-terminus, compared to isoform 1."[55]

Gene ID: 7554 is ZNF8 zinc finger protein 8.[56]

Gene ID: 9040 is UBE2M ubiquitin conjugating enzyme E2 M: "The modification of proteins with ubiquitin is an important cellular mechanism for targeting abnormal or short-lived proteins for degradation. Ubiquitination involves at least three classes of enzymes: ubiquitin-activating enzymes, or E1s, ubiquitin-conjugating enzymes, or E2s, and ubiquitin-protein ligases, or E3s. This gene encodes a member of the E2 ubiquitin-conjugating enzyme family. The encoded protein is linked with a ubiquitin-like protein, NEDD8, which can be conjugated to cellular proteins, such as Cdc53/culin."[57]

Gene ID: 10172 is ZNF256 zinc finger protein 256.[58]

  1. NP_001362332.1 zinc finger protein 256 isoform 2 variant 2.[58]
  2. NP_005764.2 zinc finger protein 256 isoform 1 variant 1.[58]

Gene ID: 10998 is SLC27A5 solute carrier family 27 member 5: "The protein encoded by this gene is an isozyme of very long-chain acyl-CoA synthetase (VLCS). It is capable of activating very long-chain fatty-acids containing 24- and 26-carbons. It is expressed in liver and associated with endoplasmic reticulum but not with peroxisomes. Its primary role is in fatty acid elongation or complex lipid synthesis rather than in degradation. This gene has a mouse ortholog."[59]

  1. NP_001308125.1 bile acyl-CoA synthetase isoform 2 precursor variant 2.[59]
  2. NP_036386.1 bile acyl-CoA synthetase isoform 1 precursor variant 1.[59]

Gene ID: 25799 is ZNF324 zinc finger protein 324.[60]

Gene ID: 27300 is ZNF544 zinc finger protein 544.[61]

  1. NP_001307696.1 zinc finger protein 544 isoform 1: "Transcript Variant: This variant (2) differs in the 5' UTR compared to variant 1. Variants 1 - 3 encode the same protein (isoform 1)."[61]
  2. NP_001307698.1 zinc finger protein 544 isoform 1: "Transcript Variant: This variant (3) differs in the 5' UTR compared to variant 1. Variants 1 - 3 encode the same protein (isoform 1)."[61]
  3. NP_001307699.1 zinc finger protein 544 isoform 2: "Transcript Variant: This variant (4) differs in the 5' UTR and lacks an alternate in-frame exon in the coding region, compared to variant 1. It encodes isoform 2, which is shorter than isoform 1. Variants 4 - 6 encode the same protein (isoform 2)."[61]
  4. NP_001307700.1 zinc finger protein 544 isoform 2: "Transcript Variant: This variant (5) lacks an alternate in-frame exon in the coding region, compared to variant 1. It encodes isoform 2, which is shorter than isoform 1. Variants 4 - 6 encode the same protein (isoform 2)."[61]
  5. NP_001307702.1 zinc finger protein 544 isoform 2: "Transcript Variant: This variant (6) differs in the 5' UTR and lacks an alternate in-frame exon in the coding region, compared to variant 1. It encodes isoform 2, which is shorter than isoform 1. Variants 4 - 6 encode the same protein (isoform 2)."[61]
  6. NP_001307703.1 zinc finger protein 544 isoform 3: "Transcript Variant: This variant (7) differs in the 5' UTR and uses an alternate in-frame splice site in the 3' terminal exon, compared to variant 1. It encodes isoform 3, which is shorter than isoform 1."[61]
  7. NP_001307705.1 zinc finger protein 544 isoform 4: "Transcript Variant: This variant (8) has multiple differences compared to variant 1. These differences result in distinct 5' and 3' UTRs and cause translation initiation at an alternate start site, compared to variant 1. The encoded protein (isoform 4) is shorter and has distinct N- and C-termini, compared to isoform 1."[61]
  8. NP_001307706.1 zinc finger protein 544 isoform 5: "Transcript Variant: This variant (9) has multiple differences compared to variant 1. These differences result in distinct 5' and 3' UTRs and cause translation initiation at an alternate start site, compared to variant 1. The encoded protein (isoform 5) is shorter and has distinct N- and C-termini, compared to isoform 1."[61]
  9. NP_001307709.1 zinc finger protein 544 isoform 6: "Transcript Variant: This variant (10) contains an alternate 3' terminal exon, compared to variant 1. It encodes a shorter isoform (6) which has a distinct C-terminus, compared to isoform 1."[61]
  10. NP_001307710.1 zinc finger protein 544 isoform 7: "Transcript Variant: This variant (11) has multiple differences compared to variant 1. These differences result in distinct 5' and 3' UTRs and cause translation initiation at an alternate start site, compared to variant 1. The encoded protein (isoform 7) is shorter and has distinct N- and C-termini, compared to isoform 1."[61]
  11. NP_001307711.1 zinc finger protein 544 isoform 8: "Transcript Variant: This variant (12) contains an alternate 3' terminal exon, compared to variant 1. It encodes a shorter isoform (8) which has a distinct C-terminus, compared to isoform 1."[61]
  12. NP_001307712.1 zinc finger protein 544 isoform 9: "Transcript Variant: This variant (13) has multiple differences compared to variant 1. These differences result in distinct 5' and 3' UTRs and cause translation initiation at an alternate start site, compared to variant 1. The encoded protein (isoform 9) is shorter and has distinct N- and C-termini, compared to isoform 1."[61]
  13. NP_001307714.1 zinc finger protein 544 isoform 10: "Transcript Variant: This variant (14) uses alternate splice sites in the penultimate and 3' terminal exons, compared to variant 1. It encodes isoform 10 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 14 - 16 encode the same protein (isoform 10)."[61]
  14. NP_001307715.1 zinc finger protein 544 isoform 10: "Transcript Variant: This variant (15) differs in the 5' UTR and uses alternate splice sites in the penultimate and 3' terminal exons, compared to variant 1. It encodes isoform 10 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 14 - 16 encode the same protein (isoform 10)."[61]
  15. NP_001307716.1 zinc finger protein 544 isoform 10: "Transcript Variant: This variant (16) differs in the 5' UTR and uses alternate splice sites in the penultimate and 3' terminal exons, compared to variant 1. It encodes isoform 10 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 14 - 16 encode the same protein (isoform 10)."[61]
  16. NP_001307717.1 zinc finger protein 544 isoform 11: "Transcript Variant: This variant (17) uses an alternate splice site in the penultimate exon, compared to variant 1. It encodes isoform 11 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 17 - 20 encode the same protein (isoform 11)."[61]
  17. NP_001307718.1 zinc finger protein 544 isoform 11: "Transcript Variant: This variant (18) differs in the 5' UTR and uses an alternate splice site in the penultimate exon, compared to variant 1. It encodes isoform 11 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 17 - 20 encode the same protein (isoform 11)."[61]
  18. NP_001307720.1 zinc finger protein 544 isoform 11: "Transcript Variant: This variant (19) differs in the 5' UTR and uses an alternate splice site in the penultimate exon, compared to variant 1. It encodes isoform 11 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 17 - 20 encode the same protein (isoform 11)."[61]
  19. NP_001307721.1 zinc finger protein 544 isoform 11: "Transcript Variant: This variant (20) differs in the 5' UTR and uses an alternate splice site in the penultimate exon, compared to variant 1. It encodes isoform 11 which is shorter and has a distinct C-terminus, compared to isoform 1. Variants 17 - 20 encode the same protein (isoform 11)."[61]
  20. NP_055295.2 zinc finger protein 544 isoform 1: "Transcript Variant: This variant (1) encodes the longest isoform (1). Variants 1 - 3 encode the same protein (isoform 1)."[61]

Gene ID: 27338 is UBE2S ubiquitin conjugating enzyme E2 S: "This gene encodes a member of the ubiquitin-conjugating enzyme family. The encoded protein is able to form a thiol ester linkage with ubiquitin in a ubiquitin activating enzyme-dependent manner, a characteristic property of ubiquitin carrier proteins."[62]

Gene ID: 54807 is ZNF586 zinc finger protein 586.[63]

  1. NP_001070894.1 zinc finger protein 586 isoform b: "Transcript Variant: This variant (2) lacks an alternate exon that results in a frameshift in the 3' coding region, compared to variant 1. The encoded isoform (b) has a distinct C-terminus and is shorter than isoform a."[63]
  2. NP_001191743.1 zinc finger protein 586 isoform c: "Transcript Variant: This variant (3) differs in its 5' UTR and uses a downstream start codon, compared to variant 1. The encoded isoform (c) is shorter at the N-terminus, compared to isoform a."[63]
  3. NP_060122.2 zinc finger protein 586 isoform a: "Transcript Variant: This variant (1) encodes the longest isoform (a)."[63]

Gene ID: 55663 is ZNF446 zinc finger protein 446.[64]

  1. NP_001291382.1 zinc finger protein 446 isoform 2: "Transcript Variant: This variant (2) has a shorter 5' UTR and differs in the 3' exon structure, compared to variant 1. The encoded isoform (2) has a distinct, shorter C-terminus compared to isoform 1."[64]
  2. NP_060378.1 zinc finger protein 446 isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."[64]

Gene ID: 57573 is ZNF471 zinc finger protein 471.[64]

  1. NP_001308697.1 zinc finger protein 471 isoform, 2 variant 2.[64]
  2. NP_065864.2 zinc finger protein 471 isoform 1, 2 variant 1.[64]

Gene ID: 57663 is USP29 ubiquitin specific peptidase 29.[65]

Gene ID: 63934 is ZNF667 zinc finger protein 667.[66]

  1. NP_001308284.1 zinc finger protein 667 isoform 2, variant 3.[66]
  2. NP_001308285.1 zinc finger protein 667 isoform 1, variant 2.[66]
  3. NP_071386.3 zinc finger protein 667 isoform 1: "Transcript Variant: This variant (1) represents the shorter transcript and encodes the functional protein."[66]

Gene ID 65982 is ZSCAN18 zinc finger and SCAN domain containing 18.[67]

  1. NP_001139014.1 zinc finger and SCAN domain-containing protein 18 isoform 1: "Transcript Variant: This variant (1) encodes the longest protein (isoform 1)."[67]
  2. NP_001139015.1 zinc finger and SCAN domain-containing protein 18 isoform 3: "Transcript Variant: This variant (4) uses a different segment for its 5' UTR and lacks a coding region segment, which results in the use of a downstream start codon, compared to variant 1. Variants 3 and 4 encode the same protein (isoform 3), which is shorter when it is compared to isoform 1."[67]
  3. NP_001139016.1 zinc finger and SCAN domain-containing protein 18 isoform 2: "Transcript Variant: This variant (2) uses a different segment for its 5' UTR and lacks a coding region segment, which results in the use of a downstream start codon, compared to variant 1. The resulting protein (isoform 2) has a shorter N-terminus when it is compared to isoform 1."[67]
  4. NP_076415.3 zinc finger and SCAN domain-containing protein 18 isoform 3: "Transcript Variant: This variant (3) uses a different segment for its 5' UTR and lacks a coding region segment, which results in the use of a downstream start codon, compared to variant 1. Variants 3 and 4 encode the same protein (isoform 3), which is shorter when it is compared to isoform 1."[67]

Gene ID: 65996 is CENPBD1P1 CENPB DNA-binding domains containing 1 pseudogene 1, non-coding RNA.[68]

Gene ID: 79149 is ZSCAN5A zinc finger and SCAN domain containing 5A.[69]

  1. NP_001308990.1 zinc finger and SCAN domain-containing protein 5A isoform a: "Transcript Variant: This variant (1) and variant 2 both encode isoform a."[69]
  2. NP_001308991.1 zinc finger and SCAN domain-containing protein 5A isoform a: "Transcript Variant: This variant (2) and variant 1 both encode isoform a."[69]
  3. NP_001308993.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (4), along with variants 3, 5-8, and 10-11, encodes isoform b."[69]
  4. NP_001308994.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (5), along with variants 3-4, 6-8, and 10-11, encodes isoform b."[69]
  5. NP_001308995.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (6), along with variants 3-5, 7-8, and 10-11, encodes isoform b."[69]
  6. NP_001308996.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (7), along with variants 3-6, 8, and 10-11, encodes isoform b."[69]
  7. NP_001308997.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (8), along with variants 3-7 and 10-11, encodes isoform b."[69]
  8. NP_001308999.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (10), along with variants 3-8 and 11, encodes isoform b."[69]
  9. NP_001309001.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (11), along with variants 3-8 and 10, encodes isoform b."[69]
  10. NP_001309002.1 zinc finger and SCAN domain-containing protein 5A isoform c: "Transcript Variant: This variant (12), along with variants 13-15, encodes isoform c."[69]
  11. NP_001309003.1 zinc finger and SCAN domain-containing protein 5A isoform c: "Transcript Variant: This variant (13), along with variants 12 and 14-15, encodes isoform c."[69]
  12. NP_001309004.1 zinc finger and SCAN domain-containing protein 5A isoform c: "Transcript Variant: This variant (14), along with variants 12-13 and 15, encodes isoform c."[69]
  13. NP_001309005.1 zinc finger and SCAN domain-containing protein 5A isoform c: "Transcript Variant: This variant (15), along with variants 12-14, encodes isoform c."[69]
  14. NP_001309006.1 zinc finger and SCAN domain-containing protein 5A isoform d.[69]
  15. NP_001309007.1 zinc finger and SCAN domain-containing protein 5A isoform e.[69]
  16. NP_077279.1 zinc finger and SCAN domain-containing protein 5A isoform b: "Transcript Variant: This variant (3), along with variants 4-8 and 10-11, encodes isoform b."[69]

Gene ID: 79673 is ZNF329 zinc finger protein 329.[70]

Gene ID: 79744 is ZNF419 zinc finger protein 419.[71]

  1. NP_001091961.1 zinc finger protein 419 isoform 1: "Transcript Variant: This variant (1) encodes the longest isoform (1)."[71]
  2. NP_001091962.1 zinc finger protein 419 isoform 3: "Transcript Variant: This variant (3) lacks an alternate in-frame exon in the 5' coding region, compared to variant 1, resulting in an isoform (3) that is shorter than isoform 1."[71]
  3. NP_001091963.1 zinc finger protein 419 isoform 4: "Transcript Variant: This variant (4) lacks an alternate in-frame exon and uses an alternate in-frame splice site in the 5' coding region, compared to variant 1, resulting in an isoform (4) that is shorter than isoform 1."[71]
  4. NP_001091964.1 zinc finger protein 419 isoform 5: "Transcript Variant: This variant (5) lacks an alternate in-frame exon in the 5' coding region, compared to variant 1, resulting in an isoform (5) that is shorter than isoform 1."[71]
  5. NP_001091965.1 zinc finger protein 419 isoform 6: "Transcript Variant: This variant (6) lacks two alternate in-frame exons in the 5' coding region, compared to variant 1, resulting in an isoform (6) that is shorter than isoform 1."[71]
  6. NP_001091966.1 zinc finger protein 419 isoform 7: "Transcript Variant: This variant (7) lacks two alternate in-frame exons and and uses an alternate in-frame splice site in the 5' coding region, compared to variant 1, resulting in an isoform (7) that is shorter than isoform 1."[71]
  7. NP_001278672.1 zinc finger protein 419 isoform 8: "Transcript Variant: This variant (8) uses an alternate in-frame splice site and lacks an alternate in-frame exon in the 5' coding region, compared to variant 1, resulting in an isoform (8) that is shorter than isoform 1."[71]
  8. NP_001278673.1 zinc finger protein 419 isoform 9 precursor: "Transcript Variant: This variant (9) uses an alternate splice site and it thus differs in its 5' UTR and initiates translation at a downstream in-frame start codon, and it also lacks an in-frame exon in the 5' coding region, compared to variant 1. The encoded isoform (9) is shorter at the N-terminus, compared to isoform 1."[71]
  9. NP_001278674.1 zinc finger protein 419 isoform 10: "Transcript Variant: This variant (10) lacks an alternate in-frame exon in the 5' coding region, and it uses an alternate splice site in its 3' terminal exon and thus differs in the 3' coding region, compared to variant 1. The encoded isoform (10) has a distinct C-terminus and is shorter than isoform 1."[71]
  10. NP_078967.3 zinc finger protein 419 isoform 2: "Transcript Variant: This variant (2) uses an alternate in-frame splice site in the 5' coding region, compared to variant 1, resulting in an isoform (2) that is shorter than isoform 1."[71]

Gene ID: 79818 is ZNF552 zinc finger protein 552.[72]

Gene ID: 79891 is ZNF671 zinc finger protein 671.[73]

  1. NP_001308304.1 zinc finger protein 671 isoform 2.[73]
  2. NP_001308305.1 zinc finger protein 671 isoform 3.[73]
  3. NP_079109.2 zinc finger protein 671 isoform 1.[73]

Gene ID: 84878 is ZBTB45 zinc finger and BTB domain containing 45.[74]

  1. NP_001303907.1 zinc finger and BTB domain-containing protein 45: "Transcript Variant: This variant (1) represents the longest transcript. All six variants encode the same protein."[74]
  2. NP_001303908.1 zinc finger and BTB domain-containing protein 45: "Transcript Variant: This variant (3) differs in the 5' UTR compared to variant 1. All six variants encode the same protein."[74]
  3. NP_001303909.1 zinc finger and BTB domain-containing protein 45: "Transcript Variant: This variant (4) differs in the 5' UTR compared to variant 1. All six variants encode the same protein."[74]
  4. NP_001303910.1 zinc finger and BTB domain-containing protein 45: "Transcript Variant: This variant (5) differs in the 5' UTR compared to variant 1. All six variants encode the same protein."[74]
  5. NP_001303911.1 zinc finger and BTB domain-containing protein 45: "Transcript Variant: This variant (6) differs in the 5' UTR compared to variant 1. All six variants encode the same protein."[74]
  6. NP_116181.1 zinc finger and BTB domain-containing protein 45: "Transcript Variant: This variant (2) differs in the 5' UTR compared to variant 1. All six variants encode the same protein."[74]

Gene ID: 84914 is ZNF587 zinc finger protein 587.[75]

  1. NP_001191746.1 zinc finger protein 587 isoform 2: "Transcript Variant: This variant (2) uses an alternate in-frame splice site in the 5' coding region, compared to variant 1, resulting in an isoform (2) that is 1 aa shorter than isoform 1."[75]
  2. NP_116217.1 zinc finger protein 587 isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."[75]

Gene ID: 90233 is ZNF551 zinc finger protein 551.[76]

  1. NP_001257867.1 zinc finger protein 551 isoform 2: "Transcript Variant: This variant (2) uses an alternate splice site at the 3' end of the first exon compared to variant 1. The resulting isoform (2) is shorter at the N-terminus compared to isoform 1."[76]
  2. NP_612356.2 zinc finger protein 551 isoform 1: "Transcript Variant: This variant (1) represents the longest transcript and encodes the longer isoform (1)."[76]

Gene ID: 90485 is ZNF835 zinc finger protein 835.[77]

Gene ID: 116412 is ZNF837 zinc finger protein 837.[78]

  1. NP_612475.1 zinc finger protein 837: "Transcript Variant: This variant (2) uses an alternate splice site, compared to variant 1, and is protein-coding."[78] cl26386 Location: 3 → 265 DNA_pol3_gamma3; DNA polymerase III subunits gamma and tau domain III.[78]

Gene ID: 125919 is ZNF543 zinc finger protein 543.[79]

Gene ID: 147670 is SMIM17 small integral membrane protein 17.[80]

Gene ID: 147685 is C19orf18 chromosome 19 open reading frame 18.[81]

  1. NP_689687.1 uncharacterized protein C19orf18 precursor.[81]

Gene ID: 147686 is ZNF418 zinc finger protein 418.[82]

  1. NP_001303956.1 zinc finger protein 418 isoform a: "Transcript Variant: This variant (1) represents the longest transcript and encodes the longest isoform (a)."[82]
  2. NP_001303957.1 zinc finger protein 418 isoform b: "Transcript Variant: This variant (2) differs in the 5' UTR and coding sequence compared to variant 1. The resulting isoform (b) has a shorter and distinct N-terminus compared to isoform a."[82]
  3. NP_001303958.1 zinc finger protein 418 isoform c: "Transcript Variant: This variant (4) differs in the 5' UTR and coding sequence compared to variant 1. The resulting isoform (c) has a shorter and distinct N-terminus compared to isoform a. Variants 3 and 4 both encode the same isoform (c)."[82]
  4. NP_001303959.1 zinc finger protein 418 isoform d: "Transcript Variant: This variant (5) lacks three alternate exons compared to variant 1. The resulting isoform (d) is shorter at the N-terminus compared to isoform a."[82]
  5. NP_597717.1 zinc finger protein 418 isoform c: "Transcript Variant: This variant (3) lacks an alternate 5' exon and uses an alternate in-frame splice junction compared to variant 1. The resulting isoform (c) has a shorter and distinct N-terminus compared to isoform a. Variants 3 and 4 both encode the same isoform (c)."[82]

Gene ID: 147687 is ZNF417 zinc finger protein 417.[83]

  1. NP_001284663.1 zinc finger protein 417 isoform 2: "Transcript Variant: This variant (2) uses an alternate in-frame splice site in the 5' coding region, compared to variant 1. It encodes isoform 2, which is shorter by one amino acid, compared to isoform 1."[83]
  2. NP_689688.2 zinc finger protein 417 isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."[83]

Gene ID: 147694 is ZNF548 zinc finger protein 548.[84]

  1. NP_001166244.1 zinc finger protein 548 isoform 1: "Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1)."[84]
  2. NP_690873.2 zinc finger protein 548 isoform 2: "Transcript Variant: This variant (2) lacks an alternate in-frame exon in the 5' coding region, compared to variant 1. The resulting isoform (2) lacks an internal segment, compared to isoform 1."[84]

Gene ID: 147947 is ZNF542P zinc finger protein 542, pseudogene.[85]

  1. NR_003127.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (4) has an alternate first exon, includes an additional internal exon in the 5' region, and uses an alternate splice site in an internal exon in the 3' region, compared to variant 1."[85]
  2. NR_024055.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (5) has an alternate first exon, includes an additional internal exon in the 5' region, and lacks an internal exon in the 3' region, compared to variant 1."[85]
  3. NR_024056.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (2) uses an alternate splice site in an internal exon in the 3' region, compared to variant 1."[85]
  4. NR_024057.2 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (3) lacks an alternate internal exon in the 3' region, compared to variant 1."[85]
  5. NR_033418.1 RNA Sequence, non-coding RNA: "Transcript Variant: This variant (1) represents the longest transcript."[85]

A boxes

There is one A box on the positive strand in the negative direction (from ZSCAN22 to A1BG): 3'-TGACTCT-5' at 2788.

There is one A box complement on the negative strand in the negative direction: 3'-ACTGAGA-5' at 2788.

There is one A box inverse complement on the negative strand in the positive direction: 3'-AGAGTCA-5' at 2613.

There is one A box inverse on the positive strand in the positive direction: 3'-TCTCAGT-5' at 2613.

ACGT-containing elements

  1. ACGT elements, negative strand, negative direction: 24, 3'-ACGT-5' at 150, 3'-ACGT-5' at 1030, 3'-ACGT-5' at 1321, 3'-ACGT-5' at 1337, 3'-ACGT-5' at 1345, 3'-ACGT-5' at 1470, 3'-ACGT-5' at 1494, 3'-ACGT-5' at 1535, 3'-ACGT-5' at 1717, 3'-ACGT-5' at 1974, 3'-ACGT-5' at 1998, 3'-ACGT-5' at 2081, 3'-ACGT-5' at 2400, 3'-ACGT-5' at 2424, 3'-ACGT-5' at 2735, 3'-ACGT-5' at 2759, 3'-ACGT-5' at 2863, 3'-ACGT-5' at 3287, 3'-ACGT-5' at 3429, 3'-ACGT-5' at 3771, 3'-ACGT-5' at 4245, 3'-ACGT-5' at 4315, 3'-ACGT-5' at 4330, 3'-ACGT-5' at 4338.
  2. ACGT elements, negative strand, positive direction: 2, 3'-ACGT-5' at 569, 3'-ACGT-5' at 3254.
  3. ACGT elements, positive strand, negative direction: 4, 3'-ACGT-5' at 342, 3'-ACGT-5' at 531, 3'-ACGT-5' at 1772, 3'-ACGT-5' at 4236.
  4. ACGT elements, positive strand, positive direction: 44, 3'-ACGT-5' at 192, 3'-ACGT-5' at 224, 3'-ACGT-5' at 436, 3'-ACGT-5' at 531, 3'-ACGT-5' at 546, 3'-ACGT-5' at 656, 3'-ACGT-5' at 783, 3'-ACGT-5' at 1119, 3'-ACGT-5' at 1218, 3'-ACGT-5' at 1370, 3'-ACGT-5' at 1470, 3'-ACGT-5' at 1505, 3'-ACGT-5' at 1613, 3'-ACGT-5' at 1786, 3'-ACGT-5' at 1820, 3'-ACGT-5' at 1935, 3'-ACGT-5' at 2063, 3'-ACGT-5' at 2204, 3'-ACGT-5' at 2326, 3'-ACGT-5' at 2334, 3'-ACGT-5' at 2350, 3'-ACGT-5' at 2681, 3'-ACGT-5' at 2690, 3'-ACGT-5' at 2719, 3'-ACGT-5' at 2743, 3'-ACGT-5' at 2800, 3'-ACGT-5' at 2857, 3'-ACGT-5' at 2960, 3'-ACGT-5' at 3061, 3'-ACGT-5' at 3070, 3'-ACGT-5' at 3142, 3'-ACGT-5' at 3230, 3'-ACGT-5' at 3268, 3'-ACGT-5' at 3279, 3'-ACGT-5' at 3320, 3'-ACGT-5' at 3341, 3'-ACGT-5' at 3400, 3'-ACGT-5' at 3459, 3'-ACGT-5' at 3464, 3'-ACGT-5' at 3829, 3'-ACGT-5' at 3883, 3'-ACGT-5' at 3960, 3'-ACGT-5' at 4315, 3'-ACGT-5' at 4341.

ACGT-containing elements include these metal responsive elements:

  1. complement, negative strand, negative direction: 6, 3'-ACGTGAG-5' at 1348, 3'-ACGTGAG-5' at 2001, 3'-ACGTGAG-5' at 2427, 3'-ACGTGGG-5' at 2762, 3'-ACGTGAG-5' at 3290, and 3'-ACGTGAG-5' at 4341.
  2. complement, positive strand, negative direction: 6, 3'-ACGTGTG-5' at 549, 3'-ACGTGTG-5' at 1221, 3'-ACGTGAG-5' at 1373, 3'-ACGTGAG-5' at 1473, 3'-ACGTGTG-5' at 2963, 3'-ACGTGGG-5' at 3323.
  3. inverse, negative strand, negative direction: 2, 3'-CTCACGT-5' at 1470, 3'-CACACGT-5' at 2863.
  4. inverse, positive strand, negative direction: 2, 3'-CACACGT-5' at 531, 3'-CTCACGT-5' at 1772.
  5. inverse, positive strand, positive direction: 6, 3'-CGCACGT-5' at 546, 3'-CGCACGT-5' at 1218, 3'-CTCACGT-5' at 1786, 3'-CTCACGT-5' at 2326, 3'-CCCACGT-5' at 2800, 3'-CCCACGT-5' at 3883.

ACGT-containing elements include these cAMP response elements (CRE):

  1. negative strand in the negative direction (from ZSCAN22 to A1BG): 1, 3'-TGACGTCA-5' at 4317.

AGC boxes

An inverse AGC box occurs negative strand, negative direction, 3'-CCGCCGA-5' at 1754 nts from ZSCAN22 toward A1BG in the distal promoter with its complement on the positive strand, negative direction.

Angiotensinogen core promoter elements

  1. AGCE, negative strand, negative direction, looking for 3'-A/C-T-C/T-G-T-G-5': 4, 3'-ATTGTG-5' at 340, 3'-ATCGTG-5' at 2096, 3'-CTTGTG-5' at 3669, 3'-CTCGTG-5' at 3914.
  2. AGCE, negative strand, positive direction, looking for 3'-A/C-T-C/T-G-T-G-5': 2, 3'-ATTGTG-5' at 2679, 3'-CTCGTG-5' at 4376.
  3. AGCE, positive strand, negative direction, looking for 3'-A/C-T-C/T-G-T-G-5': 0.
  4. AGCE, positive strand, positive direction, looking for 3'-A/C-T-C/T-G-T-G-5': 6, 3'-CTCGTG-5' at 855, 3'-CTCGTG-5' at 955, 3'-CTCGTG-5' at 1207, 3'-CTCGTG-5' at 1627, 3'-CTTGTG-5' at 3095, 3'-CTCGTG-5' at 3739.
  5. AGCEc, negative strand, negative direction, looking for 3'-G/T-A-A/G-C-A-C-5': 0.
  6. AGCEc, negative strand, positive direction, looking for 3'-G/T-A-A/G-C-A-C-5': 6, 3'-GAGCAC-5' at 855, 3'-GAGCAC-5' at 955, 3'-GAGCAC-5' at 1207, 3'-GAGCAC-5' at 1627, 3'-GAACAC-5' at 3095, 3'-GAGCAC-5' at 3739.
  7. AGCEc, positive strand, negative direction, looking for 3'-G/T-A-A/G-C-A-C-5': 4, 3'-TAACAC-5' at 340, 3'-TAGCAC-5' at 2096, 3'-GAACAC-5' at 3669, 3'-GAGCAC-5' at 3914.
  8. AGCEc, positive strand, positive direction, looking for 3'-G/T-A-A/G-C-A-C-5': 2, 3'-TAACAC-5' at 2679, 3'-GAGCAC-5' at 4376.
  9. AGCEci, negative strand, negative direction, looking for 3'-C-A-C-A/G-A-G/T-5': 2, 3'-CACGAT-5' at 336, 3'-CACGAG-5' at 4403.
  10. AGCEci, negative strand, positive direction, looking for 3'-C-A-C-A/G-A-G/T-5': 1, 3'-CACGAG-5' at 243.
  11. AGCEci, positive strand, negative direction, looking for 3'-C-A-C-A/G-A-G/T-5': 10, 3'-CACGAG-5' at 435, 3'-CACGAG-5' at 572, 3'-CACGAG-5' at 708, 3'-CACGAG-5' at 1182, 3'-CACAAT-5' at 1721, 3'-CACAAG-5' at 2244, 3'-CACGAG-5' at 3232, 3'-CACAAT-5' at 3515, 3'-CACAAG-5' at 3634, 3'-CACGAG-5' at 4472.
  12. AGCEci, positive strand, positive direction, looking for 3'-C-A-C-A/G-A-G/T-5': 3, 3'-CACAAG-5' at 107, 3'-CACGAG-5' at 2090, 3'-CACGAG-5' at 3152.
  13. AGCEi, negative strand, negative direction, looking for 3'-G-T-G-C/T-T-A/C-5': 10, 3'-GTGCTC-5' at 435, 3'-GTGCTC-5' at 572, 3'-GTGCTC-5' at 708, 3'-GTGCTC-5' at 1182, 3'-GTGTTA-5' at 1721, 3'-GTGTTC-5' at 2244, 3'-GTGCTC-5' at 3232, 3'-GTGTTA-5' at 3515, 3'-GTGTTC-5' at 3634, 3'-GTGCTC-5' at 4472.
  14. AGCEi, negative strand, positive direction, looking for 3'-G-T-G-C/T-T-A/C-5': 3, 3'-GTGTTC-5' at 107, 3'-GTGCTC-5' at 2090, 3'-GTGCTC-5' at 3152.
  15. AGCEi, positive strand, negative direction, looking for 3'-G-T-G-C/T-T-A/C-5': 2, 3'-GTGCTA-5' at 336, 3'-GTGCTC-5' at 4403.
  16. AGCEi, positive strand, positive direction, looking for 3'-G-T-G-C/T-T-A/C-5': 0.

ATA boxes

Core promoters

There is the following inverse ATA box on the negative strand, negative direction: 1, 3'-AAATAA-5' at 4537 inside A1BG as the TSS is at 4460 nts from ZSCAN22.

Proximal promoters

There is the following inverse ATA box on the positive strand, negative direction: 3'-AAATAA-5' at 4221.

There is one inverse and inverse complement between 4050 and 4300 in the positive direction: 3'-AAATAA-5' at 4142, and 3'-TTTATT-5' at 4142.

Distal promoters

There is the following ATA box on the negative strand in the negative direction: 1, 3'-AATAAA-5' at 1726 nts from ZSCAN22.

There are the following ATA boxes on the positive strand in the negative direction: 3, 3'-AATAAA-5' at 3014, 3'-AATAAA-5' at 3335, and 3'-AATAAA-5' at 4072.

There are the following inverse ATA boxes on the positive strand, negative direction: 4, 3'-AAATAA-5' at 3013, 3'-AAATAA-5' at 3334, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075.

There is the following ATA box on the negative strand in the positive direction: 1, 3'-AATAAA-5' at 3427. It has a complement on the positive strand in the positive direction: 1, 3'-TTATTT-5' at 3427.

There is another inverse complement ATA box on the negative strand in the positive direction in distal promoter: 3'-TTTATT-5' at 2347. It also has an inverse in the distal promoter: 3'-AAATAA-5' at 2347.

B boxes

While there appear to be at least two B boxes, TGGGCA is one B-box,[86] where the "mP2 EB fragment used for binding was the 118 nucleotide fragment extending from the Dde I site at position -140 to the Dde I site at position -23 [...]. This fragment contains the GC, E, B, CAAT, and TATA boxes."[86]

  1. negative strand in the negative direction, looking for 3'-TGGGCA-5', 0.
  2. negative strand in the positive direction, looking for 3'-TGGGCA-5', 4, 3'-TGGGCA-5' at 27, 3'-TGGGCA-5' at 1945, 3'-TGGGCA-5' at 2894, 3'-TGGGCA-5' at 4180.
  3. positive strand in the negative direction, looking for 3'-TGGGCA-5', 9, 3'-TGGGCA-5' at 462, 3'-TGGGCA-5' at 902, 3'-TGGGCA-5' at 1114, 3'-TGGGCA-5' at 1359, 3'-TGGGCA-5' at 2438, 3'-TGGGCA-5' at 2773, 3'-TGGGCA-5' at 3301, 3'-TGGGCA-5' at 4040, 3'-TGGGCA-5' at 4191.
  4. positive strand in the positive direction, looking for 3'-TGGGCA-5', 0,
  5. complement, negative strand, negative direction, looking for 3'-ACCCGT-5', 9, 3'-ACCCGT-5' at 462, 3'-ACCCGT-5' at 902, 3'-ACCCGT-5' at 1114, 3'-ACCCGT-5' at 1359, 3'-ACCCGT-5' at 2438, 3'-ACCCGT-5' at 2773, 3'-ACCCGT-5' at 3301, 3'-ACCCGT-5' at 4040, 3'-ACCCGT-5' at 4191.
  6. complement, negative strand, positive direction, looking for 3'-ACCCGT-5', 0.
  7. complement, positive strand, negative direction, looking for 3'-ACCCGT-5', 0.
  8. complement, positive strand, positive direction, looking for 3'-ACCCGT-5', 4, 3'-ACCCGT-5' at 27, 3'-ACCCGT-5' at 1945, 3'-ACCCGT-5' at 2894, 3'-ACCCGT-5' at 4180.
  9. inverse complement, negative strand, negative direction, looking for 3'-TGCCCA-5', 0.
  10. inverse complement, negative strand, positive direction, looking for 3'-TGCCCA-5', 2, 3'-TGCCCA-5' at 3237, 3'-TGCCCA-5' at 3377.
  11. inverse complement, positive strand, negative direction, looking for 3'-TGCCCA-5', 4, 3'-TGCCCA-5' at 1458, 3'-TGCCCA-5' at 3854, 3'-TGCCCA-5' at 3883, 3'-TGCCCA-5' at 4251.
  12. inverse complement, positive strand, positive direction, looking for 3'-TGCCCA-5', 1, 3'-TGCCCA-5' at 3750.
  13. inverse, negative strand, negative direction, looking for 3'-ACGGGT-5', 4, 3'-ACGGGT-5' at 1458, 3'-ACGGGT-5' at 3854, 3'-ACGGGT-5' at 3883, 3'-ACGGGT-5' at 4251.
  14. inverse, negative strand, positive direction, looking for 3'-ACGGGT-5', 1, 3'-ACGGGT-5' at 3750.
  15. inverse, positive strand, negative direction, looking for 3'-ACGGGT-5', 0.
  16. inverse, positive strand, positive direction, looking for 3'-ACGGGT-5', 2, 3'-ACGGGT-5' at 3237, 3'-ACGGGT-5' at 3377.

The other is associated with the human transforming growth factor b1 binding sequences.[87]

And, has the consensus sequence 3'-TGTCTCA-5'. Let it be designated B1box.

  1. negative strand in the negative direction, looking for 3'-TGTCTCA-5', 2, 3'-TGTCTCA-5' at 1075, 3'-TGTCTCA-5' at 2445.
  2. negative strand in the positive direction, looking for 3'-TGTCTCA-5', 2, 3'-TGTCTCA-5'at 2174, 3'-TGTCTCA-5' at 2468.
  3. positive strand in the negative direction, looking for 3'-TGTCTCA-5', 5, 3'-TGTCTCA-5' at 923, 3'-TGTCTCA-5' at 1089, 3'-TGTCTCA-5' at 2033, 3'-TGTCTCA-5' at 3323, 3'-TGTCTCA-5' at 4373.
  4. positive strand in the positive direction, looking for 3'-TGTCTCA-5', 0.
  5. complement, negative strand, negative direction, looking for 3'-ACAGAGT-5', 5, 3'-ACAGAGT-5' at 923, 3'-ACAGAGT-5' at 1089, 3'-ACAGAGT-5' at 2033, 3'-ACAGAGT-5' at 3323, 3'-ACAGAGT-5' at 4373.
  6. complement, negative strand, positive direction, looking for 3'-ACAGAGT-5', 0.
  7. complement, positive strand, negative direction, looking for 3'-ACAGAGT-5', 2, 3'-ACAGAGT-5' at 1075, 3'-ACAGAGT-5' at 2445.
  8. complement, positive strand, positive direction, looking for 3'-ACAGAGT-5', 2, 3'-ACAGAGT-5' at 2174, 3'-ACAGAGT-5' at 2468.
  9. inverse complement, negative strand, negative direction, looking for 3'-TGAGACA-5', 3, 3'-TGAGACA-5' at 919, 3'-TGAGACA-5' at 1085, 3'-TGAGACA-5' at 2029.
  10. inverse complement, negative strand, positive direction, looking for 3'-TGAGACA-5', 0.
  11. inverse complement, positive strand, negative direction, looking for 3'-TGAGACA-5', 0.
  12. inverse complement, positive strand, positive direction, looking for 3'-TGAGACA-5', 1, 3'-TGAGACA-5' at 2308.
  13. inverse, negative strand, negative direction, looking for 3'-ACTCTGT-5', 0.
  14. inverse, negative strand, positive direction, looking for 3'-ACTCTGT-5', 1, 3'-ACTCTGT-5' at 2308.
  15. inverse, positive strand, negative direction, looking for 3'-ACTCTGT-5', 3, 3'-ACTCTGT-5' at 919, 3'-ACTCTGT-5' at 1085, 3'-ACTCTGT-5' at 2029.
  16. inverse, positive strand, positive direction, looking for 3'-ACTCTGT-5', 0.

B recognition elements

The factor II B recognition element is BREu.

Negative strand in the negative direction there are 3: 3'-CCACGCC-5' at 380, 3'-CCGCGCC-5' at 1762, and 3'-CCACGCC-5' at 2197 the distal promoter.

Complement, negative strand, negative direction there us 1: 3'-CCTGCGG-5' at 1153.

Inverse complement, positive strand, negative direction there are 4: 3'-GGCGTGG-5' at 1244, 3'-GGCGCGG-5' at 1762, 3'-GGCGTGG-5' at 1897, and 3'-GGCGTGG-5' at 3047.

Negative strand in the positive direction there are 3: 3'-GCACGCC-5', 1302, 3'-GGACGCC-5', 1672, 3'-GGGCGCC-5', 1769.

Positive strand in the positive direction there are 3: 3'-CCACGCC-5', 489, 3'-CGACGCC-5', 1033, 3'-CCACGCC-5', 1764.

Inverse complement, negative strand, positive direction there is 1: 3'-GGCGCCC-5', 1770.

Inverse complement, positive strand, positive direction there is 4: 3'-GGCGCGC-5', 682, 3'-GGCGCCG-5', 1338, 3'-GGCGCCG-5', 1438, 3'-GGCGTGG-5', 2566.

CAAT boxes

There are no CAAT boxes in either promoter.

CAREs

A CARE occurs in the negative direction: 3'-CAACTC-5' at 86 possibly associated with ZSCAN22. But inverse CAREs occur 3'-CTCAAC-5' at 1406, 3'-CTCAAC-5' at 2592, 3'-CTCAAC-5' at 2704, 3'-CTCAAC-5' at 3115, and 3'-CTCAAC-5' at 4096.

A CARE occurs in the positive direction: 3'-CAACTC-5' at 3292 in the positive direction. But inverse CARE occur 3'-CTCAAC-5' at 1406 and 3'-CTCAAC-5' at 1621 and 3'-CTCAAC-5' at 3290.

CArG boxes

There is a more general CArG box, 3'-CATTAAAAGG-5', at 3441 from ZSCAN22, or -1019 nts from the TSS of A1BG in the negative direction on the positive strand in the distal promoter.

A second more general CArG box, 3'-CAAAAAAAAG-5', at 1399 from ZSCAN22, or -3061 nts from the A1BG TSS may be a CArG box for ZSCAN22 in the negative direction on the positive strand in the distal promoter.

C boxes

Proximal promoters

Inverse complement, negative strand, negative direction there is 1: 3'-ACATCA-5', 4124.

There is one C box 3'-ACATCA-5' at 4116 nts in the positive direction.

Distal promoters

There are four C boxes: 3'-AGTAGT-5' at 2888, 3'-AGTAGT-5' at 2944, 3'-AGTAGT-5' at 3418, and 3'-AGTAGT-5' at 3521 on the negative strand in the negative direction and its complement on the positive strand.

Inverse complement, negative strand, negative direction there are 2: 3'-ACATCA-5', 2340, 3'-ACATCA-5', 2541.

There is one complement C box: 3'-TCATCA-5' at 3251 on the negative strand in the positive direction and its complement on the positive strand.

Inverse, negative strand, positive direction, there is 1: 3'-TGATGA-5', 2144.

Positive strand in the positive direction there is 1: 3'-AGTAGT-5', 3251.

CENP-B boxes

There are no CENP-B boxes in either promoter.

CGCG boxes

Negative strand in the negative direction there are 2: 3'-GCGCGT-5', 161, 3'-CCGCGC-5', 1761, in the distal promoter.

Positive strand in the negative direction there is 1: 3'-GCGCGG-5', 1762, in the distal promoter.

Negative strand in the positive direction there are 8: 3'-GCGCGT-5', 543, 3'-CCGCGC-5', 681, 3'-GCGCGC-5', 683, 3'-ACGCGG-5', 871, 3'-ACGCGG-5', 971, 3'-CCGCGG-5', 1337, 3'-CCGCGG-5', 1437, 3'-CCGCGC-5', 1650, in the distal promoter.

Positive strand in the positive direction there are 22: 3'-CCGCGC-5', 161, 3'-ACGCGG-5', 452, 3'-CCGCGC-5', 542, 3'-GCGCGC-5', 682, 3'-GCGCGT-5', 684, 3'-CCGCGT-5', 876, 3'-CCGCGT-5', 976, 3'-CCGCGT-5', 1046, 3'-ACGCGG-5', 1078, 3'-ACGCGG-5', 1162, 3'-CCGCGC-5', 1214, 3'-ACGCGG-5', 1246, 3'-CCGCGT-5', 1298, 3'-ACGCGT-5', 1314, 3'-ACGCGG-5', 1354, 3'-ACGCGG-5', 1398, 3'-ACGCGT-5', 1414, 3'-ACGCGG-5', 1454, 3'-ACGCGG-5', 1498, 3'-ACGCGT-5', 1523, 3'-CCGCGT-5', 1550, 3'-CCGCGG-5', 1769, in the distal promoter.

CRE boxes

Negative strand in the negative direction there is 1: 3'-TGACGTCA-5', 4317, and its complement in the proximal promoter.

D boxes

There is one D box in the distal promoter: 3'-AGTCTG-5' at 2947 on the negative strand in the negative direction and its complement on the positive strand.

Positive strand in the negative direction there is 1: 3'-AGTCTG-5', 1355.

Inverse complement, positive strand, negative direction there are 2: 3'-CAGACT-5', 15, 3'-CAGACT-5', 1616.

There is one D box in the distal promoter: 3'-AGTCTG-5' at 3923 on the negative strand in the positive direction and its complement on the positive strand.

Inverse complement, negative strand, positive direction there are 2: 3'-CAGACT-5', 1744, 3'-CAGACT-5', 2416.

Inverse complement, positive strand, positive direction there are 3: 3'-CAGACT-5', 2943, 3'-CAGACT-5', 3006, 3'-CAGACT-5', 3924.

Downstream B recognition elements

  1. negative strand in the negative direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 59, 3'-ATTTTGT-5' at 68, 3'-ATATGTT-5' at 113, 3'-GTTTTGT-5' at 166, 3'-ATATTTT-5' at 183, 3'-ATATTTT-5' at 222, 3'-GTTTTGG-5' at 259, 3'-ATGTTTT-5' at 485, 3'-GTTTTTT-5' at 487, 3'-ATTGGGG-5' at 616, 3'-ATGTTTT-5' at 637, 3'-GTTTTTT-5' at 639, 3'-ATGTTTT-5' at 771, 3'-GTTTTTT-5' at 773, 3'-GTGTGGT-5' at 883, 3'-GTTTTTT-5' at 928, 3'-GTTTTTT-5' at 1094, 3'-ATGTTTT-5' at 1228, 3'-GTTTTTT-5' at 1230, 3'-GTTTTTG-5' at 1386, 3'-GTTTGTT-5' at 1392, 3'-GTTTTTT-5' at 1396, 3'-GTTGGGT-5' at 1409, 3'-GTTGGGT-5' at 1516, 3'-GTTTGTG-5' at 1540, 3'-ATGTTTT-5' at 1880, 3'-GTTTTTT-5' at 1882, 3'-GTTTTTT-5' at 2038, 3'-ATGTTTT-5' at 2182, 3'-GTTTTTT-5' at 2184, 3'-ATGTTTT-5' at 2307, 3'-GTTTTTT-5' at 2309, 3'-GTGTGGT-5' at 2419, 3'-GTTTGTT-5' at 2484, 3'-GTTTGTT-5' at 2488, 3'-ATATGTT-5' at 2642, 3'-ATGTTTT-5' at 2644, 3'-GTGGGGT-5' at 2764, 3'-GTTGGGT-5' at 2846, 3'-ATATTTG-5' at 2875, 3'-GTAGTTT-5' at 2890, 3'-ATTTTTT-5' at 3026, 3'-GTGGGTT-5' at 3136, 3'-ATTTTTG-5' at 3165, 3'-GTATTTT-5' at 3171, 3'-GTTTTTG-5' at 3328, 3'-ATTTGTT-5' at 3338, 3'-ATTTGGT-5' at 3365, 3'-ATTTGGT-5' at 3484, 3'-GTAGTTG-5' at 3523, 3'-ATGGTGG-5' at 3740, 3'-GTGTTTT-5' at 3767, 3'-ATGTTTT-5' at 4066, 3'-GTTTTTT-5' at 4068, 3'-GTTGTGT-5' at 4196, 3'-ATGTTTT-5' at 4216, 3'-GTTTTTT-5' at 4218, 3'-GTTTTTT-5' at 4378, 3'-GTGGGGT-5' at 4446, 3'-GTAGGTG-5' at 4458 and their complements.
  2. negative strand in the positive direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 11, 3'-GTGGGGG-5' at 56, 3'-ATTTTTT-5' at 2451, 3'-GTGTTGG-5' at 2816, 3'-ATGTTTG-5' at 3339, 3'-GTGGTGG-5' at 3816, 3'-GTGTGGT-5' at 3967, 3'-GTGGTGT-5' at 3969, 3'-GTGGTTT-5' at 4108, 3'-ATTGTTG-5' at 4173, 3'-ATGGGGG-5' at 4225, 3'-GTGGGGT-5' at 4397 and their complements.
  3. positive strand in the negative direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 31, 3'-ATATGTT-5' at 43, 3'-ATATGGG-5' at 78, 3'-ATGGGGT-5' at 204, 3'-ATGTTTT-5' at 215, 3'-ATATGGT-5' at 606, 3'-ATGGTGT-5' at 608, 3'-ATGTGGT-5' at 788, 3'-GTGGTGG-5' at 790, 3'-GTGGTGT-5' at 793, 3'-ATTGGGT-5' at 1047, 3'-GTGGGTG-5' at 1163, 3'-GTGGTGG-5' at 1247, 3'-GTGGTGT-5' at 1477, 3'-GTGGTGG-5' at 1900, 3'-GTGGTGG-5' at 1903, 3'-GTGGGTG-5' at 2332, 3'-GTGTGGT-5' at 2659, 3'-GTGGTGG-5' at 2661, 3'-ATATTTT-5' at 2853, 3'-GTGGTGG-5' at 3050, 3'-GTGTGGT-5' at 3187, 3'-GTGGTGG-5' at 3189, 3'-GTGGTGG-5' at 3192, 3'-GTGGGTG-5' at 3195, 3'-ATTGGTT-5' at 3531, 3'-GTGGTTG-5' at 3605, 3'-ATGGGGT-5' at 3802, 3'-ATGTGGT-5' at 3811, 3'-GTGTTGG-5' at 3942, 3'-GTTGGTT-5' at 3944, 3'-ATGGTGG-5' at 4110 and their complements.
  4. positive strand in the positive direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 19, 3'-GTGGGTG-5' at 72, 3'-GTAGGTG-5' at 631, 3'-GTAGGTG-5' at 700, 3'-GTGGTGG-5' at 704, 3'-ATGGGGT-5' at 1891, 3'-GTTGGGT-5' at 2015, 3'-GTGGGGG-5' at 2020, 3'-GTTGGTG-5' at 2122, 3'-ATATGGT-5' at 2591, 3'-ATGGTGT-5' at 2600, 3'-GTGTGGT-5' at 2603, 3'-ATGGTGG-5' at 2759, 3'-GTGTGGG-5' at 2965, 3'-ATAGGGT-5' at 3386, 3'-GTAGGGT-5' at 3631, 3'-GTGTGGT-5' at 3825, 3'-GTTTGTG-5' at 4257, 3'-GTGGGGT-5' at 4286, 3'-GTGGGGT-5' at 4328 and their complements.
  5. inverse, negative strand, negative direction, is SuccessablesdBREi--.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 44, 3'-TTTGTTA-5' at 230, 3'-TTTTGTA-5' at 361, 3'-TTTTTTA-5' at 488, 3'-TTTTATG-5' at 633, 3'-TTTTATG-5' at 767, 3'-TGTGGTA-5' at 884, 3'-GGTTGTA-5' at 1205, 3'-TTTTTTA-5' at 1231, 3'-GTTTTTG-5' at 1386, 3'-GTTTGTG-5' at 1540, 3'-TTTTATG-5' at 1564, 3'-TTGTTTG-5' at 1587, 3'-TTTTATA-5' at 1740, 3'-TGGGGTA-5' at 1861, 3'-TTTTATG-5' at 1876, 3'-TTTTTTA-5' at 2061, 3'-GGTTGTA-5' at 2150, 3'-TTTTTTA-5' at 2185, 3'-TGGGGTA-5' at 2288, 3'-TTTTATG-5' at 2303, 3'-TGTGGTG-5' at 2420, 3'-TTGTTTG-5' at 2486, 3'-TTGTTTG-5' at 2511, 3'-GGTTGTG-5' at 2549, 3'-GGTTGTA-5' at 2612, 3'-GTTTTTA-5' at 2646, 3'-TTTGTTG-5' at 2843, 3'-TTTTATA-5' at 2869, 3'-TTTTTTA-5' at 2930, 3'-TTTGGTG-5' at 2972, 3'-TTTTTTG-5' at 3027, 3'-TGGGTTG-5' at 3137, 3'-TGGGGTA-5' at 3152, 3'-TTTTGTA-5' at 3167, 3'-GTTTTTG-5' at 3328, 3'-TTTGGTG-5' at 3366, 3'-TTTTGTG-5' at 3512, 3'-GTTGATA-5' at 3526, 3'-TGTTTTA-5' at 3768, 3'-GGGTATG-5' at 3857, 3'-GGTTGTG-5' at 3981, 3'-TTTTTTA-5' at 4069, 3'-TTTTTTA-5' at 4219, 3'-TTGGGTA-5' at 4454 and their complements.
  6. inverse, negative strand, positive direction, is SuccessablesdBREi-+.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 16, 3'-GGGGATG-5' at 59, 3'-TGTTTTA-5' at 148, 3'-TTGGGTG-5' at 1802, 3'-TTTTTTG-5' at 2282, 3'-TGGGATG-5' at 2409, 3'-TTTTTTG-5' at 2452, 3'-GGGGATA-5' at 2659, 3'-GGTTTTG-5' at 2688, 3'-GTGGATG-5' at 2714, 3'-GGTGTTG-5' at 2815, 3'-GGTTATG-5' at 3026, 3'-TGTGGTG-5' at 3644, 3'-TTTGGTG-5' at 3949, 3'-TGTGGTG-5' at 3968, 3'-GGTTTTA-5' at 4110, 3'-TGGGGTG-5' at 4398 and their complements.
  7. inverse, positive strand, negative direction, is SuccessablesdBREi+-.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 16, 3'-GTTTTTA-5' at 217, 3'-TGGTGTG-5' at 609, 3'-TGTGGTG-5' at 789, 3'-TTGGGTG-5' at 1048, 3'-GTGGGTG-5' at 1163, 3'-TTTTTTG-5' at 1433, 3'-TGGTGTG-5' at 1478, 3'-GGTGGTG-5' at 1902, 3'-GTGGGTG-5' at 2332, 3'-TGTGGTG-5' at 2660, 3'-GGGTGTG-5' at 3185, 3'-GGTTTTA-5' at 3350, 3'-TTGGTTG-5' at 3532, 3'-GTGGTTG-5' at 3605, 3'-GGTGATG-5' at 3798, 3'-TTGGTTG-5' at 3945 and their complements.
  8. inverse, positive strand, positive direction, is SuccessablesdBREi++.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 14, 3'-GTGGGTG-5' at 72, 3'-GGTGGTG-5' at 703, 3'-TTGGATG-5' at 1283, 3'-TTGGGTG-5' at 2016, 3'-GTTGGTG-5' at 2122, 3'-TGGTGTG-5' at 2601, 3'-TTTGGTG-5' at 2633, 3'-TTGTGTG-5' at 3097, 3'-TGGTTTG-5' at 3176, 3'-TGTGGTA-5' at 3826, 3'-TGGGGTG-5' at 3941, 3'-TGGGGTA-5' at 4220, 3'-GTTTGTG-5' at 4257, 3'-TGGGGTG-5' at 4287 and their complements.

Downstream core elements

In the negative direction on the negative strand, the A1BG transcription start site is at 4460 nucleotides from the last nucleotide of the gene ZSCAN22. In the positive direction on the negative strand, the A1BG transcription start site is at 4300 from well within the gene ZNF497. Downstream core elements are expected downstream of these TSSs. Occurrences before the TSSs can be found on Downstream core element gene transcriptions.

  1. negative strand, negative direction, looking for DCE SI: 3'-CTTC-5', 0.
  2. negative strand, positive direction, looking for DCE SI: 3'-CTTC-5', 0.
  3. positive strand, negative direction, looking for DCE SI: 3'-CTTC-5' at 4528.
  4. positive strand, positive direction, looking for DCE SI: 3'-CTTC-5', 0.
  1. negative strand, negative direction, looking for DCE SII: 3'-CTGT-5', 2, 3'-CTGT-5' at 4468 , 3'-CTGT-5' at 4507.
  2. negative strand, positive direction, looking for DCE SII: 3'-CTGT-5', 1, 3'-CTGT-5' at 4392.
  3. positive strand, negative direction, looking for DCE SII: 3'-CTGT-5', 0.
  4. positive strand, positive direction, looking for DCE SII: 3'-CTGT-5', 1, 3'-CTGT-5' at 4332.
  1. negative strand, negative direction, looking for DCE SIII: 3'-AGC-5', 0.
  2. negative strand, positive direction, looking for DCE SIII: 3'-AGC-5', 1, 3'-AGC-5' at 4352.
  3. positive strand, negative direction, looking for DCE SIII: 3'-AGC-5', 3, 3'-AGC-5' at 4480, 3'-AGC-5' at 4489, 3'-AGC-5' at 4520.
  4. positive strand, positive direction, looking for DCE SIII: 3'-AGC-5', 1, 3'-AGC-5' at 4374.

Complements

  1. negative strand, negative direction, looking for DCE SIc: 3'-GAAG-5', 1, 3'-GAAG-5' at 4528.
  2. negative strand, positive direction, looking for DCE SIc: 3'-GAAG-5', 0.
  3. positive strand, negative direction, looking for DCE SIc: 3'-GAAG-5', 0.
  4. positive strand, positive direction, looking for DCE SIc: 3'-GAAG-5', 0.
  1. negative strand, negative direction, looking for DCE SIIc: 3'-GACA-5', 0.
  2. negative strand, positive direction, looking for DCE SIIc: 3'-GACA-5', 1, 3'-GACA-5' at 4332.
  3. positive strand, negative direction, looking for DCE SIIc: 3'-GACA-5', 2, 3'-GACA-5' at 4468, 3'-GACA-5' at 4507.
  4. positive strand, positive direction, looking for DCE SIIc: 3'-GAAG-5', 1, 3'-GACA-5' at 4392.
  1. negative strand, negative direction, looking for DCE SIIIc: 3'-TCG-5', 3, 3'-TCG-5' at 4480, 3'-TCG-5' at 4489, 3'-TCG-5' at 4520.
  2. negative strand, positive direction, looking for DCE SIIIc: 3'-TCG-5', 1, 3'-TCG-5' at 4374.
  3. positive strand, negative direction, looking for DCE SIIIc: 3'-TCG-5', 0.
  4. positive strand, positive direction, looking for DCE SIIIc: 3'-TCG-5', 1, 3'-TCG-5' at 4352.

Inverse complements

  1. looking for DCE SIci: 3'-GAAG-5', same as the complements.
  1. negative strand, negative direction, looking for DCE SIIci: 3'-ACAG-5', 0.
  2. negative strand, positive direction, looking for DCE SIIci: 3'-ACAG-5', 0.
  3. positive strand, negative direction, looking for DCE SIIci: 3'-ACAG-5', 1, 3'-ACAG-5' at 4517.
  4. positive strand, positive direction, looking for DCE SIIci: 3'-ACAG-5', 1, 3'-ACAG-5' at 4366.
  1. negative strand, negative direction, looking for DCE SIIIci: 3'-GCT-5', 1, 3'-GCT-5' at 4471.
  2. negative strand, positive direction, looking for DCE SIIIci: 3'-GCT-5', 4, 3'-GCT-5' at 4312, 3'-GCT-5' at 4321, 3'-GCT-5' at 4372, 3'-GCT-5' at 4390.
  3. positive strand, negative direction, looking for DCE SIIIci: 3'-GCT-5', 0.
  4. positive strand, positive direction, looking for DCE SIIIci: 3'-GCT-5', 1, 3'-GCT-5' at 4356.

Inverses

  1. looking for DCE SIi: 3'-CTTC-5', same as the direct transcript.
  1. negative strand, negative direction, looking for DCE SIIi: 3'-TGTC-5', 1, 3'-TGTC-5' at 4517.
  2. negative strand, positive direction, looking for DCE SIIi: 3'-TGTC-5', 1, 3'-TGTC-5' at 4366.
  3. positive strand, negative direction, looking for DCE SIIi: 3'-TGTC-5', 0.
  4. positive strand, positive direction, looking for DCE SIIi: 3'-TGTC-5', 0.
  1. negative strand, negative direction, looking for DCE SIIIi: 3'-CGA-5', 0.
  2. negative strand, positive direction, looking for DCE SIIIi: 3'-CGA-5', 1, 3'-CGA-5' at 4356.
  3. positive strand, negative direction, looking for DCE SIIIi: 3'-CGA-5', 1, 3'-CGA-5' at 4471.
  4. positive strand, positive direction, looking for DCE SIIIi: 3'-CGA-5', 4, 3'-CGA-5' at 4312, 3'-CGA-5' at 4321, 3'-CGA-5' at 4372, 3'-CGA-5' at 4390.

Downstream promoter elements

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesDPE--.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 163, 3'-GGTCG-5', 35, 3'-AGATA-5', 234, 3'-GGTCC-5', 262, 3'-GGACA-5', 394, 3'-GGTCG-5', 403, 3'-GGTTC-5', 419, 3'-AGTCC-5', 441, 3'-GGACC-5', 459, 3'-AGATG-5', 481, 3'-GGTCG-5', 504, 3'-GGACC-5', 508, 3'-GGTCG-5', 540, 3'-GGTTC-5', 556, 3'-AGTCC-5', 578, 3'-GGACC-5', 596, 3'-AGATG-5', 624, 3'-GGTCC-5', 648, 3'-GGACA-5', 667, 3'-GGTCG-5', 676, 3'-GGTTC-5', 692, 3'-AGTCC-5', 714, 3'-GGTCG-5', 728, 3'-GGTCG-5', 737, 3'-AGATG-5', 758, 3'-GGACA-5', 801, 3'-GGTCG-5', 810, 3'-GGTCC-5', 850, 3'-GGTTC-5', 874, 3'-GGTCG-5', 895, 3'-GGACC-5', 899, 3'-AGACA-5', 919, 3'-GGTCC-5', 948, 3'-GGACA-5', 967, 3'-GGTCG-5', 976, 3'-AGTCC-5', 984, 3'-GGACC-5', 1015, 3'-GGTCG-5', 1061, 3'-AGACA-5', 1085, 3'-GGACA-5', 1131, 3'-GGTCG-5', 1140, 3'-GGTCG-5', 1194, 3'-GGACC-5', 1198, 3'-GGTTG-5', 1203, 3'-AGATG-5', 1224, 3'-GGACA-5', 1258, 3'-GGTCG-5', 1267, 3'-AGTCC-5', 1275, 3'-GGATC-5', 1306, 3'-GGTCA-5', 1352, 3'-AGACC-5', 1356, 3'-AGTTG-5', 1406, 3'-AGACA-5', 1452, 3'-GGTCC-5', 1460, 3'-AGTCG-5', 1486, 3'-AGTTG-5', 1513, 3'-AGATA-5', 1525, 3'-GGTCA-5', 1532, 3'-GGTCG-5', 1611, 3'-AGACA-5', 1776, 3'-GGTCG-5', 1785, 3'-GGTTC-5', 1817, 3'-GGACC-5', 1841, 3'-AGATG-5', 1867, 3'-GGACA-5', 1911, 3'-GGTCG-5', 1920, 3'-GGACC-5', 1959, 3'-GGTCG-5', 2005, 3'-GGACC-5', 2009, 3'-AGACA-5', 2029, 3'-GGTCC-5', 2077, 3'-GGATC-5', 2093, 3'-AGTCC-5', 2134, 3'-GGTTG-5', 2148, 3'-AGATG-5', 2169, 3'-GGTCA-5', 2211, 3'-AGTCC-5', 2250, 3'-GGTCG-5', 2264, 3'-GGACC-5', 2268, 3'-AGATG-5', 2294, 3'-GGACA-5', 2337, 3'-GGTCG-5', 2346, 3'-GGACC-5', 2385, 3'-GGTCG-5', 2431, 3'-GGACC-5', 2435, 3'-AGTTA-5', 2496, 3'-GGTCC-5', 2519, 3'-GGACA-5', 2538, 3'-GGTTG-5', 2547, 3'-AGTCC-5', 2587, 3'-GGTCA-5', 2601, 3'-GGTTG-5', 2610, 3'-AGTCG-5', 2650, 3'-GGTCA-5', 2654, 3'-GGACA-5', 2672, 3'-GGTCG-5', 2681, 3'-GGACC-5', 2720, 3'-GGTCG-5', 2766, 3'-GGACC-5', 2770, 3'-GGTTA-5', 2848, 3'-AGATG-5', 2988, 3'-GGATA-5', 2996, 3'-GGACA-5', 3061, 3'-GGTCG-5', 3070, 3'-AGTCC-5', 3110, 3'-GGTCG-5', 3124, 3'-GGACC-5', 3128, 3'-GGTTG-5', 3137, 3'-AGATG-5', 3158, 3'-GGACA-5', 3200, 3'-AGTCG-5', 3204, 3'-GGTCG-5', 3209, 3'-AGTCC-5', 3217, 3'-GGTCC-5', 3249, 3'-GGTTC-5', 3273, 3'-GGTCG-5', 3294, 3'-GGACC-5', 3298, 3'-AGACA-5', 3319, 3'-AGTCC-5', 3396, 3'-AGTTG-5', 3523, 3'-AGACA-5', 3556, 3'-GGTCC-5', 3564, 3'-GGACG-5', 3579, 3'-GGTCC-5', 3585, 3'-GGTCG-5', 3682, 3'-GGTCG-5', 3701, 3'-AGACG-5', 3706, 3'-GGTCG-5', 3731, 3'-GGACC-5', 3744, 3'-AGACC-5', 3835, 3'-AGTTC-5', 3844, 3'-GGACG-5', 3861, 3'-GGTCC-5', 3871, 3'-GGTCC-5', 3885, 3'-GGACC-5', 3906, 3'-GGTCC-5', 3951, 3'-GGACA-5', 3970, 3'-GGTTG-5', 3979, 3'-GGTTC-5', 4019, 3'-AGTTC-5', 4027, 3'-GGTCG-5', 4033, 3'-GGACC-5', 4037, 3'-AGATG-5', 4062, 3'-GGTCC-5', 4102, 3'-GGACA-5', 4121, 3'-GGTCG-5', 4130, 3'-AGTCC-5', 4138, 3'-GGTCC-5', 4170, 3'-AGTTC-5', 4178, 3'-GGACA-5', 4208, 3'-AGATG-5', 4212, 3'-GGTCC-5', 4253, 3'-GGTCG-5', 4261, 3'-GGACC-5', 4300, 3'-GGTCG-5', 4345, 3'-GGACC-5', 4349, 3'-GGACA-5', 4369, 3'-GGTCA-5', 4415, 3'-AGATG-5', 4430, 3'-AGTCC-5', 4436, 3'-GGTCG-5', 4480, 3'-AGTCG-5', 4489, 3'-GGACC-5', 4494, 3'-GGACC-5', 4546, and their complements.
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesDPE-+.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 73, 3'-GGACC-5' at 37, 3'-GGATG-5' at 59, 3'-GGTCA-5' at 153, 3'-AGATG-5' at 166, 3'-AGTCC-5' at 172, 3'-GGACC-5' at 187, 3'-GGTCC-5' at 218, 3'-GGTTC-5' at 305, 3'-GGACG-5' at 323, 3'-GGACG-5' at 359, 3'-AGACG-5' at 398, 3'-GGACG-5' at 410, 3'-AGACC-5' at 440, 3'-AGACA-5' at 712, 3'-AGTCC-5' at 757, 3'-AGATC-5' at 864, 3'-AGATC-5' at 964, 3'-AGTCG-5' at 1528, 3'-GGACG-5' at 1670, 3'-GGTCG-5' at 1687, 3'-GGACA-5' at 1693, 3'-AGTCC-5' at 1826, 3'-AGTCC-5' at 1841, 3'-GGACA-5' at 1869, 3'-GGATG-5' at 1878, 3'-GGTTC-5' at 1926, 3'-AGTTC-5' at 1987, 3'-AGTCC-5' at 2026, 3'-GGTCA-5' at 2035, 3'-AGTCA-5' at 2100, 3'-AGTTA-5' at 2134, 3'-GGTCA-5' at 2220, 3'-AGATC-5' at 2230, 3'-GGATG-5' at 2409, 3'-GGACA-5' at 2460, 3'-AGTCA-5' at 2607, 3'-AGTCA-5' at 2613, 3'-AGTCA-5' at 2618, 3'-GGATA-5' at 2659, 3'-AGTTA-5' at 2666, 3'-GGATG-5' at 2714, 3'-GGATA-5' at 2737, 3'-AGACC-5' at 2861, 3'-GGTTC-5' at 2922, 3'-AGTTC-5' at 2954, 3'-AGTCC-5' at 2998, 3'-GGTTA-5' at 3024, 3'-GGTTG-5' at 3050, 3'-AGTCC-5' at 3084, 3'-GGACA-5' at 3131, 3'-GGACC-5' at 3172, 3'-AGTCG-5' at 3283, 3'-AGTTA-5' at 3381, 3'-AGATG-5' at 3418, 3'-GGATG-5' at 3457, 3'-AGATG-5' at 3475, 3'-GGTTG-5' at 3490, 3'-GGACA-5' at 3530, 3'-GGACC-5' at 3545, 3'-AGACC-5' at 3550, 3'-GGATG-5' at 3574, 3'-GGTCA-5' at 3820, 3'-AGTCC-5' at 3863, 3'-AGACA-5' at 3893, 3'-GGTTC-5' at 4073, 3'-GGATC-5' at 4080, 3'-GGATG-5' at 4099, 3'-AGTTC-5' at 4200, 3'-GGACA-5' at 4252, 3'-GGTCA-5' at 4269, 3'-AGACG-5' at 4319, 3'-AGACA-5' at 4332, 3'-GGTCC-5' at 4420, and their complements.
  3. positive strand in the negative direction is SuccessablesDPE+-.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 101, 3'-GGACC-5', 32, 3'-AGATA-5', 57, 3'-GGATA-5', 74, 3'-AGTTG-5', 84, 3'-GGATA-5', 98, 3'-GGATA-5', 108, 3'-AGTCG-5', 157, 3'-AGACA-5', 170, 3'-GGTCA-5', 206, 3'-AGATG-5', 244, 3'-AGTTC-5', 253, 3'-AGACA-5', 422, 3'-GGATC-5', 430, 3'-GGTCA-5', 439, 3'-GGATC-5', 525, 3'-AGACA-5', 559, 3'-GGTCA-5', 568, 3'-GGTCA-5', 576, 3'-AGATC-5', 589, 3'-GGATC-5', 703, 3'-GGTCA-5', 712, 3'-AGTTC-5', 719, 3'-AGACC-5', 725, 3'-GGATG-5', 784, 3'-GGTTG-5', 862, 3'-AGATC-5', 877, 3'-AGATC-5', 972, 3'-GGTTG-5', 1028, 3'-GGACG-5', 1151, 3'-GGATC-5', 1167, 3'-AGTTC-5', 1177, 3'-GGTTG-5', 1319, 3'-AGATG-5', 1438, 3'-AGACA-5', 1569, 3'-AGATA-5', 1595, 3'-GGATC-5', 1812, 3'-AGATG-5', 1828, 3'-AGACC-5', 1834, 3'-AGATC-5', 1987, 3'-GGACA-5', 2117, 3'-AGACC-5', 2121, 3'-AGACC-5', 2145, 3'-AGATA-5', 2177, 3'-GGTTG-5', 2234, 3'-GGATC-5', 2239, 3'-GGTCA-5', 2248, 3'-AGACC-5', 2261, 3'-GGACA-5', 2271, 3'-GGTTG-5', 2398, 3'-AGATC-5', 2413, 3'-AGTCC-5', 2543, 3'-GGATC-5', 2574, 3'-GGTCA-5', 2585, 3'-AGTTG-5', 2592, 3'-AGACC-5', 2598, 3'-AGTTG-5', 2704, 3'-AGTTG-5', 2733, 3'-AGACA-5', 2880, 3'-AGATG-5', 2894, 3'-AGATG-5', 2905, 3'-AGACA-5', 2948, 3'-AGATA-5', 2981, 3'-GGATC-5', 3097, 3'-AGTTG-5', 3115, 3'-AGACC-5', 3121, 3'-GGTTG-5', 3261, 3'-AGATC-5', 3276, 3'-GGACA-5', 3389, 3'-AGACA-5', 3433, 3'-AGATA-5', 3465, 3'-AGATC-5', 3488, 3'-GGTTG-5', 3532, 3'-GGTTG-5', 3605, 3'-AGATG-5', 3620, 3'-AGATG-5', 3627, 3'-GGATA-5', 3655, 3'-GGACA-5', 3756, 3'-AGACC-5', 3761, 3'-GGTTG-5', 3804, 3'-GGTCG-5', 3813, 3'-GGACC-5', 3868, 3'-AGATG-5', 3919, 3'-GGTTG-5', 3945, 3'-GGATC-5', 4006, 3'-AGTTC-5', 4024, 3'-AGACC-5', 4030, 3'-AGTTG-5', 4096, 3'-AGTCC-5', 4126, 3'-GGATC-5', 4157, 3'-AGTTC-5', 4175, 3'-AGACA-5', 4181, 3'-AGACC-5', 4204, 3'-AGACG-5', 4235, 3'-GGATC-5', 4288, 3'-GGTCA-5', 4307, 3'-AGACC-5', 4365, 3'-AGTTC-5', 4417, 3'-GGACA-5', 4468, 3'-AGATC-5', 4475, 3'-AGTCC-5', 4500, 3'-AGACA-5', 4507, and their complements.
  4. positive strand in the positive direction is SuccessablesDPE++.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 159, 3'-GGTCC-5' at 8, 3'-GGTCC-5' at 33, 3'-GGACC-5' at 40, 3'-AGTCC-5' at 90, 3'-AGACA-5' at 98, 3'-AGACC-5' at 102, 3'-GGACA-5' at 144, 3'-GGTTC-5' at 177, 3'-GGACG-5' at 191, 3'-GGTCC-5' at 215, 3'-AGACG-5' at 223, 3'-AGACC-5' at 270, 3'-GGACC-5' at 286, 3'-GGTCG-5' at 329, 3'-GGTCC-5' at 424, 3'-GGACG-5' at 435, 3'-AGTCG-5' at 511, 3'-GGTCC-5' at 515, 3'-GGACC-5' at 598, 3'-GGTTG-5' at 607, 3'-AGTCG-5' at 613, 3'-GGTCG-5' at 617, 3'-GGTCG-5' at 623, 3'-GGATG-5' at 649, 3'-GGTCC-5' at 707, 3'-GGACG-5' at 807, 3'-AGTCG-5' at 831, 3'-GGTTG-5' at 843, 3'-GGACC-5' at 847, 3'-GGACA-5' at 891, 3'-GGACG-5' at 907, 3'-AGTCG-5' at 931, 3'-GGTTG-5' at 943, 3'-GGACC-5' at 947, 3'-GGACA-5' at 991, 3'-GGACG-5' at 1075, 3'-GGACG-5' at 1118, 3'-GGTCG-5' at 1127, 3'-GGTCC-5' at 1175, 3'-GGATG-5' at 1195, 3'-GGACC-5' at 1199, 3'-GGTCA-5' at 1250, 3'-AGTCG-5' at 1267, 3'-GGTCG-5' at 1271, 3'-GGTTG-5' at 1279, 3'-GGATG-5' at 1283, 3'-GGACG-5' at 1311, 3'-GGTCG-5' at 1357, 3'-GGTCG-5' at 1363, 3'-GGACG-5' at 1369, 3'-AGACC-5' at 1376, 3'-AGACG-5' at 1395, 3'-GGACG-5' at 1411, 3'-GGTCG-5' at 1457, 3'-GGTCG-5' at 1463, 3'-GGACG-5' at 1469, 3'-AGACC-5' at 1476, 3'-AGACG-5' at 1495, 3'-GGATG-5' at 1573, 3'-AGTCG-5' at 1603, 3'-AGTTG-5' at 1621, 3'-AGACG-5' at 1733, 3'-GGACG-5' at 1776, 3'-GGACC-5' at 1815, 3'-GGTCC-5' at 1855, 3'-GGACA-5' at 1860, 3'-AGACC-5' at 1864, 3'-GGTCC-5' at 1893, 3'-AGACC-5' at 1992, 3'-GGTTG-5' at 2012, 3'-GGTCA-5' at 2024, 3'-GGTCG-5' at 2052, 3'-AGTCA-5' at 2060, 3'-AGTCA-5' at 2098, 3'-AGTCG-5' at 2102, 3'-AGTCC-5' at 2115, 3'-AGATC-5' at 2167, 3'-AGACA-5' at 2182, 3'-AGTCG-5' at 2198, 3'-AGTTA-5' at 2233, 3'-GGACA-5' at 2250, 3'-AGACA-5' at 2260, 3'-AGACA-5' at 2308, 3'-GGTCC-5' at 2316, 3'-AGTCC-5' at 2372, 3'-AGTCG-5' at 2390, 3'-GGTTC-5' at 2398, 3'-GGACC-5' at 2433, 3'-GGATC-5' at 2481, 3'-GGACC-5' at 2501, 3'-AGTTC-5' at 2508, 3'-GGACG-5' at 2520, 3'-AGTCG-5' at 2526, 3'-GGACC-5' at 2569, 3'-GGTCC-5' at 2574, 3'-GGTTC-5' at 2593, 3'-GGTCA-5' at 2605, 3'-AGTTC-5' at 2615, 3'-AGTCC-5' at 2620, 3'-GGTCC-5' at 2780, 3'-AGACG-5' at 2856, 3'-GGTCC-5' at 2876, 3'-AGACC-5' at 2883, 3'-GGACC-5' at 2891, 3'-GGTTA-5' at 2908, 3'-AGACA-5' at 2925, 3'-AGTCA-5' at 2936, 3'-AGACA-5' at 2957, 3'-AGACG-5' at 2975, 3'-AGACC-5' at 2983, 3'-GGACC-5' at 2988, 3'-GGTCA-5' at 2996, 3'-GGTCC-5' at 3016, 3'-AGACC-5' at 3021, 3'-AGTCC-5' at 3034, 3'-AGTCG-5' at 3041, 3'-GGACC-5' at 3047, 3'-AGACG-5' at 3060, 3'-GGTCA-5' at 3082, 3'-GGTCC-5' at 3111, 3'-AGTCG-5' at 3155, 3'-GGTCG-5' at 3239, 3'-AGATA-5' at 3258, 3'-AGACG-5' at 3267, 3'-AGACG-5' at 3278, 3'-AGTTG-5' at 3290, 3'-GGACC-5' at 3296, 3'-AGACG-5' at 3306, 3'-AGACG-5' at 3358, 3'-GGACC-5' at 3362, 3'-GGTCA-5' at 3379, 3'-AGACC-5' at 3405, 3'-AGTTA-5' at 3424, 3'-GGACA-5' at 3434, 3'-GGACC-5' at 3496, 3'-GGTCC-5' at 3536, 3'-GGACA-5' at 3617, 3'-GGACA-5' at 3622, 3'-GGTTG-5' at 3633, 3'-GGACC-5' at 3679, 3'-GGTCC-5' at 3687, 3'-GGTCG-5' at 3720, 3'-AGTCC-5' at 3728, 3'-GGACC-5' at 3758, 3'-AGTCG-5' at 3775, 3'-GGACC-5' at 3787, 3'-GGTCA-5' at 3841, 3'-AGTCC-5' at 3868, 3'-AGTCG-5' at 3997, 3'-AGTCG-5' at 4023, 3'-GGTCC-5' at 4032, 3'-AGTCG-5' at 4052, 3'-AGATC-5' at 4064, 3'-AGATC-5' at 4076, 3'-GGACG-5' at 4231, 3'-AGTCA-5' at 4271, 3'-GGACC-5' at 4409, 3'-AGACC-5' at 4416, 3'-GGACC-5' at 4424, and their complements.
  5. inverse, negative strand, negative direction, is SuccessablesDPEi--.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 58, 3'-CCTGG-5', 32, 3'-ACAGA-5', 479, 3'-GTAGG-5', 593, 3'-ATTGG-5', 614, 3'-ACTGG-5', 734, 3'-GCAGA-5', 754, 3'-CTTGG-5', 846, 3'-ACAGA-5', 921, 3'-CTAGG-5', 973, 3'-CTTGG-5', 1012, 3'-ACTGA-5', 1051, 3'-ACAGA-5', 1087, 3'-GCTGG-5', 1191, 3'-ACAGA-5', 1222, 3'-CTTGG-5', 1303, 3'-GTTGG-5', 1407, 3'-CTAGA-5', 1482, 3'-GTTGG-5', 1514, 3'-ATAGG-5', 1529, 3'-GTAGG-5', 1572, 3'-CTTGA-5', 1685, 3'-ATAGA-5', 1731, 3'-GCAGA-5', 1774, 3'-CTAGG-5', 1813, 3'-GTAGG-5', 1838, 3'-GTAGA-5', 1863, 3'-CTTGG-5', 1956, 3'-ACAGA-5', 2031, 3'-ACAGA-5', 2165, 3'-ACTGG-5', 2189, 3'-GTAGA-5', 2290, 3'-CTTGG-5', 2382, 3'-CTTGG-5', 2717, 3'-ACTGA-5', 2786, 3'-GTTGG-5', 2844, 3'-GTTGA-5', 2911, 3'-ACAGA-5', 2986, 3'-GTAGA-5', 3154, 3'-CTTGG-5', 3245, 3'-ACAGA-5', 3321, 3'-CTTGA-5', 3460, 3'-GTTGA-5', 3524, 3'-GTAGA-5', 3551, 3'-CCTGA-5', 3640, 3'-GCAGG-5', 3698, 3'-CCTGA-5', 3747, 3'-CTTGG-5', 3784, 3'-ACAGA-5', 3833, 3'-GTTGA-5', 3849, 3'-CCTGG-5', 3868, 3'-GTAGG-5', 3903, 3'-GTAGA-5', 4058, 3'-ACAGA-5', 4210, 3'-CCTGA-5', 4327, 3'-ACAGA-5', 4371, 3'-CTTGG-5', 4451, 3'-GTAGG-5', 4456, 3'-CTAGG-5', 4476,
  6. inverse, negative strand, positive direction, is SuccessablesDPEi-+.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 152 , 3'-CCAGG-5' at 8 , 3'-CCAGA-5' at 15 , 3'-ATTGG-5' at 24 , 3'-CCAGG-5' at 33 , 3'-CCTGG-5' at 40 , 3'-ACAGG-5' at 157 , 3'-GCAGG-5' at 194 , 3'-CCAGA-5' at 204 , 3'-CCAGG-5' at 215 , 3'-GCTGG-5' at 277 , 3'-CCTGG-5' at 286 , 3'-GCAGG-5' at 318 , 3'-ACTGG-5' at 347 , 3'-ACAGG-5' at 365 , 3'-GCAGG-5' at 379 , 3'-GCTGG-5' at 386 , 3'-GCAGA-5' at 396 , 3'-GCTGG-5' at 417 , 3'-CCAGG-5' at 424 , 3'-GCAGA-5' at 438 , 3'-CCAGA-5' at 468 , 3'-CCAGG-5' at 515 , 3'-ACAGG-5' at 552 , 3'-CCTGG-5' at 598 , 3'-GCAGG-5' at 658 , 3'-CCAGG-5' at 707 , 3'-CCTGA-5' at 725 , 3'-GCTGG-5' at 779 , 3'-CCAGA-5' at 835 , 3'-CCTGG-5' at 847 , 3'-CCTGA-5' at 859 , 3'-CCAGA-5' at 935 , 3'-CCTGG-5' at 947 , 3'-CCTGA-5' at 959 , 3'-ACTGG-5' at 1140 , 3'-CCAGG-5' at 1175 , 3'-CCTGG-5' at 1199 , 3'-ACTGA-5' at 1286 , 3'-GCAGA-5' at 1316 , 3'-GCAGA-5' at 1416 , 3'-CCAGA-5' at 1631 , 3'-CCTGA-5' at 1660 , 3'-CCTGA-5' at 1676 , 3'-GCTGG-5' at 1736 , 3'-CCAGA-5' at 1742 , 3'-GCTGG-5' at 1779 , 3'-GCAGG-5' at 1788 , 3'-CTTGG-5' at 1799 , 3'-CCTGG-5' at 1815 , 3'-CCAGG-5' at 1855 , 3'-GTAGG-5' at 1875 , 3'-CCAGG-5' at 1893 , 3'-GCAGG-5' at 1905 , 3'-GCAGA-5' at 1937 , 3'-ACTGG-5' at 1953 , 3'-ACAGG-5' at 1966 , 3'-ACAGG-5' at 2125 , 3'-GTTGG-5' at 2185 , 3'-CCTGA-5' at 2211 , 3'-CCAGA-5' at 2228 , 3'-GTAGG-5' at 2255 , 3'-GCAGG-5' at 2296 , 3'-CCAGG-5' at 2316 , 3'-GCTGG-5' at 2320 , 3'-GCTGG-5' at 2405 , 3'-ACAGA-5' at 2414 , 3'-CCTGG-5' at 2433 , 3'-CTAGG-5' at 2482 , 3'-CCTGG-5' at 2501 , 3'-GTTGG-5' at 2541 , 3'-ATAGG-5' at 2550 , 3'-CCTGG-5' at 2569 , 3'-CCAGG-5' at 2574 , 3'-ATAGA-5' at 2627 , 3'-CTAGG-5' at 2639 , 3'-ACAGA-5' at 2652 , 3'-ACTGA-5' at 2674 , 3'-GCAGG-5' at 2683 , 3'-GCAGA-5' at 2721 , 3'-GCTGG-5' at 2734 , 3'-GCAGG-5' at 2745 , 3'-GCTGG-5' at 2770 , 3'-CCAGG-5' at 2780 , 3'-GCTGG-5' at 2810 , 3'-GTTGG-5' at 2816 , 3'-ACAGA-5' at 2837 , 3'-GCAGA-5' at 2859 , 3'-CCAGG-5' at 2876 , 3'-CCTGG-5' at 2891 , 3'-GCTGA-5' at 2915 , 3'-CCTGA-5' at 2968 , 3'-CCTGG-5' at 2988 , 3'-CCAGG-5' at 3016 , 3'-CCTGG-5' at 3047 , 3'-CCAGA-5' at 3091 , 3'-CCAGG-5' at 3111 , 3'-ACTGG-5' at 3117 , 3'-GCAGG-5' at 3128 , 3'-ACAGA-5' at 3133 , 3'-GCAGG-5' at 3147 , 3'-ACAGA-5' at 3179 , 3'-GCAGA-5' at 3214 , 3'-CCAGA-5' at 3221 , 3'-GCTGG-5' at 3242 , 3'-CCTGG-5' at 3296 , 3'-ACTGG-5' at 3345 , 3'-CCTGG-5' at 3362 , 3'-GTAGA-5' at 3416 , 3'-GCAGG-5' at 3466 , 3'-GCAGA-5' at 3473 , 3'-CTAGG-5' at 3484 , 3'-CCTGG-5' at 3496 , 3'-CTAGG-5' at 3522 , 3'-GCTGG-5' at 3526 , 3'-CCAGG-5' at 3536 , 3'-CCAGA-5' at 3548 , 3'-ACAGG-5' at 3571 , 3'-GCTGA-5' at 3588 , 3'-ACAGG-5' at 3636 , 3'-GCAGG-5' at 3662 , 3'-CCTGG-5' at 3679 , 3'-CCAGG-5' at 3687 , 3'-GCAGG-5' at 3694 , 3'-ACTGA-5' at 3735 , 3'-GTAGG-5' at 3753 , 3'-CCTGG-5' at 3758 , 3'-GCAGG-5' at 3768 , 3'-GCTGA-5' at 3778 , 3'-CCTGG-5' at 3787 , 3'-CCAGA-5' at 3806 , 3'-GCAGA-5' at 3831 , 3'-CCAGA-5' at 3891 , 3'-GCAGA-5' at 3916 , 3'-ACAGG-5' at 3975 , 3'-GCTGG-5' at 3989 , 3'-CTTGA-5' at 4016 , 3'-CCAGG-5' at 4032 , 3'-GTAGA-5' at 4036 , 3'-CTAGA-5' at 4065 , 3'-ACAGG-5' at 4070 , 3'-CTAGG-5' at 4077 , 3'-ACTGA-5' at 4089 , 3'-CTTGA-5' at 4131 , 3'-ATTGA-5' at 4161 , 3'-GCTGG-5' at 4177 , 3'-CCTGA-5' at 4186 , 3'-CCTGA-5' at 4214 , 3'-CTTGG-5' at 4300 , 3'-GCAGA-5' at 4317 , 3'-CCAGA-5' at 4330 , 3'-CCTGG-5' at 4409 , 3'-CCTGG-5' at 4424.
  7. inverse, positive strand, negative direction, is SuccessablesDPEi+-.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 174, 3'-ACAGA-5', 13, 3'-ACTGA-5', 17, 3'-GTTGA-5', 85, 3'-ATAGA-5', 100, 3'-GTAGG-5', 119, 3'-ACTGA-5', 130, 3'-GCTGA-5', 140, 3'-ACAGA-5', 168, 3'-CCAGG-5', 262, 3'-GTAGA-5', 284, 3'-ACAGA-5', 289, 3'-ACTGA-5', 307, 3'-CTTGG-5', 328, 3'-ATAGA-5', 355, 3'-ACAGG-5', 424, 3'-CCTGG-5', 459, 3'-CCTGG-5', 508, 3'-ACAGG-5', 561, 3'-GCAGG-5', 565, 3'-ATTGA-5', 585, 3'-CCTGG-5', 596, 3'-ATTGG-5', 643, 3'-CCAGG-5', 648, 3'-GCAGG-5', 697, 3'-CCTGA-5', 732, 3'-GCTGG-5', 781, 3'-GCTGA-5', 825, 3'-GCAGG-5', 831, 3'-CTTGA-5', 843, 3'-CCAGG-5', 850, 3'-CCTGG-5', 899, 3'-ACAGA-5', 907, 3'-CCAGG-5', 948, 3'-GTAGA-5', 970, 3'-GCTGA-5', 991, 3'-GCAGG-5', 997, 3'-CTTGA-5', 1009, 3'-CCTGG-5', 1015, 3'-GCAGA-5', 1023, 3'-ATTGG-5', 1045, 3'-ACAGA-5', 1073, 3'-GCTGG-5', 1111, 3'-CCTGA-5', 1173, 3'-CCTGG-5', 1198, 3'-GCTGA-5', 1282, 3'-GCAGG-5', 1288, 3'-CTTGA-5', 1300, 3'-CTAGG-5', 1307, 3'-GCAGA-5', 1314, 3'-CCAGA-5', 1411, 3'-CCAGG-5', 1460, 3'-GCTGG-5', 1464, 3'-CCAGA-5', 1518, 3'-ACAGA-5', 1567, 3'-GCAGA-5', 1614, 3'-CCTGA-5', 1623, 3'-CTTGG-5', 1649, 3'-GTAGA-5', 1653, 3'-CCAGA-5', 1670, 3'-ATAGA-5', 1710, 3'-GCTGG-5', 1746, 3'-GCTGG-5', 1756, 3'-GCTGA-5', 1800, 3'-GCAGG-5', 1823, 3'-CCTGG-5', 1841, 3'-GTTGA-5', 1853, 3'-GCTGG-5', 1891, 3'-CTTGG-5', 1927, 3'-ACTGA-5', 1935, 3'-GCAGG-5', 1941, 3'-CCTGG-5', 1959, 3'-GCAGA-5', 1967, 3'-CCTGG-5', 2009, 3'-ACAGA-5', 2017, 3'-GCTGG-5', 2069, 3'-CCAGG-5', 2077, 3'-GCTGA-5', 2109, 3'-ACAGA-5', 2119, 3'-CTTGA-5', 2127, 3'-GCTGA-5', 2226, 3'-GTTGG-5', 2235, 3'-CCTGG-5', 2268, 3'-GCTGG-5', 2326, 3'-GCTGA-5', 2361, 3'-GCAGG-5', 2367, 3'-CTTGA-5', 2379, 3'-CCTGG-5', 2385, 3'-GCAGG-5', 2389, 3'-CCTGG-5', 2435, 3'-ACAGA-5', 2443, 3'-ACAGG-5', 2514, 3'-CCAGG-5', 2519, 3'-GCTGA-5', 2562, 3'-GCAGG-5', 2568, 3'-CTTGA-5', 2580, 3'-GTTGA-5', 2593, 3'-ACAGG-5', 2689, 3'-GCTGA-5', 2696, 3'-GTTGA-5', 2705, 3'-CTTGA-5', 2714, 3'-CCTGG-5', 2720, 3'-GCTGA-5', 2744, 3'-CCTGG-5', 2770, 3'-ACAGA-5', 2778, 3'-ACAGA-5', 2878, 3'-ATAGA-5', 2903, 3'-CTTGG-5', 2921, 3'-GCTGG-5', 3035, 3'-GCTGG-5', 3041, 3'-GCTGA-5', 3085, 3'-CTTGA-5', 3103, 3'-GTTGG-5', 3116, 3'-CCTGG-5', 3128, 3'-GCTGG-5', 3180, 3'-GCTGA-5', 3224, 3'-CTTGA-5', 3242, 3'-CCAGG-5', 3249, 3'-GTAGA-5', 3256, 3'-CCTGG-5', 3298, 3'-ATTGA-5', 3358, 3'-CTTGA-5', 3401, 3'-ATAGA-5', 3422, 3'-GCAGA-5', 3431, 3'-ATAGG-5', 3447, 3'-CTAGA-5', 3463, 3'-CCAGA-5', 3486, 3'-GTTGA-5', 3505, 3'-ATTGG-5', 3529, 3'-GTTGA-5', 3533, 3'-ACTGA-5', 3542, 3'-CCAGG-5', 3564, 3'-CTTGA-5', 3571, 3'-CCAGG-5', 3585, 3'-GCAGA-5', 3589, 3'-GTTGG-5', 3606, 3'-GCTGA-5', 3649, 3'-ACAGA-5', 3672, 3'-GCTGG-5', 3719, 3'-CCTGG-5', 3744, 3'-ACTGG-5', 3749, 3'-CCTGA-5', 3781, 3'-CTTGG-5', 3793, 3'-GTTGA-5', 3805, 3'-GTAGA-5', 3820, 3'-GCTGG-5', 3864, 3'-CCAGG-5', 3871, 3'-CCAGG-5', 3885, 3'-CCTGG-5', 3906, 3'-ACAGA-5', 3917, 3'-CCTGA-5', 3932, 3'-GTTGG-5', 3942, 3'-GTTGG-5', 3946, 3'-CCAGG-5', 3951, 3'-GCTGA-5', 3994, 3'-CTTGA-5', 4012, 3'-CCTGG-5', 4037, 3'-ATAGA-5', 4079, 3'-GTTGG-5', 4097, 3'-CCAGG-5', 4102, 3'-GCTGA-5', 4145, 3'-CCAGG-5', 4170, 3'-CTTGG-5', 4188, 3'-CCAGA-5', 4233, 3'-CCAGG-5', 4253, 3'-CTTGG-5', 4268, 3'-GCTGA-5', 4276, 3'-GCAGG-5', 4282, 3'-CTTGA-5', 4294, 3'-CCTGG-5', 4300, 3'-CCTGG-5', 4349, 3'-CCAGA-5', 4448, 3'-CCTGG-5', 4494, 3'-ACAGA-5', 4518, 3'-CCTGG-5', 4546,
  8. inverse, positive strand, positive direction, is SuccessablesDPEi++.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 95, 3'-GTAGG-5' at 30, 3'-CCTGG-5' at 37, 3'-ACAGG-5' at 82, 3'-ACAGA-5' at 100, 3'-CCTGG-5' at 187, 3'-CCAGG-5' at 218, 3'-ACAGA-5' at 268, 3'-GTTGG-5' at 608, 3'-GTAGG-5' at 629, 3'-GTAGG-5' at 698, 3'-CCTGA-5' at 746, 3'-CCTGA-5' at 814, 3'-GTTGG-5' at 844, 3'-CTAGG-5' at 865, 3'-ACAGG-5' at 893, 3'-GCTGA-5' at 898, 3'-CCTGA-5' at 914, 3'-GTTGG-5' at 944, 3'-CTAGG-5' at 965, 3'-ACAGG-5' at 993, 3'-GCTGA-5' at 998, 3'-GTTGG-5' at 1280, 3'-GCAGA-5' at 1393, 3'-GCAGA-5' at 1493, 3'-GTTGG-5' at 1616, 3'-ACTGG-5' at 1662, 3'-CCAGA-5' at 1711, 3'-ACAGA-5' at 1731, 3'-CTTGG-5' at 1811, 3'-ACAGA-5' at 1862, 3'-GCAGG-5' at 1930, 3'-CTTGA-5' at 1951, 3'-CCAGA-5' at 1958, 3'-GTTGG-5' at 2013, 3'-ACAGA-5' at 2078, 3'-GTAGA-5' at 2111, 3'-GTTGG-5' at 2120, 3'-ACAGA-5' at 2172, 3'-ACTGG-5' at 2213, 3'-CTTGG-5' at 2225, 3'-CCAGA-5' at 2258, 3'-CCTGA-5' at 2271, 3'-GCTGA-5' at 2359, 3'-CTAGG-5' at 2378, 3'-ACAGA-5' at 2466, 3'-CCAGA-5' at 2489, 3'-CTAGG-5' at 2514, 3'-CTTGG-5' at 2579, 3'-CCTGA-5' at 2672, 3'-CTTGG-5' at 2776, 3'-CCTGA-5' at 2820, 3'-GTAGA-5' at 2852, 3'-ACTGG-5' at 2873, 3'-CCAGA-5' at 2941, 3'-ACTGA-5' at 2945, 3'-ACAGA-5' at 3004, 3'-CCAGA-5' at 3019, 3'-ACTGA-5' at 3029, 3'-ACAGA-5' at 3053, 3'-GTAGG-5' at 3108, 3'-CCTGG-5' at 3172, 3'-GCAGG-5' at 3203, 3'-CCAGA-5' at 3245, 3'-GCAGA-5' at 3256, 3'-GTTGA-5' at 3291, 3'-CCAGA-5' at 3299, 3'-GTAGA-5' at 3329, 3'-ATAGG-5' at 3384, 3'-ACAGA-5' at 3392, 3'-GTAGA-5' at 3403, 3'-CCTGG-5' at 3545, 3'-ACAGG-5' at 3577, 3'-CCAGA-5' at 3608, 3'-ACAGG-5' at 3619, 3'-GTAGG-5' at 3629, 3'-ACTGG-5' at 3714, 3'-ATTGA-5' at 3733, 3'-CCAGA-5' at 3771, 3'-ACTGG-5' at 3784, 3'-GCTGA-5' at 3801, 3'-CTTGG-5' at 3838, 3'-CTTGG-5' at 3856, 3'-GTTGG-5' at 3911, 3'-CTTGG-5' at 3937, 3'-ACTGG-5' at 4018, 3'-CTTGA-5' at 4048, 3'-GCAGA-5' at 4056, 3'-CTAGG-5' at 4081, 3'-GTAGG-5' at 4183, 3'-ACTGG-5' at 4216, 3'-GCTGG-5' at 4358, 3'-ACAGG-5' at 4367, 3'-CCAGA-5' at 4380, 3'-CCAGA-5' at 4414, 3'-CCAGG-5' at 4420.

DREB boxes

There are no dehydration-responsive element-binding (DREB) boxes in either promoter.

E2 boxes

Negative strand in the negative direction there are 5: 3'-ACAGATGT-5', 482, 3'-ACAGATGT-5', 1225, 3'-GCAGTTGG-5', 1514, 3'-ACAGATGT-5', 2989, 3'-ACAGATGT-5', 4213, in the distal promoter.

Positive strand in the negative direction there are 2: 3'-GCAGGTGG-5', 2571, 3'-ACAGATGA-5', 3920.

Inverse complement, negative strand, negative direction there is 1: 3'-CCACCTGT-5', 2117.

Inverse complement, positive strand, negative direction there are 4: 3'-CCACCTGT-5', 394, 3'-ACACCTGT-5', 1131, 3'-GCAACTGC-5', 3851, 3'-ACACCTGT-5', 3970

Negative strand in the positive direction there is 1: 3'-GCAGATGA-5', 37.

EIF4E basal elements

There are no EIF4E basal element, also eIF4E, (4EBE), in either promoter.

Enhancer boxes

Core promoters

Negative strand in the positive direction there are 2: 3'-CACATG-5', 4364, 3'-CACATG-5', 4370.

Proximal promoters

Positive strand, negative direction there is 1: 3'-CACATG-5' at 4247.

Negative strand, positive direction there are 2: 3'-CACATG-5', 4153, 3'-CACATG-5', 4221.

Distal promoters

Negative strand in the negative direction there are 4: 3'-CACATG-5' at 324, 3'-CACATG-5' at 797, 3'-CACATG-5' at 2213, and 3'-CACATG-5' at 2342.

Positive strand in the negative direction there are 17, 3'-CACATG-5' at 123, 3'-CACATG-5' at 200, 3'-CACATG-5' at 952, 3'-CACATG-5' at 1206, 3'-CACATG-5' at 1849, 3'-CACATG-5' at 1952, 3'-CACATG-5' at 2151, 3'-CACATG-5' at 2276, 3'-CACATG-5' at 2322, 3'-CACATG-5' at 2533, 3'-CACATG-5' at 2613, 3'-CACATG-5' at 2667, 3'-CACATG-5' at 2751, 3'-CACATG-5' at 2783, 3'-CACATG-5' at 4106, 3'-CACATG-5' at 4116.

Negative strand in the positive direction there are 17: 3'-CACATG-5', 1186, 3'-CACATG-5', 1238, 3'-CACATG-5', 1871, 3'-CACATG-5', 1933, 3'-CACATG-5', 2031, 3'-CACATG-5', 2140, 3'-CACATG-5', 2153, 3'-CACATG-5', 2266, 3'-CACATG-5', 2473, 3'-CACATG-5', 3140, 3'-CACATG-5', 3335, 3'-CACATG-5', 3580, 3'-CACATG-5', 3707, 3'-CACATG-5', 3742, 3'-CACATG-5', 3827, 3'-CACATG-5', 3900, 3'-CACATG-5', 3956.

Positive strand in the positive direction there are 4: 3'-CACATG-5', 126, 3'-CACATG-5', 565, 3'-CACATG-5', 2596, 3'-CACATG-5', 3114.

F boxes

GAAC elements

  1. negative strand in the negative direction, looking for 3'-GAACT-5', 13, 3'-GAACT-5', 843, 3'-GAACT-5', 1009, 3'-GAACT-5', 1300, 3'-GAACT-5', 2127, 3'-GAACT-5', 2379, 3'-GAACT-5', 2580, 3'-GAACT-5', 2714, 3'-GAACT-5', 3103, 3'-GAACT-5', 3242, 3'-GAACT-5', 3401, 3'-GAACT-5', 3571, 3'-GAACT-5', 4012, 3'-GAACT-5', 4294,
  2. negative strand in the positive direction, looking for 3'-GAACT-5', 1, 3'-GAACT-5', 609,
  3. positive strand in the negative direction, looking for 3'-GAACT-5', 2, 3'-GAACT-5', 1685, 3'-GAACT-5', 3460,
  4. positive strand in the positive direction, looking for 3'-GAACT-5', 2, 3'-GAACT-5', 577, 3'-GAACT-5', 692,
  5. complement, negative strand, negative direction, looking for 3'-CTTGA-5', 2, 3'-CTTGA-5', 1685, 3'-CTTGA-5', 3460,
  6. complement, negative strand, positive direction, looking for 3'-CTTGA-5', 2, 3'-CTTGA-5', 577, 3'-CTTGA-5', 692,
  7. complement, positive strand, negative direction, looking for 3'-CTTGA-5', 13, 3'-CTTGA-5', 843, 3'-CTTGA-5', 1009, 3'-CTTGA-5', 1300, 3'-CTTGA-5', 2127, 3'-CTTGA-5', 2379, 3'-CTTGA-5', 2580, 3'-CTTGA-5', 2714, 3'-CTTGA-5', 3103, 3'-CTTGA-5', 3242, 3'-CTTGA-5', 3401, 3'-CTTGA-5', 3571, 3'-CTTGA-5', 4012, 3'-CTTGA-5', 4294,
  8. complement, positive strand, positive direction, looking for 3'-CTTGA-5', 1, 3'-CTTGA-5', 609,
  9. inverse complement, negative strand, negative direction, looking for 3'-AGTTC-5', 3, 3'-AGTTC-5', 3844, 3'-AGTTC-5', 4027, 3'-AGTTC-5', 4178,
  10. inverse complement, negative strand, positive direction, looking for 3'-AGTTC-5', 1, 3'-AGTTC-5', 761,
  11. inverse complement, positive strand, negative direction, looking for 3'-AGTTC-5', 6, 3'-AGTTC-5', 253, 3'-AGTTC-5', 719, 3'-AGTTC-5', 1177, 3'-AGTTC-5', 4024, 3'-AGTTC-5', 4175, 3'-AGTTC-5', 4417,
  12. inverse complement, positive strand, positive direction, looking for 3'-AGTTC-5', 0,
  13. inverse, negative strand, negative direction, looking for 3'-TCAAG-5', 6, 3'-TCAAG-5', 253, 3'-TCAAG-5', 719, 3'-TCAAG-5', 1177, 3'-TCAAG-5', 4024, 3'-TCAAG-5', 4175, 3'-TCAAG-5', 4417,
  14. inverse, negative strand, positive direction, looking for 3'-TCAAG-5', 0,
  15. inverse, positive strand, negative direction, looking for 3'-TCAAG-5', 3, 3'-TCAAG-5', 3844, 3'-TCAAG-5', 4027, 3'-TCAAG-5', 4178,
  16. inverse, positive strand, positive direction, looking for 3'-TCAAG-5', 1, 3'-TCAAG-5', 761.

GA responsive elements

Only one GARE (an inverse) occurs: between ZSCAN22 and A1BG 3'-AAACAAT-5' at 230 nts and its complement.

GATA boxes

GTGA-box has the consensus sequence GATA.[88]

Proximal promoters

Inverse complement, negative strand, positive direction there is 1: 3'-TTTATCAC-5', 4125.

Distal promoters

Positive strand in the negative direction there are 2: 3'-GGGATAGA-5', 100, 3'-ATGATAGA-5', 355.

Inverse complement, negative strand, negative direction there is 1: 3'-GTTATCAT-5', 2500.

Inverse complement, positive strand, negative direction there is 1: 3'-TTTATCTT-5', 1732.

Inverse complement, negative strand, positive direction there is 1: 3'-GTTATCCC-5', 3385.

Inverse complement, positive strand, positive direction there are 2: 3'-GCTATCAG-5', 1840, 3'-TTTATCTT-5', 2628.

G boxes

There are no G boxes in either promoter.

GC boxes

Positive strand in the negative direction there are 2; 3'-TGGGCGTGGT-5', 1898, 3'-TGGGCGTGGT-5', 3048, in the distal promoter.

Inverse complement, negative strand, negative direction there is 1: 3'-ACTCCGCCCA-5', 3092.

Inverse complement, positive strand, negative direction there is 1: 3'-GCTCCGCCTC-5', 1505.

Negative strand in the positive direction there is 1: 3'-TGGGCGGGAC-5', 409.

Inverse complement, positive strand, positive direction there is 1:, 3'-GCCACGCCCC-5', 491.

GCC boxes

The GCC box is the same as the AGC box.

GLM boxes

There are no GCN4-like motif (GLM) boxes in either promoter.

Grainy head transcription factor binding sites

H boxes

Core promoters

Between ZSCAN22 and A1BG: There is one inverse and its complement 3'-AGGAGA-5' at 4428 nts.

Between ZNF497 and A1BG: There is an inverse and its complement 3'-AGGACA-5' at 4252. There is five after the TSS: 3'-AGAGAA-5' at 4387, 3'-AGTACA-5' at 4365, 3'-ACCAGA-5' at 4380, 3'-AAGAGA-5' at 4386, 3'-ACGACA-5' at 4392 and their complements.

Proximal promoters

Between ZSCAN22 and A1BG: There is one H box (3'-ANANNA-5'): negative direction, negative strand, 3'-ACACGA-5' at 4402. On the positive strand in the negative direction there are 16: 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5'at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395, with their complements on the negative strand, negative direction.

Between ZNF497 and A1BG: There is one H box (3'-ANANNA-5'): 3'-AGAGAA-5' at 4387 in the proximal promoter, negative strand, positive direction. There are four: 3'-TCATGT-5' at 4365, 3'-TGGTCT-5' at 4380, 3'-TTCTCT-5' at 4386, and 3'-TGCTGT-5' at 4392 and their complements in the positive direction.

Distal promoters

Between ZSCAN22 and A1BG, negative strand, negative direction: 3'-AGAGGA-5' at 3387, 3'-AGAGGA-5' at 3638, and 3'-AGAGGA-5' at 3675. One inverse and its complement 3'-AGGAGA-5' at 3790. There are 14 H boxes: 3'-ACACCA-5' at 788, 3'-ACATCA-5' at 2541, 3'-ACACCA-5' at 2659, 3'-ACATTA-5' at 2675, 3'-ATAAAA-5' at 2853, 3'-AAAGTA-5' at 2886, 3'-ACATTA-5' at 3064, 3'-AGATGA-5' at 3159, 3'-ACACCA-5' at 3187, 3'-AGAAGA-5' at 3554, 3'-AGACGA-5' at 3707, 3'-ACACCA-5' at 3811, 3'-ACATTA-5' at 3973, and 3'-ACATCA-5' at 4124.

On the positive strand, negative direction, there are 127 H boxes: 3'-ACCACA-5' at 608, 3'-ACCACA-5' at 793, 3'-ACACCA-5' at 883, 3'-ACCACA-5', 1477, 3'-ACACCA-5' at 2419, 3'-AAAAAA-5' at 2461, 3'-AAAAAA-5' at 2462, 3'-AAAAAA-5' at 2463, 3'-AAAAAA-5' at 2464, 3'-AAAAAA-5' at 2465, 3'-AAAAAA-5' at 2466, 3'-AAAAAA-5' at 2467, 3'-AAAAAA-5' at 2468, 3'-AAAAAA-5' at 2469, 3'-AAAAAA-5' at 2470, 3'-AAAGCA-5' at 2473, 3'-AAAGCA-5' at 2479, 3'-AAACAA-5' at 2484, 3'-AAACAA-5' at 2488, 3'-ACAAAA-5' at 2490, 3'-ATAGTA-5' at 2500, 3'-AGAAAA-5' at 2506, 3'-AAAACA-5' at 2508, 3'-AAACAA-5' at 2509, 3'-AGACCA-5' at 2599, 3'-ATACAA-5' at 2642, 3'-ACAAAA-5' at 2644, 3'-AAATCA-5' at 2648, 3'-ACAGGA-5' at 2690, 3'-AAATCA-5' at 2749, 3'-AGAGCA-5' at 2781, 3'-AAAAGA-5' at 2798, 3'-AAAGAA-5' at 2799, 3'-AAAGAA-5' at 2803, 3'-AGAAAA-5' at 2805, 3'-AAAAGA-5' at 2807, 3'-AGAGAA-5' at 2810, 3'-AGAAGA-5' at 2812, 3'-AGAAAA-5' at 2815, 3'-AAAAAA-5' at 2817, 3'-AAAAGA-5' at 2819, 3'-AAAGAA-5' at 2820, 3'-AGAAAA-5' at 2822, 3'-AAAAGA-5' at 2824, 3'-AGAGAA-5' at 2827, 3'-AGAAGA-5' at 2829, 3'-AGAAAA-5' at 2832, 3'-AAAAAA-5' at 2834, 3'-AAAAGA-5' at 2836, 3'-AAAGAA-5' at 2837, 3'-AGAAAA-5' at 2839, 3'-AAAACA-5' at 2841, 3'-AAACAA-5' at 2842, 3'-AAAATA-5' at 2868, 3'-ATATAA-5' at 2873, 3'-AAAAAA-5' at 2929, 3'-ACATCA-5' at 2941, 3'-ACATTA-5' at 2951, 3'-AAACCA-5' at 2971, 3'-AAAATA-5' at 3012, 3'-AAATAA-5' at 3013, 3'-AAAAAA-5' at 3026, 3'-AAACTA-5' at 3029, 3'-AGACCA-5' at 3122, 3'-AAAACA-5' at 3166, 3'-ACATAA-5' at 3169, 3'-ATAAAA-5' at 3171, 3'-AAATTA-5' at 3175, 3'-AGATCA-5' at 3277, 3'-ACAAGA-5' at 3307, 3'-AGAGCA-5' at 3310, 3'-AAAACA-5' at 3329, 3'-AAACAA-5' at 3330, 3'-AAATAA-5' at 3334, 3'-AAACAA-5' at 3338, 3'-ACAAGA-5' at 3340, 3'-AGAAAA-5' at 3343, 3'-AAACCA-5' at 3365, 3'-AGAGGA-5' at 3387, 3'-ACATCA-5' at 3394, 3'-AGAGAA-5' at 3406, 3'-ACATCA-5' at 3415, 3'-ACATTA-5' at 3436, 3'-ATATTA-5' at 3454, 3'-ATATTA-5' at 3468, 3'-AAACCA-5' at 3484, 3'-AGATCA-5' at 3489, 3'-AAAACA-5' at 3511, 3'-ACACAA-5' at 3514, 3'-ATAATA-5' at 3538, 3'-ACAAGA-5' at 3635, 3'-AGAGGA-5' at 3638, 3'-AAAGAA-5' at 3666, 3'-AGAACA-5' at 3668, 3'-AGAGGA-5' at 3675, 3'-ACAAGA-5' at 3759, 3'-AGACCA-5' at 3762, 3'-ACCACA-5' at 3764, 3'-ACAAAA-5' at 3767, 3'-AGAGCA-5' at 3913, 3'-AGATGA-5' at 3920, 3'-AGACCA-5' at 4031, 3'-ACAAAA-5' at 4066, 3'-AAAAAA-5' at 4068, 3'-AAAATA-5' at 4070, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075, 3'-ATAATA-5' at 4077, 3'-ATAGAA-5' at 4080, 3'-AAAGAA-5' at 4084, 3'-AGAAAA-5' at 4086, 3'-AGACAA-5' at 4182, 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5' at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395.

Between ZNF497 and A1BG: There are two H boxes after nucleotide number 2300 in the negative strand and positive direction: 3'-ACCACA-5' at 420, 3'-ACACCA-5' at 386, 3'-TGGTGT-5' at 511, 3'-TGGTGT-5' at 530, 3'-ACACCA-5' at 2603 and 3'-ACACCA-5' at 3825.

There are two H boxes after nucleotide number 2300 in the positive strand and positive direction: 3'-ACACCA-5' at 204, 3'-ACACCA-5' at 528, 3'-ACACCA-5' at 3643 and 3'-ACACCA-5' at 3967.

Regarding 3'-ANANNA-5', on the negative strand, positive direction, there are 25 H boxes: 3'-ATACCA-5' at 2591, 3'-ACACCA-5' at 2603, 3'-ATAGAA-5' at 2628, 3'-AAACCA-5' at 2632, 3'-ACACTA-5'at 2637, 3'-ATATAA-5' at 2662, 3'-AGAGCA-5' at 2704, 3'-AGAGGA-5' at 2793, 3'-AAAGGA-5' at 2829, 3'-ACAGAA-5' at 2838, 3'-AAAGAA-5' at 3066, 3'-AGAACA-5' at 3094, 3'-AGAGCA-5' at 3138, 3'-ACAGCA-5' at 3212, 3'-ACAGTA-5' at 3414, 3'-AGATGA-5' at 3476, 3'-ACAGGA-5' at 3572, 3'-AAAGCA-5' at 3599, 3'-ACATGA-5' at 3708, 3'-ACACCA-5' at 3825, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-AAATGA-5' at 4094, 3'-ACATCA-5' at 4116, and 3'-ACATGA-5' at 4154.

On the positive strand, positive direction there are 20 H boxes: 3'-AAATAA-5' at 2347, 3'-AAAAAA-5' at 2451, 3'-AAAACA-5' at 2453, 3'-AGACGA-5' at 2976, 3'-AGACCA-5' at 3022, 3'-AGAGAA-5' at 3056, 3'-AGAAGA-5' at 3058, 3'-AGAGGA-5' at 3302, 3'-AGACGA-5' at 3307, 3'-ACAGAA-5' at 3393, 3'-AGAAGA-5' at 3395, 3'-ACAGGA-5' at 3620, 3'-ACACCA-5' at 3643, 3'-AAACCA-5' at 3948, 3'-ACACCA-5' at 3967, 3'-AGAGGA-5' at 4059, 3'-AAAATA-5' at 4122, 3'-AAATCA-5' at 4137, 3'-AAATAA-5' at 4142, and 3'-ATATTA-5' at 4168.

There inverses on the negative strand in the positive direction of 31 H boxes: 3'-ATGACA-5' at 2412, 3'-ACTACA-5' at 2428, 3'-AGGACA-5' at 2460, 3'-ATTATA-5' at 2548, 3'-ACCACA-5' at 2600, 3'-AGGAAA-5' at 2623, 3'-AATAGA-5' at 2627, 3'-ACCACA-5' at 2634, 3'-AACAGA-5' at 2652, 3'-AGCAAA-5' at 2706, 3'-AGGAAA-5' at 2831, 3'-AACACA-5' at 2835, 3'-ATGACA-5' at 2843, 3'-AGAACA-5' at 3094, 3'-AACACA-5' at 3096, 3'-AGGACA-5' at 3131, 3'-ACCAAA-5' at 3175, 3'-AACAGA-5' at 3179, 3'-AGCAGA-5' at 3214, 3'-AGTAGA-5' at 3416, 3'-AATAAA-5' at 3427, 3'-ACCAGA-5' at 3548, 3'-ATGACA-5' at 3569, 3'-AGGAGA-5' at 3650, 3'-AGCACA-5' at 3740, 3'-ACCACA-5' at 3859, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-ATCATA-5' at 4149, and 3'-ATTATA-5' at 4166.

HMG boxes

HNF6s

Core promoters

Inverse complement, positive strand, negative direction there is 1: 3'-TTATTAATTC-5', 4542.

Proximal promoters

Negative strand in the negative direction there is 1: 3'-TTATTAATCG-5', 4229.

Negative strand in the positive direction there are 2: 3'-TTATTAATCA-5', 4147, 3'-TTATTGATTA-5', 4164.

Inverse complement, positive strand, positive direction there are 1: 3'-ATATTAACAA-5', 4172.

Distal promoters

Negative strand in the negative direction there are 2: 3'-GTGTTAATAA-5', 1725, 3'-TAGTTGATAA-5', 3527.

Positive strand in the negative direction there is 1: 3'-AAATTGATAA-5', 3361.

Inverse complement, negative strand, negative direction there are 2: 3'-ACATGGACAT-5', 802, 3'-TAATGAACTT-5', 1301.

Inverse complement, positive strand, negative direction there are 2: 3'-AAATTGATAA-5', 3361, 3'-TCATCAACTA-5', 3525.

Negative strand in the positive direction there are 1: 3'-ATGTCCATGG-5', 3581.

Positive strand in the positive direction there is 1: 3'-GAGTCCATTG-5', 3732.

Inverse complement, positive strand, positive direction there is 1: 3'-CCATTGACTC-5', 3736.

Homeoboxes

"Transcription factors Pax-4 and Pax-6 are known to be key regulators of pancreatic cell differentiation and development. [...] The gene-targeting experiments revealed that Pax-4 and Pax-6 cannot substitute for each other in tissue with overlapping expression of both genes. [The] DNA-binding specificities of Pax-4 and Pax-6 are similar. The Pax-4 homeodomain [HD} was shown to preferentially dimerize on DNA sequences consisting of an inverted TAAT motif, separated by 4-nucleotide spacing."[89]

The "crucial difference between the binding sites of Antennapedia class and TTF-1 HDs is in the motifs 5'-TAAT-3', recognized by Antennapedia [a Hox gene, a subset of homeobox genes, first discovered in Drosophila which controls the formation of legs during development], and 5'-CAAG-3', preferentially bound by TTF-1. [The] binding of wild type and mutants TTF-1 HD to oligonucleotides containing either 5'-TAAT-3' or 5'-CAAG-3' indicate that only in the presence of the latter motif the Gln50 in TTF-1 HD is utilized for DNA recognition."[90]

HY boxes

Core promoters

Positive strand in the negative direction there is 1: 3'-TGAGGG-5' at 4558.

Inverse complement, negative strand, negative direction there is 1: 3'-CCCTCA-5', 4498.

Negative strand in the positive direction there is 1: 3'-TGTGGG-5', 4395.

Distal promoters

Negative strand in the negative direction there is 1: 3'-TGTGGG-5' at 749.

Positive strand in the negative direction there are 4: 3'-TGAGGG-5' at 88, 3'-TGAGGG-5' at 2699, 3'-TGAGGG-5' at 3652, 3'-TGTGGG-5' at 3712.

Inverse complement, negative strand, negative direction there are 3: 3'-CCCTCA-5', 2702, 3'-CCCACA-5', 3184, 3'-CCCTCA-5', 3889.

Positive strand in the positive direction there are 2: 3'-TGTGGG-5', 2965, 3'-TGTGGG-5', 3533.

Negative strand in the positive direction there are 3: 3'-TGAGGG-5', 258, 3'-TGAGGG-5', 3479, 3'-TGAGGG-5', 3879.

Inverse complement, negative strand, positive direction there are 3: 3'-CCCTCA-5', 88, 3'-CCCTCA-5', 3207, 3'-CCCTCA-5', 3503.

Inverse complement, positive strand, positive direction there is 5: 3'-CCCTCA-5', 494, 3'-CCCTCA-5', 662, 3'-CCCTCA-5', 1783, 3'-CCCACA-5', 1803, 3'-CCCTCA-5', 3185.

I boxes

Initiator elements (YYANWYY)

Core promoters

There is the following Inr in the core promoter, negative strand, negative direction: 3'-TTACTCC-5' at 4557.

There are four Inrs in the core promoter, positive strand, negative direction: 3'-CCACTCC-5' at 4425, 3'-CCACTTT-5' at 4461, 3'-TCACATT-5' at 4533, and 3'-TTAATTC-5' at 4542.

There is the following Inr in the core promoter, negative strand, positive direction: 3'-CTGCACC-5' at 4343.

There are two Inrs in the core promoter, positive strand, positive direction: 3'-CCACTCC-5' at 4401 and 3'-CCAGACC-5' at 4416.

Proximal promoters

There are eight Inrs on the negative strand in the negative direction: 3'-TCACTCT-5' at 4202, 3'-TCGGTCT-5' at 4233, 3'-CTGCACC-5' at 4238, 3'-TCGGACC-5' at 4300, 3'-CCAGTTT-5' at 4309, 3'-TCGGACC-5' at 4349, 3'-TCACACT-5' at 4361, and 3'-TTACTCC-5' at 4557.

There are seven Inrs on the positive strand in the negative direction: 3'-CCGGACT-5' at 4327, 3'-CTGCACT-5' at 4340, 3'-CCAGTTC-5' at 4417, 3'-CCACTCC-5' at 4425, 3'-CCACTTT-5' at 4461, 3'-TCACATT-5' at 4533, and 3'-TTAATTC-5' at 4542.

There is one Inr on the negative strand in the positive direction: 3'-CTGCACC-5' at 4343.

There is two Inrs on the positive strand in the positive direction: 3'-CCACTCC-5' at 4401 and 3'-CCAGACC-5' at 4416.

Distal promoters

Negative strand in the negative direction there are 87: 3'-TTGTTCC-5', 71, 3'-CTATACC-5', 77, 3'-CCGTTTC-5', 93, 3'-CCGTACT-5', 124, 3'-CCATATT-5', 181, 3'-CTACATT-5', 247, 3'-TTGGTCC-5', 262, 3'-TTATACT-5', 274, 3'-TCACTCT-5', 301, 3'-CTGCTTT-5', 312, 3'-CCGGTTC-5', 419, 3'-CCAGTCC-5', 441, 3'-TCGGACC-5', 459, 3'-TTGTATC-5', 468, 3'-TCACTTT-5', 473, 3'-TCGGACC-5', 508, 3'-CCGGTTC-5', 556, 3'-CCAGTCC-5', 578, 3'-TTATACC-5', 605, 3'-CCGGTCC-5', 648, 3'-CCGGTTC-5', 692, 3'-CCAGTCC-5', 714, 3'-TCGGACT-5', 732, 3'-TCGCACC-5', 741, 3'-CTACACC-5', 787, 3'-TCGGTTC-5', 874, 3'-TCGGACC-5', 899, 3'-TCGCTCT-5', 913, 3'-TCGGTCC-5', 948, 3'-CCGTACC-5', 953, 3'-TTAGTCC-5', 984, 3'-TTGGACC-5', 1015, 3'-TCACTCT-5', 1079, 3'-TCGGACC-5', 1198, 3'-TTGTACC-5', 1207, 3'-CCACTTT-5', 1212, 3'-CCGCACC-5', 1244, 3'-TTGGATC-5', 1306, 3'-TCAGACC-5', 1356, 3'-TTATTCT-5', 1365, 3'-TCGTTTT-5', 1371, 3'-TTGTTTT-5', 1394, 3'-CCACACT-5', 1479, 3'-TTGCTTC-5', 1555, 3'-CCGTTTT-5', 1561, 3'-TTACTTT-5', 1582, 3'-TTGGATT-5', 1591, 3'-TTAATTT-5', 1697, 3'-TTATACC-5', 1742, 3'-CCGCACC-5', 1897, 3'-CCGTACT-5', 1953, 3'-TTGGACC-5', 1959, 3'-TCGGACC-5', 2009, 3'-TCGTTCT-5', 2023, 3'-TTACACC-5', 2065, 3'-CCGGTCC-5', 2077, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TTGTACC-5', 2152, 3'-CCGCTTT-5', 2157, 3'-CCAGTCC-5', 2250, 3'-TCAAACT-5', 2257, 3'-TCGGACC-5', 2268, 3'-TCGTACC-5', 2277, 3'-CCACTTT-5', 2282, 3'-TTGGACC-5', 2385, 3'-TCGGACC-5', 2435, 3'-TCACTCT-5', 2449, 3'-TCGTTTT-5', 2476, 3'-TTGTTTT-5', 2490, 3'-TCATTCT-5', 2503, 3'-CCGGTCC-5', 2519, 3'-CCAGTCC-5', 2587, 3'-TCACACC-5', 2605, 3'-TTGTACC-5', 2614, 3'-CCACTTT-5', 2619, 3'-TCACACC-5', 2658, 3'-TTGGACC-5', 2720, 3'-TCGGACC-5', 2770, 3'-TCGTACT-5', 2784, 3'-TTGATTC-5', 2914, 3'-CCGATTT-5', 3009, 3'-TTGATTC-5', 3031, 3'-CCGCACC-5', 3047, 3'-TCGGACC-5', 3128, 3'-TTGTTCC-5', 3141, 3'-CCACTTT-5', 3146, 3'-TTGTATT-5', 3169, 3'-CCACACC-5', 3186, 3'-TCGGTTC-5', 3273, 3'-TCGGACC-5', 3298, 3'-TTGTTCT-5', 3307, 3'-TCGTTTT-5', 3313, 3'-TTGTTCT-5', 3340, 3'-TCGTTCT-5', 3374, 3'-CCGAACT-5', 3401, 3'-CCGTATC-5', 3446, 3'-TTGATCT-5', 3463, 3'-TTGGTCT-5', 3486, 3'-CTGTTCT-5', 3759, 3'-CTACACC-5', 3810, 3'-CTGGTCC-5', 3871, 3'-TCATTCT-5', 3893, 3'-CTACTTT-5', 3922, 3'-CCGGTCC-5', 3951, 3'-TCGGACC-5', 4037, 3'-TTGTATC-5', 4046, 3'-TCACTCT-5', 4051, 3'-TTACACT-5', 4092, 3'-CCGGTCC-5', 4102, 3'-CCGTACC-5', 4107, 3'-CCGGTCC-5', 4170, 3'-TCGAACC-5', 4188.

Positive strand in the negative direction there are 40: 3'-CTGAATT-5', 20, 3'-TTGGACC-5', 32, 3'-CTGCATT-5', 152, 3'-TTGAACC-5', 846, 3'-TCACACC-5', 882, 3'-TTGAACC-5', 1012, 3'-TCACTCC-5', 1058, 3'-TCACACC-5', 1128, 3'-TTGAACC-5', 1303, 3'-TTGCACC-5', 1339, 3'-TTGCACT-5', 1347, 3'-CCAGTCT-5', 1354, 3'-CCATTTC-5', 1380, 3'-TCGCTCT-5', 1450, 3'-CTATATC-5', 1528, 3'-TTATTTT-5', 1727, 3'-CTGCACT-5', 2000, 3'-CTACTCC-5', 2352, 3'-TTGAACC-5', 2382, 3'-TCACACC-5', 2418, 3'-CTGCACT-5', 2426, 3'-TTGAATC-5', 2708, 3'-TTGAACC-5', 2717, 3'-CTGCACC-5', 2761, 3'-TTGAACC-5', 3245, 3'-TTGCACT-5', 3289, 3'-CCAGATC-5', 3488, 3'-CTGCTCC-5', 3582, 3'-CCATTTC-5', 3688, 3'-CTGGACT-5', 3747, 3'-CTGAACC-5', 3784, 3'-CCATACC-5', 3858, 3'-TCACACC-5', 3967.

Inverse complement, negative strand, negative direction there are 32: 3'-GATACAA-5', 213, 3'-GGACCGA-5', 598, 3'-AGTGCGG-5', 664, 3'-GGACTGG-5', 734, 3'-AGTGTGG-5', 882, 3'-GAAGTGA-5', 1056, 3'-AGTGTGG-5', 1128, 3'-GGACCGG-5', 1200, 3'-AGAGCGA-5', 1448, 3'-GGTCCGA-5', 1462, 3'-GATATAG-5', 1528, 3'-AGAACGG-5', 1608, 3'-AAAATAG-5', 1730, 3'-AGTGCAG-5', 1773, 3'-GGACCGA-5', 1843, 3'-AGTGCGG-5', 1992, 3'-AGTGCGG-5', 2208, 3'-AGTGTGG-5', 2418, 3'-AGTACGG-5', 2535, 3'-AGTACGG-5', 2753, 3'-AAAGTAG-5', 2887, 3'-GATTCGA-5', 3033, 3'-GGACCGG-5', 3130, 3'-AGTGCGG-5', 3281, 3'-AGTCCGA-5', 3398, 3'-GGTCTAG-5', 3488, 3'-GGTATGG-5', 3858, 3'-GGTCCGG-5', 3873, 3'-AGTGTGG-5', 3967.

Negative strand in the positive direction there are 45: 3'-TTGTATT-5', 115, 3'-CTGTTTT-5', 147, 3'-CCACACT-5', 345, 3'-CCGGACT-5', 746, 3'-CTGCACT-5', 1372, 3'-CTGCACT-5', 1472, 3'-CCAGACT-5', 1744, 3'-CCACTTC-5', 1914, 3'-CTATTTC-5', 1978, 3'-CCAGTCC-5', 2026, 3'-TCGCTTC-5', 2095, 3'-TCATATT-5', 2178, 3'-CTGCATT-5', 2206, 3'-CCAGATC-5', 2230, 3'-TCAATCT-5', 2235, 3'-CTGTTTC-5', 2263, 3'-TCACTCT-5', 2306, 3'-CTACACC-5', 2430, 3'-CTAATTT-5', 2440, 3'-CCGCACC-5', 2566, 3'-TTATACC-5', 2590, 3'-CCACACC-5', 2602, 3'-CCACACT-5', 2636, 3'-TCAGATT-5', 2868, 3'-CTGCTCC-5', 2978, 3'-CCAGTCC-5', 2998, 3'-CCAGTCC-5', 3084, 3'-CTGGTCT-5', 3245, 3'-TCGCTCT-5', 3276, 3'-CTGGTCT-5', 3299, 3'-CTGCTCC-5', 3309, 3'-CTGCACC-5', 3322, 3'-CCGCATC-5', 3328, 3'-TTGCACT-5', 3343, 3'-CTGTTCC-5', 3352, 3'-TTGCATC-5', 3402, 3'-TCACACT-5', 3507, 3'-CCAGACC-5', 3550, 3'-CTGTTCC-5', 3625, 3'-TCACACC-5', 3824, 3'-TCATTTT-5', 4120, 3'-TCACTCT-5', 4128, 3'-TTGATTT-5', 4134, 3'-TTAGTTT-5', 4139.

Positive strand in the positive direction there are 75: 3'-CTGGACC-5', 40, 3'-CCGGTCC-5', 215, 3'-TTACACT-5', 230, 3'-CCGGACC-5', 286, 3'-CCGTTCC-5', 503, 3'-TCGGTCC-5', 515, 3'-CCGCTCT-5', 557, 3'-CCGTTCC-5', 587, 3'-CCGCTCT-5', 641, 3'-CCGTTCC-5', 671, 3'-CCGGACT-5', 725, 3'-CCGTTCC-5', 823, 3'-TCGGTCT-5', 835, 3'-TTGGACC-5', 847, 3'-CCGTTCC-5', 923, 3'-TCGGTCT-5', 935, 3'-TTGGACC-5', 947, 3'-CCGTTCC-5', 1007, 3'-TCGCTCT-5', 1061, 3'-CCGGTCC-5', 1175, 3'-CCGCTCT-5', 1229, 3'-CCGTTCC-5', 1259, 3'-CCGTTCC-5', 1327, 3'-CCGCTCT-5', 1381, 3'-CCGTTCC-5', 1427, 3'-CCGCTCT-5', 1481, 3'-TCGTTCC-5', 1511, 3'-CCGCTCT-5', 1565, 3'-CCGCACT-5', 1720, 3'-CCACACC-5', 1805, 3'-CCGCTCT-5', 1921, 3'-CCGTTCT-5', 1948, 3'-CCACACC-5', 1971, 3'-TCAATTT-5', 2136, 3'-TTGTACT-5', 2141, 3'-CTACTTT-5', 2146, 3'-CCGTTCT-5', 2190, 3'-CCAGTCT-5', 2222, 3'-TTGGTCT-5', 2228, 3'-CCGCACT-5', 2555, 3'-CCGGTCC-5', 2574, 3'-TCAGTCT-5', 2609, 3'-TCAGTTC-5', 2615, 3'-TCAGTCC-5', 2620, 3'-CTATATT-5', 2662, 3'-TCAATCC-5', 2668, 3'-TCGTTTT-5', 2707, 3'-TCGATTC-5', 2789, 3'-TTGCTCC-5', 2806, 3'-CTAAACT-5', 2871, 3'-CTGGTCC-5', 2876, 3'-CCAGACT-5', 2943, 3'-CCGGACC-5', 2988, 3'-CCAGACC-5', 3021, 3'-TTATACC-5', 3162, 3'-CTGGTTT-5', 3175, 3'-TCGGTCT-5', 3221, 3'-CTACTCC-5', 3478, 3'-CCGATCC-5', 3484, 3'-TCGATCC-5', 3522, 3'-CTGGTCT-5', 3548, 3'-TCACACT-5', 3594, 3'-CCACTCC-5', 3647, 3'-CCGGACC-5', 3679, 3'-CCGGACC-5', 3758, 3'-CTGGACC-5', 3787, 3'-TCACTCC-5', 3878, 3'-TCAGACT-5', 3924, 3'-TCACACC-5', 3966, 3'-CCACACT-5', 3971, 3'-TTACTCC-5', 4096, 3'-CTACTCC-5', 4102, 3'-CTAAATC-5', 4136, 3'-CCACTCC-5'.

Inverse complement, negative strand, positive direction there are 61: 3'-AGAGTGG-5', 53, 3'-AATGTGA-5', 230, 3'-GGAGCGA-5', 429, 3'-AGACCGG-5', 442, 3'-GGTGCGG-5', 489, 3'-AGTGCGG-5', 498, 3'-AGTGCGG-5', 582, 3'-AGTGCGG-5', 666, 3'-GGTGCAG-5', 784, 3'-AGTGCGG-5', 1086, 3'-AGTGCGG-5', 1170, 3'-AGTGCGG-5', 1254, 3'-AATGCGG-5', 1322, 3'-AATGCGG-5', 1422, 3'-AGTGCGG-5', 1590, 3'-GAAGCGG-5', 1636, 3'-GGTGCGG-5', 1764, 3'-AGTGCAG-5', 1787, 3'-GGTGTGG-5', 1805, 3'-GAACTGG-5', 1953, 3'-GGTGTGG-5', 1971, 3'-AAAGCAG-5', 2007, 3'-AGTGCAG-5', 2064, 3'-GAACCAG-5', 2227, 3'-AGATCAA-5', 2232, 3'-AGTGCAG-5', 2327, 3'-GGTGCAA-5', 2335, 3'-GAAATAG-5', 2626, 3'-GATATAA-5', 2662, 3'-GGACTGA-5', 2674, 3'-AGAGCAA-5', 2705, 3'-AAAGTGG-5', 2711, 3'-GGTGCAA-5', 2801, 3'-AGAATGA-5', 2841, 3'-GATTTGA-5', 2871, 3'-GGTCTGA-5', 2943, 3'-GGTCTGG-5', 3021, 3'-AATATGG-5', 3162, 3'-GAAATGG-5', 3168, 3'-GGACCAA-5', 3174, 3'-GGAATGA-5', 3441, 3'-GATGCAG-5', 3460, 3'-AGTGCAG-5', 3465, 3'-GGACCAG-5', 3547, 3'-GGAATGA-5', 3567, 3'-AGTGTGA-5', 3594, 3'-GAAGCGG-5', 3670, 3'-AATCCGA-5', 3799, 3'-AGAATGA-5', 3835, 3'-GAACCAG-5', 3840, 3'-AGAGTGA-5', 3876, 3'-AGTCTGA-5', 3924, 3'-AGTGTGG-5', 3966, 3'-GGTGTGA-5', 3971, 3'-AGAGTGG-5', 4040, 3'-AGAACAG-5', 4069, 3'-GAAATGA-5', 4094, 3'-GATTTAG-5', 4136.

Inverse complement, positive strand, negative direction there are 100: 3'-AGACTGA-5', 17, 3'-GGACCAG-5', 34, 3'-AAAACAA-5', 69, 3'-GATATGG-5', 77, 3'-AAACTGA-5', 130, 3'-AAAACAG-5', 167, 3'-GGTATAA-5', 181, 3'-GAAACAA-5', 229, 3'-GATGTAA-5', 247, 3'-AGTTCAA-5', 255, 3'-AAACCAG-5', 261, 3'-AATATGA-5', 274, 3'-AGAACAG-5', 288, 3'-AAACTGA-5', 307, 3'-GGTGCGG-5', 380, 3'-AGTGCGA-5', 448, 3'-AATACGA-5', 492, 3'-AAATTAG-5', 499, 3'-AGATTGA-5', 585, 3'-AATATGG-5', 605, 3'-AATACAA-5', 635, 3'-AAATTGG-5', 643, 3'-AGTTCGA-5', 721, 3'-AGACCAG-5', 727, 3'-AATACAA-5', 769, 3'-AAATTAG-5', 777, 3'-GATGTGG-5', 787, 3'-AGAGCGA-5', 911, 3'-GATCCAG-5', 975, 3'-AGATTGG-5', 1045, 3'-AGAGTGA-5', 1077, 3'-AAATTAG-5', 1234, 3'-AGTCTGG-5', 1356, 3'-AGAGCAA-5', 1369, 3'-AAAACAA-5', 1388, 3'-AGTGCAG-5', 1471, 3'-GGTGTGA-5', 1479, 3'-AGTGCAA-5', 1536, 3'-AGAACGA-5', 1553, 3'-AATACAG-5', 1566, 3'-GAAACAA-5', 1585, 3'-GAAATGA-5', 1663, 3'-AAAGCGG-5', 1680, 3'-GAATTAA-5', 1696, 3'-AATATGG-5', 1742, 3'-AATACAA-5', 1878, 3'-AAATTAG-5', 1887, 3'-AGACTGA-5', 1935, 3'-AGAATGG-5', 1948, 3'-AGAGCAA-5', 2021, 3'-AATGTGG-5', 2065, 3'-GGTGCAG-5', 2082, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AGACCAA-5', 2147, 3'-GATACAA-5', 2180, 3'-AAAATGA-5', 2187, 3'-GGTGCGG-5', 2197, 3'-AGTTTGA-5', 2257, 3'-AGACCAG-5', 2263, 3'-AATACAA-5', 2305, 3'-AAACTAG-5', 2313, 3'-AGAGTGA-5', 2447, 3'-GATTCGG-5', 2454, 3'-AAAGCAA-5', 2474, 3'-AAAGCAA-5', 2480, 3'-AAAACAA-5', 2509, 3'-AGACCAG-5', 2600, 3'-AGTGTGG-5', 2605, 3'-AAATCAG-5', 2649, 3'-AGTGTGG-5', 2658, 3'-AAAACAA-5', 2842, 3'-AGAATGG-5', 3004, 3'-AAAATAA-5', 3013, 3'-AAACTAA-5', 3030, 3'-AGACCAG-5', 3123, 3'-AAATTAG-5', 3176, 3'-GGTGTGG-5', 3186, 3'-AGAGCAA-5', 3311, 3'-AAAACAA-5', 3330, 3'-AAATTGA-5', 3358, 3'-GAAGTGA-5', 3410, 3'-GAACTAG-5', 3462, 3'-AAACCAG-5', 3485, 3'-AATCCAG-5', 3681, 3'-GGAACAG-5', 3725, 3'-GGACTGG-5', 3749, 3'-AATGCAG-5', 3772, 3'-GATGTGG-5', 3810, 3'-GGACCAG-5', 3870, 3'-GGAGTAA-5', 3891, 3'-AGTTCAA-5', 4026, 3'-AGACCAG-5', 4032, 3'-AAAATAA-5', 4071, 3'-AATGTGA-5', 4092, 3'-AGTTCAA-5', 4177.

Inverse complement, positive strand, positive direction there are 75: 3'-GGTCCGA-5', 10, 3'-AGTCCGG-5', 92, 3'-AATCCAG-5', 152, 3'-GGTCCAG-5', 217, 3'-GGTGTGA-5', 345, 3'-GAAGCGG-5', 459, 3'-AGAATGA-5', 524, 3'-GAAGCGG-5', 595, 3'-GATGCGA-5', 652, 3'-GGTGCGA-5', 777, 3'-GGACCGG-5', 849, 3'-GGACCGG-5', 949, 3'-GGTCCGA-5', 1177, 3'-AAAGCAG-5', 1183, 3'-GAAGCGG-5', 1308, 3'-GAAGCGG-5', 1408, 3'-AATTCGG-5', 1541, 3'-GATGCGA-5', 1576, 3'-GGACTGG-5', 1662, 3'-GGTCTGA-5', 1744, 3'-GGACCGA-5', 1817, 3'-GGTCCGG-5', 1857, 3'-AGAATGG-5', 1888, 3'-GAAGTAG-5', 2110, 3'-AGTATAA-5', 2178, 3'-GGACTGG-5', 2213, 3'-GGTCTAG-5', 2230, 3'-AGAGTGG-5', 2247, 3'-AAAGTGA-5', 2304, 3'-GGTCCGA-5', 2318, 3'-AATCCGA-5', 2368, 3'-GATGTGG-5', 2430, 3'-GGACCGA-5', 2435, 3'-AGAGTGG-5', 2470, 3'-GGTACAA-5', 2475, 3'-GGACCGG-5', 2571, 3'-AATATGG-5', 2590, 3'-GGTGTGG-5', 2602, 3'-AGTTCAG-5', 2617, 3'-GGTGTGA-5', 2636, 3'-AGTCTAA-5', 2868, 3'-AAACTGG-5', 2873, 3'-GGTCCGG-5', 2878, 3'-AGACCGA-5', 2885, 3'-GGAGTAA-5', 2902, 3'-AGACTGA-5', 2945, 3'-AGACCGG-5', 2985, 3'-GGACCGG-5', 2990, 3'-GGAACAG-5', 3003, 3'-GGTCCAG-5', 3018, 3'-AGACCAA-5', 3023, 3'-AGTCCGG-5', 3036, 3'-GGACCAA-5', 3049, 3'-GAAGTAG-5', 3250, 3'-AGTGCAG-5', 3255, 3'-GGACCAG-5', 3298, 3'-AGAGTGA-5', 3317, 3'-GGTACAA-5', 3337, 3'-GGAACGG-5', 3375, 3'-AGTGTGA-5', 3507, 3'-GATCCGA-5', 3524, 3'-GGTCTGG-5', 3550, 3'-AGAGTGG-5', 3612, 3'-GGACCGG-5', 3681, 3'-AGTGTGG-5', 3824, 3'-GAACTGG-5', 4018, 3'-AAAATAG-5', 4123, 3'-GAACTAA-5', 4133, 3'-AAATCAA-5', 4138.

Initiator elements (BBCABW)

Core promoters

There are five Inrs, positive strand, negative direction: 3'-TCCACT-5', 4423, 3'-CCCAGA-5', 4448, 3'-TCCACT-5', 4459, 3'-CCCACT-5', 4485, 3'-TTCACA-5', 4531.

There are five Inrs, negative strand, positive direction: 3'-GTCAGT-5', 4271, 3'-CTCATT-5', 4309, 3'-TGCAGA-5', 4317, 3'-CCCAGA-5', 4330, 3'-CTCACT-5', 4338.

There are four Inrs, positive strand, positive direction: 3'-TCCAGT-5', 4269, 3'-CTCACT-5', 4350, 3'-CCCACT-5', 4399, 3'-CCCAGA-5', 4414.

Proximal promoters

There are five Inrs on the negative strand in the negative direction: 3'-GTCACT-5', 4200, 3'-TCCAGT-5', 4307, 3'-GTCACT-5', 4319, 3'-CCCACT-5', 4353, 3'-GTCACA-5', 4359.

There are nine Inrs on the positive strand in the negative direction: 3'-GCCAGA-5', 4233, 3'-TGCAGT-5', 4317, 3'-TGCACT-5', 4340, 3'-GCCAGT-5', 4415, 3'-TCCACT-5', 4423, 3'-CCCAGA-5', 4448, 3'-TCCACT-5', 4459, 3'-CCCACT-5', 4485, 3'-TTCACA-5', 4531.

There is six Inrs on the negative strand in the positive direction: 3'-CTCAGA-5', 4195, 3'-GTCAGT-5', 4271, 3'-CTCATT-5', 4309, 3'-TGCAGA-5', 4317, 3'-CCCAGA-5', 4330, 3'-CTCACT-5', 4338.

There is four Inrs on the positive strand in the positive direction: 3'-TCCAGT-5', 4269, 3'-CTCACT-5', 4350, 3'-CCCACT-5', 4399, 3'-CCCAGA-5', 4414.

Distal promoters

Negative strand in the negative direction there are 44: 3'-TCCATA-5', 179, 3'-CCCAGT-5', 206, 3'-CTCAGA-5', 278, 3'-GTCACT-5', 299, 3'-TTCACA-5', 322, 3'-TCCAGT-5', 439, 3'-TGCATT-5', 533, 3'-TCCAGT-5', 568, 3'-TCCAGT-5', 576, 3'-TCCAGT-5', 712, 3'-GGCAGA-5', 754, 3'-GCCACT-5', 868, 3'-GTCACT-5', 1034, 3'-CCCACT-5', 1049, 3'-CTCACT-5', 1077, 3'-GGCACA-5', 1220, 3'-GTCACT-5', 1325, 3'-GTCAGA-5', 1354, 3'-CTCAGA-5', 1444, 3'-GGCAGT-5', 1511, 3'-TGCAGA-5', 1774, 3'-GTCACT-5', 1978, 3'-GTCACA-5', 2085, 3'-TCCAGT-5', 2248, 3'-GTCACT-5', 2404, 3'-CTCACT-5', 2447, 3'-TCCAGT-5', 2585, 3'-GTCACA-5', 2603, 3'-GTCACA-5', 2656, 3'-GTCACT-5', 2739, 3'-TTCACA-5', 2860, 3'-TCCACT-5', 3144, 3'-CCCACA-5', 3184, 3'-TTCACT-5', 3410, 3'-GTCATT-5', 3480, 3'-TCCACT-5', 3825, 3'-CTCATA-5', 3829, 3'-CTCATT-5', 3891, 3'-TTCACA-5', 3939.

Positive strand in the negative direction there are 59: 3'-GCCATA-5', 39, 3'-TGCATT-5', 152, 3'-GTCACT-5', 208, 3'-GGCACA-5', 266, 3'-GGCACA-5', 518, 3'-GGCACA-5', 960, 3'-GGCAGA-5', 1023, 3'-TGCAGT-5', 1032, 3'-TTCACT-5', 1056, 3'-GGCACA-5', 1116, 3'-CTCACA-5', 1126, 3'-GGCAGA-5', 1314, 3'-TGCAGT-5', 1323, 3'-TGCACT-5', 1347, 3'-TCCAGT-5', 1352, 3'-TCCATT-5', 1378, 3'-CCCAGA-5', 1411, 3'-TGCAGT-5', 1472, 3'-CTCACT-5', 1491, 3'-CCCAGA-5', 1518, 3'-TCCAGT-5', 1532, 3'-TGCACA-5', 1719, 3'-GGCAGA-5', 1967, 3'-TGCAGT-5', 1976, 3'-GCCACT-5', 1995, 3'-TGCACT-5', 2000, 3'-TGCAGT-5', 2083, 3'-GCCAGT-5', 2211, 3'-TGCAGT-5', 2402, 3'-TGCACT-5', 2426, 3'-TCCACT-5', 2632, 3'-GCCAGT-5', 2654, 3'-GGCACA-5', 2665, 3'-TGCAGT-5', 2737, 3'-GCCACT-5', 2756, 3'-GCCATT-5', 3284, 3'-TGCACT-5', 3289, 3'-TGCAGA-5', 3431, 3'-GGCATA-5', 3445, 3'-GGCATA-5', 3451, 3'-GGCAGT-5', 3478, 3'-GGCAGA-5', 3589, 3'-GGCAGT-5', 3600, 3'-GTCAGA-5', 3625, 3'-GGCACA-5', 3632, 3'-CTCAGA-5', 3644, 3'-GCCATT-5', 3686, 3'-TCCACA-5', 3692, 3'-CCCATA-5', 3856, 3'-CTCACA-5', 3965.

Inverse complement, negative strand, negative direction there are 46: 3'-TCTGAC-5', 16, 3'-TGTGGA-5', 62, 3'-TGTGCA-5', 342, 3'-TGTGCA-5', 531, 3'-AGTGCG-5', 663, 3'-TGTGGG-5', 749, 3'-TCTGAG-5', 916, 3'-TGTGCG-5', 963, 3'-ACTGAA-5', 1052, 3'-AGTGAG-5', 1057, 3'-TCTGAG-5', 1082, 3'-TGTGGA-5', 1129, 3'-AGTGGA-5', 1171, 3'-AATGAA-5', 1298, 3'-TCTGAG-5', 1403, 3'-AGTGAC-5', 1492, 3'-TGTGAA-5', 1544, 3'-TCTGAA-5', 1617, 3'-AGTGCA-5', 1772, 3'-TCTGAC-5', 1934, 3'-AGTGCG-5', 1991, 3'-TCTGAG-5', 2026, 3'-TATGAC-5', 2162, 3'-ACTGGC-5', 2190, 3'-AGTGCG-5', 2207, 3'-TGTGAA-5', 2551, 3'-AGTGAA-5', 2578, 3'-ACTGAG-5', 2787, 3'-TATGGA-5', 2994, 3'-AGTGGG-5', 3057, 3'-AGTGAA-5', 3101, 3'-AGTGAA-5', 3240, 3'-AGTGCG-5', 3280, 3'-TCTGAC-5', 3425, 3'-TATGAC-5', 3541, 3'-TATGCG-5', 3547, 3'-TATGGA-5', 3859, 3'-TGTGGA-5', 3968, 3'-TGTGAA-5', 3983.

Inverse complement, positive strand, negative direction there are 54, 3'-ACTGAA-5', 18, 3'-TATGGG-5', 78, 3'-ACTGAA-5', 131, 3'-TATGAG-5', 275, 3'-AGTGAG-5', 300, 3'-ACTGAC-5', 308, 3'-AGTGCG-5', 447, 3'-AGTGAA-5', 472, 3'-AGTGGA-5', 523, 3'-AGTGAG-5', 1035, 3'-AGTGAG-5', 1078, 3'-AGTGGC-5', 1121, 3'-AGTGAG-5', 1326, 3'-TCTGGG-5', 1357, 3'-AGTGCA-5', 1470, 3'-ACTGCA-5', 1494, 3'-AGTGCA-5', 1535, 3'-AATGAA-5', 1581, 3'-AATGCC-5', 1634, 3'-TATGGC-5', 1743, 3'-ACTGAG-5', 1936, 3'-AATGGC-5', 1949, 3'-AGTGAG-5', 1979, 3'-ACTGCA-5', 1998, 3'-TGTGGC-5', 2066, 3'-AATGAC-5', 2188, 3'-AGTGAG-5', 2405, 3'-ACTGCA-5', 2424, 3'-AGTGAG-5', 2448, 3'-TGTGGC-5', 2606, 3'-AGTGAG-5', 2740, 3'-ACTGCA-5', 2759, 3'-TGTGCA-5', 2863, 3'-AATGGC-5', 3005, 3'-TGTGAG-5', 3268, 3'-AGTGAC-5', 3411, 3'-TGTGCA-5', 3429, 3'-TGTGCC-5', 3561, 3'-AATGGG-5', 3660, 3'-TGTGGG-5', 3712, 3'-ACTGGG-5', 3750, 3'-AATGCA-5', 3771, 3'-TCTGGA-5', 3836, 3'-ACTGCC-5', 3852, 3'-TGTGGC-5', 3960, 3'-AGTGAG-5', 4050, 3'-TGTGAG-5', 4093.

Negative strand in the positive direction there 87: 3'-TCCAGA-5', 15, 3'-GGCATT-5', 22, 3'-GTCACA-5', 155, 3'-CCCAGA-5', 204, 3'-GCCACA-5', 343, 3'-CGCAGA-5', 396, 3'-TGCAGA-5', 438, 3'-CCCAGA-5', 468, 3'-TGCACA-5', 548, 3'-TCCACA-5', 632, 3'-CGCACT-5', 686, 3'-CGCACA-5', 800, 3'-GCCAGA-5', 835, 3'-GCCACA-5', 884, 3'-GCCAGA-5', 935, 3'-GCCACA-5', 984, 3'-CGCACA-5', 1052, 3'-CGCACA-5', 1136, 3'-TGCACA-5', 1220, 3'-CCCAGT-5', 1250, 3'-CGCAGA-5', 1316, 3'-TGCACT-5', 1372, 3'-CGCAGA-5', 1416, 3'-TGCACT-5', 1472, 3'-CCCACT-5', 1502, 3'-CGCACA-5', 1556, 3'-GGCATT-5', 1702, 3'-CCCAGA-5', 1742, 3'-TGCACA-5', 1822, 3'-TCCACT-5', 1912, 3'-TGCAGA-5', 1937, 3'-GGCACT-5', 1996, 3'-CCCAGT-5', 2024, 3'-TCCACA-5', 2029, 3'-CTCAGT-5', 2060, 3'-TGCAGT-5', 2065, 3'-GCCACT-5', 2072, 3'-TTCAGT-5', 2098, 3'-CTCATA-5', 2176, 3'-TGCATT-5', 2206, 3'-GTCAGA-5', 2222, 3'-CTCAGA-5', 2239, 3'-TTCACT-5', 2304, 3'-TGCAGT-5', 2328, 3'-GTCACT-5', 2425, 3'-GTCAGA-5', 2609, 3'-CTCAGA-5', 2699, 3'-TGCAGA-5', 2721, 3'-CTCAGA-5', 2729, 3'-TGCAGA-5', 2859, 3'-CTCAGA-5', 2866, 3'-CTCATT-5', 2902, 3'-GTCACT-5', 2929, 3'-TTCAGT-5', 2936, 3'-TGCACA-5', 2962, 3'-TGCATT-5', 3072, 3'-CCCAGT-5', 3082, 3'-CCCAGA-5', 3091, 3'-TCCACA-5', 3192, 3'-CTCACA-5', 3209, 3'-GCCAGA-5', 3221, 3'-TGCAGT-5', 3232, 3'-TGCAGT-5', 3281, 3'-CTCACT-5', 3317, 3'-TGCACT-5', 3343, 3'-CCCAGT-5', 3379, 3'-CCCACT-5', 3388, 3'-GGCACA-5', 3409, 3'-TGCAGT-5', 3461, 3'-GGCAGA-5', 3473, 3'-CTCACA-5', 3505, 3'-GCCACA-5', 3705, 3'-TCCAGA-5', 3806, 3'-GTCACA-5', 3822, 3'-TGCAGA-5', 3831, 3'-TCCAGA-5', 3891, 3'-CGCAGA-5', 3916, 3'-GTCACA-5', 3954, 3'-TGCAGT-5', 3962, 3'-GGCACT-5', 4006, 3'-TCCACT-5', 4013.

Positive strand in the positive direction there are 40: 3'-TCCAGT-5', 153, 3'-CGCACA-5', 1020, 3'-CCCAGA-5', 1711, 3'-CGCACT-5', 1720, 3'-CCCACA-5', 1803, 3'-CCCAGA-5', 1958, 3'-TCCACA-5', 1969, 3'-GTCAGT-5', 2100, 3'-TCCACT-5', 2128, 3'-TCCAGT-5', 2220, 3'-TCCAGA-5', 2258, 3'-TCCACT-5', 2375, 3'-CGCAGT-5', 2423, 3'-GTCACA-5', 2464, 3'-CCCAGA-5', 2489, 3'-TTCACT-5', 2511, 3'-CGCACT-5', 2555, 3'-GTCAGT-5', 2607, 3'-CTCAGT-5', 2613, 3'-TTCAGT-5', 2618, 3'-TCCATA-5', 2642, 3'-TCCAGA-5', 3019, 3'-CTCAGA-5', 3187, 3'-TGCAGA-5', 3256, 3'-CTCACA-5', 3592, 3'-GCCAGA-5', 3608, 3'-CTCACT-5', 3712, 3'-TCCATT-5', 3731, 3'-TCCAGA-5', 3771, 3'-CCCAGT-5', 3820, 3'-GTCACT-5', 3843, 3'-CTCACT-5', 3876, 3'-TTCAGA-5', 3922, 3'-TCCACT-5', 3934, 3'-GTCACA-5', 3964, 3'-CGCAGA-5', 4056.

Inverse complement, negative strand, positive direction there are 94: 3'-AGTGGG-5', 54, 3'-TCTGCA-5', 224, 3'-TGTGAA-5', 231, 3'-ACTGCC-5', 238, 3'-TCTGAG-5', 256, 3'-TCTGGA-5', 271, 3'-ACTGGG-5', 348, 3'-AGTGCG-5', 497, 3'-AGTGCG-5', 581, 3'-AGTGCG-5', 665, 3'-ACTGCG-5', 749, 3'-TGTGGC-5', 819, 3'-ACTGCC-5', 901, 3'-TGTGGC-5', 919, 3'-ACTGCG-5', 1001, 3'-TGTGGC-5', 1023, 3'-AGTGCG-5', 1085, 3'-AGTGCG-5', 1160, 3'-AGTGCG-5', 1169, 3'-AGTGCG-5', 1253, 3'-ACTGAG-5', 1287, 3'-AATGCG-5', 1321, 3'-TCTGGC-5', 1377, 3'-TCTGCG-5', 1396, 3'-AATGCG-5', 1421, 3'-TCTGGC-5', 1477, 3'-TCTGCG-5', 1496, 3'-ACTGCA-5', 1505, 3'-AGTGCG-5', 1589, 3'-AGTGCG-5', 1725, 3'-AGTGCA-5', 1786, 3'-TGTGGA-5', 1806, 3'-TCTGGG-5', 1865, 3'-ACTGGG-5', 1954, 3'-TGTGGC-5', 1972, 3'-TCTGGC-5', 1993, 3'-AGTGCA-5', 2063, 3'-AGTGGC-5', 2068, 3'-TATGGC-5', 2160, 3'-ACTGCA-5', 2204, 3'-AGTGCA-5', 2326, 3'-TGTGCA-5', 2681, 3'-AGTGGA-5', 2712, 3'-ACTGCC-5', 2823, 3'-AATGAC-5', 2842, 3'-TCTGCA-5', 2857, 3'-TCTGGC-5', 2884, 3'-AATGGG-5', 2911, 3'-TCTGAC-5', 2944, 3'-TCTGAG-5', 2951, 3'-TGTGCA-5', 2960, 3'-TCTGGC-5', 2984, 3'-TCTGAG-5', 3007, 3'-AGTGCC-5', 3011, 3'-TATGAC-5', 3028, 3'-TCTGCA-5', 3061, 3'-AATGCA-5', 3070, 3'-ACTGGC-5', 3118, 3'-TCTGAG-5', 3124, 3'-TATGGA-5', 3163, 3'-AATGGG-5', 3169, 3'-AGTGCC-5', 3235, 3'-TATGAG-5', 3261, 3'-TCTGCA-5', 3268, 3'-TCTGCA-5', 3279, 3'-ACTGCA-5', 3320, 3'-ACTGGC-5', 3346, 3'-TCTGCC-5', 3359, 3'-TCTGGC-5', 3406, 3'-AATGCC-5', 3431, 3'-TGTGGA-5', 3437, 3'-AATGAA-5', 3442, 3'-AATGAG-5', 3446, 3'-AGTGGG-5', 3450, 3'-AGTGCA-5', 3464, 3'-AATGAC-5', 3568, 3'-TGTGAA-5', 3595, 3'-AGTGAC-5', 3713, 3'-ACTGAG-5', 3736, 3'-AATGAC-5', 3783, 3'-AATGAA-5', 3836, 3'-AGTGAG-5', 3877, 3'-TGTGAG-5', 3904, 3'-TCTGAA-5', 3925, 3'-TGTGCA-5', 3960, 3'-TGTGAC-5', 3972, 3'-AGTGGG-5', 4041, 3'-ACTGAA-5', 4090, 3'-AATGAG-5', 4095.

Inverse complement, positive strand, positive direction there are 47: 3'-TCTGAC-5', 236, 3'-TGTGAC-5', 346, 3'-TCTGCC-5', 399, 3'-TCTGGC-5', 441, 3'-AATGAA-5', 525, 3'-TGTGCA-5', 569, 3'-TGTGCG-5', 803, 3'-TGTGCG-5', 887, 3'-TGTGCG-5', 987, 3'-TGTGAC-5', 1139, 3'-TGTGCC-5', 1223, 3'-TGTGCC-5', 1559, 3'-ACTGGG-5', 1663, 3'-TGTGCC-5', 1698, 3'-TCTGAA-5', 1745, 3'-AATGGG-5', 1889, 3'-ACTGGC-5', 2214, 3'-AGTGGA-5', 2248, 3'-AGTGAG-5', 2305, 3'-AGTGGG-5', 2313, 3'-AGTGAC-5', 2341, 3'-TCTGAA-5', 2417, 3'-TGTGGA-5', 2431, 3'-TATGAA-5', 2740, 3'-TCTGGA-5', 2862, 3'-AGTGAC-5', 2930, 3'-ACTGAA-5', 2946, 3'-TGTGGG-5', 2965, 3'-ACTGAA-5', 3030, 3'-AGTGCA-5', 3254, 3'-AGTGAC-5', 3318, 3'-TGTGAG-5', 3508, 3'-TGTGGG-5', 3533, 3'-TCTGGA-5', 3551, 3'-AGTGGG-5', 3613, 3'-AGTGCC-5', 3748, 3'-ACTGGA-5', 3785, 3'-ACTGGA-5', 4019, 3'-AGTGAC-5', 4088, 3'-AGTGAG-5', 4127.

L boxes

The consensus sequence for the L1 box is TAAATGYA.[88] Y is (A/C/G).

M35 boxes

negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesM35--.bas, looking for 3'-TTGACA-5', 2, 3'-TTGACA-5', 477, 3'-TTGACA-5', 4399.

M boxes

Metal responsive elements

Proximal promoters

On the positive strand in the negative direction there is an MRE 3'-TGCACTC-5' at 4341.

Distal promoters

Positive strand in the negative direction there are 6: 3'-TGCGCTC-5', 891, 3'-TGCACTC-5', 1348, 3'-TGCACTC-5', 2001, 3'-TGCACTC-5', 2427, 3'-TGCACCC-5', 2762, 3'-TGCACTC-5', 3290.

Inverse complement, negative strand, negative direction there are 2: 3'-GTGTGCA-5', 531, 3'-GAGTGCA-5', 1772.

Inverse complement, positive strand, negative direction there are 2: 3'-GAGTGCA-5', 1470, 3'-GTGTGCA-5', 2863.

Negative strand in the positive direction there are 11: 3'-TGCGCCC-5', 453, 3'-TGCACAC-5', 549, 3'-TGCACAC-5', 1221, 3'-TGCGCCC-5', 1247, 3'-TGCACTC-5', 1373, 3'-TGCGCCC-5', 1399, 3'-TGCACTC-5', 1473, 3'-TGCGCCC-5', 1499, 3'-TGCGCCC-5', 1657, 3'-TGCACAC-5', 2963, 3'-TGCACCC-5', 3323.

Positive strand in the positive direction there are 2: 3'-TGCGCCC-5', 872, 3'-TGCGCCC-5', 972.

Inverse complement, negative strand, positive direction there are 10: 3'-GCGTGCA-5', 546, 3'-GCGCGCA-5', 684, 3'-GGGCGCA-5', 876, 3'-GGGCGCA-5', 976, 3'-GCGTGCA-5', 1218, 3'-GTGCGCA-5', 1523, 3'-GAGTGCA-5', 1786, 3'-GAGTGCA-5', 2326, 3'-GGGTGCA-5', 2800, 3'-GGGTGCA-5', 3883.

Motif ten elements

There are no MTEs in either promoter.

MYB recognition elements

P boxes

Pollen1 elements

"Electrophoretic mobility shift assays identified a pollen-specific cis-acting element POLLEN1 (AGAAA) mapped at AtACBP4 (−157/−153) which interacted with nuclear proteins from flower and this was substantiated by DNase I footprinting."[88]

Pribnow boxes

  1. negative strand in the negative direction, looking for 3'-TATAAT-5', 2, 3'-TATAAT-5', 3454, 3'-TATAAT-5', 3468,
  2. negative strand in the positive direction, looking for 3'-TATAAT-5', 1, 3'-TATAAT-5', 729,
  3. positive strand in the negative direction, looking for 3'-TATAAT-5', 0,
  4. positive strand in the positive direction, looking for 3'-TATAAT-5', 0,
  5. complement, negative strand, negative direction, looking for 3'-ATATTA-5', 0,
  6. complement, negative strand, positive direction, looking for 3'-ATATTA-5', 0,
  7. complement, positive strand, negative direction, looking for 3'-ATATTA-5', 2, 3'-ATATTA-5', 3454, 3'-ATATTA-5', 3468,
  8. complement, positive strand, positive direction, looking for 3'-ATATTA-5', 1, 3'-ATATTA-5', 729,
  9. inverse complement, negative strand, negative direction, looking for 3'-ATTATA-5', 2, 3'-ATTATA-5', 272, 3'-ATTATA-5', 603,
  10. inverse complement, negative strand, positive direction, looking for 3'-ATTATA-5', 1, 3'-ATTATA-5', 727,
  11. inverse complement, positive strand, negative direction, looking for 3'-ATTATA-5', 0,
  12. inverse complement, positive strand, positive direction, looking for 3'-ATTATA-5', 0,
  13. inverse, negative strand, negative direction, looking for 3'-TAATAT-5', 0,
  14. inverse, negative strand, positive direction, looking for 3'-TAATAT-5', 0,
  15. inverse, positive strand, negative direction, looking for 3'-TAATAT-5', 2, 3'-TAATAT-5', 272, 3'-TAATAT-5', 603,
  16. inverse, positive strand, positive direction, looking for 3'-TAATAT-5', 1, 3'-TAATAT-5', 727.

Prolamin boxes

  1. negative strand in the negative direction: 1, 3'-TGTAAAG-5', 2884,
  2. negative strand in the positive direction: 1, 3'-TGAAAAG-5', 489,
  3. positive strand in the negative direction: 1, 3'-TGAAAAG-5', 1627.

Pyrimidine boxes

Pyrimidine boxes and their complements in the negative direction: 3'-CCTTTT-5' at 2459, 3'-CCTTTT-5' at 2927, and 3'-CCTTTT-5' at 2968 occur. Inverse pyrimidine boxes and their complements occur 3'-AAAAGG-5' at 105, 3'-AAAAGG-5' at 1107, 3'-AAAAGG-5' at 3345, and 3'-AAAAGG-5' at 3441.

Pyrimidine boxes in the positive direction: 3'-CCTTTT-5' at 135 and 3'-CCTTTT-5' at 291 and their complements are close to ZNF497.

Q elements

"The basal regulatory elements identified include a putative TATA-box (−30/−24) for RNA polymerase binding and a CAAT box (−64/−61; [...]). Several putative floral expression-related cis-elements identified included a putative 6-nucleotide Q element (−770/−665), three GTGA boxes (−372/−369, −209/−206 and −164/−161) and four putative highly-conserved POLLEN1 boxes (−737/−733, −711/−707, −150/−146 and −36/−32; [...])."[88]

The consensus sequence for a Q element is 3'-AGGTCA-5'.[88]

Retinoblastoma control elements

R response elements

STAT5s

Proximal promoters

Negative strand in the positive direction there is 1: 3'-TTCCGGGAA-5', 4247.

Distal promoters

Positive strand in the negative direction there are 2: 3'-TTCGTTGAA-5', 3506, 3'-TTCCCTGAA-5', 3782.

Positive strand in the positive direction there is 1: 3'-TTCCATGAA-5', 128.

Synaptic Activity-Responsive Elements

TACTAAC boxes

Tapetum boxes

The consensus sequence for the TAPETUM box is TCGTGT.[88]

TATA boxes

Negative strand in the negative direction there are 2: 3'-TATATATA-5' at 1600 (or -2860 nts upstream from the TSS) and 3'-TATATAAA-5' at 1602 (or -2858 nts).

Positive strand in the negative direction there are 3: 3'-TATAAAAG-5' at 184 (or -4276 nts), 3'-TATAAAAG-5' at 223 (or -4237 nts), and 3'-TATATAAA-5' at 2874 (or -1586 nts).

Inverse complement, negative strand, negative direction there are 2: 3'-TATATATA-5', 1600, 3'-TTTATATA-5', 2871.

Inverse complement, positive strand, negative direction there is 1: 3'-TTTTTATA-5', 219.

TAT boxes

Only an inverse and its complement occurs between ZSCAN22 and A1BG: 3'-TACCTAT-5' at 2996 nts from ZSCAN22.

TATCCAC boxes

None occur.

T boxes

TCCACCATA elements

"Given that AtACBP4pro::GUS (−156/−67) could drive promoter activity for pollen expression, [electrophoretic mobility shift assays] EMSAs were carried out to investigate the role of the putative POLLEN1 cis-element, AGAAA (−150/−146), and its adjacent co-dependent regulatory element TCCACCATA (–141/–133)."[88]

"POLLEN1 and the TCCACCATA element are co-dependent regulatory elements responsible for pollen-specific activation of tomato LAT52 (Bate and Twell 1998)."[88]

Telomeric repeat DNA-binding factors

Copying the consensus telomeric repeat DNA-binding factor (TRF): 3'-TTAGGG-5' and putting the sequence in "⌘F" locates this sequence in the A1BG negative direction, nucleotide positions as can be found by the computer programs.

In the nucleotides between ZSCAN22 and A1BG there is at least one 3'-TTAGGG-5' beginning about 680 nucleotides from ZSCAN22 or ending at about 686 nts.

Homo sapiens genes containing these are found using Homo sapiens "TRF (TTAGGG repeat-binding factor)".

Tetradecanoylphorbol-13-acetate response elements

TGFβ control elements

TGF-β inhibitory elements

Upstream response elements

V boxes

W boxes

Proximal promoters

Inverse W boxes occur in the negative strand, negative direction of A1BG: 3'-GGTCAA-5' at 4416 and 3'-GGTCAA-5' at 4308.

W boxes occur in the positive direction, positive strand of A1BG: 3'-CTGACC-5' and its complement at 4216 and inverse W boxes occur 3'-GGTCAG-5' and its complement at 4270.

Distal promoters

A W box occurs 3'-CTGACC-5' at 3749, whereas 3'-CTGACT-5' at 17, 3'-TTGACT-5' at 130, 3'-TTGACT-5' at 307, and 3'-CTGACC-5' at 734 occur close to ZSCAN22, but 3'-CTGACT-5' at 1935 could be associated ZSCAN22 or an unknown gene between it and A1BG, along with their complements, negative strand, negative direction.

Inverse complement, positive strand, negative direction there are 5: 3'-GGTCAG-5', 440, 3'-GGTCAG-5', 577, 3'-GGTCAG-5', 713, 3'-GGTCAG-5', 2249, 3'-GGTCAG-5', 2586.

W box inverses occur 3'-GGTCAG-5' at 1353 negative direction.

W boxes 3'-AGTCAG-5' at 2101, 3'-GGTCAG-5' at 2221, 3'-AGTCAG-5' at 2608, 3'-AGTCAA-5' at 2614, and 3'-AGTCAG-5' at 2619 along with their complements, positive direction.

W boxes in the positive direction occur 3'-CTGACC-5' at 1662, 3'-CTGACC-5' at 2213, 3'-TTGACC-5' at 2873, 3'-CTGACT-5' at 2945, and 3'-TTGACC-5' at 4018 that could be associated with A1BG, along with 3'-TTGACC-5' at 1953, 3'-CTGACT-5' at 2674, and 3'-TTGACT-5' at 3735.

Inverse complement, positive strand, positive direction there are 6: 3'-GGTCAG-5', 2025, 3'-AGTCAG-5', 2099, 3'-GGTCAG-5', 2606, 3'-GGTCAG-5', 2997, 3'-GGTCAG-5', 3083, 3'-GGTCAA-5', 3380.

X boxes

There are no X boxes in either promoter.

X core promoter elements

  1. negative strand in the negative direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 1, 3'-TGGTGGGACC-5', 3744,
  2. negative strand in the positive direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 0,
  3. positive strand in the negative direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 0,
  4. positive strand in the positive direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 0,
  5. complement, negative strand, negative direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 0,
  6. complement, negative strand, positive direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 0,
  7. complement, positive strand, negative direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 1, 3'-ACCACCCTGG-5', 3744,
  8. complement, positive strand, positive direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 0,
  9. inverse complement, negative strand, negative, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 0,
  10. inverse complement, negative strand, positive direction, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 0,
  11. inverse complement, positive strand, negative direction, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 1, 3'-GCTCCCACCT-5', 392,
  12. inverse complement, positive strand, positive direction, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 0,
  13. inverse, negative strand, negative direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 1, 3'-CGAGGGTGGA-5', 392,
  14. inverse, negative strand, positive direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 1, 3'-CCAGGGTGGG-5', 102,
  15. inverse, positive strand, negative direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 0,
  16. inverse, positive strand, positive direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 0.

Y boxes

There are no Y boxes in either promoter.

Z boxes

Hypotheses

  1. Downstream core promoters may work as transcription factors even as their complements or inverses.
  2. In addition to the DNA binding sequences listed above, the transcription factors that can open up and attach through the local epigenome need to be known and specified.
  3. Each DNA binding domain serving as a transcription factor for the promoter of any immunoglobulin supergene family member, also serves or is present in the promoters for A1BG.
  4. The function of A1BG is the same as other immunoglobulin genes possessing the immunoglobulin domain cl11960 and/or any of three immunoglobulin-like domains: pfam13895, cd05751 and smart00410 in the order and nucleotide sequence: cd05751 Location: 401 → 493, smart00410 Location: 218 → 280, pfam13895 Location: 210 → 301 and cl11960 Location: 28 → 110.

See also

References

  1. "Entrez Gene: Alpha-1-B glycoprotein". Retrieved 2012-11-09.
  2. 2.0 2.1 "A1BG alpha-1-B glycoprotein". Retrieved May 10, 2013.
  3. 3.0 3.1 HGNC (13 March 2020). "ZSCAN22 zinc finger and SCAN domain containing 22 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  4. 4.0 4.1 RefSeq (10 September 2009). "MIR6806 microRNA 6806 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  5. Jag123 (7 March 2005). "antigen". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  6. SemperBlotto (21 April 2008). "immunogen". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 8 March 2020.
  7. 7.0 7.1 7.2 C. Michael Gibson (27 April 2008). "Antigen". Boston, Massachusetts: WikiDoc Foundation. Retrieved 8 March 2020.
  8. Williamsayers79 (26 February 2007). "antibody". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  9. Jag123 (7 March 2005). "antibody". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  10. Eleonora Market, F. Nina Papavasiliou (2003) V(D)J Recombination and the Evolution of the Adaptive Immune System PLoS Biology 1(1): e16.
  11. Charles A. Janeway, Jr; et al. (2001). Immunobiolog (5th ed. ed.). Garland Publishing. ISBN 0-8153-3642-X.
  12. SemperBlotto (25 February 2006). "immunoglobulin". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  13. SemperBlotto (28 April 2008). "immunoglobulin". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  14. 14.0 14.1 14.2 14.3 RefSeq (10 December 2019). "A1BG alpha-1-B glycoprotein [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  15. Tian M, Cui YZ, Song GH, Zong MJ, Zhou XY, Chen Y, Han JX (2008). "Proteomic analysis identifies MMP-9, DJ-1 and A1BG as overexpressed proteins in pancreatic juice from pancreatic ductal adenocarcinoma patients". BMC Cancer. 8: 241. doi:10.1186/1471-2407-8-241. PMC 2528014. PMID 18706098.
  16. 16.0 16.1 16.2 16.3 HGNC2019 (10 December 2019). "A1BG-AS1 A1BG antisense RNA 1 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  17. 17.0 17.1 17.2 17.3 17.4 17.5 17.6 Noriaki Ishioka, Nobuhiro Takahashi, and Frank W. Putnam (April 1986). "Amino acid sequence of human plasma 𝛂1B-glycoprotein: Homology to the immunoglobulin supergene family" (PDF). Proceedings of the National Academy of Sciences USA. 83 (8): 2363–7. doi:10.1073/pnas.83.8.2363. PMID 3458201. Retrieved 9 March 2020.
  18. 18.0 18.1 Katrina M. Morris, Denis O’Meally, Thiri Zaw, Xiaomin Song, Amber Gillett, Mark P. Molloy, Adam Polkinghorne, and Katherine Belova (7 October 2016). "Characterisation of the immune compounds in koala milk using a combined transcriptomic and proteomic approach". Scientific Reports. 6: 35011. doi:10.1038/srep35011. PMID 27713568. Retrieved 14 March 2020.
  19. R J Paxton, G Mooser, H Pande, T D Lee, and J E Shively (1 February 1987). "Sequence analysis of carcinoembryonic antigen: identification of glycosylation sites and homology with the immunoglobulin supergene family" (PDF). Proceedings of the National Academy of Sciences USA. 84 (4): 920–924. doi:10.1073/pnas.84.4.920. PMID 3469650. Retrieved 26 March 2020.
  20. NCBI (2 February 2016). "Conserved Protein Domain Family cl11960: Ig Superfamily". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 May 2020.
  21. NCBI (5 August 2015). "Conserved Protein Domain Family pfam13895: Ig_2". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 24 May 2020.
  22. NCBI (16 August 2016). "Conserved Protein Domain Family cd05751: Ig1_LILR_KIR_like". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 24 May 2020.
  23. NCBI (16 January 2013). "Conserved Protein Domain Family smart00410: IG_like". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 24 May 2020.
  24. 24.98.118.180 (28 February 2007). "species". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  25. 25.0 25.1 Peter coxhead (22 August 2018). "Species". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  26. Chiswick Chap (1 December 2016). "Species". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  27. 27.0 27.1 27.2 27.3 "AceView: A1BG". Retrieved May 11, 2013.
  28. Pdeitiker (26 July 2008). "variant". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  29. SemperBlotto (6 January 2007). "isoform". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  30. 72.178.245.181 (30 November 2008). "isoform". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  31. H Eiberg, ML Bisgaard, J Mohr (1 December 1989). "Linkage between alpha 1B-glycoprotein (A1BG) and Lutheran (LU) red blood group system: assignment to chromosome 19: new genetic variants of A1BG". Clinical genetics. 36 (6): 415–8. PMID 2591067. Retrieved 2017-10-08.
  32. John R. Stehle Jr., Mark E. Weeks, Kai Lin, Mark C. Willingham, Amy M. Hicks, John F. Timms, Zheng Cui (January 2007). "Mass spectrometry identification of circulating alpha-1-B glycoprotein, increased in aged female C57BL/6 mice". Biochimica et Biophysica Acta (BBA) - General Subjects. 1770 (1): 79–86. Retrieved 2017-10-08.
  33. 33.0 33.1 33.2 33.3 33.4 Caitrin W. McDonough, Yan Gong, Sandosh Padmanabhan, Ben Burkley, Taimour Y. Langaee, Olle Melander, Carl J. Pepine, Anna F. Dominiczak, Rhonda M. Cooper-DeHoff, Julie A. Johnson (June 2013). "Pharmacogenomic Association of Nonsynonymous SNPs in SIGLEC12, A1BG, and the Selectin Region and Cardiovascular Outcomes" (PDF). Hypertension. 62 (1): 48–54. doi:10.1161/HYPERTENSIONAHA.111.00823. PMID 23690342. Retrieved 2017-10-08.
  34. DTLHS (10 January 2018). "genotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  35. SemperBlotto (22 October 2005). "genotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  36. Widsith (28 March 2012). "polymorphism". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  37. 217.105.66.98 (8 September 2016). "allele". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  38. 138.130.33.215 (7 April 2004). "allele". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  39. 39.0 39.1 B. Gahne, R. K. Juneja, and A. Stratil (June 1987). "Genetic polymorphism of human plasma alpha 1B-glycoprotein: phenotyping by immunoblotting or by a simple method of 2-D electrophoresis". Human Genetics. 76 (2): 111–5. doi:10.1007/bf00284904. PMID 3610142. Retrieved 25 March 2020.
  40. R.K. Juneja, G. Beckman, M. Lukka, B. Gahne, and C. Ehnholm (1989). "Plasma α1B-Glycoprotein Allele Frequencies in Finns and Swedish Lapps: Evidence for a New α1B Allele". Human Heredity. 39 (1): 32–36. doi:10.1159/000153828. Retrieved 25 March 2020.
  41. 41.0 41.1 R.K. Juneja, N. Saha, B. Gahne and J.S.H. Tay (1989). "Distribution of Plasma Alpha-1-B-Glycoprotein Phenotypes in Several Mongoloid Populations of East Asia". Human Heredity. 39: 218–222. doi:10.1159/000153863. Retrieved 25 March 2020.
  42. 24.235.196.118 (23 September 2007). "phenotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-10-04.
  43. SemperBlotto (14 February 2005). "phenotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-10-04.
  44. N2e (3 July 2008). "phenotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-10-04.
  45. Mardiaty Iryani Abdullah, Ching Chin Lee, Sarni Mat Junit, Khoon Leong Ng, and Onn Haji Hashim (13 September 2016). "Tissue and serum samples of patients with papillary thyroid cancer with and without benign background demonstrate different altered expression of proteins". Peer J. 4: e2450. doi:10.7717/peerj.2450. PMID 27672505. Retrieved 15 March 2020.
  46. 46.0 46.1 46.2 46.3 Udby L, Sørensen OE, Pass J, Johnsen AH, Behrendt N, Borregaard N, Kjeldsen L. (October 2004). "Cysteine-rich secretory protein 3 is a ligand of alpha1B-glycoprotein in human plasma". Biochemistry. 43 (40): 12877–86. doi:10.1021/bi048823e. PMID 15461460. |access-date= requires |url= (help)
  47. "The Opossum: Our Marvelous Marsupial, The Social Loner". Wildlife Rescue League.
  48. Journal Of Venomous Animals And Toxins – Anti-Lethal Factor From Opossum Serum Is A Potent Antidote For Animal, Plant And Bacterial Toxins. Retrieved 2009-12-29.
  49. 49.0 49.1 B Haendler, J Krätzschmar, F Theuring and W D Schleuning (July 1993). "Transcripts for cysteine-rich secretory protein-1 (CRISP-1; DE/AEG) and the novel related CRISP-3 are expressed under androgen control in the mouse salivary gland". Endocrinology. 133 (1): 192–8. doi:10.1210/en.133.1.192. PMID 8319566. Retrieved 2012-02-20.
  50. 50.0 50.1 HGNC2019 (10 December 2019). "ZNF497 zinc finger protein 497 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  51. 51.0 51.1 HGNC2019 (10 December 2019). "LOC100419840 zinc finger protein 446 pseudogene [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  52. 52.0 52.1 HGNC2019 (10 December 2019). "LOC105372483 uncharacterized LOC105372483 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  53. 53.0 53.1 HGNC2019 (10 December 2019). "RNA5SP473 RNA, 5S ribosomal pseudogene 473 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  54. 54.0 54.1 RefSeq (13 March 2020). "FKBP1AP1 FKBP prolyl isomerase 1A pseudogene 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  55. 55.0 55.1 55.2 55.3 RefSeq (July 2008). "FKBP1AP1 FKBP prolyl isomerase 1A pseudogene 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  56. HGNC (13 March 2020). "ZNF8 zinc finger protein 8 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  57. RefSeq (July 2008). "UBE2M ubiquitin conjugating enzyme E2 M [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  58. 58.0 58.1 58.2 HGNC (13 March 2020). "ZNF256 zinc finger protein 256 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  59. 59.0 59.1 59.2 RefSeq (July 2008). "SLC27A5 solute carrier family 27 member 5 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  60. HGNC (29 March 2020). "ZNF324 zinc finger protein 324 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 26 May 2020.
  61. 61.00 61.01 61.02 61.03 61.04 61.05 61.06 61.07 61.08 61.09 61.10 61.11 61.12 61.13 61.14 61.15 61.16 61.17 61.18 61.19 61.20 HGNC (13 March 2020). "ZNF544 zinc finger protein 544 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  62. RefSeq (July 2008). "UBE2S ubiquitin conjugating enzyme E2 S [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  63. 63.0 63.1 63.2 63.3 HGNC (13 March 2020). "ZNF586 zinc finger protein 586 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  64. 64.0 64.1 64.2 64.3 64.4 64.5 HGNC (13 March 2020). "ZNF446 zinc finger protein 446 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  65. HGNC (3 May 2020). "USP29 ubiquitin specific peptidase 29 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  66. 66.0 66.1 66.2 66.3 HGNC (13 March 2020). "ZNF667 zinc finger protein 667 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  67. 67.0 67.1 67.2 67.3 67.4 HGNC (5 April 2020). "ZSCAN18 zinc finger and SCAN domain containing 18 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  68. HGNC (13 March 2020). "CENPBD1P1 CENPB DNA-binding domains containing 1 pseudogene 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  69. 69.00 69.01 69.02 69.03 69.04 69.05 69.06 69.07 69.08 69.09 69.10 69.11 69.12 69.13 69.14 69.15 69.16 HGNC (13 March 2020). "ZSCAN5A zinc finger and SCAN domain containing 5A [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  70. HGNC (3 May 2020). "ZNF329 zinc finger protein 329 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  71. 71.00 71.01 71.02 71.03 71.04 71.05 71.06 71.07 71.08 71.09 71.10 HGNC (13 March 2020). "ZNF419 zinc finger protein 419 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  72. HGNC (13 March 2020). "ZNF552 zinc finger protein 552 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  73. 73.0 73.1 73.2 73.3 HGNC (13 March 2020). "ZNF671 zinc finger protein 671 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  74. 74.0 74.1 74.2 74.3 74.4 74.5 74.6 HGNC (3 May 2020). "ZBTB45 zinc finger and BTB domain containing 45 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  75. 75.0 75.1 75.2 HGNC (13 March 2020). "ZNF587 zinc finger protein 587 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  76. 76.0 76.1 76.2 HGNC (13 March 2020). "ZNF551 zinc finger protein 551 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  77. HGNC (3 May 2020). "ZNF835 zinc finger protein 835 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  78. 78.0 78.1 78.2 HGNC (13 March 2020). "ZNF837 zinc finger protein 837 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  79. HGNC (13 March 2020). "ZNF543 zinc finger protein 543 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  80. HGNC (13 March 2020). "SMIM17 small integral membrane protein 17 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  81. 81.0 81.1 HGNC (13 March 2020). "C19orf18 chromosome 19 open reading frame 18 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 27 May 2020.
  82. 82.0 82.1 82.2 82.3 82.4 82.5 HGNC (20 April 2020). "ZNF418 zinc finger protein 418 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 28 May 2020.
  83. 83.0 83.1 83.2 HGNC (13 March 2020). "ZNF417 zinc finger protein 417 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 28 May 2020.
  84. 84.0 84.1 84.2 HGNC (13 March 2020). "ZNF548 zinc finger protein 548 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 28 May 2020.
  85. 85.0 85.1 85.2 85.3 85.4 85.5 HGNC (13 March 2020). "ZNF542P zinc finger protein 542, pseudogene [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 28 May 2020.
  86. 86.0 86.1 PA Johnson, D Bunick, NB Hecht (1991). "Protein Binding Regions in the Mouse and Rat Protamine-2 Genes" (PDF). Biology of Reproduction. 44 (1): 127–134. Retrieved 6 April 2019.
  87. Amber Paratore Sanchez and Kumar Sharma (July 2009). "Transcription factors in the pathogenesis of diabetic nephropathy". Expert Reviews in Molecular Medicine. 11: e13. doi:10.1017/S1462399409001057. Retrieved 1 October 2018.
  88. 88.0 88.1 88.2 88.3 88.4 88.5 88.6 88.7 Zi-Wei Ye, Jie Xu, Jianxin Shi, Dabing Zhang and Mee-Len Chye (January 2017). "Kelch-motif containing acyl-CoA binding proteins AtACBP4 and AtACBP5 are differentially expressed and function in floral lipid metabolism" (PDF). Plant Molecular Biology. 93: 209–225. doi:10.1007/s11103-016-0557-5. Retrieved 7 May 2020.
  89. Anna Kalousová, Vladimı́r Beneš, Jan Pačes, Václav Pačes and Zbyněk Kozmik (June 1999). "DNA Binding and Transactivating Properties of the Paired and Homeobox Protein Pax4". Biochemical and Biophysical Research Communications. 259 (3): 510–518. Retrieved 6 May 2020.
  90. G. Damante, D. Fabbro, L. Pelizari, D. Civitareale, S. Guazzi, M. Polycarpou-Schwartz, S. Cauci, F. Quadrifoglio, S. Formisano and R. Di Lauro (20 June 1994). "Sequence-specific DNA recognition by the thyroid transcription factor-1 homeodomain" (PDF). Nucleic Acids Research. 22 (15): 3075–83. Retrieved 6 May 2020.

External links

{{Phosphate biochemistry}}Template:Sisterlinks