Initiator element gene transcriptions: Difference between revisions

Jump to navigation Jump to search
Line 440: Line 440:
Negative strand, negative direction: TTACTCC at 4557, AGTGTAA at 4533, TCACACT at 4361, TCGGACC at 4349, CCAGTTT at 4309, TCGGACC at 4300, GGTCCGA at 4255, CTGCACC at 4238, TCGGTCT at 4233, AAAATAA at 4221, TCACTCT at 4202, TCGAACC at 4188, AGTTCAA at 4177, CCGGTCC at 4170, AGTACGG at 4118, CCGTACC at 4107, CCGGTCC at 4102, TTACACT at 4092, AAAATAA at 4071, TCACTCT at 4051, TTGTATC at 4046, TCGGACC at 4037, AGACCAG at 4032, AGTTCAA at 4026, AGTGTGG at 3967, CCGGTCC at 3951, CTACTTT at 3922, TCATTCT at 3893, GGTCCGG at 3873, CTGGTCC at 3871, GGTATGG at 3858, CTACACC at 3810, CTGTTCT at 3759, GGTCTAG at 3488, TTGGTCT at 3486, TTGATCT at 3463, CCGTATC at 3446, CCGAACT at 3401, AGTCCGA at 3398, TCGTTCT at 3374, TTGTTCT at 3340, TCGTTTT at 3313, TTGTTCT at 3307, TCGGACC at 3298, AGTGCGG at 3281, TCGGTTC at 3273, CCACACC at 3186, TTGTATT at 3169, CCACTTT at 3146, TTGTTCC at 3141, GGACCGG at 3130, TCGGACC at 3128, CCGCACC at 3047, GATTCGA at 3033, TTGATTC at 3031, CCGATTT at 3009, TTGATTC at 2914, AAAGTAG at 2887.
Negative strand, negative direction: TTACTCC at 4557, AGTGTAA at 4533, TCACACT at 4361, TCGGACC at 4349, CCAGTTT at 4309, TCGGACC at 4300, GGTCCGA at 4255, CTGCACC at 4238, TCGGTCT at 4233, AAAATAA at 4221, TCACTCT at 4202, TCGAACC at 4188, AGTTCAA at 4177, CCGGTCC at 4170, AGTACGG at 4118, CCGTACC at 4107, CCGGTCC at 4102, TTACACT at 4092, AAAATAA at 4071, TCACTCT at 4051, TTGTATC at 4046, TCGGACC at 4037, AGACCAG at 4032, AGTTCAA at 4026, AGTGTGG at 3967, CCGGTCC at 3951, CTACTTT at 3922, TCATTCT at 3893, GGTCCGG at 3873, CTGGTCC at 3871, GGTATGG at 3858, CTACACC at 3810, CTGTTCT at 3759, GGTCTAG at 3488, TTGGTCT at 3486, TTGATCT at 3463, CCGTATC at 3446, CCGAACT at 3401, AGTCCGA at 3398, TCGTTCT at 3374, TTGTTCT at 3340, TCGTTTT at 3313, TTGTTCT at 3307, TCGGACC at 3298, AGTGCGG at 3281, TCGGTTC at 3273, CCACACC at 3186, TTGTATT at 3169, CCACTTT at 3146, TTGTTCC at 3141, GGACCGG at 3130, TCGGACC at 3128, CCGCACC at 3047, GATTCGA at 3033, TTGATTC at 3031, CCGATTT at 3009, TTGATTC at 2914, AAAGTAG at 2887.


Positive strand, negative direction: GGAATGA at 4555, TTAATTC at 4542, TCACATT at 4533, AGTCCAA at 4502, CCACTTT at 4461, CCACTCC at 4425, CCAGTTC at 4417, AGTGTGA at 4361, CTGCACT at 4340, CCGGACT at 4327, TCACACC at 3967, CCATACC at 3858, CTGAACC at 3784, CTGGACT at 3747, CCATTTC at 3688, CTGCTCC at 3582, CCAGATC at 3488, TTGCACT at 3289, TTGAACC at 3245.
Positive strand, negative direction: GGAATGA at 4555, TTAATTC at 4542, TCACATT at 4533, AGTCCAA at 4502, CCACTTT at 4461, CCACTCC at 4425, CCAGTTC at 4417, AGTGTGA at 4361, CTGCACT at 4340, CCGGACT at 4327, TCACACC at 3967, GGAGTAA at 3891, GGACCAG at 3870, CCATACC at 3858, GATGTGG at 3810, CTGAACC at 3784, AATGCAG at 3772, GGACTGG at 3749, CTGGACT at 3747, GGAACAG at 3725, CCATTTC at 3688, AATCCAG at 3681, CTGCTCC at 3582, CCAGATC at 3488, AAACCAG at 3485, GAACTAG at 3462, GAAGTGA at 3410, AAATTGA at 3358, AAAACAA at 3330, AGAGCAA at 3311, TTGCACT at 3289, TTGAACC at 3245, GGTGTGG at 3186, AAATTAG at 3176, AGACCAG at 3123, AAACTAA at 3030, AAAATAA at 3013, AGAATGG at 3004.
inverse complement, positive strand, negative direction: GGAGTAA at 3891, GGACCAG at 3870, GATGTGG at 3810, AATGCAG at 3772, GGACTGG at 3749, GGAACAG at 3725, AATCCAG at 3681, AAACCAG at 3485, GAACTAG at 3462, GAAGTGA at 3410, AAATTGA at 3358, AAAACAA at 3330, AGAGCAA at 3311, GGTGTGG at 3186, AAATTAG at 3176, AGACCAG at 3123, AAACTAA at 3030, AAAATAA at 3013, AGAATGG at 3004.


===YYRNWYY core promoters===
===YYRNWYY core promoters===

Revision as of 06:55, 16 April 2021

Editor-In-Chief: Henry A. Hoff

In the biosynthesis of any human protein, the gene that contains the nucleotide sequence which is translated into that protein must be transcribed. For RNA polymerase II holoenzyme to transcribe the gene, the gene's promoter must be located. After the promoter is located, the transcription start site (TSS) is pinpointed by using nucleotide sequences that include the TSS. Within the promoter, most human genes lack a TATA box and have an initiator element (Inr) or downstream promoter element instead.

On the basis of descriptions available, various Inrs are located to test whether the known TSS is located.

Notations

Notation: let the symbol Inr denote an initiator element.

Notation: let the symbol +1 designate the nucleotide that is the transcription start site (TSS).

Genetics

Inr in humans was first explained and sequenced in 1989.[1]

The Inr element for core promoters was found to be more prevalent than the TATA box in eukaryotic promoter domains.[2] In a study of 1800+ distinct human promoter sequences it was found that 49% contain the Inr element while 21.8% contain the TATA box.[2]

Gene transcriptions

Two subunits, TAF1 and TAF2, of the TFIID recognize the Inr sequence and bring the complex together.[3]

The interaction between TFIID and Inr is believed to be most imperative in initiating transcription due to the Inr sequence overlapping the start site.[4]

The Inr element is also believed to interact with the activator Sp1 transcription factor (Sp1), specificity protein 1 transcription factor, which is then able to regulate the activation and initiation of transcription[5]

Promoters with a functional Inr are more likely to lack a TATA box or to possess a degenerate TATA sequence because a gene with an active Inr is less dependent on a functional TATA box or additional promoters.[6] Although Inr element varies between promoters, the sequence is highly conserved between humans and yeast.[6] An analysis of 7670 transcription start sites showed that roughly 40% had an exact match to the BBCA+1BW Inr sequence, while 16% contained only one mismatch [7] TFIID and subunits are very sensitive to the Inr sequence and nucleotide changes have been shown to drastically change the binding affinity, where the +1 and -3 positions have been identified as the most critical for transcription efficiency and Inr function.[6] A replacement of the Adenosine (A) nucleotide at the +1 to G or T changes transcription activity by 10% and a replacement of Thymine (T) at the +3 position changes transcription activity levels by 22%.[8]

Theoretical initiator elements

Here's a theoretical definition:

Def. a series of nucleotides including a transcription start site on one DNA strand whose presence in a gene promoter eventually leads to a chain reaction or polymerization such as transcription is called an initiator element.

Consensus sequence for an Inr-like/TCT is 5'-TTCTCT-3'.[9]

RNA polymerase IIs

"RNA pol II itself recognizes features of the Inr which might assist the correct positioning of the polymerase on the promoter (Carcamo et al., 1991; Weis and Reinberg, 1997)."[10][11][12]

RNA polymerase II may form a stable complex on TATA-less promoters that contain Inr elements and possess a weak, intrinsic preference for Inr-like sequences.[11]

RNA polymerase II holoenzyme complexes

Gene ID: 672 is BRCA1 BRCA1, DNA repair associated. "This gene encodes a nuclear phosphoprotein that plays a role in maintaining genomic stability, and it also acts as a tumor suppressor. The encoded protein combines with other tumor suppressors, DNA damage sensors, and signal transducers to form a large multi-subunit protein complex known as the BRCA1-associated genome surveillance complex (BASC). This gene product associates with RNA polymerase II, and through the C-terminal domain, also interacts with histone deacetylase complexes. This protein thus plays a role in transcription, DNA repair of double-stranded breaks, and recombination. Mutations in this gene are responsible for approximately 40% of inherited breast cancers and more than 80% of inherited breast and ovarian cancers. Alternative splicing plays a role in modulating the subcellular localization and physiological function of this gene. Many alternatively spliced transcript variants, some of which are disease-associated mutations, have been described for this gene, but the full-length natures of only some of these variants has been described. A related pseudogene, which is also located on chromosome 17, has been identified."[13]

Gene ID: 1660 is DHX9 DExH-box helicase 9 (aka LKP; RHA; DDX9; NDH2; NDHII). "This gene encodes a member of the DEAH-containing family of RNA helicases. The encoded protein is an enzyme that catalyzes the ATP-dependent unwinding of double-stranded RNA and DNA-RNA complexes. This protein localizes to both the nucleus and the cytoplasm and functions as a transcriptional regulator. This protein may also be involved in the expression and nuclear export of retroviral RNAs. Alternate splicing results in multiple transcript variants. Pseudogenes of this gene are found on chromosomes 11 and 13."[14]

BRCA1 has been shown to interact with DHX9; i.e., overexpression of a protein fragment of RNA helicase A causes inhibition of endogenous BRCA1 function and defects in ploidy and cytokinesis in mammary epithelial cells[15] and the BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase A.[16]

ATP-dependent RNA helicase A (RHA; also known as DHX9, LKP, and NDHI) is an enzyme that in humans is encoded by the DHX9 gene.[17][18][14]

RNA polymerase II subunit A C-terminal domain phosphatase is an enzyme that in humans is encoded by the CTDP1 gene.[19][20][21]

Gene ID: 9150 is CTDP1 CTD phosphatase subunit 1. "This gene encodes a protein which interacts with the carboxy-terminus of the RAP74 subunit of transcription initiation factor TFIIF, and functions as a phosphatase that processively dephosphorylates the C-terminus of POLR2A (a subunit of RNA polymerase II), making it available for initiation of gene expression. Mutations in this gene are associated with congenital cataracts, facial dysmorphism and neuropathy syndrome (CCFDN). Alternatively spliced transcript variants encoding different isoforms have been described for this gene."[22]

"This gene encodes a protein which interacts with the carboxy-terminus of transcription initiation factor TFIIF, a transcription factor which regulates elongation as well as initiation by RNA polymerase II. The protein may also represent a component of an RNA polymerase II holoenzyme complex. Alternative splicing of this gene results in two transcript variants encoding 2 different isoforms."[21]

CTDP1 has been shown to interact with WD repeat-containing protein 77,[23] GTF2F1[20] and POLR2A.[24]

Gene ID: 168400 is DDX53 DEAD-box helicase 53. "This intronless gene encodes a protein which contains several domains found in members of the DEAD-box helicase protein family. Other members of this protein family participate in ATP-dependent RNA unwinding."[25]

"DEAD/DEAH box helicases are proteins, and are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein with RNA helicase activity. It may participate in melting of DNA:RNA hybrids, such as those that occur during transcription, and may play a role in X-linked gene expression. It contains 2 copies of a double-stranded RNA-binding domain, a DEXH core domain and an RGG box. The RNA-binding domains and RGG box influence and regulate RNA helicase activity."[25]

Consensus sequences

As in other metazoans, for genes lacking a TATA box, the Inr is functionally analogous, with a base pair (bp) consensus 5'-YYA+1NWYY-3', to direct transcription initiation.[26] Using the degenerate nucleotide code, the consensus sequence is 5'-C/T-C/T-A-A/C/G/T-A/T-C/T-C/T-3', or in the direction of transcription on the template strand: 3'-C/T-C/T-A-A/C/G/T-A/T-C/T-C/T-5'.

"TATA-less core promoters that lack AT-rich sequences in the -30 region and do not stably bind TBP are likely to assemble PICs via alternative pathways and to be regulated by distinct mechanisms (Smale and Kadonaga, 2003). However, the number of such bona fide TATA-less genes remains unclear in eukaryotic genomes."[27]

In Entamoeba histolytica, the consensus sequence is AAAAATTCA.[28]

The Inr has the consensus sequence YYANWYY.[29] Similarly to the TATA box, the Inr element facilitates the binding of transcription Factor II D (TATA binding protein TAF).[29]

Enhancers

An Inr for mammalian RNA polymerase II can be defined as a DNA sequence element that overlaps a TSS and is sufficient for

  1. determining the start site location in a promoter that lacks a TATA box and
  2. enhancing the strength of a promoter that contains a TATA box.[30]

TATA binding protein associated factors

"Although any isolated TAF may not exhibit sequence-specific interactions at the Inr element in the absence of a TATA-box, a combination of TAFs may bind sequence specifically to the Inr element regardless of the TATA-box and/or DPE (Chalkley and Verrijzer, 1999)."[31] Bold added.

TAF1 "binds to core promoter sequences encompassing the transcription start site. It also binds to activators and other transcriptional regulators, and these interactions affect the rate of transcription initiation."[32]

Prior to transcription, stable binding to an Inr occurs by a complex consisting of TAF1 and TAF2.[10]

TATA box-likes

The Inr is the only element in metazoan protein-encoding genes known to be a functional analog of the TATA box, in that it is sufficient for directing accurate transcription initiation in genes that lack TATA boxes.[33]

General transcription factor II As

General transcription factor II A is critical for the cooperative binding of TFIID to the Inr.[34]

General transcription factor II Ds

The general transcription factor II D (TFIID) is one of several general transcription factors that make up the RNA polymerase II preinitiation complex.[35] Before the start of transcription, the transcription factor II D (TFIID) complex, binds to the core promoter of the gene.[35]

TFIID is the first protein to bind to DNA during the formation of the pre-initiation transcription complex of RNA polymerase II (RNA Pol II).[35]

General transcription factor II Is

General transcription factor II I, or TFII-I, is a factor capable of binding the Inr element.[36][37]

Transcription start sites

Usually the Inr contains the TSS.

"[T]he initiator (INR) element located at, or immediately adjacent to, the TSS, ... is recognized by the TBP-associated factors TAF1 and TAF2 of the TFIID complex".[27]

"[T]ranscription does not need to begin at the +1 nucleotide for the Inr to function. RNA polymerase II has been redirected to alternative start sites by reducing ATP concentrations within a nuclear extract, by altering the spacing between the TATA and Inr in a promoter containing both elements, and by dinucleotide initiation strategies".[38]

Hypotheses

  1. A1BG is not transcribed by an initiator element.
  2. A1BG is not transcribed by a TATA box.

Samplings

YYRNWYY

The wider consensus sequence of 3'-YYRNWYY-5' allows a G at the TSS but at most only allows two Gs in a row.[39]

For the Basic programs (starting with SuccessablesInr.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 121, TTACTCC at 4557, TCACACT at 4361, TCGGACC at 4349, CCAGTTT at 4309, TCGGACC at 4300, CTGCACC at 4238, TCGGTCT at 4233, TCACTCT at 4202, TCGAACC at 4188, CCGGTCC at 4170, CCGTACC at 4107, CCGGTCC at 4102, TTACACT at 4092, TCACTCT at 4051, TTGTATC at 4046, TCGGACC at 4037, CCGGTCC at 3951, CTACTTT at 3922, TCATTCT at 3893, CTGGTCC at 3871, CTACACC at 3810, CTGTTCT at 3759, TTGGTCT at 3486, TTGATCT at 3463, CCGTATC at 3446, CCGAACT at 3401, TCGTTCT at 3374, TTGTTCT at 3340, TCGTTTT at 3313, TTGTTCT at 3307, TCGGACC at 3298, TCGGTTC at 3273, CCACACC at 3186, TTGTATT at 3169, CCACTTT at 3146, TTGTTCC at 3141, TCGGACC at 3128, CCGCACC at 3047, TTGATTC at 3031, CCGATTT at 3009, TTGATTC at 2914, TCGTACT at 2784, TCGGACC at 2770, TTGGACC at 2720, TCACACC at 2658, CCACTTT at 2619, TTGTACC at 2614, TCACACC at 2605, CCAGTCC at 2587, CCGGTCC at 2519, TCATTCT at 2503, TTGTTTT at 2490, TCGTTTT at 2476, TCACTCT at 2449, TCGGACC at 2435, TTGGACC at 2385, CCACTTT at 2282, TCGTACC at 2277, TCGGACC at 2268, TCAAACT at 2257, CCAGTCC at 2250, CCGCTTT at 2157, TTGTACC at 2152, TCAAACT at 2141, TCACATT at 2087, CCGGTCC at 2077, TTACACC at 2065, TCGTTCT at 2023, TCGGACC at 2009, TTGGACC at 1959, CCGTACT at 1953, CCGCACC at 1897, TTATACC at 1742, TTAATTT at 1697, TTGGATT at 1591, TTACTTT at 1582, CCGTTTT at 1561, TTGCTTC at 1555, CCACACT at 1479, TTGTTTT at 1394, TCGTTTT at 1371, TTATTCT at 1365, TCAGACC at 1356, TTGGATC at 1306, CCGCACC at 1244, CCACTTT at 1212, TTGTACC at 1207, TCGGACC at 1198, TCACTCT at 1079, TTGGACC at 1015, TTAGTCC at 984, CCGTACC at 953, TCGGTCC at 948, TCGCTCT at 913, TCGGACC at 899, TCGGTTC at 874, CTACACC at 787, TCGCACC at 741, TCGGACT at 732, CCAGTCC at 714, CCGGTTC at 692, CCGGTCC at 648, TTATACC at 605, CCAGTCC at 578, CCGGTTC at 556, TCGGACC at 508, TCACTTT at 473, TTGTATC at 468, TCGGACC at 459, CCAGTCC at 441, CCGGTTC at 419, CTGCTTT at 312, TCACTCT at 301, TTATACT at 274, TTGGTCC at 262, CTACATT at 247, CCATATT at 181, CCGTACT at 124, CCGTTTC at 93, CTATACC at 77, TTGTTCC at 71.
  2. Negative strand, positive direction: 45, CTGCACC at 4343, TTAGTTT at 4139, TTGATTT at 4134, TCACTCT at 4128, TCATTTT at 4120, TCACACC at 3824, CTGTTCC at 3625, CCAGACC at 3550, TCACACT at 3507, TTGCATC at 3402, CTGTTCC at 3352, TTGCACT at 3343, CCGCATC at 3328, CTGCACC at 3322, CTGCTCC at 3309, CTGGTCT at 3299, TCGCTCT at 3276, CTGGTCT at 3245, CCAGTCC at 3084, CCAGTCC at 2998, CTGCTCC at 2978, TCAGATT at 2868, CCACACT at 2636, CCACACC at 2602, TTATACC at 2590, CCGCACC at 2566, CTAATTT at 2440, CTACACC at 2430, TCACTCT at 2306, CTGTTTC at 2263, TCAATCT at 2235, CCAGATC at 2230, CTGCATT at 2206, TCATATT at 2178, TCGCTTC at 2095, CCAGTCC at 2026, CTATTTC at 1978, CCACTTC at 1914, CCAGACT at 1744, CTGCACT at 1472, CTGCACT at 1372, CCGGACT at 746, CCACACT at 345, CTGTTTT at 147, TTGTATT at 115.
  3. Positive strand, negative direction: 40, TTAATTC at 4542, TCACATT at 4533, CCACTTT at 4461, CCACTCC at 4425, CCAGTTC at 4417, CTGCACT at 4340, CCGGACT at 4327, TCACACC at 3967, CCATACC at 3858, CTGAACC at 3784, CTGGACT at 3747, CCATTTC at 3688, CTGCTCC at 3582, CCAGATC at 3488, TTGCACT at 3289, TTGAACC at 3245, CTGCACC at 2761, TTGAACC at 2717, TTGAATC at 2708, CTGCACT at 2426, TCACACC at 2418, TTGAACC at 2382, CTACTCC at 2352, CTGCACT at 2000, TTATTTT at 1727, CTATATC at 1528, TCGCTCT at 1450, CCATTTC at 1380, CCAGTCT at 1354, TTGCACT at 1347, TTGCACC at 1339, TTGAACC at 1303, TCACACC at 1128, TCACTCC at 1058, TTGAACC at 1012, TCACACC at 882, TTGAACC at 846, CTGCATT at 152, TTGGACC at 32, CTGAATT at 20.
  4. Positive strand, positive direction: 75, CCAGACC at 4416, CCACTCC at 4401, CTAAATC at 4136, CTACTCC at 4102, TTACTCC at 4096, CCACACT at 3971, TCACACC at 3966, TCAGACT at 3924, TCACTCC at 3878, CTGGACC at 3787, CCGGACC at 3758, CCGGACC at 3679, CCACTCC at 3647, TCACACT at 3594, CTGGTCT at 3548, TCGATCC at 3522, CCGATCC at 3484, CTACTCC at 3478, TCGGTCT at 3221, CTGGTTT at 3175, TTATACC at 3162, CCAGACC at 3021, CCGGACC at 2988, CCAGACT at 2943, CTGGTCC at 2876, CTAAACT at 2871, TTGCTCC at 2806, TCGATTC at 2789, TCGTTTT at 2707, TCAATCC at 2668, CTATATT at 2662, TCAGTCC at 2620, TCAGTTC at 2615, TCAGTCT at 2609, CCGGTCC at 2574, CCGCACT at 2555, TTGGTCT at 2228, CCAGTCT at 2222, CCGTTCT at 2190, CTACTTT at 2146, TTGTACT at 2141, TCAATTT at 2136, CCACACC at 1971, CCGTTCT at 1948, CCGCTCT at 1921, CCACACC at 1805, CCGCACT at 1720, CCGCTCT at 1565, TCGTTCC at 1511, CCGCTCT at 1481, CCGTTCC at 1427, CCGCTCT at 1381, CCGTTCC at 1327, CCGTTCC at 1259, CCGCTCT at 1229, CCGGTCC at 1175, TCGCTCT at 1061, CCGTTCC at 1007, TTGGACC at 947, TCGGTCT at 935, CCGTTCC at 923, TTGGACC at 847, TCGGTCT at 835, CCGTTCC at 823, CCGGACT at 725, CCGTTCC at 671, CCGCTCT at 641, CCGTTCC at 587, CCGCTCT at 557, TCGGTCC at 515, CCGTTCC at 503, CCGGACC at 286, TTACACT at 230, CCGGTCC at 215, CTGGACC at 40.
  5. complement, negative strand, negative direction is SuccessablesInrc--.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 40, 3'-GACTTAA-5', 20, 3'-AACCTGG-5', 32, 3'-GACGTAA-5', 152, 3'-AACTTGG-5', 846, 3'-AGTGTGG-5', 882, 3'-AACTTGG-5', 1012, 3'-AGTGAGG-5', 1058, 3'-AGTGTGG-5', 1128, 3'-AACTTGG-5', 1303, 3'-AACGTGG-5', 1339, 3'-AACGTGA-5', 1347, 3'-GGTCAGA-5', 1354, 3'-GGTAAAG-5', 1380, 3'-AGCGAGA-5', 1450, 3'-GATATAG-5', 1528, 3'-AATAAAA-5', 1727, 3'-GACGTGA-5', 2000, 3'-GATGAGG-5', 2352, 3'-AACTTGG-5', 2382, 3'-AGTGTGG-5', 2418, 3'-GACGTGA-5', 2426, 3'-AACTTAG-5', 2708, 3'-AACTTGG-5', 2717, 3'-GACGTGG-5', 2761, 3'-AACTTGG-5', 3245, 3'-AACGTGA-5', 3289, 3'-GGTCTAG-5', 3488, 3'-GACGAGG-5', 3582, 3'-GGTAAAG-5', 3688, 3'-GACCTGA-5', 3747, 3'-GACTTGG-5', 3784, 3'-GGTATGG-5', 3858, 3'-AGTGTGG-5', 3967, 3'-GGCCTGA-5', 4327, 3'-GACGTGA-5', 4340, 3'-GGTCAAG-5', 4417, 3'-GGTGAGG-5', 4425, 3'-GGTGAAA-5', 4461, 3'-AGTGTAA-5', 4533, 3'-AATTAAG-5', 4542,
  6. complement, negative strand, positive direction is SuccessablesInrc-+.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 75, 5'-GACCTGG-3' at 40, 5'-GGCCAGG-3' at 215, 5'-AATGTGA-3' at 230, 5'-GGCCTGG-3' at 286, 5'-GGCAAGG-3' at 503, 5'-AGCCAGG-3' at 515, 5'-GGCGAGA-3' at 557, 5'-GGCAAGG-3' at 587, 5'-GGCGAGA-3' at 641, 5'-GGCAAGG-3' at 671, 5'-GGCCTGA-3' at 725, 5'-GGCAAGG-3' at 823, 5'-AGCCAGA-3' at 835, 5'-AACCTGG-3' at 847, 5'-GGCAAGG-3' at 923, 5'-AGCCAGA-3' at 935, 5'-AACCTGG-3' at 947, 5'-GGCAAGG-3' at 1007, 5'-AGCGAGA-3' at 1061, 5'-GGCCAGG-3' at 1175, 5'-GGCGAGA-3' at 1229, 5'-GGCAAGG-3' at 1259, 5'-GGCAAGG-3' at 1327, 5'-GGCGAGA-3' at 1381, 5'-GGCAAGG-3' at 1427, 5'-GGCGAGA-3' at 1481, 5'-AGCAAGG-3' at 1511, 5'-GGCGAGA-3' at 1565, 5'-GGCGTGA-3' at 1720, 5'-GGTGTGG-3' at 1805, 5'-GGCGAGA-3' at 1921, 5'-GGCAAGA-3' at 1948, 5'-GGTGTGG-3' at 1971, 5'-AGTTAAA-3' at 2136, 5'-AACATGA-3' at 2141, 5'-GATGAAA-3' at 2146, 5'-GGCAAGA-3' at 2190, 5'-GGTCAGA-3' at 2222, 5'-AACCAGA-3' at 2228, 5'-GGCGTGA-3' at 2555, 5'-GGCCAGG-3' at 2574, 5'-AGTCAGA-3' at 2609, 5'-AGTCAAG-3' at 2615, 5'-AGTCAGG-3' at 2620, 5'-GATATAA-3' at 2662, 5'-AGTTAGG-3' at 2668, 5'-AGCAAAA-3' at 2707, 5'-AGCTAAG-3' at 2789, 5'-AACGAGG-3' at 2806, 5'-GATTTGA-3' at 2871, 5'-GACCAGG-3' at 2876, 5'-GGTCTGA-3' at 2943, 5'-GGCCTGG-3' at 2988, 5'-GGTCTGG-3' at 3021, 5'-AATATGG-3' at 3162, 5'-GACCAAA-3' at 3175, 5'-AGCCAGA-3' at 3221, 5'-GATGAGG-3' at 3478, 5'-GGCTAGG-3' at 3484, 5'-AGCTAGG-3' at 3522, 5'-GACCAGA-3' at 3548, 5'-AGTGTGA-3' at 3594, 5'-GGTGAGG-3' at 3647, 5'-GGCCTGG-3' at 3679, 5'-GGCCTGG-3' at 3758, 5'-GACCTGG-3' at 3787, 5'-AGTGAGG-3' at 3878, 5'-AGTCTGA-3' at 3924, 5'-AGTGTGG-3' at 3966, 5'-GGTGTGA-3' at 3971, 5'-AATGAGG-3' at 4096, 5'-GATGAGG-3' at 4102, 5'-GATTTAG-3' at 4136, 5'-GGTGAGG-3' at 4401, 5'-GGTCTGG-3' at 4416.
  7. complement, positive strand, negative direction is SuccessablesInrc+-.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 121, 3'-AACAAGG-5', 71, 3'-GATATGG-5', 77, 3'-GGCAAAG-5', 93, 3'-GGCATGA-5', 124, 3'-GGTATAA-5', 181, 3'-GATGTAA-5', 247, 3'-AACCAGG-5', 262, 3'-AATATGA-5', 274, 3'-AGTGAGA-5', 301, 3'-GACGAAA-5', 312, 3'-GGCCAAG-5', 419, 3'-GGTCAGG-5', 441, 3'-AGCCTGG-5', 459, 3'-AACATAG-5', 468, 3'-AGTGAAA-5', 473, 3'-AGCCTGG-5', 508, 3'-GGCCAAG-5', 556, 3'-GGTCAGG-5', 578, 3'-AATATGG-5', 605, 3'-GGCCAGG-5', 648, 3'-GGCCAAG-5', 692, 3'-GGTCAGG-5', 714, 3'-AGCCTGA-5', 732, 3'-AGCGTGG-5', 741, 3'-GATGTGG-5', 787, 3'-AGCCAAG-5', 874, 3'-AGCCTGG-5', 899, 3'-AGCGAGA-5', 913, 3'-AGCCAGG-5', 948, 3'-GGCATGG-5', 953, 3'-AATCAGG-5', 984, 3'-AACCTGG-5', 1015, 3'-AGTGAGA-5', 1079, 3'-AGCCTGG-5', 1198, 3'-AACATGG-5', 1207, 3'-GGTGAAA-5', 1212, 3'-GGCGTGG-5', 1244, 3'-AACCTAG-5', 1306, 3'-AGTCTGG-5', 1356, 3'-AATAAGA-5', 1365, 3'-AGCAAAA-5', 1371, 3'-AACAAAA-5', 1394, 3'-GGTGTGA-5', 1479, 3'-AACGAAG-5', 1555, 3'-GGCAAAA-5', 1561, 3'-AATGAAA-5', 1582, 3'-AACCTAA-5', 1591, 3'-AATTAAA-5', 1697, 3'-AATATGG-5', 1742, 3'-GGCGTGG-5', 1897, 3'-GGCATGA-5', 1953, 3'-AACCTGG-5', 1959, 3'-AGCCTGG-5', 2009, 3'-AGCAAGA-5', 2023, 3'-AATGTGG-5', 2065, 3'-GGCCAGG-5', 2077, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AACATGG-5', 2152, 3'-GGCGAAA-5', 2157, 3'-GGTCAGG-5', 2250, 3'-AGTTTGA-5', 2257, 3'-AGCCTGG-5', 2268, 3'-AGCATGG-5', 2277, 3'-GGTGAAA-5', 2282, 3'-AACCTGG-5', 2385, 3'-AGCCTGG-5', 2435, 3'-AGTGAGA-5', 2449, 3'-AGCAAAA-5', 2476, 3'-AACAAAA-5', 2490, 3'-AGTAAGA-5', 2503, 3'-GGCCAGG-5', 2519, 3'-GGTCAGG-5', 2587, 3'-AGTGTGG-5', 2605, 3'-AACATGG-5', 2614, 3'-GGTGAAA-5', 2619, 3'-AGTGTGG-5', 2658, 3'-AACCTGG-5', 2720, 3'-AGCCTGG-5', 2770, 3'-AGCATGA-5', 2784, 3'-AACTAAG-5', 2914, 3'-GGCTAAA-5', 3009, 3'-AACTAAG-5', 3031, 3'-GGCGTGG-5', 3047, 3'-AGCCTGG-5', 3128, 3'-AACAAGG-5', 3141, 3'-GGTGAAA-5', 3146, 3'-AACATAA-5', 3169, 3'-GGTGTGG-5', 3186, 3'-AGCCAAG-5', 3273, 3'-AGCCTGG-5', 3298, 3'-AACAAGA-5', 3307, 3'-AGCAAAA-5', 3313, 3'-AACAAGA-5', 3340, 3'-AGCAAGA-5', 3374, 3'-GGCTTGA-5', 3401, 3'-GGCATAG-5', 3446, 3'-AACTAGA-5', 3463, 3'-AACCAGA-5', 3486, 3'-GACAAGA-5', 3759, 3'-GATGTGG-5', 3810, 3'-GACCAGG-5', 3871, 3'-AGTAAGA-5', 3893, 3'-GATGAAA-5', 3922, 3'-GGCCAGG-5', 3951, 3'-AGCCTGG-5', 4037, 3'-AACATAG-5', 4046, 3'-AGTGAGA-5', 4051, 3'-AATGTGA-5', 4092, 3'-GGCCAGG-5', 4102, 3'-GGCATGG-5', 4107, 3'-GGCCAGG-5', 4170, 3'-AGCTTGG-5', 4188, 3'-AGTGAGA-5', 4202, 3'-AGCCAGA-5', 4233, 3'-GACGTGG-5', 4238, 3'-AGCCTGG-5', 4300, 3'-GGTCAAA-5', 4309, 3'-AGCCTGG-5', 4349, 3'-AGTGTGA-5', 4361, 3'-AATGAGG-5', 4557,
  8. complement, positive strand, positive direction is SuccessablesInrc++.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 45, 5'-AACATAA-3' at 115, 5'-GACAAAA-3' at 147, 5'-GGTGTGA-3' at 345, 5'-GGCCTGA-3' at 746, 5'-GACGTGA-3' at 1372, 5'-GACGTGA-3' at 1472, 5'-GGTCTGA-3' at 1744, 5'-GGTGAAG-3' at 1914, 5'-GATAAAG-3' at 1978, 5'-GGTCAGG-3' at 2026, 5'-AGCGAAG-3' at 2095, 5'-AGTATAA-3' at 2178, 5'-GACGTAA-3' at 2206, 5'-GGTCTAG-3' at 2230, 5'-AGTTAGA-3' at 2235, 5'-GACAAAG-3' at 2263, 5'-AGTGAGA-3' at 2306, 5'-GATGTGG-3' at 2430, 5'-GATTAAA-3' at 2440, 5'-GGCGTGG-3' at 2566, 5'-AATATGG-3' at 2590, 5'-GGTGTGG-3' at 2602, 5'-GGTGTGA-3' at 2636, 5'-AGTCTAA-3' at 2868, 5'-GACGAGG-3' at 2978, 5'-GGTCAGG-3' at 2998, 5'-GGTCAGG-3' at 3084, 5'-GACCAGA-3' at 3245, 5'-AGCGAGA-3' at 3276, 5'-GACCAGA-3' at 3299, 5'-GACGAGG-3' at 3309, 5'-GACGTGG-3' at 3322, 5'-GGCGTAG-3' at 3328, 5'-AACGTGA-3' at 3343, 5'-GACAAGG-3' at 3352, 5'-AACGTAG-3' at 3402, 5'-AGTGTGA-3' at 3507, 5'-GGTCTGG-3' at 3550, 5'-GACAAGG-3' at 3625, 5'-AGTGTGG-3' at 3824, 5'-AGTAAAA-3' at 4120, 5'-AGTGAGA-3' at 4128, 5'-AACTAAA-3' at 4134, 5'-AATCAAA-3' at 4139, 5'-GACGTGG-3' at 4343.
  9. inverse complement, negative strand, negative direction: 32, AGTGTAA at 4533, GGTCCGA at 4255, AGTACGG at 4118, AGTGTGG at 3967, GGTCCGG at 3873, GGTATGG at 3858, GGTCTAG at 3488, AGTCCGA at 3398, AGTGCGG at 3281, GGACCGG at 3130, GATTCGA at 3033, AAAGTAG at 2887, AGTACGG at 2753, AGTACGG at 2535, AGTGTGG at 2418, AGTGCGG at 2208, AGTGCGG at 1992, GGACCGA at 1843, AGTGCAG at 1773, AAAATAG at 1730, AGAACGG at 1608, GATATAG at 1528, GGTCCGA at 1462, AGAGCGA at 1448, GGACCGG at 1200, AGTGTGG at 1128, GAAGTGA at 1056, AGTGTGG at 882, GGACTGG at 734, AGTGCGG at 664, GGACCGA at 598, GATACAA at 213.
  10. inverse complement, negative strand, positive direction: 61, GGAACAG at 4445, GGTCTGG at 4416, GGAGTGA at 4350, GATTTAG at 4136, GAAATGA at 4094, AGAACAG at 4069, AGAGTGG at 4040, GGTGTGA at 3971, AGTGTGG at 3966, AGTCTGA at 3924, AGAGTGA at 3876, GAACCAG at 3840, AGAATGA at 3835, AATCCGA at 3799, GAAGCGG at 3670, AGTGTGA at 3594, GGAATGA at 3567, GGACCAG at 3547, AGTGCAG at 3465, GATGCAG at 3460, GGAATGA at 3441, GGACCAA at 3174, GAAATGG at 3168, AATATGG at 3162, GGTCTGG at 3021, GGTCTGA at 2943, GATTTGA at 2871, AGAATGA at 2841, GGTGCAA at 2801, AAAGTGG at 2711, AGAGCAA at 2705, GGACTGA at 2674, GATATAA at 2662, GAAATAG at 2626, GGTGCAA at 2335, AGTGCAG at 2327, AGATCAA at 2232, GAACCAG at 2227, AGTGCAG at 2064, AAAGCAG at 2007, GGTGTGG at 1971, GAACTGG at 1953, GGTGTGG at 1805, AGTGCAG at 1787, GGTGCGG at 1764, GAAGCGG at 1636, AGTGCGG at 1590, AATGCGG at 1422, AATGCGG at 1322, AGTGCGG at 1254, AGTGCGG at 1170, AGTGCGG at 1086, GGTGCAG at 784, AGTGCGG at 666, AGTGCGG at 582, AGTGCGG at 498, GGTGCGG at 489, AGACCGG at 442, GGAGCGA at 429, AATGTGA at 230, AGAGTGG at 53.
  11. inverse complement, positive strand, negative direction: 100, GGAATGA at 4555, AGTCCAA at 4502, AGTGTGA at 4361, AAAATAA at 4221, AGTTCAA at 4177, AATGTGA at 4092, AAAATAA at 4071, AGACCAG at 4032, AGTTCAA at 4026, GGAGTAA at 3891, GGACCAG at 3870, GATGTGG at 3810, AATGCAG at 3772, GGACTGG at 3749, GGAACAG at 3725, AATCCAG at 3681, AAACCAG at 3485, GAACTAG at 3462, GAAGTGA at 3410, AAATTGA at 3358, AAAACAA at 3330, AGAGCAA at 3311, GGTGTGG at 3186, AAATTAG at 3176, AGACCAG at 3123, AAACTAA at 3030, AAAATAA at 3013, AGAATGG at 3004, AAAACAA at 2842, AGTGTGG at 2658, AAATCAG at 2649, AGTGTGG at 2605, AGACCAG at 2600, AAAACAA at 2509, AAAGCAA at 2480, AAAGCAA at 2474, GATTCGG at 2454, AGAGTGA at 2447, AAACTAG at 2313, AATACAA at 2305, AGACCAG at 2263, AGTTTGA at 2257, GGTGCGG at 2197, AAAATGA at 2187, GATACAA at 2180, AGACCAA at 2147, AGTTTGA at 2141, AGTGTAA at 2087, GGTGCAG at 2082, AATGTGG at 2065, AGAGCAA at 2021, AGAATGG at 1948, AGACTGA at 1935, AAATTAG at 1887, AATACAA at 1878, AATATGG at 1742, GAATTAA at 1696, AAAGCGG at 1680, GAAATGA at 1663, GAAACAA at 1585, AATACAG at 1566, AGAACGA at 1553, AGTGCAA at 1536, GGTGTGA at 1479, AGTGCAG at 1471, AAAACAA at 1388, AGAGCAA at 1369, AGTCTGG at 1356, AAATTAG at 1234, AGAGTGA at 1077, AGATTGG at 1045, GATCCAG at 975, AGAGCGA at 911, GATGTGG at 787, AAATTAG at 777, AATACAA at 769, AGACCAG at 727, AGTTCGA at 721, AAATTGG at 643, AATACAA at 635, AATATGG at 605, AGATTGA at 585, AAATTAG at 499, AATACGA at 492, AGTGCGA at 448, GGTGCGG at 380, AAACTGA at 307, AGAACAG at 288, AATATGA at 274, AAACCAG at 261, AGTTCAA at 255, GATGTAA at 247, GAAACAA at 229, GGTATAA at 181, AAAACAG at 167, AAACTGA at 130, GATATGG at 77, AAAACAA at 69, GGACCAG at 34, AGACTGA at 17.
  12. inverse complement, positive strand, positive direction: 75, AGAACGA at 4390, GGTACGA at 4372, AGTACAG at 4366, GGAGTAA at 4309, GGACTGG at 4216, GAAACGG at 4210, AAATCAA at 4138, GAACTAA at 4133, AAAATAG at 4123, GAACTGG at 4018, AGTGTGG at 3824, GGACCGG at 3681, AGAGTGG at 3612, GGTCTGG at 3550, GATCCGA at 3524, AGTGTGA at 3507, GGAACGG at 3375, GGTACAA at 3337, AGAGTGA at 3317, GGACCAG at 3298, AGTGCAG at 3255, GAAGTAG at 3250, GGACCAA at 3049, AGTCCGG at 3036, AGACCAA at 3023, GGTCCAG at 3018, GGAACAG at 3003, GGACCGG at 2990, AGACCGG at 2985, AGACTGA at 2945, GGAGTAA at 2902, AGACCGA at 2885, GGTCCGG at 2878, AAACTGG at 2873, AGTCTAA at 2868, GGTGTGA at 2636, AGTTCAG at 2617, GGTGTGG at 2602, AATATGG at 2590, GGACCGG at 2571, GGTACAA at 2475, AGAGTGG at 2470, GGACCGA at 2435, GATGTGG at 2430, AATCCGA at 2368, GGTCCGA at 2318, AAAGTGA at 2304, AGAGTGG at 2247, GGTCTAG at 2230, GGACTGG at 2213, AGTATAA at 2178, GAAGTAG at 2110, AGAATGG at 1888, GGTCCGG at 1857, GGACCGA at 1817, GGTCTGA at 1744, GGACTGG at 1662, GATGCGA at 1576, AATTCGG at 1541, GAAGCGG at 1408, GAAGCGG at 1308, AAAGCAG at 1183, GGTCCGA at 1177, GGACCGG at 949, GGACCGG at 849, GGTGCGA at 777, GATGCGA at 652, GAAGCGG at 595, AGAATGA at 524, GAAGCGG at 459, GGTGTGA at 345, GGTCCAG at 217, AATCCAG at 152, AGTCCGG at 92, GGTCCGA at 10.
  13. inverse, negative strand, negative direction, is SuccessablesInri--.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 100, 3'-TCTGACT-5', 17, 3'-CCTGGTC-5', 34, 3'-TTTTGTT-5', 69, 3'-CTATACC-5', 77, 3'-TTTGACT-5', 130, 3'-TTTTGTC-5', 167, 3'-CCATATT-5', 181, 3'-CTTTGTT-5', 229, 3'-CTACATT-5', 247, 3'-TCAAGTT-5', 255, 3'-TTTGGTC-5', 261, 3'-TTATACT-5', 274, 3'-TCTTGTC-5', 288, 3'-TTTGACT-5', 307, 3'-CCACGCC-5', 380, 3'-TCACGCT-5', 448, 3'-TTATGCT-5', 492, 3'-TTTAATC-5', 499, 3'-TCTAACT-5', 585, 3'-TTATACC-5', 605, 3'-TTATGTT-5', 635, 3'-TTTAACC-5', 643, 3'-TCAAGCT-5', 721, 3'-TCTGGTC-5', 727, 3'-TTATGTT-5', 769, 3'-TTTAATC-5', 777, 3'-CTACACC-5', 787, 3'-TCTCGCT-5', 911, 3'-CTAGGTC-5', 975, 3'-TCTAACC-5', 1045, 3'-TCTCACT-5', 1077, 3'-TTTAATC-5', 1234, 3'-TCAGACC-5', 1356, 3'-TCTCGTT-5', 1369, 3'-TTTTGTT-5', 1388, 3'-TCACGTC-5', 1471, 3'-CCACACT-5', 1479, 3'-TCACGTT-5', 1536, 3'-TCTTGCT-5', 1553, 3'-TTATGTC-5', 1566, 3'-CTTTGTT-5', 1585, 3'-CTTTACT-5', 1663, 3'-TTTCGCC-5', 1680, 3'-CTTAATT-5', 1696, 3'-TTATACC-5', 1742, 3'-TTATGTT-5', 1878, 3'-TTTAATC-5', 1887, 3'-TCTGACT-5', 1935, 3'-TCTTACC-5', 1948, 3'-TCTCGTT-5', 2021, 3'-TTACACC-5', 2065, 3'-CCACGTC-5', 2082, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TCTGGTT-5', 2147, 3'-CTATGTT-5', 2180, 3'-TTTTACT-5', 2187, 3'-CCACGCC-5', 2197, 3'-TCAAACT-5', 2257, 3'-TCTGGTC-5', 2263, 3'-TTATGTT-5', 2305, 3'-TTTGATC-5', 2313, 3'-TCTCACT-5', 2447, 3'-CTAAGCC-5', 2454, 3'-TTTCGTT-5', 2474, 3'-TTTCGTT-5', 2480, 3'-TTTTGTT-5', 2509, 3'-TCTGGTC-5', 2600, 3'-TCACACC-5', 2605, 3'-TTTAGTC-5', 2649, 3'-TCACACC-5', 2658, 3'-TTTTGTT-5', 2842, 3'-TCTTACC-5', 3004, 3'-TTTTATT-5', 3013, 3'-TTTGATT-5', 3030, 3'-TCTGGTC-5', 3123, 3'-TTTAATC-5', 3176, 3'-CCACACC-5', 3186, 3'-TCTCGTT-5', 3311, 3'-TTTTGTT-5', 3330, 3'-TTTAACT-5', 3358, 3'-CTTCACT-5', 3410, 3'-CTTGATC-5', 3462, 3'-TTTGGTC-5', 3485, 3'-TTAGGTC-5', 3681, 3'-CCTTGTC-5', 3725, 3'-CCTGACC-5', 3749, 3'-TTACGTC-5', 3772, 3'-CTACACC-5', 3810, 3'-CCTGGTC-5', 3870, 3'-CCTCATT-5', 3891, 3'-TCAAGTT-5', 4026, 3'-TCTGGTC-5', 4032, 3'-TTTTATT-5', 4071, 3'-TTACACT-5', 4092, 3'-TCAAGTT-5', 4177, 3'-TTTTATT-5', 4221, 3'-TCACACT-5', 4361, 3'-TCAGGTT-5', 4502, 3'-CCTTACT-5', 4555,
  14. inverse, negative strand, positive direction, is SuccessablesInri-+.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 75, 5'-CCAGGCT-3' at 10, 5'-TCAGGCC-3' at 92, 5'-TTAGGTC-3' at 152, 5'-CCAGGTC-3' at 217, 5'-CCACACT-3' at 345, 5'-CTTCGCC-3' at 459, 5'-TCTTACT-3' at 524, 5'-CTTCGCC-3' at 595, 5'-CTACGCT-3' at 652, 5'-CCACGCT-3' at 777, 5'-CCTGGCC-3' at 849, 5'-CCTGGCC-3' at 949, 5'-CCAGGCT-3' at 1177, 5'-TTTCGTC-3' at 1183, 5'-CTTCGCC-3' at 1308, 5'-CTTCGCC-3' at 1408, 5'-TTAAGCC-3' at 1541, 5'-CTACGCT-3' at 1576, 5'-CCTGACC-3' at 1662, 5'-CCAGACT-3' at 1744, 5'-CCTGGCT-3' at 1817, 5'-CCAGGCC-3' at 1857, 5'-TCTTACC-3' at 1888, 5'-CTTCATC-3' at 2110, 5'-TCATATT-3' at 2178, 5'-CCTGACC-3' at 2213, 5'-CCAGATC-3' at 2230, 5'-TCTCACC-3' at 2247, 5'-TTTCACT-3' at 2304, 5'-CCAGGCT-3' at 2318, 5'-TTAGGCT-3' at 2368, 5'-CTACACC-3' at 2430, 5'-CCTGGCT-3' at 2435, 5'-TCTCACC-3' at 2470, 5'-CCATGTT-3' at 2475, 5'-CCTGGCC-3' at 2571, 5'-TTATACC-3' at 2590, 5'-CCACACC-3' at 2602, 5'-TCAAGTC-3' at 2617, 5'-CCACACT-3' at 2636, 5'-TCAGATT-3' at 2868, 5'-TTTGACC-3' at 2873, 5'-CCAGGCC-3' at 2878, 5'-TCTGGCT-3' at 2885, 5'-CCTCATT-3' at 2902, 5'-TCTGACT-3' at 2945, 5'-TCTGGCC-3' at 2985, 5'-CCTGGCC-3' at 2990, 5'-CCTTGTC-3' at 3003, 5'-CCAGGTC-3' at 3018, 5'-TCTGGTT-3' at 3023, 5'-TCAGGCC-3' at 3036, 5'-CCTGGTT-3' at 3049, 5'-CTTCATC-3' at 3250, 5'-TCACGTC-3' at 3255, 5'-CCTGGTC-3' at 3298, 5'-TCTCACT-3' at 3317, 5'-CCATGTT-3' at 3337, 5'-CCTTGCC-3' at 3375, 5'-TCACACT-3' at 3507, 5'-CTAGGCT-3' at 3524, 5'-CCAGACC-3' at 3550, 5'-TCTCACC-3' at 3612, 5'-CCTGGCC-3' at 3681, 5'-TCACACC-3' at 3824, 5'-CTTGACC-3' at 4018, 5'-TTTTATC-3' at 4123, 5'-CTTGATT-3' at 4133, 5'-TTTAGTT-3' at 4138, 5'-CTTTGCC-3' at 4210, 5'-CCTGACC-3' at 4216, 5'-CCTCATT-3' at 4309, 5'-TCATGTC-3' at 4366, 5'-CCATGCT-3' at 4372, 5'-TCTTGCT-3' at 4390.
  15. inverse, positive strand, negative direction, is SuccessablesInri+-.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 32, 3'-CTATGTT-5', 213, 3'-CCTGGCT-5', 598, 3'-TCACGCC-5', 664, 3'-CCTGACC-5', 734, 3'-TCACACC-5', 882, 3'-CTTCACT-5', 1056, 3'-TCACACC-5', 1128, 3'-CCTGGCC-5', 1200, 3'-TCTCGCT-5', 1448, 3'-CCAGGCT-5', 1462, 3'-CTATATC-5', 1528, 3'-TCTTGCC-5', 1608, 3'-TTTTATC-5', 1730, 3'-TCACGTC-5', 1773, 3'-CCTGGCT-5', 1843, 3'-TCACGCC-5', 1992, 3'-TCACGCC-5', 2208, 3'-TCACACC-5', 2418, 3'-TCATGCC-5', 2535, 3'-TCATGCC-5', 2753, 3'-TTTCATC-5', 2887, 3'-CTAAGCT-5', 3033, 3'-CCTGGCC-5', 3130, 3'-TCACGCC-5', 3281, 3'-TCAGGCT-5', 3398, 3'-CCAGATC-5', 3488, 3'-CCATACC-5', 3858, 3'-CCAGGCC-5', 3873, 3'-TCACACC-5', 3967, 3'-TCATGCC-5', 4118, 3'-CCAGGCT-5', 4255, 3'-TCACATT-5', 4533,
  16. inverse, positive strand, positive direction, is SuccessablesInri++.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 61, 5'-TCTCACC-3' at 53, 5'-TTACACT-3' at 230, 5'-CCTCGCT-3' at 429, 5'-TCTGGCC-3' at 442, 5'-CCACGCC-3' at 489, 5'-TCACGCC-3' at 498, 5'-TCACGCC-3' at 582, 5'-TCACGCC-3' at 666, 5'-CCACGTC-3' at 784, 5'-TCACGCC-3' at 1086, 5'-TCACGCC-3' at 1170, 5'-TCACGCC-3' at 1254, 5'-TTACGCC-3' at 1322, 5'-TTACGCC-3' at 1422, 5'-TCACGCC-3' at 1590, 5'-CTTCGCC-3' at 1636, 5'-CCACGCC-3' at 1764, 5'-TCACGTC-3' at 1787, 5'-CCACACC-3' at 1805, 5'-CTTGACC-3' at 1953, 5'-CCACACC-3' at 1971, 5'-TTTCGTC-3' at 2007, 5'-TCACGTC-3' at 2064, 5'-CTTGGTC-3' at 2227, 5'-TCTAGTT-3' at 2232, 5'-TCACGTC-3' at 2327, 5'-CCACGTT-3' at 2335, 5'-CTTTATC-3' at 2626, 5'-CTATATT-3' at 2662, 5'-CCTGACT-3' at 2674, 5'-TCTCGTT-3' at 2705, 5'-TTTCACC-3' at 2711, 5'-CCACGTT-3' at 2801, 5'-TCTTACT-3' at 2841, 5'-CTAAACT-3' at 2871, 5'-CCAGACT-3' at 2943, 5'-CCAGACC-3' at 3021, 5'-TTATACC-3' at 3162, 5'-CTTTACC-3' at 3168, 5'-CCTGGTT-3' at 3174, 5'-CCTTACT-3' at 3441, 5'-CTACGTC-3' at 3460, 5'-TCACGTC-3' at 3465, 5'-CCTGGTC-3' at 3547, 5'-CCTTACT-3' at 3567, 5'-TCACACT-3' at 3594, 5'-CTTCGCC-3' at 3670, 5'-TTAGGCT-3' at 3799, 5'-TCTTACT-3' at 3835, 5'-CTTGGTC-3' at 3840, 5'-TCTCACT-3' at 3876, 5'-TCAGACT-3' at 3924, 5'-TCACACC-3' at 3966, 5'-CCACACT-3' at 3971, 5'-TCTCACC-3' at 4040, 5'-TCTTGTC-3' at 4069, 5'-CTTTACT-3' at 4094, 5'-CTAAATC-3' at 4136, 5'-CCTCACT-3' at 4350, 5'-CCAGACC-3' at 4416, 5'-CCTTGTT-3' at 4445.

YYRNWYY UTRs

Negative strand, negative direction: TTACTCC at 4557, AGTGTAA at 4533, TCACACT at 4361, TCGGACC at 4349, CCAGTTT at 4309, TCGGACC at 4300, GGTCCGA at 4255, CTGCACC at 4238, TCGGTCT at 4233, AAAATAA at 4221, TCACTCT at 4202, TCGAACC at 4188, AGTTCAA at 4177, CCGGTCC at 4170, AGTACGG at 4118, CCGTACC at 4107, CCGGTCC at 4102, TTACACT at 4092, AAAATAA at 4071, TCACTCT at 4051, TTGTATC at 4046, TCGGACC at 4037, AGACCAG at 4032, AGTTCAA at 4026, AGTGTGG at 3967, CCGGTCC at 3951, CTACTTT at 3922, TCATTCT at 3893, GGTCCGG at 3873, CTGGTCC at 3871, GGTATGG at 3858, CTACACC at 3810, CTGTTCT at 3759, GGTCTAG at 3488, TTGGTCT at 3486, TTGATCT at 3463, CCGTATC at 3446, CCGAACT at 3401, AGTCCGA at 3398, TCGTTCT at 3374, TTGTTCT at 3340, TCGTTTT at 3313, TTGTTCT at 3307, TCGGACC at 3298, AGTGCGG at 3281, TCGGTTC at 3273, CCACACC at 3186, TTGTATT at 3169, CCACTTT at 3146, TTGTTCC at 3141, GGACCGG at 3130, TCGGACC at 3128, CCGCACC at 3047, GATTCGA at 3033, TTGATTC at 3031, CCGATTT at 3009, TTGATTC at 2914, AAAGTAG at 2887.

Positive strand, negative direction: GGAATGA at 4555, TTAATTC at 4542, TCACATT at 4533, AGTCCAA at 4502, CCACTTT at 4461, CCACTCC at 4425, CCAGTTC at 4417, AGTGTGA at 4361, CTGCACT at 4340, CCGGACT at 4327, TCACACC at 3967, GGAGTAA at 3891, GGACCAG at 3870, CCATACC at 3858, GATGTGG at 3810, CTGAACC at 3784, AATGCAG at 3772, GGACTGG at 3749, CTGGACT at 3747, GGAACAG at 3725, CCATTTC at 3688, AATCCAG at 3681, CTGCTCC at 3582, CCAGATC at 3488, AAACCAG at 3485, GAACTAG at 3462, GAAGTGA at 3410, AAATTGA at 3358, AAAACAA at 3330, AGAGCAA at 3311, TTGCACT at 3289, TTGAACC at 3245, GGTGTGG at 3186, AAATTAG at 3176, AGACCAG at 3123, AAACTAA at 3030, AAAATAA at 3013, AGAATGG at 3004.

YYRNWYY core promoters

Positive strand, negative direction: AAAACAA at 2842.

Negative strand, positive direction: GGAACAG at 4445, GGTCTGG at 4416, GGAGTGA at 4350, CTGCACC at 4343.

Positive strand, positive direction: CCAGACC at 4416, CCACTCC at 4401, AGAACGA at 4390, GGTACGA at 4372, AGTACAG at 4366, GGAGTAA at 4309.

YYRNWYY proximal promoters

Negative strand, negative direction: TCGTACT at 2784, TCGGACC at 2770, TTGGACC at 2720, TCACACC at 2658, CCACTTT at 2619, TTGTACC at 2614, TCACACC at 2605. inverse complement, negative strand, negative direction: AGTACGG at 2753.

Positive strand, negative direction: CTGCACC at 2761, TTGAACC at 2717, TTGAATC at 2708. inverse complement, positive strand, negative direction: AGTGTGG at 2658, AAATCAG at 2649, AGTGTGG at 2605, AGACCAG at 2600.

Negative strand, positive direction: TTAGTTT at 4139, TTGATTT at 4134, TCACTCT at 4128, TCATTTT at 4120. inverse complement, negative strand, positive direction: GATTTAG at 4136, GAAATGA at 4094, AGAACAG at 4069.

Positive strand, positive direction: CTAAATC at 4136, CTACTCC at 4102, TTACTCC at 4096. inverse complement, positive strand, positive direction: GGACTGG at 4216, GAAACGG at 4210, AAATCAA at 4138, GAACTAA at 4133, AAAATAG at 4123.

YYRNWYY distal promoters

Negative strand, negative direction: CCAGTCC at 2587, CCGGTCC at 2519, TCATTCT at 2503, TTGTTTT at 2490, TCGTTTT at 2476, TCACTCT at 2449, TCGGACC at 2435, TTGGACC at 2385, CCACTTT at 2282, TCGTACC at 2277, TCGGACC at 2268, TCAAACT at 2257, CCAGTCC at 2250, CCGCTTT at 2157, TTGTACC at 2152, TCAAACT at 2141, TCACATT at 2087, CCGGTCC at 2077, TTACACC at 2065, TCGTTCT at 2023, TCGGACC at 2009, TTGGACC at 1959, CCGTACT at 1953, CCGCACC at 1897, TTATACC at 1742, TTAATTT at 1697, TTGGATT at 1591, TTACTTT at 1582, CCGTTTT at 1561, TTGCTTC at 1555, CCACACT at 1479, TTGTTTT at 1394, TCGTTTT at 1371, TTATTCT at 1365, TCAGACC at 1356, TTGGATC at 1306, CCGCACC at 1244, CCACTTT at 1212, TTGTACC at 1207, TCGGACC at 1198, TCACTCT at 1079, TTGGACC at 1015, TTAGTCC at 984, CCGTACC at 953, TCGGTCC at 948, TCGCTCT at 913, TCGGACC at 899, TCGGTTC at 874, CTACACC at 787, TCGCACC at 741, TCGGACT at 732, CCAGTCC at 714, CCGGTTC at 692, CCGGTCC at 648, TTATACC at 605, CCAGTCC at 578, CCGGTTC at 556, TCGGACC at 508, TCACTTT at 473, TTGTATC at 468, TCGGACC at 459, CCAGTCC at 441, CCGGTTC at 419, CTGCTTT at 312, TCACTCT at 301, TTATACT at 274, TTGGTCC at 262, CTACATT at 247, CCATATT at 181, CCGTACT at 124, CCGTTTC at 93, CTATACC at 77, TTGTTCC at 71. inverse complement, negative strand, negative direction: AGTACGG at 2535, AGTGTGG at 2418, AGTGCGG at 2208, AGTGCGG at 1992, GGACCGA at 1843, AGTGCAG at 1773, AAAATAG at 1730, AGAACGG at 1608, GATATAG at 1528, GGTCCGA at 1462, AGAGCGA at 1448, GGACCGG at 1200, AGTGTGG at 1128, GAAGTGA at 1056, AGTGTGG at 882, GGACTGG at 734, AGTGCGG at 664, GGACCGA at 598, GATACAA at 213.

Positive strand, negative direction: CTGCACT at 2426, TCACACC at 2418, TTGAACC at 2382, CTACTCC at 2352, CTGCACT at 2000, TTATTTT at 1727, CTATATC at 1528, TCGCTCT at 1450, CCATTTC at 1380, CCAGTCT at 1354, TTGCACT at 1347, TTGCACC at 1339, TTGAACC at 1303, TCACACC at 1128, TCACTCC at 1058, TTGAACC at 1012, TCACACC at 882, TTGAACC at 846, CTGCATT at 152, TTGGACC at 32, CTGAATT at 20. inverse complement, positive strand, negative direction: AAAACAA at 2509, AAAGCAA at 2480, AAAGCAA at 2474, GATTCGG at 2454, AGAGTGA at 2447, AAACTAG at 2313, AATACAA at 2305, AGACCAG at 2263, AGTTTGA at 2257, GGTGCGG at 2197, AAAATGA at 2187, GATACAA at 2180, AGACCAA at 2147, AGTTTGA at 2141, AGTGTAA at 2087, GGTGCAG at 2082, AATGTGG at 2065, AGAGCAA at 2021, AGAATGG at 1948, AGACTGA at 1935, AAATTAG at 1887, AATACAA at 1878, AATATGG at 1742, GAATTAA at 1696, AAAGCGG at 1680, GAAATGA at 1663, GAAACAA at 1585, AATACAG at 1566, AGAACGA at 1553, AGTGCAA at 1536, GGTGTGA at 1479, AGTGCAG at 1471, AAAACAA at 1388, AGAGCAA at 1369, AGTCTGG at 1356, AAATTAG at 1234, AGAGTGA at 1077, AGATTGG at 1045, GATCCAG at 975, AGAGCGA at 911, GATGTGG at 787, AAATTAG at 777, AATACAA at 769, AGACCAG at 727, AGTTCGA at 721, AAATTGG at 643, AATACAA at 635, AATATGG at 605, AGATTGA at 585, AAATTAG at 499, AATACGA at 492, AGTGCGA at 448, GGTGCGG at 380, AAACTGA at 307, AGAACAG at 288, AATATGA at 274, AAACCAG at 261, AGTTCAA at 255, GATGTAA at 247, GAAACAA at 229, GGTATAA at 181, AAAACAG at 167, AAACTGA at 130, GATATGG at 77, AAAACAA at 69, GGACCAG at 34, AGACTGA at 17.

Negative strand, positive direction: TCACACC at 3824, CTGTTCC at 3625, CCAGACC at 3550, TCACACT at 3507, TTGCATC at 3402, CTGTTCC at 3352, TTGCACT at 3343, CCGCATC at 3328, CTGCACC at 3322, CTGCTCC at 3309, CTGGTCT at 3299, TCGCTCT at 3276, CTGGTCT at 3245, CCAGTCC at 3084, CCAGTCC at 2998, CTGCTCC at 2978, TCAGATT at 2868, CCACACT at 2636, CCACACC at 2602, TTATACC at 2590, CCGCACC at 2566, CTAATTT at 2440, CTACACC at 2430, TCACTCT at 2306, CTGTTTC at 2263, TCAATCT at 2235, CCAGATC at 2230, CTGCATT at 2206, TCATATT at 2178, TCGCTTC at 2095, CCAGTCC at 2026, CTATTTC at 1978, CCACTTC at 1914, CCAGACT at 1744, CTGCACT at 1472, CTGCACT at 1372, CCGGACT at 746, CCACACT at 345, CTGTTTT at 147, TTGTATT at 115. inverse complement, negative strand, positive direction: AGAGTGG at 4040, GGTGTGA at 3971, AGTGTGG at 3966, AGTCTGA at 3924, AGAGTGA at 3876, GAACCAG at 3840, AGAATGA at 3835, AATCCGA at 3799, GAAGCGG at 3670, AGTGTGA at 3594, GGAATGA at 3567, GGACCAG at 3547, AGTGCAG at 3465, GATGCAG at 3460, GGAATGA at 3441, GGACCAA at 3174, GAAATGG at 3168, AATATGG at 3162, GGTCTGG at 3021, GGTCTGA at 2943, GATTTGA at 2871, AGAATGA at 2841, GGTGCAA at 2801, AAAGTGG at 2711, AGAGCAA at 2705, GGACTGA at 2674, GATATAA at 2662, GAAATAG at 2626, GGTGCAA at 2335, AGTGCAG at 2327, AGATCAA at 2232, GAACCAG at 2227, AGTGCAG at 2064, AAAGCAG at 2007, GGTGTGG at 1971, GAACTGG at 1953, GGTGTGG at 1805, AGTGCAG at 1787, GGTGCGG at 1764, GAAGCGG at 1636, AGTGCGG at 1590, AATGCGG at 1422, AATGCGG at 1322, AGTGCGG at 1254, AGTGCGG at 1170, AGTGCGG at 1086, GGTGCAG at 784, AGTGCGG at 666, AGTGCGG at 582, AGTGCGG at 498, GGTGCGG at 489, AGACCGG at 442, GGAGCGA at 429, AATGTGA at 230, AGAGTGG at 53.

Positive strand, positive direction: CCACACT at 3971, TCACACC at 3966, TCAGACT at 3924, TCACTCC at 3878, CTGGACC at 3787, CCGGACC at 3758, CCGGACC at 3679, CCACTCC at 3647, TCACACT at 3594, CTGGTCT at 3548, TCGATCC at 3522, CCGATCC at 3484, CTACTCC at 3478, TCGGTCT at 3221, CTGGTTT at 3175, TTATACC at 3162, CCAGACC at 3021, CCGGACC at 2988, CCAGACT at 2943, CTGGTCC at 2876, CTAAACT at 2871, TTGCTCC at 2806, TCGATTC at 2789, TCGTTTT at 2707, TCAATCC at 2668, CTATATT at 2662, TCAGTCC at 2620, TCAGTTC at 2615, TCAGTCT at 2609, CCGGTCC at 2574, CCGCACT at 2555, TTGGTCT at 2228, CCAGTCT at 2222, CCGTTCT at 2190, CTACTTT at 2146, TTGTACT at 2141, TCAATTT at 2136, CCACACC at 1971, CCGTTCT at 1948, CCGCTCT at 1921, CCACACC at 1805, CCGCACT at 1720, CCGCTCT at 1565, TCGTTCC at 1511, CCGCTCT at 1481, CCGTTCC at 1427, CCGCTCT at 1381, CCGTTCC at 1327, CCGTTCC at 1259, CCGCTCT at 1229, CCGGTCC at 1175, TCGCTCT at 1061, CCGTTCC at 1007, TTGGACC at 947, TCGGTCT at 935, CCGTTCC at 923, TTGGACC at 847, TCGGTCT at 835, CCGTTCC at 823, CCGGACT at 725, CCGTTCC at 671, CCGCTCT at 641, CCGTTCC at 587, CCGCTCT at 557, TCGGTCC at 515, CCGTTCC at 503, CCGGACC at 286, TTACACT at 230, CCGGTCC at 215, CTGGACC at 40. inverse complement, positive strand, positive direction: GAACTGG at 4018, AGTGTGG at 3824, GGACCGG at 3681, AGAGTGG at 3612, GGTCTGG at 3550, GATCCGA at 3524, AGTGTGA at 3507, GGAACGG at 3375, GGTACAA at 3337, AGAGTGA at 3317, GGACCAG at 3298, AGTGCAG at 3255, GAAGTAG at 3250, GGACCAA at 3049, AGTCCGG at 3036, AGACCAA at 3023, GGTCCAG at 3018, GGAACAG at 3003, GGACCGG at 2990, AGACCGG at 2985, AGACTGA at 2945, GGAGTAA at 2902, AGACCGA at 2885, GGTCCGG at 2878, AAACTGG at 2873, AGTCTAA at 2868, GGTGTGA at 2636, AGTTCAG at 2617, GGTGTGG at 2602, AATATGG at 2590, GGACCGG at 2571, GGTACAA at 2475, AGAGTGG at 2470, GGACCGA at 2435, GATGTGG at 2430, AATCCGA at 2368, GGTCCGA at 2318, AAAGTGA at 2304, AGAGTGG at 2247, GGTCTAG at 2230, GGACTGG at 2213, AGTATAA at 2178, GAAGTAG at 2110, AGAATGG at 1888, GGTCCGG at 1857, GGACCGA at 1817, GGTCTGA at 1744, GGACTGG at 1662, GATGCGA at 1576, AATTCGG at 1541, GAAGCGG at 1408, GAAGCGG at 1308, AAAGCAG at 1183, GGTCCGA at 1177, GGACCGG at 949, GGACCGG at 849, GGTGCGA at 777, GATGCGA at 652, GAAGCGG at 595, AGAATGA at 524, GAAGCGG at 459, GGTGTGA at 345, GGTCCAG at 217, AATCCAG at 152, AGTCCGG at 92, GGTCCGA at 10.

BBCABW

For the Basic programs (starting with SuccessablesInr2.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 44, GTCACA at 4359, CCCACT at 4353, GTCACT at 4319, TCCAGT at 4307, GTCACT at 4200, TTCACA at 3939, CTCATT at 3891, CTCATA at 3829, TCCACT at 3825, GTCATT at 3480, TTCACT at 3410, CCCACA at 3184, TCCACT at 3144, TTCACA at 2860, GTCACT at 2739, GTCACA at 2656, GTCACA at 2603, TCCAGT at 2585, CTCACT at 2447, GTCACT at 2404, TCCAGT at 2248, GTCACA at 2085, GTCACT at 1978, TGCAGA at 1774, GGCAGT at 1511, CTCAGA at 1444, GTCAGA at 1354, GTCACT at 1325, GGCACA at 1220, CTCACT at 1077, CCCACT at 1049, GTCACT at 1034, GCCACT at 868, GGCAGA at 754, TCCAGT at 712, TCCAGT at 576, TCCAGT at 568, TGCATT at 533, TCCAGT at 439, TTCACA at 322, GTCACT at 299, CTCAGA at 278, CCCAGT at 206, TCCATA at 179.
  2. Negative strand, positive direction: 87, CTCACT at 4338, CCCAGA at 4330, TGCAGA at 4317, CTCATT at 4309, GTCAGT at 4271, CTCAGA at 4195, TCCACT at 4013, GGCACT at 4006, TGCAGT at 3962, GTCACA at 3954, CGCAGA at 3916, TCCAGA at 3891, TGCAGA at 3831, GTCACA at 3822, TCCAGA at 3806, GCCACA at 3705, CTCACA at 3505, GGCAGA at 3473, TGCAGT at 3461, GGCACA at 3409, CCCACT at 3388, CCCAGT at 3379, TGCACT at 3343, CTCACT at 3317, TGCAGT at 3281, TGCAGT at 3232, GCCAGA at 3221, CTCACA at 3209, TCCACA at 3192, CCCAGA at 3091, CCCAGT at 3082, TGCATT at 3072, TGCACA at 2962, TTCAGT at 2936, GTCACT at 2929, CTCATT at 2902, CTCAGA at 2866, TGCAGA at 2859, CTCAGA at 2729, TGCAGA at 2721, CTCAGA at 2699, GTCAGA at 2609, GTCACT at 2425, TGCAGT at 2328, TTCACT at 2304, CTCAGA at 2239, GTCAGA at 2222, TGCATT at 2206, CTCATA at 2176, TTCAGT at 2098, GCCACT at 2072, TGCAGT at 2065, CTCAGT at 2060, TCCACA at 2029, CCCAGT at 2024, GGCACT at 1996, TGCAGA at 1937, TCCACT at 1912, TGCACA at 1822, CCCAGA at 1742, GGCATT at 1702, CGCACA at 1556, CCCACT at 1502, TGCACT at 1472, CGCAGA at 1416, TGCACT at 1372, CGCAGA at 1316, CCCAGT at 1250, TGCACA at 1220, CGCACA at 1136, CGCACA at 1052, GCCACA at 984, GCCAGA at 935, GCCACA at 884, GCCAGA at 835, CGCACA at 800, CGCACT at 686, TCCACA at 632, TGCACA at 548, CCCAGA at 468, TGCAGA at 438, CGCAGA at 396, GCCACA at 343, CCCAGA at 204, GTCACA at 155, GGCATT at 22, TCCAGA at 15.
  3. Positive strand, negative direction: 59, TTCACA at 4531, CCCACT at 4485, TCCACT at 4459, CCCAGA at 4448, TCCACT at 4423, GCCAGT at 4415, TGCACT at 4340, TGCAGT at 4317, GCCAGA at 4233, CTCACA at 3965, CCCATA at 3856, TCCACA at 3692, GCCATT at 3686, CTCAGA at 3644, GGCACA at 3632, GTCAGA at 3625, GGCAGT at 3600, GGCAGA at 3589, GGCAGT at 3478, GGCATA at 3451, GGCATA at 3445, TGCAGA at 3431, TGCACT at 3289, GCCATT at 3284, GCCACT at 2756, TGCAGT at 2737, GGCACA at 2665, GCCAGT at 2654, TCCACT at 2632, TGCACT at 2426, TGCAGT at 2402, GCCAGT at 2211, TGCAGT at 2083, TGCACT at 2000, GCCACT at 1995, TGCAGT at 1976, GGCAGA at 1967, TGCACA at 1719, TCCAGT at 1532, CCCAGA at 1518, CTCACT at 1491, TGCAGT at 1472, CCCAGA at 1411, TCCATT at 1378, TCCAGT at 1352, TGCACT at 1347, TGCAGT at 1323, GGCAGA at 1314, CTCACA at 1126, GGCACA at 1116, TTCACT at 1056, TGCAGT at 1032, GGCAGA at 1023, GGCACA at 960, GGCACA at 518, GGCACA at 266, GTCACT at 208, TGCATT at 152, GCCATA at 39.
  4. Positive strand, positive direction: 40, CCCAGA at 4414, CCCACT at 4399, CTCACT at 4350, TCCAGT at 4269, CGCAGA at 4056, GTCACA at 3964, TCCACT at 3934, TTCAGA at 3922, CTCACT at 3876, GTCACT at 3843, CCCAGT at 3820, TCCAGA at 3771, TCCATT at 3731, CTCACT at 3712, GCCAGA at 3608, CTCACA at 3592, TGCAGA at 3256, CTCAGA at 3187, TCCAGA at 3019, TCCATA at 2642, TTCAGT at 2618, CTCAGT at 2613, GTCAGT at 2607, CGCACT at 2555, TTCACT at 2511, CCCAGA at 2489, GTCACA at 2464, CGCAGT at 2423, TCCACT at 2375, TCCAGA at 2258, TCCAGT at 2220, TCCACT at 2128, GTCAGT at 2100, TCCACA at 1969, CCCAGA at 1958, CCCACA at 1803, CGCACT at 1720, CCCAGA at 1711, CGCACA at 1020, TCCAGT at 153.
  5. complement, negative strand, negative direction is SuccessablesInr2c--.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 59, 3'-CGGTAT-5', 39, 3'-ACGTAA-5', 152, 3'-CAGTGA-5', 208, 3'-CCGTGT-5', 266, 3'-CCGTGT-5', 518, 3'-CCGTGT-5', 960, 3'-CCGTCT-5', 1023, 3'-ACGTCA-5', 1032, 3'-AAGTGA-5', 1056, 3'-CCGTGT-5', 1116, 3'-GAGTGT-5', 1126, 3'-CCGTCT-5', 1314, 3'-ACGTCA-5', 1323, 3'-ACGTGA-5', 1347, 3'-AGGTCA-5', 1352, 3'-AGGTAA-5', 1378, 3'-GGGTCT-5', 1411, 3'-ACGTCA-5', 1472, 3'-GAGTGA-5', 1491, 3'-GGGTCT-5', 1518, 3'-AGGTCA-5', 1532, 3'-ACGTGT-5', 1719, 3'-CCGTCT-5', 1967, 3'-ACGTCA-5', 1976, 3'-CGGTGA-5', 1995, 3'-ACGTGA-5', 2000, 3'-ACGTCA-5', 2083, 3'-CGGTCA-5', 2211, 3'-ACGTCA-5', 2402, 3'-ACGTGA-5', 2426, 3'-AGGTGA-5', 2632, 3'-CGGTCA-5', 2654, 3'-CCGTGT-5', 2665, 3'-ACGTCA-5', 2737, 3'-CGGTGA-5', 2756, 3'-CGGTAA-5', 3284, 3'-ACGTGA-5', 3289, 3'-ACGTCT-5', 3431, 3'-CCGTAT-5', 3445, 3'-CCGTAT-5', 3451, 3'-CCGTCA-5', 3478, 3'-CCGTCT-5', 3589, 3'-CCGTCA-5', 3600, 3'-CAGTCT-5', 3625, 3'-CCGTGT-5', 3632, 3'-GAGTCT-5', 3644, 3'-CGGTAA-5', 3686, 3'-AGGTGT-5', 3692, 3'-GGGTAT-5', 3856, 3'-GAGTGT-5', 3965, 3'-CGGTCT-5', 4233, 3'-ACGTCA-5', 4317, 3'-ACGTGA-5', 4340, 3'-CGGTCA-5', 4415, 3'-AGGTGA-5', 4423, 3'-GGGTCT-5', 4448, 3'-AGGTGA-5', 4459, 3'-GGGTGA-5', 4485, 3'-AAGTGT-5', 4531.
  6. complement, negative strand, positive direction is SuccessablesInr2c-+.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 40 , 3'-AGGTCA-5', 153 , 3'-GCGTGT-5', 1020 , 3'-GGGTCT-5', 1711 , 3'-GCGTGA-5', 1720 , 3'-GGGTGT-5', 1803 , 3'-GGGTCT-5', 1958 , 3'-AGGTGT-5', 1969 , 3'-CAGTCA-5', 2100 , 3'-AGGTGA-5', 2128 , 3'-AGGTCA-5', 2220 , 3'-AGGTCT-5', 2258 , 3'-AGGTGA-5', 2375 , 3'-GCGTCA-5', 2423 , 3'-CAGTGT-5', 2464 , 3'-GGGTCT-5', 2489 , 3'-AAGTGA-5', 2511 , 3'-GCGTGA-5', 2555 , 3'-CAGTCA-5', 2607 , 3'-GAGTCA-5', 2613 , 3'-AAGTCA-5', 2618 , 3'-AGGTAT-5', 2642 , 3'-AGGTCT-5', 3019 , 3'-GAGTCT-5', 3187 , 3'-ACGTCT-5', 3256 , 3'-GAGTGT-5', 3592 , 3'-CGGTCT-5', 3608 , 3'-GAGTGA-5', 3712 , 3'-AGGTAA-5', 3731 , 3'-AGGTCT-5', 3771 , 3'-GGGTCA-5', 3820 , 3'-CAGTGA-5', 3843 , 3'-GAGTGA-5', 3876 , 3'-AAGTCT-5', 3922 , 3'-AGGTGA-5', 3934 , 3'-CAGTGT-5', 3964 , 3'-GCGTCT-5', 4056 , 3'-AGGTCA-5', 4269 , 3'-GAGTGA-5', 4350 , 3'-GGGTGA-5', 4399 , 3'-GGGTCT-5', 4414.
  7. complement, positive strand, negative direction is SuccessablesInr2c+-.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 44, 3'-AGGTAT-5', 179, 3'-GGGTCA-5', 206, 3'-GAGTCT-5', 278, 3'-CAGTGA-5', 299, 3'-AAGTGT-5', 322, 3'-AGGTCA-5', 439, 3'-ACGTAA-5', 533, 3'-AGGTCA-5', 568, 3'-AGGTCA-5', 576, 3'-AGGTCA-5', 712, 3'-CCGTCT-5', 754, 3'-CGGTGA-5', 868, 3'-CAGTGA-5', 1034, 3'-GGGTGA-5', 1049, 3'-GAGTGA-5', 1077, 3'-CCGTGT-5', 1220, 3'-CAGTGA-5', 1325, 3'-CAGTCT-5', 1354, 3'-GAGTCT-5', 1444, 3'-CCGTCA-5', 1511, 3'-ACGTCT-5', 1774, 3'-CAGTGA-5', 1978, 3'-CAGTGT-5', 2085, 3'-AGGTCA-5', 2248, 3'-CAGTGA-5', 2404, 3'-GAGTGA-5', 2447, 3'-AGGTCA-5', 2585, 3'-CAGTGT-5', 2603, 3'-CAGTGT-5', 2656, 3'-CAGTGA-5', 2739, 3'-AAGTGT-5', 2860, 3'-AGGTGA-5', 3144, 3'-GGGTGT-5', 3184, 3'-AAGTGA-5', 3410, 3'-CAGTAA-5', 3480, 3'-AGGTGA-5', 3825, 3'-GAGTAT-5', 3829, 3'-GAGTAA-5', 3891, 3'-AAGTGT-5', 3939, 3'-CAGTGA-5', 4200, 3'-AGGTCA-5', 4307, 3'-CAGTGA-5', 4319, 3'-GGGTGA-5', 4353, 3'-CAGTGT-5', 4359.
  8. complement, positive strand, positive direction is SuccessablesInr2c++.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 87, 3'-AGGTCT-5', 15, 3'-CCGTAA-5', 22, 3'-CAGTGT-5', 155, 3'-GGGTCT-5', 204, 3'-CGGTGT-5', 343, 3'-GCGTCT-5', 396, 3'-ACGTCT-5', 438, 3'-GGGTCT-5', 468, 3'-ACGTGT-5', 548, 3'-AGGTGT-5', 632, 3'-GCGTGA-5', 686, 3'-GCGTGT-5', 800, 3'-CGGTCT-5', 835, 3'-CGGTGT-5', 884, 3'-CGGTCT-5', 935, 3'-CGGTGT-5', 984, 3'-GCGTGT-5', 1052, 3'-GCGTGT-5', 1136, 3'-ACGTGT-5', 1220, 3'-GGGTCA-5', 1250, 3'-GCGTCT-5', 1316, 3'-ACGTGA-5', 1372, 3'-GCGTCT-5', 1416, 3'-ACGTGA-5', 1472, 3'-GGGTGA-5', 1502, 3'-GCGTGT-5', 1556, 3'-CCGTAA-5', 1702, 3'-GGGTCT-5', 1742, 3'-ACGTGT-5', 1822, 3'-AGGTGA-5', 1912, 3'-ACGTCT-5', 1937, 3'-CCGTGA-5', 1996, 3'-GGGTCA-5', 2024, 3'-AGGTGT-5', 2029, 3'-GAGTCA-5', 2060, 3'-ACGTCA-5', 2065, 3'-CGGTGA-5', 2072, 3'-AAGTCA-5', 2098, 3'-GAGTAT-5', 2176, 3'-ACGTAA-5', 2206, 3'-CAGTCT-5', 2222, 3'-GAGTCT-5', 2239, 3'-AAGTGA-5', 2304, 3'-ACGTCA-5', 2328, 3'-CAGTGA-5', 2425, 3'-CAGTCT-5', 2609, 3'-GAGTCT-5', 2699, 3'-ACGTCT-5', 2721, 3'-GAGTCT-5', 2729, 3'-ACGTCT-5', 2859, 3'-GAGTCT-5', 2866, 3'-GAGTAA-5', 2902, 3'-CAGTGA-5', 2929, 3'-AAGTCA-5', 2936, 3'-ACGTGT-5', 2962, 3'-ACGTAA-5', 3072, 3'-GGGTCA-5', 3082, 3'-GGGTCT-5', 3091, 3'-AGGTGT-5', 3192, 3'-GAGTGT-5', 3209, 3'-CGGTCT-5', 3221, 3'-ACGTCA-5', 3232, 3'-ACGTCA-5', 3281, 3'-GAGTGA-5', 3317, 3'-ACGTGA-5', 3343, 3'-GGGTCA-5', 3379, 3'-GGGTGA-5', 3388, 3'-CCGTGT-5', 3409, 3'-ACGTCA-5', 3461, 3'-CCGTCT-5', 3473, 3'-GAGTGT-5', 3505, 3'-CGGTGT-5', 3705, 3'-AGGTCT-5', 3806, 3'-CAGTGT-5', 3822, 3'-ACGTCT-5', 3831, 3'-AGGTCT-5', 3891, 3'-GCGTCT-5', 3916, 3'-CAGTGT-5', 3954, 3'-ACGTCA-5', 3962, 3'-CCGTGA-5', 4006, 3'-AGGTGA-5', 4013, 3'-GAGTCT-5', 4195, 3'-CAGTCA-5', 4271, 3'-GAGTAA-5', 4309, 3'-ACGTCT-5', 4317, 3'-GGGTCT-5', 4330, 3'-GAGTGA-5', 4338.
  9. inverse complement, negative strand, negative direction: 46, TCTGGG at 4366, TGTGAC at 4336, TCTGCA at 4236, TCTGGG at 4205, AGTGAA at 4161, TCTGAG at 4054, AGTGAA at 4010, TGTGAA at 3983, TGTGGA at 3968, TATGGA at 3859, TATGCG at 3547, TATGAC at 3541, TCTGAC at 3425, AGTGCG at 3280, AGTGAA at 3240, AGTGAA at 3101, AGTGGG at 3057, TATGGA at 2994, ACTGAG at 2787, AGTGAA at 2578, TGTGAA at 2551, AGTGCG at 2207, ACTGGC at 2190, TATGAC at 2162, TCTGAG at 2026, AGTGCG at 1991, TCTGAC at 1934, AGTGCA at 1772, TCTGAA at 1617, TGTGAA at 1544, AGTGAC at 1492, TCTGAG at 1403, AATGAA at 1298, AGTGGA at 1171, TGTGGA at 1129, TCTGAG at 1082, AGTGAG at 1057, ACTGAA at 1052, TGTGCG at 963, TCTGAG at 916, TGTGGG at 749, AGTGCG at 663, TGTGCA at 531, TGTGCA at 342, TGTGGA at 62, TCTGAC at 16.
  10. inverse complement, negative strand, positive direction: 94, TCTGGG at 4417, TGTGGG at 4395, AGTGAG at 4351, ACTGCA at 4341, AGTGCC at 4274, AATGAG at 4095, ACTGAA at 4090, AGTGGG at 4041, TGTGAC at 3972, TGTGCA at 3960, TCTGAA at 3925, TGTGAG at 3904, AGTGAG at 3877, AATGAA at 3836, AATGAC at 3783, ACTGAG at 3736, AGTGAC at 3713, TGTGAA at 3595, AATGAC at 3568, AGTGCA at 3464, AGTGGG at 3450, AATGAG at 3446, AATGAA at 3442, TGTGGA at 3437, AATGCC at 3431, TCTGGC at 3406, TCTGCC at 3359, ACTGGC at 3346, ACTGCA at 3320, TCTGCA at 3279, TCTGCA at 3268, TATGAG at 3261, AGTGCC at 3235, AATGGG at 3169, TATGGA at 3163, TCTGAG at 3124, ACTGGC at 3118, AATGCA at 3070, TCTGCA at 3061, TATGAC at 3028, AGTGCC at 3011, TCTGAG at 3007, TCTGGC at 2984, TGTGCA at 2960, TCTGAG at 2951, TCTGAC at 2944, AATGGG at 2911, TCTGGC at 2884, TCTGCA at 2857, AATGAC at 2842, ACTGCC at 2823, AGTGGA at 2712, TGTGCA at 2681, AGTGCA at 2326, ACTGCA at 2204, TATGGC at 2160, AGTGGC at 2068, AGTGCA at 2063, TCTGGC at 1993, TGTGGC at 1972, ACTGGG at 1954, TCTGGG at 1865, TGTGGA at 1806, AGTGCA at 1786, AGTGCG at 1725, AGTGCG at 1589, ACTGCA at 1505, TCTGCG at 1496, TCTGGC at 1477, AATGCG at 1421, TCTGCG at 1396, TCTGGC at 1377, AATGCG at 1321, ACTGAG at 1287, AGTGCG at 1253, AGTGCG at 1169, AGTGCG at 1160, AGTGCG at 1085, TGTGGC at 1023, ACTGCG at 1001, TGTGGC at 919, ACTGCC at 901, TGTGGC at 819, ACTGCG at 749, AGTGCG at 665, AGTGCG at 581, AGTGCG at 497, ACTGGG at 348, TCTGGA at 271, TCTGAG at 256, ACTGCC at 238, TGTGAA at 231, TCTGCA at 224, AGTGGG at 54.
  11. inverse complement, positive strand, negative direction: 54, AATGAG at 4556, TGTGAG at 4362, ACTGCA at 4338, ACTGCA at 4330, AGTGAG at 4320, ACTGCA at 4315, AGTGAG at 4201, TGTGAG at 4093, AGTGAG at 4050, TGTGGC at 3960, ACTGCC at 3852, TCTGGA at 3836, AATGCA at 3771, ACTGGG at 3750, TGTGGG at 3712, AATGGG at 3660, TGTGCC at 3561, TGTGCA at 3429, AGTGAC at 3411, TGTGAG at 3268, AATGGC at 3005, TGTGCA at 2863, ACTGCA at 2759, AGTGAG at 2740, TGTGGC at 2606, AGTGAG at 2448, ACTGCA at 2424, AGTGAG at 2405, AATGAC at 2188, TGTGGC at 2066, ACTGCA at 1998, AGTGAG at 1979, AATGGC at 1949, ACTGAG at 1936, TATGGC at 1743, AATGCC at 1634, AATGAA at 1581, AGTGCA at 1535, ACTGCA at 1494, AGTGCA at 1470, TCTGGG at 1357, AGTGAG at 1326, AGTGGC at 1121, AGTGAG at 1078, AGTGAG at 1035, AGTGGA at 523, AGTGAA at 472, AGTGCG at 447, ACTGAC at 308, AGTGAG at 300, TATGAG at 275, ACTGAA at 131, TATGGG at 78, ACTGAA at 18.
  12. inverse complement, positive strand, positive direction: 47, AGTGAC at 4339, TGTGAG at 4335, AGTGGG at 4326, TCTGCG at 4320, TGTGCC at 4259, ACTGGG at 4217, AGTGGG at 4204, AGTGAG at 4127, AGTGAC at 4088, ACTGGA at 4019, ACTGGA at 3785, AGTGCC at 3748, AGTGGG at 3613, TCTGGA at 3551, TGTGGG at 3533, TGTGAG at 3508, AGTGAC at 3318, AGTGCA at 3254, ACTGAA at 3030, TGTGGG at 2965, ACTGAA at 2946, AGTGAC at 2930, TCTGGA at 2862, TATGAA at 2740, TGTGGA at 2431, TCTGAA at 2417, AGTGAC at 2341, AGTGGG at 2313, AGTGAG at 2305, AGTGGA at 2248, ACTGGC at 2214, AATGGG at 1889, TCTGAA at 1745, TGTGCC at 1698, ACTGGG at 1663, TGTGCC at 1559, TGTGCC at 1223, TGTGAC at 1139, TGTGCG at 987, TGTGCG at 887, TGTGCG at 803, TGTGCA at 569, AATGAA at 525, TCTGGC at 441, TCTGCC at 399, TGTGAC at 346, TCTGAC at 236.
  13. inverse, negative strand, negative direction, is SuccessablesInr2i--.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 54, 3'-TGACTT-5', 18, 3'-ATACCC-5', 78, 3'-TGACTT-5', 131, 3'-ATACTC-5', 275, 3'-TCACTC-5', 300, 3'-TGACTG-5', 308, 3'-TCACGC-5', 447, 3'-TCACTT-5', 472, 3'-TCACCT-5', 523, 3'-TCACTC-5', 1035, 3'-TCACTC-5', 1078, 3'-TCACCG-5', 1121, 3'-TCACTC-5', 1326, 3'-AGACCC-5', 1357, 3'-TCACGT-5', 1470, 3'-TGACGT-5', 1494, 3'-TCACGT-5', 1535, 3'-TTACTT-5', 1581, 3'-TTACGG-5', 1634, 3'-ATACCG-5', 1743, 3'-TGACTC-5', 1936, 3'-TTACCG-5', 1949, 3'-TCACTC-5', 1979, 3'-TGACGT-5', 1998, 3'-ACACCG-5', 2066, 3'-TTACTG-5', 2188, 3'-TCACTC-5', 2405, 3'-TGACGT-5', 2424, 3'-TCACTC-5', 2448, 3'-ACACCG-5', 2606, 3'-TCACTC-5', 2740, 3'-TGACGT-5', 2759, 3'-ACACGT-5', 2863, 3'-TTACCG-5', 3005, 3'-ACACTC-5', 3268, 3'-TCACTG-5', 3411, 3'-ACACGT-5', 3429, 3'-ACACGG-5', 3561, 3'-TTACCC-5', 3660, 3'-ACACCC-5', 3712, 3'-TGACCC-5', 3750, 3'-TTACGT-5', 3771, 3'-AGACCT-5', 3836, 3'-TGACGG-5', 3852, 3'-ACACCG-5', 3960, 3'-TCACTC-5', 4050, 3'-ACACTC-5', 4093, 3'-TCACTC-5', 4201, 3'-TGACGT-5', 4315, 3'-TCACTC-5', 4320, 3'-TGACGT-5', 4330, 3'-TGACGT-5', 4338, 3'-ACACTC-5', 4362, 3'-TTACTC-5', 4556.
  14. inverse, negative strand, positive direction, is SuccessablesInr2i-+.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 47, 3'-AGACTG-5', 236, 3'-ACACTG-5', 346, 3'-AGACGG-5', 399, 3'-AGACCG-5', 441, 3'-TTACTT-5', 525, 3'-ACACGT-5', 569, 3'-ACACGC-5', 803, 3'-ACACGC-5', 887, 3'-ACACGC-5', 987, 3'-ACACTG-5', 1139, 3'-ACACGG-5', 1223, 3'-ACACGG-5', 1559, 3'-TGACCC-5', 1663, 3'-ACACGG-5', 1698, 3'-AGACTT-5', 1745, 3'-TTACCC-5', 1889, 3'-TGACCG-5', 2214, 3'-TCACCT-5', 2248, 3'-TCACTC-5', 2305, 3'-TCACCC-5', 2313, 3'-TCACTG-5', 2341, 3'-AGACTT-5', 2417, 3'-ACACCT-5', 2431, 3'-ATACTT-5', 2740, 3'-AGACCT-5', 2862, 3'-TCACTG-5', 2930, 3'-TGACTT-5', 2946, 3'-ACACCC-5', 2965, 3'-TGACTT-5', 3030, 3'-TCACGT-5', 3254, 3'-TCACTG-5', 3318, 3'-ACACTC-5', 3508, 3'-ACACCC-5', 3533, 3'-AGACCT-5', 3551, 3'-TCACCC-5', 3613, 3'-TCACGG-5', 3748, 3'-TGACCT-5', 3785, 3'-TGACCT-5', 4019, 3'-TCACTG-5', 4088, 3'-TCACTC-5', 4127, 3'-TCACCC-5', 4204, 3'-TGACCC-5', 4217, 3'-ACACGG-5', 4259, 3'-AGACGC-5', 4320, 3'-TCACCC-5', 4326, 3'-ACACTC-5', 4335, 3'-TCACTG-5', 4339.
  15. inverse, positive strand, negative direction, is SuccessablesInr2i+-.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 46, 3'-AGACTG-5', 16, 3'-ACACCT-5', 62, 3'-ACACGT-5', 342, 3'-ACACGT-5', 531, 3'-TCACGC-5', 663, 3'-ACACCC-5', 749, 3'-AGACTC-5', 916, 3'-ACACGC-5', 963, 3'-TGACTT-5', 1052, 3'-TCACTC-5', 1057, 3'-AGACTC-5', 1082, 3'-ACACCT-5', 1129, 3'-TCACCT-5', 1171, 3'-TTACTT-5', 1298, 3'-AGACTC-5', 1403, 3'-TCACTG-5', 1492, 3'-ACACTT-5', 1544, 3'-AGACTT-5', 1617, 3'-TCACGT-5', 1772, 3'-AGACTG-5', 1934, 3'-TCACGC-5', 1991, 3'-AGACTC-5', 2026, 3'-ATACTG-5', 2162, 3'-TGACCG-5', 2190, 3'-TCACGC-5', 2207, 3'-ACACTT-5', 2551, 3'-TCACTT-5', 2578, 3'-TGACTC-5', 2787, 3'-ATACCT-5', 2994, 3'-TCACCC-5', 3057, 3'-TCACTT-5', 3101, 3'-TCACTT-5', 3240, 3'-TCACGC-5', 3280, 3'-AGACTG-5', 3425, 3'-ATACTG-5', 3541, 3'-ATACGC-5', 3547, 3'-ATACCT-5', 3859, 3'-ACACCT-5', 3968, 3'-ACACTT-5', 3983, 3'-TCACTT-5', 4010, 3'-AGACTC-5', 4054, 3'-TCACTT-5', 4161, 3'-AGACCC-5', 4205, 3'-AGACGT-5', 4236, 3'-ACACTG-5', 4336, 3'-AGACCC-5', 4366.
  16. inverse, positive strand, positive direction, is SuccessablesInr2i++.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 94, 3'-TCACCC-5', 54, 3'-AGACGT-5', 224, 3'-ACACTT-5', 231, 3'-TGACGG-5', 238, 3'-AGACTC-5', 256, 3'-AGACCT-5', 271, 3'-TGACCC-5', 348, 3'-TCACGC-5', 497, 3'-TCACGC-5', 581, 3'-TCACGC-5', 665, 3'-TGACGC-5', 749, 3'-ACACCG-5', 819, 3'-TGACGG-5', 901, 3'-ACACCG-5', 919, 3'-TGACGC-5', 1001, 3'-ACACCG-5', 1023, 3'-TCACGC-5', 1085, 3'-TCACGC-5', 1160, 3'-TCACGC-5', 1169, 3'-TCACGC-5', 1253, 3'-TGACTC-5', 1287, 3'-TTACGC-5', 1321, 3'-AGACCG-5', 1377, 3'-AGACGC-5', 1396, 3'-TTACGC-5', 1421, 3'-AGACCG-5', 1477, 3'-AGACGC-5', 1496, 3'-TGACGT-5', 1505, 3'-TCACGC-5', 1589, 3'-TCACGC-5', 1725, 3'-TCACGT-5', 1786, 3'-ACACCT-5', 1806, 3'-AGACCC-5', 1865, 3'-TGACCC-5', 1954, 3'-ACACCG-5', 1972, 3'-AGACCG-5', 1993, 3'-TCACGT-5', 2063, 3'-TCACCG-5', 2068, 3'-ATACCG-5', 2160, 3'-TGACGT-5', 2204, 3'-TCACGT-5', 2326, 3'-ACACGT-5', 2681, 3'-TCACCT-5', 2712, 3'-TGACGG-5', 2823, 3'-TTACTG-5', 2842, 3'-AGACGT-5', 2857, 3'-AGACCG-5', 2884, 3'-TTACCC-5', 2911, 3'-AGACTG-5', 2944, 3'-AGACTC-5', 2951, 3'-ACACGT-5', 2960, 3'-AGACCG-5', 2984, 3'-AGACTC-5', 3007, 3'-TCACGG-5', 3011, 3'-ATACTG-5', 3028, 3'-AGACGT-5', 3061, 3'-TTACGT-5', 3070, 3'-TGACCG-5', 3118, 3'-AGACTC-5', 3124, 3'-ATACCT-5', 3163, 3'-TTACCC-5', 3169, 3'-TCACGG-5', 3235, 3'-ATACTC-5', 3261, 3'-AGACGT-5', 3268, 3'-AGACGT-5', 3279, 3'-TGACGT-5', 3320, 3'-TGACCG-5', 3346, 3'-AGACGG-5', 3359, 3'-AGACCG-5', 3406, 3'-TTACGG-5', 3431, 3'-ACACCT-5', 3437, 3'-TTACTT-5', 3442, 3'-TTACTC-5', 3446, 3'-TCACCC-5', 3450, 3'-TCACGT-5', 3464, 3'-TTACTG-5', 3568, 3'-ACACTT-5', 3595, 3'-TCACTG-5', 3713, 3'-TGACTC-5', 3736, 3'-TTACTG-5', 3783, 3'-TTACTT-5', 3836, 3'-TCACTC-5', 3877, 3'-ACACTC-5', 3904, 3'-AGACTT-5', 3925, 3'-ACACGT-5', 3960, 3'-ACACTG-5', 3972, 3'-TCACCC-5', 4041, 3'-TGACTT-5', 4090, 3'-TTACTC-5', 4095, 3'-TCACGG-5', 4274, 3'-TGACGT-5', 4341, 3'-TCACTC-5', 4351, 3'-ACACCC-5', 4395, 3'-AGACCC-5', 4417.

BBCABW UTRs

Negative strand, negative direction: TCTGGG at 4366, GTCACA at 4359, CCCACT at 4353, TGTGAC at 4336, GTCACT at 4319, TCCAGT at 4307, TCTGCA at 4236, TCTGGG at 4205, GTCACT at 4200, AGTGAA at 4161, TCTGAG at 4054, AGTGAA at 4010, TGTGAA at 3983, TGTGGA at 3968, TTCACA at 3939, CTCATT at 3891, TATGGA at 3859, CTCATA at 3829, TCCACT at 3825, TATGCG at 3547, TATGAC at 3541, GTCATT at 3480, TCTGAC at 3425, TTCACT at 3410, AGTGCG at 3280, AGTGAA at 3240, CCCACA at 3184, TCCACT at 3144, AGTGAA at 3101, AGTGGG at 3057, TATGGA at 2994, TTCACA at 2860.

Positive strand, negative direction: AATGAG at 4556, TTCACA at 4531, CCCACT at 4485, TCCACT at 4459, CCCAGA at 4448, TCCACT at 4423, ACTGCA at 4315, GCCAGT at 4415, TGTGAG at 4362, TGCACT at 4340, ACTGCA at 4338, ACTGCA at 4330, AGTGAG at 4320, TGCAGT at 4317, GCCAGA at 4233, AGTGAG at 4201, TGTGAG at 4093, AGTGAG at 4050, CTCACA at 3965, TGTGGC at 3960, CCCATA at 3856, ACTGCC at 3852, TCTGGA at 3836, AATGCA at 3771, ACTGGG at 3750, TGTGGG at 3712, TCCACA at 3692, GCCATT at 3686, AATGGG at 3660, CTCAGA at 3644, GGCACA at 3632, GTCAGA at 3625, GGCAGT at 3600, GGCAGA at 3589, GGCAGT at 3478, TGTGCC at 3561, GGCATA at 3451, GGCATA at 3445, TGCAGA at 3431, TGTGCA at 3429, AGTGAC at 3411, TGCACT at 3289, GCCATT at 3284, TGTGAG at 3268, AATGGC at 3005, TGTGCA at 2863.

BBCABW core promoters

Negative strand, positive direction: TCTGGG at 4417, TGTGGG at 4395, AGTGAG at 4351, ACTGCA at 4341, CTCACT at 4338, CCCAGA at 4330, TGCAGA at 4317, CTCATT at 4309, AGTGCC at 4274, GTCAGT at 4271.

Positive strand, positive direction: CCCAGA at 4414, CCCACT at 4399, CTCACT at 4350, AGTGAC at 4339, TGTGAG at 4335, AGTGGG at 4326, TCTGCG at 4320, TCCAGT at 4269.

BBCABW proximal promoters

Negative strand, negative direction: ACTGAG at 2787, GTCACT at 2739, GTCACA at 2656, GTCACA at 2603.

Positive strand, negative direction: ACTGCA at 2759, GCCACT at 2756, AGTGAG at 2740, TGCAGT at 2737, GGCACA at 2665, GCCAGT at 2654, TCCACT at 2632, TGTGGC at 2606.

Negative strand, positive direction: CTCAGA at 4195 AATGAG at 4095, ACTGAA at 4090.

Positive strand, positive direction: TGTGCC at 4259, ACTGGG at 4217, AGTGGG at 4204, AGTGAG at 4127, AGTGAC at 4088, CGCAGA at 4056.

BBCABW distal promoters

Negative strand, negative direction: TCCAGT at 2585, AGTGAA at 2578, TGTGAA at 2551, CTCACT at 2447, GTCACT at 2404, TCCAGT at 2248, AGTGCG at 2207, ACTGGC at 2190, TATGAC at 2162, GTCACA at 2085, TCTGAG at 2026, AGTGCG at 1991, GTCACT at 1978, TCTGAC at 1934, TGCAGA at 1774, AGTGCA at 1772, TCTGAA at 1617, TGTGAA at 1544, GGCAGT at 1511, AGTGAC at 1492, CTCAGA at 1444, TCTGAG at 1403, GTCAGA at 1354, GTCACT at 1325, AATGAA at 1298, GGCACA at 1220, AGTGGA at 1171, TGTGGA at 1129, TCTGAG at 1082, CTCACT at 1077, AGTGAG at 1057, ACTGAA at 1052, CCCACT at 1049, GTCACT at 1034, TGTGCG at 963, TCTGAG at 916, GCCACT at 868, GGCAGA at 754, TGTGGG at 749, TCCAGT at 712, AGTGCG at 663, TCCAGT at 576, TCCAGT at 568, TGCATT at 533, TGTGCA at 531, TCCAGT at 439, TGTGCA at 342, TTCACA at 322, GTCACT at 299, CTCAGA at 278, CCCAGT at 206, TCCATA at 179, TGTGGA at 62, TCTGAC at 16.

Positive strand, negative direction: AGTGAG at 2448, TGCACT at 2426, ACTGCA at 2424, AGTGAG at 2405, TGCAGT at 2402, GCCAGT at 2211, AATGAC at 2188, TGCAGT at 2083, TGTGGC at 2066, TGCACT at 2000, ACTGCA at 1998, GCCACT at 1995, AGTGAG at 1979, TGCAGT at 1976, GGCAGA at 1967, AATGGC at 1949, ACTGAG at 1936, TATGGC at 1743, TGCACA at 1719, AATGCC at 1634, AATGAA at 1581, AGTGCA at 1535, TCCAGT at 1532, CCCAGA at 1518, ACTGCA at 1494, CTCACT at 1491, TGCAGT at 1472, AGTGCA at 1470, CCCAGA at 1411, TCCATT at 1378, TCTGGG at 1357, TCCAGT at 1352, TGCACT at 1347, AGTGAG at 1326, TGCAGT at 1323, GGCAGA at 1314, CTCACA at 1126, AGTGGC at 1121, GGCACA at 1116, AGTGAG at 1078, TTCACT at 1056, AGTGAG at 1035, TGCAGT at 1032, GGCAGA at 1023, GGCACA at 960, AGTGGA at 523, GGCACA at 518, AGTGAA at 472, AGTGCG at 447, ACTGAC at 308, AGTGAG at 300, TATGAG at 275, GGCACA at 266, GTCACT at 208, TGCATT at 152, ACTGAA at 131, TATGGG at 78, GCCATA at 39, ACTGAA at 18.

Negative strand, positive direction: GTCACT at 2425, TGCAGT at 2328, AGTGCA at 2326, TTCACT at 2304, CTCAGA at 2239, GTCAGA at 2222, TGCATT at 2206, ACTGCA at 2204, CTCATA at 2176, TATGGC at 2160, TTCAGT at 2098, GCCACT at 2072, AGTGGC at 2068, TGCAGT at 2065, AGTGCA at 2063, CTCAGT at 2060, TCCACA at 2029, CCCAGT at 2024, GGCACT at 1996, TCTGGC at 1993, TGTGGC at 1972, ACTGGG at 1954, TGCAGA at 1937, TCCACT at 1912, TCTGGG at 1865, TGCACA at 1822, TGTGGA at 1806, AGTGCA at 1786, CCCAGA at 1742, AGTGCG at 1725, GGCATT at 1702, AGTGCG at 1589, CGCACA at 1556, ACTGCA at 1505, CCCACT at 1502, TCTGCG at 1496, TCTGGC at 1477, TGCACT at 1472, AATGCG at 1421, CGCAGA at 1416, TCTGCG at 1396 TCTGGC at 1377, TGCACT at 1372 CGCAGA at 1316, AATGCG at 1321, ACTGAG at 1287, AGTGCG at 1253, CCCAGT at 1250, TGCACA at 1220, AGTGCG at 1169, AGTGCG at 1160, CGCACA at 1136, AGTGCG at 1085, CGCACA at 1052, TGTGGC at 1023, ACTGCG at 1001, GCCACA at 984, GCCAGA at 935, TGTGGC at 919, ACTGCC at 901, GCCACA at 884, GCCAGA at 835, CGCACA at 800, TGTGGC at 819, ACTGCG at 749, CGCACT at 686, AGTGCG at 665, TCCACA at 632, AGTGCG at 581, TGCACA at 548, AGTGCG at 497, CCCAGA at 468, TGCAGA at 438, CGCAGA at 396, ACTGGG at 348, GCCACA at 343, TCTGGA at 271, TCTGAG at 256, ACTGCC at 238, TGTGAA at 231, TCTGCA at 224, CCCAGA at 204, GTCACA at 155, AGTGGG at 54 GGCATT at 22, TCCAGA at 15.

Positive strand, positive direction: ACTGGA at 4019, GTCACA at 3964, TCCACT at 3934, TTCAGA at 3922, CTCACT at 3876, GTCACT at 3843, CCCAGT at 3820, ACTGGA at 3785, TCCAGA at 3771, AGTGCC at 3748, TCCATT at 3731, CTCACT at 3712, AGTGGG at 3613, GCCAGA at 3608, CTCACA at 3592, TCTGGA at 3551, TGTGGG at 3533, TGTGAG at 3508, AGTGAC at 3318, TGCAGA at 3256, AGTGCA at 3254, CTCAGA at 3187, ACTGAA at 3030, TCCAGA at 3019, TGTGGG at 2965, ACTGAA at 2946, AGTGAC at 2930, TCTGGA at 2862, TATGAA at 2740, TCCATA at 2642, TTCAGT at 2618, CTCAGT at 2613, GTCAGT at 2607, CGCACT at 2555, TTCACT at 2511, CCCAGA at 2489, GTCACA at 2464, TGTGGA at 2431, CGCAGT at 2423, TCTGAA at 2417, TCCACT at 2375, AGTGAC at 2341, AGTGGG at 2313, AGTGAG at 2305, TCCAGA at 2258, AGTGGA at 2248, TCCAGT at 2220, ACTGGC at 2214, TCCACT at 2128, GTCAGT at 2100, TCCACA at 1969, CCCAGA at 1958, AATGGG at 1889, CCCACA at 1803, TCTGAA at 1745, CGCACT at 1720, CCCAGA at 1711, TGTGCC at 1698, ACTGGG at 1663, TGTGCC at 1559, TGTGCC at 1223, TGTGAC at 1139, CGCACA at 1020, TGTGCG at 987, TGTGCG at 887, TGTGCG at 803, TGTGCA at 569, AATGAA at 525, TCTGGC at 441, TCTGCC at 399, TGTGAC at 346, TCTGAC at 236, TCCAGT at 153.

Inr-like, TCTs samplings

Copying TTCTCT in "⌘F" yields three between ZSCAN22 and A1BG and three between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence TTCTCT (starting with SuccessablesTCT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for TTCTCT, 3, TTCTCT at 3380, TTCTCT at 2826, TTCTCT at 2809.
  2. negative strand, positive direction, looking for TTCTCT, 4, TTCTCT at 4386, TTCTCT at 1990, TTCTCT at 139, TTCTCT at 119.
  3. positive strand, negative direction, looking for TTCTCT, 1, TTCTCT at 622.
  4. positive strand, positive direction, looking for TTCTCT, 0.
  5. complement, negative strand, negative direction, looking for AAGAGA, 1, AAGAGA at 622.
  6. complement, negative strand, positive direction, looking for AAGAGA, 0.
  7. complement, positive strand, negative direction, looking for AAGAGA, 3, AAGAGA at 3380, AAGAGA at 2826, AAGAGA at 2809.
  8. complement, positive strand, positive direction, looking for AAGAGA, 4, AAGAGA at 4386, AAGAGA at 1990, AAGAGA at 139, AAGAGA at 119.
  9. inverse complement, negative strand, negative direction, looking for AGAGAA, 1, AGAGAA at 4527.
  10. inverse complement, negative strand, positive direction, looking for AGAGAA, 0.
  11. inverse complement, positive strand, negative direction, looking for AGAGAA, 3, AGAGAA at 3406, AGAGAA at 2827, AGAGAA at 2810.
  12. inverse complement, positive strand, positive direction, looking for AGAGAA, 2, AGAGAA at 4387, AGAGAA at 3056.
  13. inverse negative strand, negative direction, looking for TCTCTT, 3, TCTCTT at 3406, TCTCTT at 2827, TCTCTT at 2810.
  14. inverse negative strand, positive direction, looking for TCTCTT, 2, TCTCTT at 4387, TCTCTT at 3056.
  15. inverse positive strand, negative direction, looking for TCTCTT, 1, TCTCTT at 4527.
  16. inverse positive strand, positive direction, looking for TCTCTT, 0.

TCT core promoters

Negative strand, negative direction: AGAGAA at 4527, and complement.

Negative strand, positive direction: TTCTCT at 4386, and complement.

Positive strand, positive direction: AGAGAA at 4387, and complement.

TCT distal promoters

Negative strand, negative direction: TTCTCT at 3380, TTCTCT at 2826, TTCTCT at 2809, and complements.

Positive strand, negative direction: AGAGAA at 3406, AGAGAA at 2827, AGAGAA at 2810, TTCTCT at 622, and complements.

Negative strand, positive direction: TTCTCT at 1990, TTCTCT at 139, TTCTCT at 119, and complements.

Positive strand, positive direction: AGAGAA at 3056, and complement.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. Smale, Stephen T.; Baltimore, David (1989-04-07). "The "initiator" as a transcription control element". Cell. 57 (1): 103–113. doi:10.1016/0092-8674(89)90176-1. ISSN 0092-8674. PMID 2467742.
  2. 2.0 2.1 Gershenzon, Naum I.; Ioshikhes, Ilya P. (2005-04-15). "Synergy of human Pol II core promoter elements revealed by statistical sequence analysis". Bioinformatics. 21 (8): 1295–1300. doi:10.1093/bioinformatics/bti172. ISSN 1367-4803.
  3. Lim, Chin Yan; Santoso, Buyung; Boulay, Thomas; Dong, Emily; Ohler, Uwe; Kadonaga, James T. (2004-07-01). "The MTE, a new core promoter element for transcription by RNA polymerase II". Genes & Development. 18 (13): 1606–1617. doi:10.1101/gad.1193404. ISSN 0890-9369. PMC 443522. PMID 15231738.
  4. Kaufmann, J.; Smale, S. T. (1994-04-01). "Direct recognition of initiator elements by a component of the transcription factor IID complex". Genes & Development. 8 (7): 821–829. doi:10.1101/gad.8.7.821. ISSN 0890-9369. PMID 7926770.
  5. O'Shea-Greenfield, A.; Smale, S. T. (1992-01-15). "Roles of TATA and initiator elements in determining the start site location and direction of RNA polymerase II transcription". The Journal of Biological Chemistry. 267 (2): 1391–1402. ISSN 0021-9258. PMID 1730658.
  6. 6.0 6.1 6.2 Yang, Chuhu; Bolotin, Eugene; Jiang, Tao; Sladek, Frances M.; Martinez, Ernest (2007-03-01). "Prevalence of the Initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. ISSN 0378-1119. PMC 1955227. PMID 17123746.
  7. Ngoc, Long Vo; Cassidy, California Jack; Huang, Cassidy Yunjing; Duttke, Sascha H. C.; Kadonaga, James T. (2017-01-20). "The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters". Genes & Development. doi:10.1101/gad.293837.116. ISSN 0890-9369. PMC 5287114. PMID 28108474.
  8. Javahery, R; Khachi, A; Lo, K; Zenzie-Gregory, B; Smale, S T (1994-01-01). "DNA sequence requirements for transcriptional initiator activity in mammalian cells". Molecular and Cellular Biology. 14 (1): 116–127. doi:10.1128/mcb.14.1.116. ISSN 0270-7306. PMC 358362. PMID 8264580.
  9. Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima (9 August 2020). "Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" (PDF). Fisheries Science. 86: 873–887. doi:10.1007/s12562-020-01451-z. Retrieved 27 September 2020.
  10. 10.0 10.1 Gillian E. Chalkley and C. Peter Verrijzer (September 1, 1999). "DNA binding site selection by RNA polymerase II TAFs: a TAFII250-TAFII150 complex recognizes the Initiator" (PDF). The EMBO Journal. 18 (17): 4835–45. PMID 10469661. Retrieved 2012-04-26.
  11. 11.0 11.1 J. Carcamo, L. Buckbinder, D. Reinberg (1991). Proceedings of the National Academy of Sciences USA. 88: 8052–6. Missing or empty |title= (help); |access-date= requires |url= (help)
  12. L. Weis and D. Reinberg (1997). "Accurate positioning of RNA polymerase II on a natural TATA-less promoter is independent of TATA-binding protein associated factors and initiator-binding proteins". Mol. Cell. Biol. 17: 2973–84. |access-date= requires |url= (help)
  13. RefSeq (May 2009). "BRCA1 BRCA1, DNA repair associated [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  14. 14.0 14.1 RefSeq (February 2010). "BRCA1 BRCA1, DNA repair associated [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  15. Schlegel BP, Starita LM, Parvin JD (February 2003). "Overexpression of a protein fragment of RNA helicase A causes inhibition of endogenous BRCA1 function and defects in ploidy and cytokinesis in mammary epithelial cells". Oncogene. 22 (7): 983–91. doi:10.1038/sj.onc.1206195. PMID 12592385.
  16. Anderson SF, Schlegel BP, Nakajima T, Wolpin ES, Parvin JD (July 1998). "BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase A". Nat. Genet. 19 (3): 254–6. doi:10.1038/930. PMID 9662397.
  17. Lee CG, Hurwitz J (Aug 1993). "Human RNA helicase A is homologous to the maleless protein of Drosophila". The Journal of Biological Chemistry. 268 (22): 16822–30. PMID 8344961.
  18. Zhang S, Grosse F (April 1997). "Domain structure of human nuclear DNA helicase II (RNA helicase A)". The Journal of Biological Chemistry. 272 (17): 11487–94. doi:10.1074/jbc.272.17.11487. PMID 9111062.
  19. Archambault J, Chambers RS, Kobor MS, Ho Y, Cartier M, Bolotin D, Andrews B, Kane CM, Greenblatt J (February 1998). "An essential component of a C-terminal domain phosphatase that interacts with transcription factor IIF in Saccharomyces cerevisiae". Proc Natl Acad Sci U S A. 94 (26): 14300–5. Bibcode:1997PNAS...9414300A. doi:10.1073/pnas.94.26.14300. PMC 24951. PMID 9405607.
  20. 20.0 20.1 Archambault J, Pan G, Dahmus GK, Cartier M, Marshall N, Zhang S, Dahmus ME, Greenblatt J (November 1998). "FCP1, the RAP74-interacting subunit of a human protein phosphatase that dephosphorylates the carboxyl-terminal domain of RNA polymerase IIO". J Biol Chem. 273 (42): 27593–601. doi:10.1074/jbc.273.42.27593. PMID 9765293.
  21. 21.0 21.1 "Entrez Gene: CTDP1 CTD (carboxy-terminal domain, RNA polymerase II, polypeptide A) phosphatase, subunit 1".
  22. RefSeq (February 2011). "CTDP1 CTD phosphatase subunit 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  23. Licciardo, Paolo; Amente Stefano; Ruggiero Luca; Monti Maria; Pucci Piero; Lania Luigi; Majello Barbara (Feb 2003). "The FCP1 phosphatase interacts with RNA polymerase II and with MEP50 a component of the methylosome complex involved in the assembly of snRNP". Nucleic Acids Res. England. 31 (3): 999–1005. doi:10.1093/nar/gkg197. PMC 149217. PMID 12560496.
  24. Scully, R; Anderson S F; Chao D M; Wei W; Ye L; Young R A; Livingston D M; Parvin J D (May 1997). "BRCA1 is a component of the RNA polymerase II holoenzyme". Proc. Natl. Acad. Sci. U.S.A. UNITED STATES. 94 (11): 5605–10. Bibcode:1997PNAS...94.5605S. doi:10.1073/pnas.94.11.5605. ISSN 0027-8424. PMC 20825. PMID 9159119.
  25. 25.0 25.1 RefSeq (September 2011). "DDX53 DEAD-box helicase 53 [ Homo sapiens (human)". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  26. DR Liston, PJ Johnson (March 1999). "Analysis of a Ubiquitous Promoter Element in a Primitive Eukaryote: Early Evolution of the Initiator Element". Molecular and Cellular Biology. 19 (3): 2380–8. PMID 10022924. |access-date= requires |url= (help)
  27. 27.0 27.1 C Yang, E Bolotin, T Jiang, FM Sladek, E Martinez (March 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMC 1955227. PMID 17123746.
  28. JE Purdy, BJ Mann, LT Pho, WA Petri Jr (July 19, 1994). "Transient transfection of the enteric parasite Entamoeba histolytica and expression of firefly luciferase". Proceedings of the National Academy of Science USA. 91 (15): 7099–103. PMID 8041752. Retrieved 2012-06-10.
  29. 29.0 29.1 Hualin Xi, Yong Yu, Yutao Fu, Jonathan Foley, Anason Halees, and Zhiping Weng (June 2007). "Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1". Genome Research. 17 (6): 798–806. doi:10.1101/gr.5754707. PMC 1891339. PMID 17567998.
  30. R. Javahery, A. Khachi, K. Lo, B. Zenzie-Gregory, S. T. Smale (January 1994). "DNA Sequence Requirements for Transcriptional Initiator Activity in Mammalian Cells". Molecular and Cellular Biology. 14 (1): 116–27. PMID 8264580. |access-date= requires |url= (help)
  31. Ananda L. Roy (August 2001). "Biochemistry and biology of the inducible multifunctional transcription factor TFII-I" (PDF). Gene. 274 (1–2): 1–13. doi:10.1016/S0378-1119(01)00625-4. Retrieved 2012-04-06.
  32. HGNC:11535 (March 24, 2012). "TAF1 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 250kDa". Bethesda, Maryland: NCBI. Retrieved 2012-04-09.
  33. ST Smale (March 1997). "Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes". Biochimica & Biophysica Acta. 1351 (1–2): 73–88. doi:10.1016/S0167-4781(96)00206-0. PMID 9116046. |access-date= requires |url= (help)
  34. KH Emami, A Jain, ST Smale (1997). Genes Development. 11: 3007–19. Missing or empty |title= (help); |access-date= requires |url= (help)
  35. 35.0 35.1 35.2 Benjamin Lewin (2004). Genes VIII. Upper Saddle River, NJ: Pearson Prentice Hall. pp. 636–637. ISBN 0-13-144946-X.
  36. AL Roy, M Meisterernst, P. Pognonec, RG Roeder (1991). Nature. 354: 245–8. Missing or empty |title= (help); |access-date= requires |url= (help)
  37. AL Roy, S. Malik, M. Meisterernst, RG Roeder (1993). 365: 355–9. Missing or empty |title= (help); |access-date= requires |url= (help)
  38. Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
  39. Tamar Juven-Gershon, Jer-Yuan Hsu, Joshua W. M. Theisen, and James T. Kadonaga (June 2008). "The RNA Polymerase II Core Promoter – the Gateway to Transcription". Current Opinion in Cell Biology. 20 (3): 253–9. doi:10.1016/j.ceb.2008.03.003. Retrieved 2013-02-13.

Further reading

External links

{{Chemistry resources}}

{{Phosphate biochemistry}}