Initiator element gene transcriptions: Difference between revisions

Jump to navigation Jump to search
Line 401: Line 401:
{{main|A1BG gene transcription core promoters}}
{{main|A1BG gene transcription core promoters}}


===5'-YYRNWYY-3'===
===YYRNWYY===


The wider consensus sequence of 3'-YYRNWYY-5' allows a G at the TSS but at most only allows two Gs in a row.<ref name=Gershon2008>{{ cite journal
The wider consensus sequence of 3'-YYRNWYY-5' allows a G at the TSS but at most only allows two Gs in a row.<ref name=Gershon2008>{{ cite journal
Line 419: Line 419:


For the Basic programs (starting with SuccessablesInr.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
For the Basic programs (starting with SuccessablesInr.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesInr--.bas, looking for 3'-C/T-C/T-A/G-A/C/G/T-A/T-C/T-C/T-5', 121, 3'-TTGTTCC-5', 71, 3'-CTATACC-5', 77, 3'-CCGTTTC-5', 93, 3'-CCGTACT-5', 124, 3'-CCATATT-5', 181, 3'-CTACATT-5', 247, 3'-TTGGTCC-5', 262, 3'-TTATACT-5', 274, 3'-TCACTCT-5', 301, 3'-CTGCTTT-5', 312, 3'-CCGGTTC-5', 419, 3'-CCAGTCC-5', 441, 3'-TCGGACC-5', 459, 3'-TTGTATC-5', 468, 3'-TCACTTT-5', 473, 3'-TCGGACC-5', 508, 3'-CCGGTTC-5', 556, 3'-CCAGTCC-5', 578, 3'-TTATACC-5', 605, 3'-CCGGTCC-5', 648, 3'-CCGGTTC-5', 692, 3'-CCAGTCC-5', 714, 3'-TCGGACT-5', 732, 3'-TCGCACC-5', 741, 3'-CTACACC-5', 787, 3'-TCGGTTC-5', 874, 3'-TCGGACC-5', 899, 3'-TCGCTCT-5', 913, 3'-TCGGTCC-5', 948, 3'-CCGTACC-5', 953, 3'-TTAGTCC-5', 984, 3'-TTGGACC-5', 1015, 3'-TCACTCT-5', 1079, 3'-TCGGACC-5', 1198, 3'-TTGTACC-5', 1207, 3'-CCACTTT-5', 1212, 3'-CCGCACC-5', 1244, 3'-TTGGATC-5', 1306, 3'-TCAGACC-5', 1356, 3'-TTATTCT-5', 1365, 3'-TCGTTTT-5', 1371, 3'-TTGTTTT-5', 1394, 3'-CCACACT-5', 1479, 3'-TTGCTTC-5', 1555, 3'-CCGTTTT-5', 1561, 3'-TTACTTT-5', 1582, 3'-TTGGATT-5', 1591, 3'-TTAATTT-5', 1697, 3'-TTATACC-5', 1742, 3'-CCGCACC-5', 1897, 3'-CCGTACT-5', 1953, 3'-TTGGACC-5', 1959, 3'-TCGGACC-5', 2009, 3'-TCGTTCT-5', 2023, 3'-TTACACC-5', 2065, 3'-CCGGTCC-5', 2077, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TTGTACC-5', 2152, 3'-CCGCTTT-5', 2157, 3'-CCAGTCC-5', 2250, 3'-TCAAACT-5', 2257, 3'-TCGGACC-5', 2268, 3'-TCGTACC-5', 2277, 3'-CCACTTT-5', 2282, 3'-TTGGACC-5', 2385, 3'-TCGGACC-5', 2435, 3'-TCACTCT-5', 2449, 3'-TCGTTTT-5', 2476, 3'-TTGTTTT-5', 2490, 3'-TCATTCT-5', 2503, 3'-CCGGTCC-5', 2519, 3'-CCAGTCC-5', 2587, 3'-TCACACC-5', 2605, 3'-TTGTACC-5', 2614, 3'-CCACTTT-5', 2619, 3'-TCACACC-5', 2658, 3'-TTGGACC-5', 2720, 3'-TCGGACC-5', 2770, 3'-TCGTACT-5', 2784, 3'-TTGATTC-5', 2914, 3'-CCGATTT-5', 3009, 3'-TTGATTC-5', 3031, 3'-CCGCACC-5', 3047, 3'-TCGGACC-5', 3128, 3'-TTGTTCC-5', 3141, 3'-CCACTTT-5', 3146, 3'-TTGTATT-5', 3169, 3'-CCACACC-5', 3186, 3'-TCGGTTC-5', 3273, 3'-TCGGACC-5', 3298, 3'-TTGTTCT-5', 3307, 3'-TCGTTTT-5', 3313, 3'-TTGTTCT-5', 3340, 3'-TCGTTCT-5', 3374, 3'-CCGAACT-5', 3401, 3'-CCGTATC-5', 3446, 3'-TTGATCT-5', 3463, 3'-TTGGTCT-5', 3486, 3'-CTGTTCT-5', 3759, 3'-CTACACC-5', 3810, 3'-CTGGTCC-5', 3871, 3'-TCATTCT-5', 3893, 3'-CTACTTT-5', 3922, 3'-CCGGTCC-5', 3951, 3'-TCGGACC-5', 4037, 3'-TTGTATC-5', 4046, 3'-TCACTCT-5', 4051, 3'-TTACACT-5', 4092, 3'-CCGGTCC-5', 4102, 3'-CCGTACC-5', 4107, 3'-CCGGTCC-5', 4170, 3'-TCGAACC-5', 4188, 3'-TCACTCT-5', 4202, 3'-TCGGTCT-5', 4233, 3'-CTGCACC-5', 4238, 3'-TCGGACC-5', 4300, 3'-CCAGTTT-5', 4309, 3'-TCGGACC-5', 4349, 3'-TCACACT-5', 4361, 3'-TTACTCC-5', 4557,
# Negative strand, negative direction: 121, TTACTCC at 4557, TCACACT at 4361, TCGGACC at 4349, CCAGTTT at 4309, TCGGACC at 4300, CTGCACC at 4238, TCGGTCT at 4233, TCACTCT at 4202, TCGAACC at 4188, CCGGTCC at 4170, CCGTACC at 4107, CCGGTCC at 4102, TTACACT at 4092, TCACTCT at 4051, TTGTATC at 4046, TCGGACC at 4037, CCGGTCC at 3951, CTACTTT at 3922, TCATTCT at 3893, CTGGTCC at 3871, CTACACC at 3810, CTGTTCT at 3759, TTGGTCT at 3486, TTGATCT at 3463, CCGTATC at 3446, CCGAACT at 3401, TCGTTCT at 3374, TTGTTCT at 3340, TCGTTTT at 3313, TTGTTCT at 3307, TCGGACC at 3298, TCGGTTC at 3273, CCACACC at 3186, TTGTATT at 3169, CCACTTT at 3146, TTGTTCC at 3141, TCGGACC at 3128, CCGCACC at 3047, TTGATTC at 3031, CCGATTT at 3009, TTGATTC at 2914, TCGTACT at 2784, TCGGACC at 2770, TTGGACC at 2720, TCACACC at 2658, CCACTTT at 2619, TTGTACC at 2614, TCACACC at 2605, CCAGTCC at 2587, CCGGTCC at 2519, TCATTCT at 2503, TTGTTTT at 2490, TCGTTTT at 2476, TCACTCT at 2449, TCGGACC at 2435, TTGGACC at 2385, CCACTTT at 2282, TCGTACC at 2277, TCGGACC at 2268, TCAAACT at 2257, CCAGTCC at 2250, CCGCTTT at 2157, TTGTACC at 2152, TCAAACT at 2141, TCACATT at 2087, CCGGTCC at 2077, TTACACC at 2065, TCGTTCT at 2023, TCGGACC at 2009, TTGGACC at 1959, CCGTACT at 1953, CCGCACC at 1897, TTATACC at 1742, TTAATTT at 1697, TTGGATT at 1591, TTACTTT at 1582, CCGTTTT at 1561, TTGCTTC at 1555, CCACACT at 1479, TTGTTTT at 1394, TCGTTTT at 1371, TTATTCT at 1365, TCAGACC at 1356, TTGGATC at 1306, CCGCACC at 1244, CCACTTT at 1212, TTGTACC at 1207, TCGGACC at 1198, TCACTCT at 1079, TTGGACC at 1015, TTAGTCC at 984, CCGTACC at 953, TCGGTCC at 948, TCGCTCT at 913, TCGGACC at 899, TCGGTTC at 874, CTACACC at 787, TCGCACC at 741, TCGGACT at 732, CCAGTCC at 714, CCGGTTC at 692, CCGGTCC at 648, TTATACC at 605, CCAGTCC at 578, CCGGTTC at 556, TCGGACC at 508, TCACTTT at 473, TTGTATC at 468, TCGGACC at 459, CCAGTCC at 441, CCGGTTC at 419, CTGCTTT at 312, TCACTCT at 301, TTATACT at 274, TTGGTCC at 262, CTACATT at 247, CCATATT at 181, CCGTACT at 124, CCGTTTC at 93, CTATACC at 77, TTGTTCC at 71.
# negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesInr-+.bas, looking for 3'-C/T-C/T-A/G-A/C/G/T-A/T-C/T-C/T-5', 45, 5'-TTGTATT-3' at 115, 5'-CTGTTTT-3' at 147, 5'-CCACACT-3' at 345, 5'-CCGGACT-3' at 746, 5'-CTGCACT-3' at 1372, 5'-CTGCACT-3' at 1472, 5'-CCAGACT-3' at 1744, 5'-CCACTTC-3' at 1914, 5'-CTATTTC-3' at 1978, 5'-CCAGTCC-3' at 2026, 5'-TCGCTTC-3' at 2095, 5'-TCATATT-3' at 2178, 5'-CTGCATT-3' at 2206, 5'-CCAGATC-3' at 2230, 5'-TCAATCT-3' at 2235, 5'-CTGTTTC-3' at 2263, 5'-TCACTCT-3' at 2306, 5'-CTACACC-3' at 2430, 5'-CTAATTT-3' at 2440, 5'-CCGCACC-3' at 2566, 5'-TTATACC-3' at 2590, 5'-CCACACC-3' at 2602, 5'-CCACACT-3' at 2636, 5'-TCAGATT-3' at 2868, 5'-CTGCTCC-3' at 2978, 5'-CCAGTCC-3' at 2998, 5'-CCAGTCC-3' at 3084, 5'-CTGGTCT-3' at 3245, 5'-TCGCTCT-3' at 3276, 5'-CTGGTCT-3' at 3299, 5'-CTGCTCC-3' at 3309, 5'-CTGCACC-3' at 3322, 5'-CCGCATC-3' at 3328, 5'-TTGCACT-3' at 3343, 5'-CTGTTCC-3' at 3352, 5'-TTGCATC-3' at 3402, 5'-TCACACT-3' at 3507, 5'-CCAGACC-3' at 3550, 5'-CTGTTCC-3' at 3625, 5'-TCACACC-3' at 3824, 5'-TCATTTT-3' at 4120, 5'-TCACTCT-3' at 4128, 5'-TTGATTT-3' at 4134, 5'-TTAGTTT-3' at 4139, 5'-CTGCACC-3' at 4343.
# Negative strand, positive direction: 45, CTGCACC at 4343, TTAGTTT at 4139, TTGATTT at 4134, TCACTCT at 4128, TCATTTT at 4120, TCACACC at 3824, CTGTTCC at 3625, CCAGACC at 3550, TCACACT at 3507, TTGCATC at 3402, CTGTTCC at 3352, TTGCACT at 3343, CCGCATC at 3328, CTGCACC at 3322, CTGCTCC at 3309, CTGGTCT at 3299, TCGCTCT at 3276, CTGGTCT at 3245, CCAGTCC at 3084, CCAGTCC at 2998, CTGCTCC at 2978, TCAGATT at 2868, CCACACT at 2636, CCACACC at 2602, TTATACC at 2590, CCGCACC at 2566, CTAATTT at 2440, CTACACC at 2430, TCACTCT at 2306, CTGTTTC at 2263, TCAATCT at 2235, CCAGATC at 2230, CTGCATT at 2206, TCATATT at 2178, TCGCTTC at 2095, CCAGTCC at 2026, CTATTTC at 1978, CCACTTC at 1914, CCAGACT at 1744, CTGCACT at 1472, CTGCACT at 1372, CCGGACT at 746, CCACACT at 345, CTGTTTT at 147, TTGTATT at 115.
# positive strand in the negative direction is SuccessablesInr+-.bas, looking for 3'-C/T-C/T-A/G-A/C/G/T-A/T-C/T-C/T-5', 40, 3'-CTGAATT-5', 20, 3'-TTGGACC-5', 32, 3'-CTGCATT-5', 152, 3'-TTGAACC-5', 846, 3'-TCACACC-5', 882, 3'-TTGAACC-5', 1012, 3'-TCACTCC-5', 1058, 3'-TCACACC-5', 1128, 3'-TTGAACC-5', 1303, 3'-TTGCACC-5', 1339, 3'-TTGCACT-5', 1347, 3'-CCAGTCT-5', 1354, 3'-CCATTTC-5', 1380, 3'-TCGCTCT-5', 1450, 3'-CTATATC-5', 1528, 3'-TTATTTT-5', 1727, 3'-CTGCACT-5', 2000, 3'-CTACTCC-5', 2352, 3'-TTGAACC-5', 2382, 3'-TCACACC-5', 2418, 3'-CTGCACT-5', 2426, 3'-TTGAATC-5', 2708, 3'-TTGAACC-5', 2717, 3'-CTGCACC-5', 2761, 3'-TTGAACC-5', 3245, 3'-TTGCACT-5', 3289, 3'-CCAGATC-5', 3488, 3'-CTGCTCC-5', 3582, 3'-CCATTTC-5', 3688, 3'-CTGGACT-5', 3747, 3'-CTGAACC-5', 3784, 3'-CCATACC-5', 3858, 3'-TCACACC-5', 3967, 3'-CCGGACT-5', 4327, 3'-CTGCACT-5', 4340, 3'-CCAGTTC-5', 4417, 3'-CCACTCC-5', 4425, 3'-CCACTTT-5', 4461, 3'-TCACATT-5', 4533, 3'-TTAATTC-5', 4542,
# Positive strand, negative direction: 40, TTAATTC at 4542, TCACATT at 4533, CCACTTT at 4461, CCACTCC at 4425, CCAGTTC at 4417, CTGCACT at 4340, CCGGACT at 4327, TCACACC at 3967, CCATACC at 3858, CTGAACC at 3784, CTGGACT at 3747, CCATTTC at 3688, CTGCTCC at 3582, CCAGATC at 3488, TTGCACT at 3289, TTGAACC at 3245, CTGCACC at 2761, TTGAACC at 2717, TTGAATC at 2708, CTGCACT at 2426, TCACACC at 2418, TTGAACC at 2382, CTACTCC at 2352, CTGCACT at 2000, TTATTTT at 1727, CTATATC at 1528, TCGCTCT at 1450, CCATTTC at 1380, CCAGTCT at 1354, TTGCACT at 1347, TTGCACC at 1339, TTGAACC at 1303, TCACACC at 1128, TCACTCC at 1058, TTGAACC at 1012, TCACACC at 882, TTGAACC at 846, CTGCATT at 152, TTGGACC at 32, CTGAATT at 20.
# positive strand in the positive direction is SuccessablesInr++.bas, looking for 3'-C/T-C/T-A/G-A/C/G/T-A/T-C/T-C/T-5', 75, 5'-CTGGACC-3' at 40, 5'-CCGGTCC-3' at 215, 5'-TTACACT-3' at 230, 5'-CCGGACC-3' at 286, 5'-CCGTTCC-3' at 503, 5'-TCGGTCC-3' at 515, 5'-CCGCTCT-3' at 557, 5'-CCGTTCC-3' at 587, 5'-CCGCTCT-3' at 641, 5'-CCGTTCC-3' at 671, 5'-CCGGACT-3' at 725, 5'-CCGTTCC-3' at 823, 5'-TCGGTCT-3' at 835, 5'-TTGGACC-3' at 847, 5'-CCGTTCC-3' at 923, 5'-TCGGTCT-3' at 935, 5'-TTGGACC-3' at 947, 5'-CCGTTCC-3' at 1007, 5'-TCGCTCT-3' at 1061, 5'-CCGGTCC-3' at 1175, 5'-CCGCTCT-3' at 1229, 5'-CCGTTCC-3' at 1259, 5'-CCGTTCC-3' at 1327, 5'-CCGCTCT-3' at 1381, 5'-CCGTTCC-3' at 1427, 5'-CCGCTCT-3' at 1481, 5'-TCGTTCC-3' at 1511, 5'-CCGCTCT-3' at 1565, 5'-CCGCACT-3' at 1720, 5'-CCACACC-3' at 1805, 5'-CCGCTCT-3' at 1921, 5'-CCGTTCT-3' at 1948, 5'-CCACACC-3' at 1971, 5'-TCAATTT-3' at 2136, 5'-TTGTACT-3' at 2141, 5'-CTACTTT-3' at 2146, 5'-CCGTTCT-3' at 2190, 5'-CCAGTCT-3' at 2222, 5'-TTGGTCT-3' at 2228, 5'-CCGCACT-3' at 2555, 5'-CCGGTCC-3' at 2574, 5'-TCAGTCT-3' at 2609, 5'-TCAGTTC-3' at 2615, 5'-TCAGTCC-3' at 2620, 5'-CTATATT-3' at 2662, 5'-TCAATCC-3' at 2668, 5'-TCGTTTT-3' at 2707, 5'-TCGATTC-3' at 2789, 5'-TTGCTCC-3' at 2806, 5'-CTAAACT-3' at 2871, 5'-CTGGTCC-3' at 2876, 5'-CCAGACT-3' at 2943, 5'-CCGGACC-3' at 2988, 5'-CCAGACC-3' at 3021, 5'-TTATACC-3' at 3162, 5'-CTGGTTT-3' at 3175, 5'-TCGGTCT-3' at 3221, 5'-CTACTCC-3' at 3478, 5'-CCGATCC-3' at 3484, 5'-TCGATCC-3' at 3522, 5'-CTGGTCT-3' at 3548, 5'-TCACACT-3' at 3594, 5'-CCACTCC-3' at 3647, 5'-CCGGACC-3' at 3679, 5'-CCGGACC-3' at 3758, 5'-CTGGACC-3' at 3787, 5'-TCACTCC-3' at 3878, 5'-TCAGACT-3' at 3924, 5'-TCACACC-3' at 3966, 5'-CCACACT-3' at 3971, 5'-TTACTCC-3' at 4096, 5'-CTACTCC-3' at 4102, 5'-CTAAATC-3' at 4136, 5'-CCACTCC-3' at 4401, 5'-CCAGACC-3' at 4416.
# Positive strand, positive direction: 75, CCAGACC at 4416, CCACTCC at 4401, CTAAATC at 4136, CTACTCC at 4102, TTACTCC at 4096, CCACACT at 3971, TCACACC at 3966, TCAGACT at 3924, TCACTCC at 3878, CTGGACC at 3787, CCGGACC at 3758, CCGGACC at 3679, CCACTCC at 3647, TCACACT at 3594, CTGGTCT at 3548, TCGATCC at 3522, CCGATCC at 3484, CTACTCC at 3478, TCGGTCT at 3221, CTGGTTT at 3175, TTATACC at 3162, CCAGACC at 3021, CCGGACC at 2988, CCAGACT at 2943, CTGGTCC at 2876, CTAAACT at 2871, TTGCTCC at 2806, TCGATTC at 2789, TCGTTTT at 2707, TCAATCC at 2668, CTATATT at 2662, TCAGTCC at 2620, TCAGTTC at 2615, TCAGTCT at 2609, CCGGTCC at 2574, CCGCACT at 2555, TTGGTCT at 2228, CCAGTCT at 2222, CCGTTCT at 2190, CTACTTT at 2146, TTGTACT at 2141, TCAATTT at 2136, CCACACC at 1971, CCGTTCT at 1948, CCGCTCT at 1921, CCACACC at 1805, CCGCACT at 1720, CCGCTCT at 1565, TCGTTCC at 1511, CCGCTCT at 1481, CCGTTCC at 1427, CCGCTCT at 1381, CCGTTCC at 1327, CCGTTCC at 1259, CCGCTCT at 1229, CCGGTCC at 1175, TCGCTCT at 1061, CCGTTCC at 1007, TTGGACC at 947, TCGGTCT at 935, CCGTTCC at 923, TTGGACC at 847, TCGGTCT at 835, CCGTTCC at 823, CCGGACT at 725, CCGTTCC at 671, CCGCTCT at 641, CCGTTCC at 587, CCGCTCT at 557, TCGGTCC at 515, CCGTTCC at 503, CCGGACC at 286, TTACACT at 230, CCGGTCC at 215, CTGGACC at 40.
# complement, negative strand, negative direction is SuccessablesInrc--.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 40, 3'-GACTTAA-5', 20, 3'-AACCTGG-5', 32, 3'-GACGTAA-5', 152, 3'-AACTTGG-5', 846, 3'-AGTGTGG-5', 882, 3'-AACTTGG-5', 1012, 3'-AGTGAGG-5', 1058, 3'-AGTGTGG-5', 1128, 3'-AACTTGG-5', 1303, 3'-AACGTGG-5', 1339, 3'-AACGTGA-5', 1347, 3'-GGTCAGA-5', 1354, 3'-GGTAAAG-5', 1380, 3'-AGCGAGA-5', 1450, 3'-GATATAG-5', 1528, 3'-AATAAAA-5', 1727, 3'-GACGTGA-5', 2000, 3'-GATGAGG-5', 2352, 3'-AACTTGG-5', 2382, 3'-AGTGTGG-5', 2418, 3'-GACGTGA-5', 2426, 3'-AACTTAG-5', 2708, 3'-AACTTGG-5', 2717, 3'-GACGTGG-5', 2761, 3'-AACTTGG-5', 3245, 3'-AACGTGA-5', 3289, 3'-GGTCTAG-5', 3488, 3'-GACGAGG-5', 3582, 3'-GGTAAAG-5', 3688, 3'-GACCTGA-5', 3747, 3'-GACTTGG-5', 3784, 3'-GGTATGG-5', 3858, 3'-AGTGTGG-5', 3967, 3'-GGCCTGA-5', 4327, 3'-GACGTGA-5', 4340, 3'-GGTCAAG-5', 4417, 3'-GGTGAGG-5', 4425, 3'-GGTGAAA-5', 4461, 3'-AGTGTAA-5', 4533, 3'-AATTAAG-5', 4542,
# complement, negative strand, negative direction is SuccessablesInrc--.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 40, 3'-GACTTAA-5', 20, 3'-AACCTGG-5', 32, 3'-GACGTAA-5', 152, 3'-AACTTGG-5', 846, 3'-AGTGTGG-5', 882, 3'-AACTTGG-5', 1012, 3'-AGTGAGG-5', 1058, 3'-AGTGTGG-5', 1128, 3'-AACTTGG-5', 1303, 3'-AACGTGG-5', 1339, 3'-AACGTGA-5', 1347, 3'-GGTCAGA-5', 1354, 3'-GGTAAAG-5', 1380, 3'-AGCGAGA-5', 1450, 3'-GATATAG-5', 1528, 3'-AATAAAA-5', 1727, 3'-GACGTGA-5', 2000, 3'-GATGAGG-5', 2352, 3'-AACTTGG-5', 2382, 3'-AGTGTGG-5', 2418, 3'-GACGTGA-5', 2426, 3'-AACTTAG-5', 2708, 3'-AACTTGG-5', 2717, 3'-GACGTGG-5', 2761, 3'-AACTTGG-5', 3245, 3'-AACGTGA-5', 3289, 3'-GGTCTAG-5', 3488, 3'-GACGAGG-5', 3582, 3'-GGTAAAG-5', 3688, 3'-GACCTGA-5', 3747, 3'-GACTTGG-5', 3784, 3'-GGTATGG-5', 3858, 3'-AGTGTGG-5', 3967, 3'-GGCCTGA-5', 4327, 3'-GACGTGA-5', 4340, 3'-GGTCAAG-5', 4417, 3'-GGTGAGG-5', 4425, 3'-GGTGAAA-5', 4461, 3'-AGTGTAA-5', 4533, 3'-AATTAAG-5', 4542,
# complement, negative strand, positive direction is SuccessablesInrc-+.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 75, 5'-GACCTGG-3' at 40, 5'-GGCCAGG-3' at 215, 5'-AATGTGA-3' at 230, 5'-GGCCTGG-3' at 286, 5'-GGCAAGG-3' at 503, 5'-AGCCAGG-3' at 515, 5'-GGCGAGA-3' at 557, 5'-GGCAAGG-3' at 587, 5'-GGCGAGA-3' at 641, 5'-GGCAAGG-3' at 671, 5'-GGCCTGA-3' at 725, 5'-GGCAAGG-3' at 823, 5'-AGCCAGA-3' at 835, 5'-AACCTGG-3' at 847, 5'-GGCAAGG-3' at 923, 5'-AGCCAGA-3' at 935, 5'-AACCTGG-3' at 947, 5'-GGCAAGG-3' at 1007, 5'-AGCGAGA-3' at 1061, 5'-GGCCAGG-3' at 1175, 5'-GGCGAGA-3' at 1229, 5'-GGCAAGG-3' at 1259, 5'-GGCAAGG-3' at 1327, 5'-GGCGAGA-3' at 1381, 5'-GGCAAGG-3' at 1427, 5'-GGCGAGA-3' at 1481, 5'-AGCAAGG-3' at 1511, 5'-GGCGAGA-3' at 1565, 5'-GGCGTGA-3' at 1720, 5'-GGTGTGG-3' at 1805, 5'-GGCGAGA-3' at 1921, 5'-GGCAAGA-3' at 1948, 5'-GGTGTGG-3' at 1971, 5'-AGTTAAA-3' at 2136, 5'-AACATGA-3' at 2141, 5'-GATGAAA-3' at 2146, 5'-GGCAAGA-3' at 2190, 5'-GGTCAGA-3' at 2222, 5'-AACCAGA-3' at 2228, 5'-GGCGTGA-3' at 2555, 5'-GGCCAGG-3' at 2574, 5'-AGTCAGA-3' at 2609, 5'-AGTCAAG-3' at 2615, 5'-AGTCAGG-3' at 2620, 5'-GATATAA-3' at 2662, 5'-AGTTAGG-3' at 2668, 5'-AGCAAAA-3' at 2707, 5'-AGCTAAG-3' at 2789, 5'-AACGAGG-3' at 2806, 5'-GATTTGA-3' at 2871, 5'-GACCAGG-3' at 2876, 5'-GGTCTGA-3' at 2943, 5'-GGCCTGG-3' at 2988, 5'-GGTCTGG-3' at 3021, 5'-AATATGG-3' at 3162, 5'-GACCAAA-3' at 3175, 5'-AGCCAGA-3' at 3221, 5'-GATGAGG-3' at 3478, 5'-GGCTAGG-3' at 3484, 5'-AGCTAGG-3' at 3522, 5'-GACCAGA-3' at 3548, 5'-AGTGTGA-3' at 3594, 5'-GGTGAGG-3' at 3647, 5'-GGCCTGG-3' at 3679, 5'-GGCCTGG-3' at 3758, 5'-GACCTGG-3' at 3787, 5'-AGTGAGG-3' at 3878, 5'-AGTCTGA-3' at 3924, 5'-AGTGTGG-3' at 3966, 5'-GGTGTGA-3' at 3971, 5'-AATGAGG-3' at 4096, 5'-GATGAGG-3' at 4102, 5'-GATTTAG-3' at 4136, 5'-GGTGAGG-3' at 4401, 5'-GGTCTGG-3' at 4416.
# complement, negative strand, positive direction is SuccessablesInrc-+.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 75, 5'-GACCTGG-3' at 40, 5'-GGCCAGG-3' at 215, 5'-AATGTGA-3' at 230, 5'-GGCCTGG-3' at 286, 5'-GGCAAGG-3' at 503, 5'-AGCCAGG-3' at 515, 5'-GGCGAGA-3' at 557, 5'-GGCAAGG-3' at 587, 5'-GGCGAGA-3' at 641, 5'-GGCAAGG-3' at 671, 5'-GGCCTGA-3' at 725, 5'-GGCAAGG-3' at 823, 5'-AGCCAGA-3' at 835, 5'-AACCTGG-3' at 847, 5'-GGCAAGG-3' at 923, 5'-AGCCAGA-3' at 935, 5'-AACCTGG-3' at 947, 5'-GGCAAGG-3' at 1007, 5'-AGCGAGA-3' at 1061, 5'-GGCCAGG-3' at 1175, 5'-GGCGAGA-3' at 1229, 5'-GGCAAGG-3' at 1259, 5'-GGCAAGG-3' at 1327, 5'-GGCGAGA-3' at 1381, 5'-GGCAAGG-3' at 1427, 5'-GGCGAGA-3' at 1481, 5'-AGCAAGG-3' at 1511, 5'-GGCGAGA-3' at 1565, 5'-GGCGTGA-3' at 1720, 5'-GGTGTGG-3' at 1805, 5'-GGCGAGA-3' at 1921, 5'-GGCAAGA-3' at 1948, 5'-GGTGTGG-3' at 1971, 5'-AGTTAAA-3' at 2136, 5'-AACATGA-3' at 2141, 5'-GATGAAA-3' at 2146, 5'-GGCAAGA-3' at 2190, 5'-GGTCAGA-3' at 2222, 5'-AACCAGA-3' at 2228, 5'-GGCGTGA-3' at 2555, 5'-GGCCAGG-3' at 2574, 5'-AGTCAGA-3' at 2609, 5'-AGTCAAG-3' at 2615, 5'-AGTCAGG-3' at 2620, 5'-GATATAA-3' at 2662, 5'-AGTTAGG-3' at 2668, 5'-AGCAAAA-3' at 2707, 5'-AGCTAAG-3' at 2789, 5'-AACGAGG-3' at 2806, 5'-GATTTGA-3' at 2871, 5'-GACCAGG-3' at 2876, 5'-GGTCTGA-3' at 2943, 5'-GGCCTGG-3' at 2988, 5'-GGTCTGG-3' at 3021, 5'-AATATGG-3' at 3162, 5'-GACCAAA-3' at 3175, 5'-AGCCAGA-3' at 3221, 5'-GATGAGG-3' at 3478, 5'-GGCTAGG-3' at 3484, 5'-AGCTAGG-3' at 3522, 5'-GACCAGA-3' at 3548, 5'-AGTGTGA-3' at 3594, 5'-GGTGAGG-3' at 3647, 5'-GGCCTGG-3' at 3679, 5'-GGCCTGG-3' at 3758, 5'-GACCTGG-3' at 3787, 5'-AGTGAGG-3' at 3878, 5'-AGTCTGA-3' at 3924, 5'-AGTGTGG-3' at 3966, 5'-GGTGTGA-3' at 3971, 5'-AATGAGG-3' at 4096, 5'-GATGAGG-3' at 4102, 5'-GATTTAG-3' at 4136, 5'-GGTGAGG-3' at 4401, 5'-GGTCTGG-3' at 4416.
# complement, positive strand, negative direction is SuccessablesInrc+-.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 121, 3'-AACAAGG-5', 71, 3'-GATATGG-5', 77, 3'-GGCAAAG-5', 93, 3'-GGCATGA-5', 124, 3'-GGTATAA-5', 181, 3'-GATGTAA-5', 247, 3'-AACCAGG-5', 262, 3'-AATATGA-5', 274, 3'-AGTGAGA-5', 301, 3'-GACGAAA-5', 312, 3'-GGCCAAG-5', 419, 3'-GGTCAGG-5', 441, 3'-AGCCTGG-5', 459, 3'-AACATAG-5', 468, 3'-AGTGAAA-5', 473, 3'-AGCCTGG-5', 508, 3'-GGCCAAG-5', 556, 3'-GGTCAGG-5', 578, 3'-AATATGG-5', 605, 3'-GGCCAGG-5', 648, 3'-GGCCAAG-5', 692, 3'-GGTCAGG-5', 714, 3'-AGCCTGA-5', 732, 3'-AGCGTGG-5', 741, 3'-GATGTGG-5', 787, 3'-AGCCAAG-5', 874, 3'-AGCCTGG-5', 899, 3'-AGCGAGA-5', 913, 3'-AGCCAGG-5', 948, 3'-GGCATGG-5', 953, 3'-AATCAGG-5', 984, 3'-AACCTGG-5', 1015, 3'-AGTGAGA-5', 1079, 3'-AGCCTGG-5', 1198, 3'-AACATGG-5', 1207, 3'-GGTGAAA-5', 1212, 3'-GGCGTGG-5', 1244, 3'-AACCTAG-5', 1306, 3'-AGTCTGG-5', 1356, 3'-AATAAGA-5', 1365, 3'-AGCAAAA-5', 1371, 3'-AACAAAA-5', 1394, 3'-GGTGTGA-5', 1479, 3'-AACGAAG-5', 1555, 3'-GGCAAAA-5', 1561, 3'-AATGAAA-5', 1582, 3'-AACCTAA-5', 1591, 3'-AATTAAA-5', 1697, 3'-AATATGG-5', 1742, 3'-GGCGTGG-5', 1897, 3'-GGCATGA-5', 1953, 3'-AACCTGG-5', 1959, 3'-AGCCTGG-5', 2009, 3'-AGCAAGA-5', 2023, 3'-AATGTGG-5', 2065, 3'-GGCCAGG-5', 2077, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AACATGG-5', 2152, 3'-GGCGAAA-5', 2157, 3'-GGTCAGG-5', 2250, 3'-AGTTTGA-5', 2257, 3'-AGCCTGG-5', 2268, 3'-AGCATGG-5', 2277, 3'-GGTGAAA-5', 2282, 3'-AACCTGG-5', 2385, 3'-AGCCTGG-5', 2435, 3'-AGTGAGA-5', 2449, 3'-AGCAAAA-5', 2476, 3'-AACAAAA-5', 2490, 3'-AGTAAGA-5', 2503, 3'-GGCCAGG-5', 2519, 3'-GGTCAGG-5', 2587, 3'-AGTGTGG-5', 2605, 3'-AACATGG-5', 2614, 3'-GGTGAAA-5', 2619, 3'-AGTGTGG-5', 2658, 3'-AACCTGG-5', 2720, 3'-AGCCTGG-5', 2770, 3'-AGCATGA-5', 2784, 3'-AACTAAG-5', 2914, 3'-GGCTAAA-5', 3009, 3'-AACTAAG-5', 3031, 3'-GGCGTGG-5', 3047, 3'-AGCCTGG-5', 3128, 3'-AACAAGG-5', 3141, 3'-GGTGAAA-5', 3146, 3'-AACATAA-5', 3169, 3'-GGTGTGG-5', 3186, 3'-AGCCAAG-5', 3273, 3'-AGCCTGG-5', 3298, 3'-AACAAGA-5', 3307, 3'-AGCAAAA-5', 3313, 3'-AACAAGA-5', 3340, 3'-AGCAAGA-5', 3374, 3'-GGCTTGA-5', 3401, 3'-GGCATAG-5', 3446, 3'-AACTAGA-5', 3463, 3'-AACCAGA-5', 3486, 3'-GACAAGA-5', 3759, 3'-GATGTGG-5', 3810, 3'-GACCAGG-5', 3871, 3'-AGTAAGA-5', 3893, 3'-GATGAAA-5', 3922, 3'-GGCCAGG-5', 3951, 3'-AGCCTGG-5', 4037, 3'-AACATAG-5', 4046, 3'-AGTGAGA-5', 4051, 3'-AATGTGA-5', 4092, 3'-GGCCAGG-5', 4102, 3'-GGCATGG-5', 4107, 3'-GGCCAGG-5', 4170, 3'-AGCTTGG-5', 4188, 3'-AGTGAGA-5', 4202, 3'-AGCCAGA-5', 4233, 3'-GACGTGG-5', 4238, 3'-AGCCTGG-5', 4300, 3'-GGTCAAA-5', 4309, 3'-AGCCTGG-5', 4349, 3'-AGTGTGA-5', 4361, 3'-AATGAGG-5', 4557,
# complement, positive strand, negative direction is SuccessablesInrc+-.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 121, 3'-AACAAGG-5', 71, 3'-GATATGG-5', 77, 3'-GGCAAAG-5', 93, 3'-GGCATGA-5', 124, 3'-GGTATAA-5', 181, 3'-GATGTAA-5', 247, 3'-AACCAGG-5', 262, 3'-AATATGA-5', 274, 3'-AGTGAGA-5', 301, 3'-GACGAAA-5', 312, 3'-GGCCAAG-5', 419, 3'-GGTCAGG-5', 441, 3'-AGCCTGG-5', 459, 3'-AACATAG-5', 468, 3'-AGTGAAA-5', 473, 3'-AGCCTGG-5', 508, 3'-GGCCAAG-5', 556, 3'-GGTCAGG-5', 578, 3'-AATATGG-5', 605, 3'-GGCCAGG-5', 648, 3'-GGCCAAG-5', 692, 3'-GGTCAGG-5', 714, 3'-AGCCTGA-5', 732, 3'-AGCGTGG-5', 741, 3'-GATGTGG-5', 787, 3'-AGCCAAG-5', 874, 3'-AGCCTGG-5', 899, 3'-AGCGAGA-5', 913, 3'-AGCCAGG-5', 948, 3'-GGCATGG-5', 953, 3'-AATCAGG-5', 984, 3'-AACCTGG-5', 1015, 3'-AGTGAGA-5', 1079, 3'-AGCCTGG-5', 1198, 3'-AACATGG-5', 1207, 3'-GGTGAAA-5', 1212, 3'-GGCGTGG-5', 1244, 3'-AACCTAG-5', 1306, 3'-AGTCTGG-5', 1356, 3'-AATAAGA-5', 1365, 3'-AGCAAAA-5', 1371, 3'-AACAAAA-5', 1394, 3'-GGTGTGA-5', 1479, 3'-AACGAAG-5', 1555, 3'-GGCAAAA-5', 1561, 3'-AATGAAA-5', 1582, 3'-AACCTAA-5', 1591, 3'-AATTAAA-5', 1697, 3'-AATATGG-5', 1742, 3'-GGCGTGG-5', 1897, 3'-GGCATGA-5', 1953, 3'-AACCTGG-5', 1959, 3'-AGCCTGG-5', 2009, 3'-AGCAAGA-5', 2023, 3'-AATGTGG-5', 2065, 3'-GGCCAGG-5', 2077, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AACATGG-5', 2152, 3'-GGCGAAA-5', 2157, 3'-GGTCAGG-5', 2250, 3'-AGTTTGA-5', 2257, 3'-AGCCTGG-5', 2268, 3'-AGCATGG-5', 2277, 3'-GGTGAAA-5', 2282, 3'-AACCTGG-5', 2385, 3'-AGCCTGG-5', 2435, 3'-AGTGAGA-5', 2449, 3'-AGCAAAA-5', 2476, 3'-AACAAAA-5', 2490, 3'-AGTAAGA-5', 2503, 3'-GGCCAGG-5', 2519, 3'-GGTCAGG-5', 2587, 3'-AGTGTGG-5', 2605, 3'-AACATGG-5', 2614, 3'-GGTGAAA-5', 2619, 3'-AGTGTGG-5', 2658, 3'-AACCTGG-5', 2720, 3'-AGCCTGG-5', 2770, 3'-AGCATGA-5', 2784, 3'-AACTAAG-5', 2914, 3'-GGCTAAA-5', 3009, 3'-AACTAAG-5', 3031, 3'-GGCGTGG-5', 3047, 3'-AGCCTGG-5', 3128, 3'-AACAAGG-5', 3141, 3'-GGTGAAA-5', 3146, 3'-AACATAA-5', 3169, 3'-GGTGTGG-5', 3186, 3'-AGCCAAG-5', 3273, 3'-AGCCTGG-5', 3298, 3'-AACAAGA-5', 3307, 3'-AGCAAAA-5', 3313, 3'-AACAAGA-5', 3340, 3'-AGCAAGA-5', 3374, 3'-GGCTTGA-5', 3401, 3'-GGCATAG-5', 3446, 3'-AACTAGA-5', 3463, 3'-AACCAGA-5', 3486, 3'-GACAAGA-5', 3759, 3'-GATGTGG-5', 3810, 3'-GACCAGG-5', 3871, 3'-AGTAAGA-5', 3893, 3'-GATGAAA-5', 3922, 3'-GGCCAGG-5', 3951, 3'-AGCCTGG-5', 4037, 3'-AACATAG-5', 4046, 3'-AGTGAGA-5', 4051, 3'-AATGTGA-5', 4092, 3'-GGCCAGG-5', 4102, 3'-GGCATGG-5', 4107, 3'-GGCCAGG-5', 4170, 3'-AGCTTGG-5', 4188, 3'-AGTGAGA-5', 4202, 3'-AGCCAGA-5', 4233, 3'-GACGTGG-5', 4238, 3'-AGCCTGG-5', 4300, 3'-GGTCAAA-5', 4309, 3'-AGCCTGG-5', 4349, 3'-AGTGTGA-5', 4361, 3'-AATGAGG-5', 4557,
# complement, positive strand, positive direction is SuccessablesInrc++.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 45, 5'-AACATAA-3' at 115, 5'-GACAAAA-3' at 147, 5'-GGTGTGA-3' at 345, 5'-GGCCTGA-3' at 746, 5'-GACGTGA-3' at 1372, 5'-GACGTGA-3' at 1472, 5'-GGTCTGA-3' at 1744, 5'-GGTGAAG-3' at 1914, 5'-GATAAAG-3' at 1978, 5'-GGTCAGG-3' at 2026, 5'-AGCGAAG-3' at 2095, 5'-AGTATAA-3' at 2178, 5'-GACGTAA-3' at 2206, 5'-GGTCTAG-3' at 2230, 5'-AGTTAGA-3' at 2235, 5'-GACAAAG-3' at 2263, 5'-AGTGAGA-3' at 2306, 5'-GATGTGG-3' at 2430, 5'-GATTAAA-3' at 2440, 5'-GGCGTGG-3' at 2566, 5'-AATATGG-3' at 2590, 5'-GGTGTGG-3' at 2602, 5'-GGTGTGA-3' at 2636, 5'-AGTCTAA-3' at 2868, 5'-GACGAGG-3' at 2978, 5'-GGTCAGG-3' at 2998, 5'-GGTCAGG-3' at 3084, 5'-GACCAGA-3' at 3245, 5'-AGCGAGA-3' at 3276, 5'-GACCAGA-3' at 3299, 5'-GACGAGG-3' at 3309, 5'-GACGTGG-3' at 3322, 5'-GGCGTAG-3' at 3328, 5'-AACGTGA-3' at 3343, 5'-GACAAGG-3' at 3352, 5'-AACGTAG-3' at 3402, 5'-AGTGTGA-3' at 3507, 5'-GGTCTGG-3' at 3550, 5'-GACAAGG-3' at 3625, 5'-AGTGTGG-3' at 3824, 5'-AGTAAAA-3' at 4120, 5'-AGTGAGA-3' at 4128, 5'-AACTAAA-3' at 4134, 5'-AATCAAA-3' at 4139, 5'-GACGTGG-3' at 4343.
# complement, positive strand, positive direction is SuccessablesInrc++.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 45, 5'-AACATAA-3' at 115, 5'-GACAAAA-3' at 147, 5'-GGTGTGA-3' at 345, 5'-GGCCTGA-3' at 746, 5'-GACGTGA-3' at 1372, 5'-GACGTGA-3' at 1472, 5'-GGTCTGA-3' at 1744, 5'-GGTGAAG-3' at 1914, 5'-GATAAAG-3' at 1978, 5'-GGTCAGG-3' at 2026, 5'-AGCGAAG-3' at 2095, 5'-AGTATAA-3' at 2178, 5'-GACGTAA-3' at 2206, 5'-GGTCTAG-3' at 2230, 5'-AGTTAGA-3' at 2235, 5'-GACAAAG-3' at 2263, 5'-AGTGAGA-3' at 2306, 5'-GATGTGG-3' at 2430, 5'-GATTAAA-3' at 2440, 5'-GGCGTGG-3' at 2566, 5'-AATATGG-3' at 2590, 5'-GGTGTGG-3' at 2602, 5'-GGTGTGA-3' at 2636, 5'-AGTCTAA-3' at 2868, 5'-GACGAGG-3' at 2978, 5'-GGTCAGG-3' at 2998, 5'-GGTCAGG-3' at 3084, 5'-GACCAGA-3' at 3245, 5'-AGCGAGA-3' at 3276, 5'-GACCAGA-3' at 3299, 5'-GACGAGG-3' at 3309, 5'-GACGTGG-3' at 3322, 5'-GGCGTAG-3' at 3328, 5'-AACGTGA-3' at 3343, 5'-GACAAGG-3' at 3352, 5'-AACGTAG-3' at 3402, 5'-AGTGTGA-3' at 3507, 5'-GGTCTGG-3' at 3550, 5'-GACAAGG-3' at 3625, 5'-AGTGTGG-3' at 3824, 5'-AGTAAAA-3' at 4120, 5'-AGTGAGA-3' at 4128, 5'-AACTAAA-3' at 4134, 5'-AATCAAA-3' at 4139, 5'-GACGTGG-3' at 4343.
# inverse complement, negative strand, negative direction is SuccessablesInrci--.bas, looking for 3'-A/G-A/G-A/T-A/C/G/T-C/T-A/G-A/G-5', 32, 3'-GATACAA-5', 213, 3'-GGACCGA-5', 598, 3'-AGTGCGG-5', 664, 3'-GGACTGG-5', 734, 3'-AGTGTGG-5', 882, 3'-GAAGTGA-5', 1056, 3'-AGTGTGG-5', 1128, 3'-GGACCGG-5', 1200, 3'-AGAGCGA-5', 1448, 3'-GGTCCGA-5', 1462, 3'-GATATAG-5', 1528, 3'-AGAACGG-5', 1608, 3'-AAAATAG-5', 1730, 3'-AGTGCAG-5', 1773, 3'-GGACCGA-5', 1843, 3'-AGTGCGG-5', 1992, 3'-AGTGCGG-5', 2208, 3'-AGTGTGG-5', 2418, 3'-AGTACGG-5', 2535, 3'-AGTACGG-5', 2753, 3'-AAAGTAG-5', 2887, 3'-GATTCGA-5', 3033, 3'-GGACCGG-5', 3130, 3'-AGTGCGG-5', 3281, 3'-AGTCCGA-5', 3398, 3'-GGTCTAG-5', 3488, 3'-GGTATGG-5', 3858, 3'-GGTCCGG-5', 3873, 3'-AGTGTGG-5', 3967, 3'-AGTACGG-5', 4118, 3'-GGTCCGA-5', 4255, 3'-AGTGTAA-5', 4533,
# inverse complement, negative strand, negative direction: 32, AGTGTAA at 4533, GGTCCGA at 4255, AGTACGG at 4118, AGTGTGG at 3967, GGTCCGG at 3873, GGTATGG at 3858, GGTCTAG at 3488, AGTCCGA at 3398, AGTGCGG at 3281, GGACCGG at 3130, GATTCGA at 3033, AAAGTAG at 2887, AGTACGG at 2753, AGTACGG at 2535, AGTGTGG at 2418, AGTGCGG at 2208, AGTGCGG at 1992, GGACCGA at 1843, AGTGCAG at 1773, AAAATAG at 1730, AGAACGG at 1608, GATATAG at 1528, GGTCCGA at 1462, AGAGCGA at 1448, GGACCGG at 1200, AGTGTGG at 1128, GAAGTGA at 1056, AGTGTGG at 882, GGACTGG at 734, AGTGCGG at 664, GGACCGA at 598, GATACAA at 213.
# inverse complement, negative strand, positive direction is SuccessablesInrci-+.bas, looking for 3'-A/G-A/G-A/T-A/C/G/T-C/T-A/G-A/G-5', 61, 5'-AGAGTGG-3' at 53, 5'-AATGTGA-3' at 230, 5'-GGAGCGA-3' at 429, 5'-AGACCGG-3' at 442, 5'-GGTGCGG-3' at 489, 5'-AGTGCGG-3' at 498, 5'-AGTGCGG-3' at 582, 5'-AGTGCGG-3' at 666, 5'-GGTGCAG-3' at 784, 5'-AGTGCGG-3' at 1086, 5'-AGTGCGG-3' at 1170, 5'-AGTGCGG-3' at 1254, 5'-AATGCGG-3' at 1322, 5'-AATGCGG-3' at 1422, 5'-AGTGCGG-3' at 1590, 5'-GAAGCGG-3' at 1636, 5'-GGTGCGG-3' at 1764, 5'-AGTGCAG-3' at 1787, 5'-GGTGTGG-3' at 1805, 5'-GAACTGG-3' at 1953, 5'-GGTGTGG-3' at 1971, 5'-AAAGCAG-3' at 2007, 5'-AGTGCAG-3' at 2064, 5'-GAACCAG-3' at 2227, 5'-AGATCAA-3' at 2232, 5'-AGTGCAG-3' at 2327, 5'-GGTGCAA-3' at 2335, 5'-GAAATAG-3' at 2626, 5'-GATATAA-3' at 2662, 5'-GGACTGA-3' at 2674, 5'-AGAGCAA-3' at 2705, 5'-AAAGTGG-3' at 2711, 5'-GGTGCAA-3' at 2801, 5'-AGAATGA-3' at 2841, 5'-GATTTGA-3' at 2871, 5'-GGTCTGA-3' at 2943, 5'-GGTCTGG-3' at 3021, 5'-AATATGG-3' at 3162, 5'-GAAATGG-3' at 3168, 5'-GGACCAA-3' at 3174, 5'-GGAATGA-3' at 3441, 5'-GATGCAG-3' at 3460, 5'-AGTGCAG-3' at 3465, 5'-GGACCAG-3' at 3547, 5'-GGAATGA-3' at 3567, 5'-AGTGTGA-3' at 3594, 5'-GAAGCGG-3' at 3670, 5'-AATCCGA-3' at 3799, 5'-AGAATGA-3' at 3835, 5'-GAACCAG-3' at 3840, 5'-AGAGTGA-3' at 3876, 5'-AGTCTGA-3' at 3924, 5'-AGTGTGG-3' at 3966, 5'-GGTGTGA-3' at 3971, 5'-AGAGTGG-3' at 4040, 5'-AGAACAG-3' at 4069, 5'-GAAATGA-3' at 4094, 5'-GATTTAG-3' at 4136, 5'-GGAGTGA-3' at 4350, 5'-GGTCTGG-3' at 4416, 5'-GGAACAA-3' at 4445.
# inverse complement, negative strand, positive direction: 61, GGAACAG at 4445, GGTCTGG at 4416, GGAGTGA at 4350, GATTTAG at 4136, GAAATGA at 4094, AGAACAG at 4069, AGAGTGG at 4040, GGTGTGA at 3971, AGTGTGG at 3966, AGTCTGA at 3924, AGAGTGA at 3876, GAACCAG at 3840, AGAATGA at 3835, AATCCGA at 3799, GAAGCGG at 3670, AGTGTGA at 3594, GGAATGA at 3567, GGACCAG at 3547, AGTGCAG at 3465, GATGCAG at 3460, GGAATGA at 3441, GGACCAA at 3174, GAAATGG at 3168, AATATGG at 3162, GGTCTGG at 3021, GGTCTGA at 2943, GATTTGA at 2871, AGAATGA at 2841, GGTGCAA at 2801, AAAGTGG at 2711, AGAGCAA at 2705, GGACTGA at 2674, GATATAA at 2662, GAAATAG at 2626, GGTGCAA at 2335, AGTGCAG at 2327, AGATCAA at 2232, GAACCAG at 2227, AGTGCAG at 2064, AAAGCAG at 2007, GGTGTGG at 1971, GAACTGG at 1953, GGTGTGG at 1805, AGTGCAG at 1787, GGTGCGG at 1764, GAAGCGG at 1636, AGTGCGG at 1590, AATGCGG at 1422, AATGCGG at 1322, AGTGCGG at 1254, AGTGCGG at 1170, AGTGCGG at 1086, GGTGCAG at 784, AGTGCGG at 666, AGTGCGG at 582, AGTGCGG at 498, GGTGCGG at 489, AGACCGG at 442, GGAGCGA at 429, AATGTGA at 230, AGAGTGG at 53.
# inverse complement, positive strand, negative direction is SuccessablesInrci+-.bas, looking for 3'-A/G-A/G-A/T-A/C/G/T-C/T-A/G-A/G-5', 100, 3'-AGACTGA-5', 17, 3'-GGACCAG-5', 34, 3'-AAAACAA-5', 69, 3'-GATATGG-5', 77, 3'-AAACTGA-5', 130, 3'-AAAACAG-5', 167, 3'-GGTATAA-5', 181, 3'-GAAACAA-5', 229, 3'-GATGTAA-5', 247, 3'-AGTTCAA-5', 255, 3'-AAACCAG-5', 261, 3'-AATATGA-5', 274, 3'-AGAACAG-5', 288, 3'-AAACTGA-5', 307, 3'-GGTGCGG-5', 380, 3'-AGTGCGA-5', 448, 3'-AATACGA-5', 492, 3'-AAATTAG-5', 499, 3'-AGATTGA-5', 585, 3'-AATATGG-5', 605, 3'-AATACAA-5', 635, 3'-AAATTGG-5', 643, 3'-AGTTCGA-5', 721, 3'-AGACCAG-5', 727, 3'-AATACAA-5', 769, 3'-AAATTAG-5', 777, 3'-GATGTGG-5', 787, 3'-AGAGCGA-5', 911, 3'-GATCCAG-5', 975, 3'-AGATTGG-5', 1045, 3'-AGAGTGA-5', 1077, 3'-AAATTAG-5', 1234, 3'-AGTCTGG-5', 1356, 3'-AGAGCAA-5', 1369, 3'-AAAACAA-5', 1388, 3'-AGTGCAG-5', 1471, 3'-GGTGTGA-5', 1479, 3'-AGTGCAA-5', 1536, 3'-AGAACGA-5', 1553, 3'-AATACAG-5', 1566, 3'-GAAACAA-5', 1585, 3'-GAAATGA-5', 1663, 3'-AAAGCGG-5', 1680, 3'-GAATTAA-5', 1696, 3'-AATATGG-5', 1742, 3'-AATACAA-5', 1878, 3'-AAATTAG-5', 1887, 3'-AGACTGA-5', 1935, 3'-AGAATGG-5', 1948, 3'-AGAGCAA-5', 2021, 3'-AATGTGG-5', 2065, 3'-GGTGCAG-5', 2082, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AGACCAA-5', 2147, 3'-GATACAA-5', 2180, 3'-AAAATGA-5', 2187, 3'-GGTGCGG-5', 2197, 3'-AGTTTGA-5', 2257, 3'-AGACCAG-5', 2263, 3'-AATACAA-5', 2305, 3'-AAACTAG-5', 2313, 3'-AGAGTGA-5', 2447, 3'-GATTCGG-5', 2454, 3'-AAAGCAA-5', 2474, 3'-AAAGCAA-5', 2480, 3'-AAAACAA-5', 2509, 3'-AGACCAG-5', 2600, 3'-AGTGTGG-5', 2605, 3'-AAATCAG-5', 2649, 3'-AGTGTGG-5', 2658, 3'-AAAACAA-5', 2842, 3'-AGAATGG-5', 3004, 3'-AAAATAA-5', 3013, 3'-AAACTAA-5', 3030, 3'-AGACCAG-5', 3123, 3'-AAATTAG-5', 3176, 3'-GGTGTGG-5', 3186, 3'-AGAGCAA-5', 3311, 3'-AAAACAA-5', 3330, 3'-AAATTGA-5', 3358, 3'-GAAGTGA-5', 3410, 3'-GAACTAG-5', 3462, 3'-AAACCAG-5', 3485, 3'-AATCCAG-5', 3681, 3'-GGAACAG-5', 3725, 3'-GGACTGG-5', 3749, 3'-AATGCAG-5', 3772, 3'-GATGTGG-5', 3810, 3'-GGACCAG-5', 3870, 3'-GGAGTAA-5', 3891, 3'-AGTTCAA-5', 4026, 3'-AGACCAG-5', 4032, 3'-AAAATAA-5', 4071, 3'-AATGTGA-5', 4092, 3'-AGTTCAA-5', 4177, 3'-AAAATAA-5', 4221, 3'-AGTGTGA-5', 4361, 3'-AGTCCAA-5', 4502, 3'-GGAATGA-5', 4555,
# inverse complement, positive strand, negative direction: 100, GGAATGA at 4555, AGTCCAA at 4502, AGTGTGA at 4361, AAAATAA at 4221, AGTTCAA at 4177, AATGTGA at 4092, AAAATAA at 4071, AGACCAG at 4032, AGTTCAA at 4026, GGAGTAA at 3891, GGACCAG at 3870, GATGTGG at 3810, AATGCAG at 3772, GGACTGG at 3749, GGAACAG at 3725, AATCCAG at 3681, AAACCAG at 3485, GAACTAG at 3462, GAAGTGA at 3410, AAATTGA at 3358, AAAACAA at 3330, AGAGCAA at 3311, GGTGTGG at 3186, AAATTAG at 3176, AGACCAG at 3123, AAACTAA at 3030, AAAATAA at 3013, AGAATGG at 3004, AAAACAA at 2842, AGTGTGG at 2658, AAATCAG at 2649, AGTGTGG at 2605, AGACCAG at 2600, AAAACAA at 2509, AAAGCAA at 2480, AAAGCAA at 2474, GATTCGG at 2454, AGAGTGA at 2447, AAACTAG at 2313, AATACAA at 2305, AGACCAG at 2263, AGTTTGA at 2257, GGTGCGG at 2197, AAAATGA at 2187, GATACAA at 2180, AGACCAA at 2147, AGTTTGA at 2141, AGTGTAA at 2087, GGTGCAG at 2082, AATGTGG at 2065, AGAGCAA at 2021, AGAATGG at 1948, AGACTGA at 1935, AAATTAG at 1887, AATACAA at 1878, AATATGG at 1742, GAATTAA at 1696, AAAGCGG at 1680, GAAATGA at 1663, GAAACAA at 1585, AATACAG at 1566, AGAACGA at 1553, AGTGCAA at 1536, GGTGTGA at 1479, AGTGCAG at 1471, AAAACAA at 1388, AGAGCAA at 1369, AGTCTGG at 1356, AAATTAG at 1234, AGAGTGA at 1077, AGATTGG at 1045, GATCCAG at 975, AGAGCGA at 911, GATGTGG at 787, AAATTAG at 777, AATACAA at 769, AGACCAG at 727, AGTTCGA at 721, AAATTGG at 643, AATACAA at 635, AATATGG at 605, AGATTGA at 585, AAATTAG at 499, AATACGA at 492, AGTGCGA at 448, GGTGCGG at 380, AAACTGA at 307, AGAACAG at 288, AATATGA at 274, AAACCAG at 261, AGTTCAA at 255, GATGTAA at 247, GAAACAA at 229, GGTATAA at 181, AAAACAG at 167, AAACTGA at 130, GATATGG at 77, AAAACAA at 69, GGACCAG at 34, AGACTGA at 17.
# inverse complement, positive strand, positive direction is SuccessablesInrci++.bas, looking for 3'-A/G-A/G-A/T-A/C/G/T-C/T-A/G-A/G-5', 75, 5'-GGTCCGA-3' at 10, 5'-AGTCCGG-3' at 92, 5'-AATCCAG-3' at 152, 5'-GGTCCAG-3' at 217, 5'-GGTGTGA-3' at 345, 5'-GAAGCGG-3' at 459, 5'-AGAATGA-3' at 524, 5'-GAAGCGG-3' at 595, 5'-GATGCGA-3' at 652, 5'-GGTGCGA-3' at 777, 5'-GGACCGG-3' at 849, 5'-GGACCGG-3' at 949, 5'-GGTCCGA-3' at 1177, 5'-AAAGCAG-3' at 1183, 5'-GAAGCGG-3' at 1308, 5'-GAAGCGG-3' at 1408, 5'-AATTCGG-3' at 1541, 5'-GATGCGA-3' at 1576, 5'-GGACTGG-3' at 1662, 5'-GGTCTGA-3' at 1744, 5'-GGACCGA-3' at 1817, 5'-GGTCCGG-3' at 1857, 5'-AGAATGG-3' at 1888, 5'-GAAGTAG-3' at 2110, 5'-AGTATAA-3' at 2178, 5'-GGACTGG-3' at 2213, 5'-GGTCTAG-3' at 2230, 5'-AGAGTGG-3' at 2247, 5'-AAAGTGA-3' at 2304, 5'-GGTCCGA-3' at 2318, 5'-AATCCGA-3' at 2368, 5'-GATGTGG-3' at 2430, 5'-GGACCGA-3' at 2435, 5'-AGAGTGG-3' at 2470, 5'-GGTACAA-3' at 2475, 5'-GGACCGG-3' at 2571, 5'-AATATGG-3' at 2590, 5'-GGTGTGG-3' at 2602, 5'-AGTTCAG-3' at 2617, 5'-GGTGTGA-3' at 2636, 5'-AGTCTAA-3' at 2868, 5'-AAACTGG-3' at 2873, 5'-GGTCCGG-3' at 2878, 5'-AGACCGA-3' at 2885, 5'-GGAGTAA-3' at 2902, 5'-AGACTGA-3' at 2945, 5'-AGACCGG-3' at 2985, 5'-GGACCGG-3' at 2990, 5'-GGAACAG-3' at 3003, 5'-GGTCCAG-3' at 3018, 5'-AGACCAA-3' at 3023, 5'-AGTCCGG-3' at 3036, 5'-GGACCAA-3' at 3049, 5'-GAAGTAG-3' at 3250, 5'-AGTGCAG-3' at 3255, 5'-GGACCAG-3' at 3298, 5'-AGAGTGA-3' at 3317, 5'-GGTACAA-3' at 3337, 5'-GGAACGG-3' at 3375, 5'-AGTGTGA-3' at 3507, 5'-GATCCGA-3' at 3524, 5'-GGTCTGG-3' at 3550, 5'-AGAGTGG-3' at 3612, 5'-GGACCGG-3' at 3681, 5'-AGTGTGG-3' at 3824, 5'-GAACTGG-3' at 4018, 5'-AAAATAG-3' at 4123, 5'-GAACTAA-3' at 4133, 5'-AAATCAA-3' at 4138, 5'-GAAACGG-3' at 4210, 5'-GGACTGG-3' at 4216, 5'-GGAGTAA-3' at 4309, 5'-AGTACAG-3' at 4366, 5'-GGTACGA-3' at 4372, 5'-AGAACGA-3' at 4390.
# inverse complement, positive strand, positive direction: 75, AGAACGA at 4390, GGTACGA at 4372, AGTACAG at 4366, GGAGTAA at 4309, GGACTGG at 4216, GAAACGG at 4210, AAATCAA at 4138, GAACTAA at 4133, AAAATAG at 4123, GAACTGG at 4018, AGTGTGG at 3824, GGACCGG at 3681, AGAGTGG at 3612, GGTCTGG at 3550, GATCCGA at 3524, AGTGTGA at 3507, GGAACGG at 3375, GGTACAA at 3337, AGAGTGA at 3317, GGACCAG at 3298, AGTGCAG at 3255, GAAGTAG at 3250, GGACCAA at 3049, AGTCCGG at 3036, AGACCAA at 3023, GGTCCAG at 3018, GGAACAG at 3003, GGACCGG at 2990, AGACCGG at 2985, AGACTGA at 2945, GGAGTAA at 2902, AGACCGA at 2885, GGTCCGG at 2878, AAACTGG at 2873, AGTCTAA at 2868, GGTGTGA at 2636, AGTTCAG at 2617, GGTGTGG at 2602, AATATGG at 2590, GGACCGG at 2571, GGTACAA at 2475, AGAGTGG at 2470, GGACCGA at 2435, GATGTGG at 2430, AATCCGA at 2368, GGTCCGA at 2318, AAAGTGA at 2304, AGAGTGG at 2247, GGTCTAG at 2230, GGACTGG at 2213, AGTATAA at 2178, GAAGTAG at 2110, AGAATGG at 1888, GGTCCGG at 1857, GGACCGA at 1817, GGTCTGA at 1744, GGACTGG at 1662, GATGCGA at 1576, AATTCGG at 1541, GAAGCGG at 1408, GAAGCGG at 1308, AAAGCAG at 1183, GGTCCGA at 1177, GGACCGG at 949, GGACCGG at 849, GGTGCGA at 777, GATGCGA at 652, GAAGCGG at 595, AGAATGA at 524, GAAGCGG at 459, GGTGTGA at 345, GGTCCAG at 217, AATCCAG at 152, AGTCCGG at 92, GGTCCGA at 10.
# inverse, negative strand, negative direction, is SuccessablesInri--.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 100, 3'-TCTGACT-5', 17, 3'-CCTGGTC-5', 34, 3'-TTTTGTT-5', 69, 3'-CTATACC-5', 77, 3'-TTTGACT-5', 130, 3'-TTTTGTC-5', 167, 3'-CCATATT-5', 181, 3'-CTTTGTT-5', 229, 3'-CTACATT-5', 247, 3'-TCAAGTT-5', 255, 3'-TTTGGTC-5', 261, 3'-TTATACT-5', 274, 3'-TCTTGTC-5', 288, 3'-TTTGACT-5', 307, 3'-CCACGCC-5', 380, 3'-TCACGCT-5', 448, 3'-TTATGCT-5', 492, 3'-TTTAATC-5', 499, 3'-TCTAACT-5', 585, 3'-TTATACC-5', 605, 3'-TTATGTT-5', 635, 3'-TTTAACC-5', 643, 3'-TCAAGCT-5', 721, 3'-TCTGGTC-5', 727, 3'-TTATGTT-5', 769, 3'-TTTAATC-5', 777, 3'-CTACACC-5', 787, 3'-TCTCGCT-5', 911, 3'-CTAGGTC-5', 975, 3'-TCTAACC-5', 1045, 3'-TCTCACT-5', 1077, 3'-TTTAATC-5', 1234, 3'-TCAGACC-5', 1356, 3'-TCTCGTT-5', 1369, 3'-TTTTGTT-5', 1388, 3'-TCACGTC-5', 1471, 3'-CCACACT-5', 1479, 3'-TCACGTT-5', 1536, 3'-TCTTGCT-5', 1553, 3'-TTATGTC-5', 1566, 3'-CTTTGTT-5', 1585, 3'-CTTTACT-5', 1663, 3'-TTTCGCC-5', 1680, 3'-CTTAATT-5', 1696, 3'-TTATACC-5', 1742, 3'-TTATGTT-5', 1878, 3'-TTTAATC-5', 1887, 3'-TCTGACT-5', 1935, 3'-TCTTACC-5', 1948, 3'-TCTCGTT-5', 2021, 3'-TTACACC-5', 2065, 3'-CCACGTC-5', 2082, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TCTGGTT-5', 2147, 3'-CTATGTT-5', 2180, 3'-TTTTACT-5', 2187, 3'-CCACGCC-5', 2197, 3'-TCAAACT-5', 2257, 3'-TCTGGTC-5', 2263, 3'-TTATGTT-5', 2305, 3'-TTTGATC-5', 2313, 3'-TCTCACT-5', 2447, 3'-CTAAGCC-5', 2454, 3'-TTTCGTT-5', 2474, 3'-TTTCGTT-5', 2480, 3'-TTTTGTT-5', 2509, 3'-TCTGGTC-5', 2600, 3'-TCACACC-5', 2605, 3'-TTTAGTC-5', 2649, 3'-TCACACC-5', 2658, 3'-TTTTGTT-5', 2842, 3'-TCTTACC-5', 3004, 3'-TTTTATT-5', 3013, 3'-TTTGATT-5', 3030, 3'-TCTGGTC-5', 3123, 3'-TTTAATC-5', 3176, 3'-CCACACC-5', 3186, 3'-TCTCGTT-5', 3311, 3'-TTTTGTT-5', 3330, 3'-TTTAACT-5', 3358, 3'-CTTCACT-5', 3410, 3'-CTTGATC-5', 3462, 3'-TTTGGTC-5', 3485, 3'-TTAGGTC-5', 3681, 3'-CCTTGTC-5', 3725, 3'-CCTGACC-5', 3749, 3'-TTACGTC-5', 3772, 3'-CTACACC-5', 3810, 3'-CCTGGTC-5', 3870, 3'-CCTCATT-5', 3891, 3'-TCAAGTT-5', 4026, 3'-TCTGGTC-5', 4032, 3'-TTTTATT-5', 4071, 3'-TTACACT-5', 4092, 3'-TCAAGTT-5', 4177, 3'-TTTTATT-5', 4221, 3'-TCACACT-5', 4361, 3'-TCAGGTT-5', 4502, 3'-CCTTACT-5', 4555,
# inverse, negative strand, negative direction, is SuccessablesInri--.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 100, 3'-TCTGACT-5', 17, 3'-CCTGGTC-5', 34, 3'-TTTTGTT-5', 69, 3'-CTATACC-5', 77, 3'-TTTGACT-5', 130, 3'-TTTTGTC-5', 167, 3'-CCATATT-5', 181, 3'-CTTTGTT-5', 229, 3'-CTACATT-5', 247, 3'-TCAAGTT-5', 255, 3'-TTTGGTC-5', 261, 3'-TTATACT-5', 274, 3'-TCTTGTC-5', 288, 3'-TTTGACT-5', 307, 3'-CCACGCC-5', 380, 3'-TCACGCT-5', 448, 3'-TTATGCT-5', 492, 3'-TTTAATC-5', 499, 3'-TCTAACT-5', 585, 3'-TTATACC-5', 605, 3'-TTATGTT-5', 635, 3'-TTTAACC-5', 643, 3'-TCAAGCT-5', 721, 3'-TCTGGTC-5', 727, 3'-TTATGTT-5', 769, 3'-TTTAATC-5', 777, 3'-CTACACC-5', 787, 3'-TCTCGCT-5', 911, 3'-CTAGGTC-5', 975, 3'-TCTAACC-5', 1045, 3'-TCTCACT-5', 1077, 3'-TTTAATC-5', 1234, 3'-TCAGACC-5', 1356, 3'-TCTCGTT-5', 1369, 3'-TTTTGTT-5', 1388, 3'-TCACGTC-5', 1471, 3'-CCACACT-5', 1479, 3'-TCACGTT-5', 1536, 3'-TCTTGCT-5', 1553, 3'-TTATGTC-5', 1566, 3'-CTTTGTT-5', 1585, 3'-CTTTACT-5', 1663, 3'-TTTCGCC-5', 1680, 3'-CTTAATT-5', 1696, 3'-TTATACC-5', 1742, 3'-TTATGTT-5', 1878, 3'-TTTAATC-5', 1887, 3'-TCTGACT-5', 1935, 3'-TCTTACC-5', 1948, 3'-TCTCGTT-5', 2021, 3'-TTACACC-5', 2065, 3'-CCACGTC-5', 2082, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TCTGGTT-5', 2147, 3'-CTATGTT-5', 2180, 3'-TTTTACT-5', 2187, 3'-CCACGCC-5', 2197, 3'-TCAAACT-5', 2257, 3'-TCTGGTC-5', 2263, 3'-TTATGTT-5', 2305, 3'-TTTGATC-5', 2313, 3'-TCTCACT-5', 2447, 3'-CTAAGCC-5', 2454, 3'-TTTCGTT-5', 2474, 3'-TTTCGTT-5', 2480, 3'-TTTTGTT-5', 2509, 3'-TCTGGTC-5', 2600, 3'-TCACACC-5', 2605, 3'-TTTAGTC-5', 2649, 3'-TCACACC-5', 2658, 3'-TTTTGTT-5', 2842, 3'-TCTTACC-5', 3004, 3'-TTTTATT-5', 3013, 3'-TTTGATT-5', 3030, 3'-TCTGGTC-5', 3123, 3'-TTTAATC-5', 3176, 3'-CCACACC-5', 3186, 3'-TCTCGTT-5', 3311, 3'-TTTTGTT-5', 3330, 3'-TTTAACT-5', 3358, 3'-CTTCACT-5', 3410, 3'-CTTGATC-5', 3462, 3'-TTTGGTC-5', 3485, 3'-TTAGGTC-5', 3681, 3'-CCTTGTC-5', 3725, 3'-CCTGACC-5', 3749, 3'-TTACGTC-5', 3772, 3'-CTACACC-5', 3810, 3'-CCTGGTC-5', 3870, 3'-CCTCATT-5', 3891, 3'-TCAAGTT-5', 4026, 3'-TCTGGTC-5', 4032, 3'-TTTTATT-5', 4071, 3'-TTACACT-5', 4092, 3'-TCAAGTT-5', 4177, 3'-TTTTATT-5', 4221, 3'-TCACACT-5', 4361, 3'-TCAGGTT-5', 4502, 3'-CCTTACT-5', 4555,
# inverse, negative strand, positive direction, is SuccessablesInri-+.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 75, 5'-CCAGGCT-3' at 10, 5'-TCAGGCC-3' at 92, 5'-TTAGGTC-3' at 152, 5'-CCAGGTC-3' at 217, 5'-CCACACT-3' at 345, 5'-CTTCGCC-3' at 459, 5'-TCTTACT-3' at 524, 5'-CTTCGCC-3' at 595, 5'-CTACGCT-3' at 652, 5'-CCACGCT-3' at 777, 5'-CCTGGCC-3' at 849, 5'-CCTGGCC-3' at 949, 5'-CCAGGCT-3' at 1177, 5'-TTTCGTC-3' at 1183, 5'-CTTCGCC-3' at 1308, 5'-CTTCGCC-3' at 1408, 5'-TTAAGCC-3' at 1541, 5'-CTACGCT-3' at 1576, 5'-CCTGACC-3' at 1662, 5'-CCAGACT-3' at 1744, 5'-CCTGGCT-3' at 1817, 5'-CCAGGCC-3' at 1857, 5'-TCTTACC-3' at 1888, 5'-CTTCATC-3' at 2110, 5'-TCATATT-3' at 2178, 5'-CCTGACC-3' at 2213, 5'-CCAGATC-3' at 2230, 5'-TCTCACC-3' at 2247, 5'-TTTCACT-3' at 2304, 5'-CCAGGCT-3' at 2318, 5'-TTAGGCT-3' at 2368, 5'-CTACACC-3' at 2430, 5'-CCTGGCT-3' at 2435, 5'-TCTCACC-3' at 2470, 5'-CCATGTT-3' at 2475, 5'-CCTGGCC-3' at 2571, 5'-TTATACC-3' at 2590, 5'-CCACACC-3' at 2602, 5'-TCAAGTC-3' at 2617, 5'-CCACACT-3' at 2636, 5'-TCAGATT-3' at 2868, 5'-TTTGACC-3' at 2873, 5'-CCAGGCC-3' at 2878, 5'-TCTGGCT-3' at 2885, 5'-CCTCATT-3' at 2902, 5'-TCTGACT-3' at 2945, 5'-TCTGGCC-3' at 2985, 5'-CCTGGCC-3' at 2990, 5'-CCTTGTC-3' at 3003, 5'-CCAGGTC-3' at 3018, 5'-TCTGGTT-3' at 3023, 5'-TCAGGCC-3' at 3036, 5'-CCTGGTT-3' at 3049, 5'-CTTCATC-3' at 3250, 5'-TCACGTC-3' at 3255, 5'-CCTGGTC-3' at 3298, 5'-TCTCACT-3' at 3317, 5'-CCATGTT-3' at 3337, 5'-CCTTGCC-3' at 3375, 5'-TCACACT-3' at 3507, 5'-CTAGGCT-3' at 3524, 5'-CCAGACC-3' at 3550, 5'-TCTCACC-3' at 3612, 5'-CCTGGCC-3' at 3681, 5'-TCACACC-3' at 3824, 5'-CTTGACC-3' at 4018, 5'-TTTTATC-3' at 4123, 5'-CTTGATT-3' at 4133, 5'-TTTAGTT-3' at 4138, 5'-CTTTGCC-3' at 4210, 5'-CCTGACC-3' at 4216, 5'-CCTCATT-3' at 4309, 5'-TCATGTC-3' at 4366, 5'-CCATGCT-3' at 4372, 5'-TCTTGCT-3' at 4390.
# inverse, negative strand, positive direction, is SuccessablesInri-+.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 75, 5'-CCAGGCT-3' at 10, 5'-TCAGGCC-3' at 92, 5'-TTAGGTC-3' at 152, 5'-CCAGGTC-3' at 217, 5'-CCACACT-3' at 345, 5'-CTTCGCC-3' at 459, 5'-TCTTACT-3' at 524, 5'-CTTCGCC-3' at 595, 5'-CTACGCT-3' at 652, 5'-CCACGCT-3' at 777, 5'-CCTGGCC-3' at 849, 5'-CCTGGCC-3' at 949, 5'-CCAGGCT-3' at 1177, 5'-TTTCGTC-3' at 1183, 5'-CTTCGCC-3' at 1308, 5'-CTTCGCC-3' at 1408, 5'-TTAAGCC-3' at 1541, 5'-CTACGCT-3' at 1576, 5'-CCTGACC-3' at 1662, 5'-CCAGACT-3' at 1744, 5'-CCTGGCT-3' at 1817, 5'-CCAGGCC-3' at 1857, 5'-TCTTACC-3' at 1888, 5'-CTTCATC-3' at 2110, 5'-TCATATT-3' at 2178, 5'-CCTGACC-3' at 2213, 5'-CCAGATC-3' at 2230, 5'-TCTCACC-3' at 2247, 5'-TTTCACT-3' at 2304, 5'-CCAGGCT-3' at 2318, 5'-TTAGGCT-3' at 2368, 5'-CTACACC-3' at 2430, 5'-CCTGGCT-3' at 2435, 5'-TCTCACC-3' at 2470, 5'-CCATGTT-3' at 2475, 5'-CCTGGCC-3' at 2571, 5'-TTATACC-3' at 2590, 5'-CCACACC-3' at 2602, 5'-TCAAGTC-3' at 2617, 5'-CCACACT-3' at 2636, 5'-TCAGATT-3' at 2868, 5'-TTTGACC-3' at 2873, 5'-CCAGGCC-3' at 2878, 5'-TCTGGCT-3' at 2885, 5'-CCTCATT-3' at 2902, 5'-TCTGACT-3' at 2945, 5'-TCTGGCC-3' at 2985, 5'-CCTGGCC-3' at 2990, 5'-CCTTGTC-3' at 3003, 5'-CCAGGTC-3' at 3018, 5'-TCTGGTT-3' at 3023, 5'-TCAGGCC-3' at 3036, 5'-CCTGGTT-3' at 3049, 5'-CTTCATC-3' at 3250, 5'-TCACGTC-3' at 3255, 5'-CCTGGTC-3' at 3298, 5'-TCTCACT-3' at 3317, 5'-CCATGTT-3' at 3337, 5'-CCTTGCC-3' at 3375, 5'-TCACACT-3' at 3507, 5'-CTAGGCT-3' at 3524, 5'-CCAGACC-3' at 3550, 5'-TCTCACC-3' at 3612, 5'-CCTGGCC-3' at 3681, 5'-TCACACC-3' at 3824, 5'-CTTGACC-3' at 4018, 5'-TTTTATC-3' at 4123, 5'-CTTGATT-3' at 4133, 5'-TTTAGTT-3' at 4138, 5'-CTTTGCC-3' at 4210, 5'-CCTGACC-3' at 4216, 5'-CCTCATT-3' at 4309, 5'-TCATGTC-3' at 4366, 5'-CCATGCT-3' at 4372, 5'-TCTTGCT-3' at 4390.
Line 436: Line 436:
# inverse, positive strand, positive direction, is SuccessablesInri++.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 61, 5'-TCTCACC-3' at 53, 5'-TTACACT-3' at 230, 5'-CCTCGCT-3' at 429, 5'-TCTGGCC-3' at 442, 5'-CCACGCC-3' at 489, 5'-TCACGCC-3' at 498, 5'-TCACGCC-3' at 582, 5'-TCACGCC-3' at 666, 5'-CCACGTC-3' at 784, 5'-TCACGCC-3' at 1086, 5'-TCACGCC-3' at 1170, 5'-TCACGCC-3' at 1254, 5'-TTACGCC-3' at 1322, 5'-TTACGCC-3' at 1422, 5'-TCACGCC-3' at 1590, 5'-CTTCGCC-3' at 1636, 5'-CCACGCC-3' at 1764, 5'-TCACGTC-3' at 1787, 5'-CCACACC-3' at 1805, 5'-CTTGACC-3' at 1953, 5'-CCACACC-3' at 1971, 5'-TTTCGTC-3' at 2007, 5'-TCACGTC-3' at 2064, 5'-CTTGGTC-3' at 2227, 5'-TCTAGTT-3' at 2232, 5'-TCACGTC-3' at 2327, 5'-CCACGTT-3' at 2335, 5'-CTTTATC-3' at 2626, 5'-CTATATT-3' at 2662, 5'-CCTGACT-3' at 2674, 5'-TCTCGTT-3' at 2705, 5'-TTTCACC-3' at 2711, 5'-CCACGTT-3' at 2801, 5'-TCTTACT-3' at 2841, 5'-CTAAACT-3' at 2871, 5'-CCAGACT-3' at 2943, 5'-CCAGACC-3' at 3021, 5'-TTATACC-3' at 3162, 5'-CTTTACC-3' at 3168, 5'-CCTGGTT-3' at 3174, 5'-CCTTACT-3' at 3441, 5'-CTACGTC-3' at 3460, 5'-TCACGTC-3' at 3465, 5'-CCTGGTC-3' at 3547, 5'-CCTTACT-3' at 3567, 5'-TCACACT-3' at 3594, 5'-CTTCGCC-3' at 3670, 5'-TTAGGCT-3' at 3799, 5'-TCTTACT-3' at 3835, 5'-CTTGGTC-3' at 3840, 5'-TCTCACT-3' at 3876, 5'-TCAGACT-3' at 3924, 5'-TCACACC-3' at 3966, 5'-CCACACT-3' at 3971, 5'-TCTCACC-3' at 4040, 5'-TCTTGTC-3' at 4069, 5'-CTTTACT-3' at 4094, 5'-CTAAATC-3' at 4136, 5'-CCTCACT-3' at 4350, 5'-CCAGACC-3' at 4416, 5'-CCTTGTT-3' at 4445.
# inverse, positive strand, positive direction, is SuccessablesInri++.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 61, 5'-TCTCACC-3' at 53, 5'-TTACACT-3' at 230, 5'-CCTCGCT-3' at 429, 5'-TCTGGCC-3' at 442, 5'-CCACGCC-3' at 489, 5'-TCACGCC-3' at 498, 5'-TCACGCC-3' at 582, 5'-TCACGCC-3' at 666, 5'-CCACGTC-3' at 784, 5'-TCACGCC-3' at 1086, 5'-TCACGCC-3' at 1170, 5'-TCACGCC-3' at 1254, 5'-TTACGCC-3' at 1322, 5'-TTACGCC-3' at 1422, 5'-TCACGCC-3' at 1590, 5'-CTTCGCC-3' at 1636, 5'-CCACGCC-3' at 1764, 5'-TCACGTC-3' at 1787, 5'-CCACACC-3' at 1805, 5'-CTTGACC-3' at 1953, 5'-CCACACC-3' at 1971, 5'-TTTCGTC-3' at 2007, 5'-TCACGTC-3' at 2064, 5'-CTTGGTC-3' at 2227, 5'-TCTAGTT-3' at 2232, 5'-TCACGTC-3' at 2327, 5'-CCACGTT-3' at 2335, 5'-CTTTATC-3' at 2626, 5'-CTATATT-3' at 2662, 5'-CCTGACT-3' at 2674, 5'-TCTCGTT-3' at 2705, 5'-TTTCACC-3' at 2711, 5'-CCACGTT-3' at 2801, 5'-TCTTACT-3' at 2841, 5'-CTAAACT-3' at 2871, 5'-CCAGACT-3' at 2943, 5'-CCAGACC-3' at 3021, 5'-TTATACC-3' at 3162, 5'-CTTTACC-3' at 3168, 5'-CCTGGTT-3' at 3174, 5'-CCTTACT-3' at 3441, 5'-CTACGTC-3' at 3460, 5'-TCACGTC-3' at 3465, 5'-CCTGGTC-3' at 3547, 5'-CCTTACT-3' at 3567, 5'-TCACACT-3' at 3594, 5'-CTTCGCC-3' at 3670, 5'-TTAGGCT-3' at 3799, 5'-TCTTACT-3' at 3835, 5'-CTTGGTC-3' at 3840, 5'-TCTCACT-3' at 3876, 5'-TCAGACT-3' at 3924, 5'-TCACACC-3' at 3966, 5'-CCACACT-3' at 3971, 5'-TCTCACC-3' at 4040, 5'-TCTTGTC-3' at 4069, 5'-CTTTACT-3' at 4094, 5'-CTAAATC-3' at 4136, 5'-CCTCACT-3' at 4350, 5'-CCAGACC-3' at 4416, 5'-CCTTGTT-3' at 4445.


===5'-BBCABW-3'===
===YYRNWYY UTRs===
{{main|UTR promoter gene transcriptions}}
 
===YYRNWYY core promoters===
{{main|Core promoter gene transcriptions}}
 
===YYRNWYY proximal promoters===
{{main|Proximal promoter gene transcriptions}}
 
===YYRNWYY distal promoters===
{{main|Distal promoter gene transcriptions}}
 
===BBCABW===


For the Basic programs (starting with SuccessablesInr2.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
For the Basic programs (starting with SuccessablesInr2.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
Line 455: Line 467:
# inverse, positive strand, negative direction, is SuccessablesInr2i+-.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 46, 3'-AGACTG-5', 16, 3'-ACACCT-5', 62, 3'-ACACGT-5', 342, 3'-ACACGT-5', 531, 3'-TCACGC-5', 663, 3'-ACACCC-5', 749, 3'-AGACTC-5', 916, 3'-ACACGC-5', 963, 3'-TGACTT-5', 1052, 3'-TCACTC-5', 1057, 3'-AGACTC-5', 1082, 3'-ACACCT-5', 1129, 3'-TCACCT-5', 1171, 3'-TTACTT-5', 1298, 3'-AGACTC-5', 1403, 3'-TCACTG-5', 1492, 3'-ACACTT-5', 1544, 3'-AGACTT-5', 1617, 3'-TCACGT-5', 1772, 3'-AGACTG-5', 1934, 3'-TCACGC-5', 1991, 3'-AGACTC-5', 2026, 3'-ATACTG-5', 2162, 3'-TGACCG-5', 2190, 3'-TCACGC-5', 2207, 3'-ACACTT-5', 2551, 3'-TCACTT-5', 2578, 3'-TGACTC-5', 2787, 3'-ATACCT-5', 2994, 3'-TCACCC-5', 3057, 3'-TCACTT-5', 3101, 3'-TCACTT-5', 3240, 3'-TCACGC-5', 3280, 3'-AGACTG-5', 3425, 3'-ATACTG-5', 3541, 3'-ATACGC-5', 3547, 3'-ATACCT-5', 3859, 3'-ACACCT-5', 3968, 3'-ACACTT-5', 3983, 3'-TCACTT-5', 4010, 3'-AGACTC-5', 4054, 3'-TCACTT-5', 4161, 3'-AGACCC-5', 4205, 3'-AGACGT-5', 4236, 3'-ACACTG-5', 4336, 3'-AGACCC-5', 4366.
# inverse, positive strand, negative direction, is SuccessablesInr2i+-.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 46, 3'-AGACTG-5', 16, 3'-ACACCT-5', 62, 3'-ACACGT-5', 342, 3'-ACACGT-5', 531, 3'-TCACGC-5', 663, 3'-ACACCC-5', 749, 3'-AGACTC-5', 916, 3'-ACACGC-5', 963, 3'-TGACTT-5', 1052, 3'-TCACTC-5', 1057, 3'-AGACTC-5', 1082, 3'-ACACCT-5', 1129, 3'-TCACCT-5', 1171, 3'-TTACTT-5', 1298, 3'-AGACTC-5', 1403, 3'-TCACTG-5', 1492, 3'-ACACTT-5', 1544, 3'-AGACTT-5', 1617, 3'-TCACGT-5', 1772, 3'-AGACTG-5', 1934, 3'-TCACGC-5', 1991, 3'-AGACTC-5', 2026, 3'-ATACTG-5', 2162, 3'-TGACCG-5', 2190, 3'-TCACGC-5', 2207, 3'-ACACTT-5', 2551, 3'-TCACTT-5', 2578, 3'-TGACTC-5', 2787, 3'-ATACCT-5', 2994, 3'-TCACCC-5', 3057, 3'-TCACTT-5', 3101, 3'-TCACTT-5', 3240, 3'-TCACGC-5', 3280, 3'-AGACTG-5', 3425, 3'-ATACTG-5', 3541, 3'-ATACGC-5', 3547, 3'-ATACCT-5', 3859, 3'-ACACCT-5', 3968, 3'-ACACTT-5', 3983, 3'-TCACTT-5', 4010, 3'-AGACTC-5', 4054, 3'-TCACTT-5', 4161, 3'-AGACCC-5', 4205, 3'-AGACGT-5', 4236, 3'-ACACTG-5', 4336, 3'-AGACCC-5', 4366.
# inverse, positive strand, positive direction, is SuccessablesInr2i++.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 94, 3'-TCACCC-5', 54, 3'-AGACGT-5', 224, 3'-ACACTT-5', 231, 3'-TGACGG-5', 238, 3'-AGACTC-5', 256, 3'-AGACCT-5', 271, 3'-TGACCC-5', 348, 3'-TCACGC-5', 497, 3'-TCACGC-5', 581, 3'-TCACGC-5', 665, 3'-TGACGC-5', 749, 3'-ACACCG-5', 819, 3'-TGACGG-5', 901, 3'-ACACCG-5', 919, 3'-TGACGC-5', 1001, 3'-ACACCG-5', 1023, 3'-TCACGC-5', 1085, 3'-TCACGC-5', 1160, 3'-TCACGC-5', 1169, 3'-TCACGC-5', 1253, 3'-TGACTC-5', 1287, 3'-TTACGC-5', 1321, 3'-AGACCG-5', 1377, 3'-AGACGC-5', 1396, 3'-TTACGC-5', 1421, 3'-AGACCG-5', 1477, 3'-AGACGC-5', 1496, 3'-TGACGT-5', 1505, 3'-TCACGC-5', 1589, 3'-TCACGC-5', 1725, 3'-TCACGT-5', 1786, 3'-ACACCT-5', 1806, 3'-AGACCC-5', 1865, 3'-TGACCC-5', 1954, 3'-ACACCG-5', 1972, 3'-AGACCG-5', 1993, 3'-TCACGT-5', 2063, 3'-TCACCG-5', 2068, 3'-ATACCG-5', 2160, 3'-TGACGT-5', 2204, 3'-TCACGT-5', 2326, 3'-ACACGT-5', 2681, 3'-TCACCT-5', 2712, 3'-TGACGG-5', 2823, 3'-TTACTG-5', 2842, 3'-AGACGT-5', 2857, 3'-AGACCG-5', 2884, 3'-TTACCC-5', 2911, 3'-AGACTG-5', 2944, 3'-AGACTC-5', 2951, 3'-ACACGT-5', 2960, 3'-AGACCG-5', 2984, 3'-AGACTC-5', 3007, 3'-TCACGG-5', 3011, 3'-ATACTG-5', 3028, 3'-AGACGT-5', 3061, 3'-TTACGT-5', 3070, 3'-TGACCG-5', 3118, 3'-AGACTC-5', 3124, 3'-ATACCT-5', 3163, 3'-TTACCC-5', 3169, 3'-TCACGG-5', 3235, 3'-ATACTC-5', 3261, 3'-AGACGT-5', 3268, 3'-AGACGT-5', 3279, 3'-TGACGT-5', 3320, 3'-TGACCG-5', 3346, 3'-AGACGG-5', 3359, 3'-AGACCG-5', 3406, 3'-TTACGG-5', 3431, 3'-ACACCT-5', 3437, 3'-TTACTT-5', 3442, 3'-TTACTC-5', 3446, 3'-TCACCC-5', 3450, 3'-TCACGT-5', 3464, 3'-TTACTG-5', 3568, 3'-ACACTT-5', 3595, 3'-TCACTG-5', 3713, 3'-TGACTC-5', 3736, 3'-TTACTG-5', 3783, 3'-TTACTT-5', 3836, 3'-TCACTC-5', 3877, 3'-ACACTC-5', 3904, 3'-AGACTT-5', 3925, 3'-ACACGT-5', 3960, 3'-ACACTG-5', 3972, 3'-TCACCC-5', 4041, 3'-TGACTT-5', 4090, 3'-TTACTC-5', 4095, 3'-TCACGG-5', 4274, 3'-TGACGT-5', 4341, 3'-TCACTC-5', 4351, 3'-ACACCC-5', 4395, 3'-AGACCC-5', 4417.
# inverse, positive strand, positive direction, is SuccessablesInr2i++.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 94, 3'-TCACCC-5', 54, 3'-AGACGT-5', 224, 3'-ACACTT-5', 231, 3'-TGACGG-5', 238, 3'-AGACTC-5', 256, 3'-AGACCT-5', 271, 3'-TGACCC-5', 348, 3'-TCACGC-5', 497, 3'-TCACGC-5', 581, 3'-TCACGC-5', 665, 3'-TGACGC-5', 749, 3'-ACACCG-5', 819, 3'-TGACGG-5', 901, 3'-ACACCG-5', 919, 3'-TGACGC-5', 1001, 3'-ACACCG-5', 1023, 3'-TCACGC-5', 1085, 3'-TCACGC-5', 1160, 3'-TCACGC-5', 1169, 3'-TCACGC-5', 1253, 3'-TGACTC-5', 1287, 3'-TTACGC-5', 1321, 3'-AGACCG-5', 1377, 3'-AGACGC-5', 1396, 3'-TTACGC-5', 1421, 3'-AGACCG-5', 1477, 3'-AGACGC-5', 1496, 3'-TGACGT-5', 1505, 3'-TCACGC-5', 1589, 3'-TCACGC-5', 1725, 3'-TCACGT-5', 1786, 3'-ACACCT-5', 1806, 3'-AGACCC-5', 1865, 3'-TGACCC-5', 1954, 3'-ACACCG-5', 1972, 3'-AGACCG-5', 1993, 3'-TCACGT-5', 2063, 3'-TCACCG-5', 2068, 3'-ATACCG-5', 2160, 3'-TGACGT-5', 2204, 3'-TCACGT-5', 2326, 3'-ACACGT-5', 2681, 3'-TCACCT-5', 2712, 3'-TGACGG-5', 2823, 3'-TTACTG-5', 2842, 3'-AGACGT-5', 2857, 3'-AGACCG-5', 2884, 3'-TTACCC-5', 2911, 3'-AGACTG-5', 2944, 3'-AGACTC-5', 2951, 3'-ACACGT-5', 2960, 3'-AGACCG-5', 2984, 3'-AGACTC-5', 3007, 3'-TCACGG-5', 3011, 3'-ATACTG-5', 3028, 3'-AGACGT-5', 3061, 3'-TTACGT-5', 3070, 3'-TGACCG-5', 3118, 3'-AGACTC-5', 3124, 3'-ATACCT-5', 3163, 3'-TTACCC-5', 3169, 3'-TCACGG-5', 3235, 3'-ATACTC-5', 3261, 3'-AGACGT-5', 3268, 3'-AGACGT-5', 3279, 3'-TGACGT-5', 3320, 3'-TGACCG-5', 3346, 3'-AGACGG-5', 3359, 3'-AGACCG-5', 3406, 3'-TTACGG-5', 3431, 3'-ACACCT-5', 3437, 3'-TTACTT-5', 3442, 3'-TTACTC-5', 3446, 3'-TCACCC-5', 3450, 3'-TCACGT-5', 3464, 3'-TTACTG-5', 3568, 3'-ACACTT-5', 3595, 3'-TCACTG-5', 3713, 3'-TGACTC-5', 3736, 3'-TTACTG-5', 3783, 3'-TTACTT-5', 3836, 3'-TCACTC-5', 3877, 3'-ACACTC-5', 3904, 3'-AGACTT-5', 3925, 3'-ACACGT-5', 3960, 3'-ACACTG-5', 3972, 3'-TCACCC-5', 4041, 3'-TGACTT-5', 4090, 3'-TTACTC-5', 4095, 3'-TCACGG-5', 4274, 3'-TGACGT-5', 4341, 3'-TCACTC-5', 4351, 3'-ACACCC-5', 4395, 3'-AGACCC-5', 4417.
===BBCABW UTRs===
{{main|UTR promoter gene transcriptions}}
===BBCABW core promoters===
{{main|Core promoter gene transcriptions}}
===BBCABW proximal promoters===
{{main|Proximal promoter gene transcriptions}}
===BBCABW distal promoters===
{{main|Distal promoter gene transcriptions}}


===Inr-like, TCTs samplings===
===Inr-like, TCTs samplings===

Revision as of 06:06, 14 April 2021

Editor-In-Chief: Henry A. Hoff

In the biosynthesis of any human protein, the gene that contains the nucleotide sequence which is translated into that protein must be transcribed. For RNA polymerase II holoenzyme to transcribe the gene, the gene's promoter must be located. After the promoter is located, the transcription start site (TSS) is pinpointed by using nucleotide sequences that include the TSS. Within the promoter, most human genes lack a TATA box and have an initiator element (Inr) or downstream promoter element instead.

On the basis of descriptions available, various Inrs are located to test whether the known TSS is located.

Notations

Notation: let the symbol Inr denote an initiator element.

Notation: let the symbol +1 designate the nucleotide that is the transcription start site (TSS).

Genetics

Inr in humans was first explained and sequenced in 1989.[1]

The Inr element for core promoters was found to be more prevalent than the TATA box in eukaryotic promoter domains.[2] In a study of 1800+ distinct human promoter sequences it was found that 49% contain the Inr element while 21.8% contain the TATA box.[2]

Gene transcriptions

Two subunits, TAF1 and TAF2, of the TFIID recognize the Inr sequence and bring the complex together.[3]

The interaction between TFIID and Inr is believed to be most imperative in initiating transcription due to the Inr sequence overlapping the start site.[4]

The Inr element is also believed to interact with the activator Sp1 transcription factor (Sp1), specificity protein 1 transcription factor, which is then able to regulate the activation and initiation of transcription[5]

Promoters with a functional Inr are more likely to lack a TATA box or to possess a degenerate TATA sequence because a gene with an active Inr is less dependent on a functional TATA box or additional promoters.[6] Although Inr element varies between promoters, the sequence is highly conserved between humans and yeast.[6] An analysis of 7670 transcription start sites showed that roughly 40% had an exact match to the BBCA+1BW Inr sequence, while 16% contained only one mismatch [7] TFIID and subunits are very sensitive to the Inr sequence and nucleotide changes have been shown to drastically change the binding affinity, where the +1 and -3 positions have been identified as the most critical for transcription efficiency and Inr function.[6] A replacement of the Adenosine (A) nucleotide at the +1 to G or T changes transcription activity by 10% and a replacement of Thymine (T) at the +3 position changes transcription activity levels by 22%.[8]

Theoretical initiator elements

Here's a theoretical definition:

Def. a series of nucleotides including a transcription start site on one DNA strand whose presence in a gene promoter eventually leads to a chain reaction or polymerization such as transcription is called an initiator element.

Consensus sequence for an Inr-like/TCT is 5'-TTCTCT-3'.[9]

RNA polymerase IIs

"RNA pol II itself recognizes features of the Inr which might assist the correct positioning of the polymerase on the promoter (Carcamo et al., 1991; Weis and Reinberg, 1997)."[10][11][12]

RNA polymerase II may form a stable complex on TATA-less promoters that contain Inr elements and possess a weak, intrinsic preference for Inr-like sequences.[11]

RNA polymerase II holoenzyme complexes

Gene ID: 672 is BRCA1 BRCA1, DNA repair associated. "This gene encodes a nuclear phosphoprotein that plays a role in maintaining genomic stability, and it also acts as a tumor suppressor. The encoded protein combines with other tumor suppressors, DNA damage sensors, and signal transducers to form a large multi-subunit protein complex known as the BRCA1-associated genome surveillance complex (BASC). This gene product associates with RNA polymerase II, and through the C-terminal domain, also interacts with histone deacetylase complexes. This protein thus plays a role in transcription, DNA repair of double-stranded breaks, and recombination. Mutations in this gene are responsible for approximately 40% of inherited breast cancers and more than 80% of inherited breast and ovarian cancers. Alternative splicing plays a role in modulating the subcellular localization and physiological function of this gene. Many alternatively spliced transcript variants, some of which are disease-associated mutations, have been described for this gene, but the full-length natures of only some of these variants has been described. A related pseudogene, which is also located on chromosome 17, has been identified."[13]

Gene ID: 1660 is DHX9 DExH-box helicase 9 (aka LKP; RHA; DDX9; NDH2; NDHII). "This gene encodes a member of the DEAH-containing family of RNA helicases. The encoded protein is an enzyme that catalyzes the ATP-dependent unwinding of double-stranded RNA and DNA-RNA complexes. This protein localizes to both the nucleus and the cytoplasm and functions as a transcriptional regulator. This protein may also be involved in the expression and nuclear export of retroviral RNAs. Alternate splicing results in multiple transcript variants. Pseudogenes of this gene are found on chromosomes 11 and 13."[14]

BRCA1 has been shown to interact with DHX9; i.e., overexpression of a protein fragment of RNA helicase A causes inhibition of endogenous BRCA1 function and defects in ploidy and cytokinesis in mammary epithelial cells[15] and the BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase A.[16]

ATP-dependent RNA helicase A (RHA; also known as DHX9, LKP, and NDHI) is an enzyme that in humans is encoded by the DHX9 gene.[17][18][14]

RNA polymerase II subunit A C-terminal domain phosphatase is an enzyme that in humans is encoded by the CTDP1 gene.[19][20][21]

Gene ID: 9150 is CTDP1 CTD phosphatase subunit 1. "This gene encodes a protein which interacts with the carboxy-terminus of the RAP74 subunit of transcription initiation factor TFIIF, and functions as a phosphatase that processively dephosphorylates the C-terminus of POLR2A (a subunit of RNA polymerase II), making it available for initiation of gene expression. Mutations in this gene are associated with congenital cataracts, facial dysmorphism and neuropathy syndrome (CCFDN). Alternatively spliced transcript variants encoding different isoforms have been described for this gene."[22]

"This gene encodes a protein which interacts with the carboxy-terminus of transcription initiation factor TFIIF, a transcription factor which regulates elongation as well as initiation by RNA polymerase II. The protein may also represent a component of an RNA polymerase II holoenzyme complex. Alternative splicing of this gene results in two transcript variants encoding 2 different isoforms."[21]

CTDP1 has been shown to interact with WD repeat-containing protein 77,[23] GTF2F1[20] and POLR2A.[24]

Gene ID: 168400 is DDX53 DEAD-box helicase 53. "This intronless gene encodes a protein which contains several domains found in members of the DEAD-box helicase protein family. Other members of this protein family participate in ATP-dependent RNA unwinding."[25]

"DEAD/DEAH box helicases are proteins, and are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein with RNA helicase activity. It may participate in melting of DNA:RNA hybrids, such as those that occur during transcription, and may play a role in X-linked gene expression. It contains 2 copies of a double-stranded RNA-binding domain, a DEXH core domain and an RGG box. The RNA-binding domains and RGG box influence and regulate RNA helicase activity."[25]

Consensus sequences

As in other metazoans, for genes lacking a TATA box, the Inr is functionally analogous, with a base pair (bp) consensus 5'-YYA+1NWYY-3', to direct transcription initiation.[26] Using the degenerate nucleotide code, the consensus sequence is 5'-C/T-C/T-A-A/C/G/T-A/T-C/T-C/T-3', or in the direction of transcription on the template strand: 3'-C/T-C/T-A-A/C/G/T-A/T-C/T-C/T-5'.

"TATA-less core promoters that lack AT-rich sequences in the -30 region and do not stably bind TBP are likely to assemble PICs via alternative pathways and to be regulated by distinct mechanisms (Smale and Kadonaga, 2003). However, the number of such bona fide TATA-less genes remains unclear in eukaryotic genomes."[27]

In Entamoeba histolytica, the consensus sequence is AAAAATTCA.[28]

The Inr has the consensus sequence YYANWYY.[29] Similarly to the TATA box, the Inr element facilitates the binding of transcription Factor II D (TATA binding protein TAF).[29]

Enhancers

An Inr for mammalian RNA polymerase II can be defined as a DNA sequence element that overlaps a TSS and is sufficient for

  1. determining the start site location in a promoter that lacks a TATA box and
  2. enhancing the strength of a promoter that contains a TATA box.[30]

TATA binding protein associated factors

"Although any isolated TAF may not exhibit sequence-specific interactions at the Inr element in the absence of a TATA-box, a combination of TAFs may bind sequence specifically to the Inr element regardless of the TATA-box and/or DPE (Chalkley and Verrijzer, 1999)."[31] Bold added.

TAF1 "binds to core promoter sequences encompassing the transcription start site. It also binds to activators and other transcriptional regulators, and these interactions affect the rate of transcription initiation."[32]

Prior to transcription, stable binding to an Inr occurs by a complex consisting of TAF1 and TAF2.[10]

TATA box-likes

The Inr is the only element in metazoan protein-encoding genes known to be a functional analog of the TATA box, in that it is sufficient for directing accurate transcription initiation in genes that lack TATA boxes.[33]

General transcription factor II As

General transcription factor II A is critical for the cooperative binding of TFIID to the Inr.[34]

General transcription factor II Ds

The general transcription factor II D (TFIID) is one of several general transcription factors that make up the RNA polymerase II preinitiation complex.[35] Before the start of transcription, the transcription factor II D (TFIID) complex, binds to the core promoter of the gene.[35]

TFIID is the first protein to bind to DNA during the formation of the pre-initiation transcription complex of RNA polymerase II (RNA Pol II).[35]

General transcription factor II Is

General transcription factor II I, or TFII-I, is a factor capable of binding the Inr element.[36][37]

Transcription start sites

Usually the Inr contains the TSS.

"[T]he initiator (INR) element located at, or immediately adjacent to, the TSS, ... is recognized by the TBP-associated factors TAF1 and TAF2 of the TFIID complex".[27]

"[T]ranscription does not need to begin at the +1 nucleotide for the Inr to function. RNA polymerase II has been redirected to alternative start sites by reducing ATP concentrations within a nuclear extract, by altering the spacing between the TATA and Inr in a promoter containing both elements, and by dinucleotide initiation strategies".[38]

Hypotheses

  1. A1BG is not transcribed by an initiator element.
  2. A1BG is not transcribed by a TATA box.

Samplings

YYRNWYY

The wider consensus sequence of 3'-YYRNWYY-5' allows a G at the TSS but at most only allows two Gs in a row.[39]

For the Basic programs (starting with SuccessablesInr.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 121, TTACTCC at 4557, TCACACT at 4361, TCGGACC at 4349, CCAGTTT at 4309, TCGGACC at 4300, CTGCACC at 4238, TCGGTCT at 4233, TCACTCT at 4202, TCGAACC at 4188, CCGGTCC at 4170, CCGTACC at 4107, CCGGTCC at 4102, TTACACT at 4092, TCACTCT at 4051, TTGTATC at 4046, TCGGACC at 4037, CCGGTCC at 3951, CTACTTT at 3922, TCATTCT at 3893, CTGGTCC at 3871, CTACACC at 3810, CTGTTCT at 3759, TTGGTCT at 3486, TTGATCT at 3463, CCGTATC at 3446, CCGAACT at 3401, TCGTTCT at 3374, TTGTTCT at 3340, TCGTTTT at 3313, TTGTTCT at 3307, TCGGACC at 3298, TCGGTTC at 3273, CCACACC at 3186, TTGTATT at 3169, CCACTTT at 3146, TTGTTCC at 3141, TCGGACC at 3128, CCGCACC at 3047, TTGATTC at 3031, CCGATTT at 3009, TTGATTC at 2914, TCGTACT at 2784, TCGGACC at 2770, TTGGACC at 2720, TCACACC at 2658, CCACTTT at 2619, TTGTACC at 2614, TCACACC at 2605, CCAGTCC at 2587, CCGGTCC at 2519, TCATTCT at 2503, TTGTTTT at 2490, TCGTTTT at 2476, TCACTCT at 2449, TCGGACC at 2435, TTGGACC at 2385, CCACTTT at 2282, TCGTACC at 2277, TCGGACC at 2268, TCAAACT at 2257, CCAGTCC at 2250, CCGCTTT at 2157, TTGTACC at 2152, TCAAACT at 2141, TCACATT at 2087, CCGGTCC at 2077, TTACACC at 2065, TCGTTCT at 2023, TCGGACC at 2009, TTGGACC at 1959, CCGTACT at 1953, CCGCACC at 1897, TTATACC at 1742, TTAATTT at 1697, TTGGATT at 1591, TTACTTT at 1582, CCGTTTT at 1561, TTGCTTC at 1555, CCACACT at 1479, TTGTTTT at 1394, TCGTTTT at 1371, TTATTCT at 1365, TCAGACC at 1356, TTGGATC at 1306, CCGCACC at 1244, CCACTTT at 1212, TTGTACC at 1207, TCGGACC at 1198, TCACTCT at 1079, TTGGACC at 1015, TTAGTCC at 984, CCGTACC at 953, TCGGTCC at 948, TCGCTCT at 913, TCGGACC at 899, TCGGTTC at 874, CTACACC at 787, TCGCACC at 741, TCGGACT at 732, CCAGTCC at 714, CCGGTTC at 692, CCGGTCC at 648, TTATACC at 605, CCAGTCC at 578, CCGGTTC at 556, TCGGACC at 508, TCACTTT at 473, TTGTATC at 468, TCGGACC at 459, CCAGTCC at 441, CCGGTTC at 419, CTGCTTT at 312, TCACTCT at 301, TTATACT at 274, TTGGTCC at 262, CTACATT at 247, CCATATT at 181, CCGTACT at 124, CCGTTTC at 93, CTATACC at 77, TTGTTCC at 71.
  2. Negative strand, positive direction: 45, CTGCACC at 4343, TTAGTTT at 4139, TTGATTT at 4134, TCACTCT at 4128, TCATTTT at 4120, TCACACC at 3824, CTGTTCC at 3625, CCAGACC at 3550, TCACACT at 3507, TTGCATC at 3402, CTGTTCC at 3352, TTGCACT at 3343, CCGCATC at 3328, CTGCACC at 3322, CTGCTCC at 3309, CTGGTCT at 3299, TCGCTCT at 3276, CTGGTCT at 3245, CCAGTCC at 3084, CCAGTCC at 2998, CTGCTCC at 2978, TCAGATT at 2868, CCACACT at 2636, CCACACC at 2602, TTATACC at 2590, CCGCACC at 2566, CTAATTT at 2440, CTACACC at 2430, TCACTCT at 2306, CTGTTTC at 2263, TCAATCT at 2235, CCAGATC at 2230, CTGCATT at 2206, TCATATT at 2178, TCGCTTC at 2095, CCAGTCC at 2026, CTATTTC at 1978, CCACTTC at 1914, CCAGACT at 1744, CTGCACT at 1472, CTGCACT at 1372, CCGGACT at 746, CCACACT at 345, CTGTTTT at 147, TTGTATT at 115.
  3. Positive strand, negative direction: 40, TTAATTC at 4542, TCACATT at 4533, CCACTTT at 4461, CCACTCC at 4425, CCAGTTC at 4417, CTGCACT at 4340, CCGGACT at 4327, TCACACC at 3967, CCATACC at 3858, CTGAACC at 3784, CTGGACT at 3747, CCATTTC at 3688, CTGCTCC at 3582, CCAGATC at 3488, TTGCACT at 3289, TTGAACC at 3245, CTGCACC at 2761, TTGAACC at 2717, TTGAATC at 2708, CTGCACT at 2426, TCACACC at 2418, TTGAACC at 2382, CTACTCC at 2352, CTGCACT at 2000, TTATTTT at 1727, CTATATC at 1528, TCGCTCT at 1450, CCATTTC at 1380, CCAGTCT at 1354, TTGCACT at 1347, TTGCACC at 1339, TTGAACC at 1303, TCACACC at 1128, TCACTCC at 1058, TTGAACC at 1012, TCACACC at 882, TTGAACC at 846, CTGCATT at 152, TTGGACC at 32, CTGAATT at 20.
  4. Positive strand, positive direction: 75, CCAGACC at 4416, CCACTCC at 4401, CTAAATC at 4136, CTACTCC at 4102, TTACTCC at 4096, CCACACT at 3971, TCACACC at 3966, TCAGACT at 3924, TCACTCC at 3878, CTGGACC at 3787, CCGGACC at 3758, CCGGACC at 3679, CCACTCC at 3647, TCACACT at 3594, CTGGTCT at 3548, TCGATCC at 3522, CCGATCC at 3484, CTACTCC at 3478, TCGGTCT at 3221, CTGGTTT at 3175, TTATACC at 3162, CCAGACC at 3021, CCGGACC at 2988, CCAGACT at 2943, CTGGTCC at 2876, CTAAACT at 2871, TTGCTCC at 2806, TCGATTC at 2789, TCGTTTT at 2707, TCAATCC at 2668, CTATATT at 2662, TCAGTCC at 2620, TCAGTTC at 2615, TCAGTCT at 2609, CCGGTCC at 2574, CCGCACT at 2555, TTGGTCT at 2228, CCAGTCT at 2222, CCGTTCT at 2190, CTACTTT at 2146, TTGTACT at 2141, TCAATTT at 2136, CCACACC at 1971, CCGTTCT at 1948, CCGCTCT at 1921, CCACACC at 1805, CCGCACT at 1720, CCGCTCT at 1565, TCGTTCC at 1511, CCGCTCT at 1481, CCGTTCC at 1427, CCGCTCT at 1381, CCGTTCC at 1327, CCGTTCC at 1259, CCGCTCT at 1229, CCGGTCC at 1175, TCGCTCT at 1061, CCGTTCC at 1007, TTGGACC at 947, TCGGTCT at 935, CCGTTCC at 923, TTGGACC at 847, TCGGTCT at 835, CCGTTCC at 823, CCGGACT at 725, CCGTTCC at 671, CCGCTCT at 641, CCGTTCC at 587, CCGCTCT at 557, TCGGTCC at 515, CCGTTCC at 503, CCGGACC at 286, TTACACT at 230, CCGGTCC at 215, CTGGACC at 40.
  5. complement, negative strand, negative direction is SuccessablesInrc--.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 40, 3'-GACTTAA-5', 20, 3'-AACCTGG-5', 32, 3'-GACGTAA-5', 152, 3'-AACTTGG-5', 846, 3'-AGTGTGG-5', 882, 3'-AACTTGG-5', 1012, 3'-AGTGAGG-5', 1058, 3'-AGTGTGG-5', 1128, 3'-AACTTGG-5', 1303, 3'-AACGTGG-5', 1339, 3'-AACGTGA-5', 1347, 3'-GGTCAGA-5', 1354, 3'-GGTAAAG-5', 1380, 3'-AGCGAGA-5', 1450, 3'-GATATAG-5', 1528, 3'-AATAAAA-5', 1727, 3'-GACGTGA-5', 2000, 3'-GATGAGG-5', 2352, 3'-AACTTGG-5', 2382, 3'-AGTGTGG-5', 2418, 3'-GACGTGA-5', 2426, 3'-AACTTAG-5', 2708, 3'-AACTTGG-5', 2717, 3'-GACGTGG-5', 2761, 3'-AACTTGG-5', 3245, 3'-AACGTGA-5', 3289, 3'-GGTCTAG-5', 3488, 3'-GACGAGG-5', 3582, 3'-GGTAAAG-5', 3688, 3'-GACCTGA-5', 3747, 3'-GACTTGG-5', 3784, 3'-GGTATGG-5', 3858, 3'-AGTGTGG-5', 3967, 3'-GGCCTGA-5', 4327, 3'-GACGTGA-5', 4340, 3'-GGTCAAG-5', 4417, 3'-GGTGAGG-5', 4425, 3'-GGTGAAA-5', 4461, 3'-AGTGTAA-5', 4533, 3'-AATTAAG-5', 4542,
  6. complement, negative strand, positive direction is SuccessablesInrc-+.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 75, 5'-GACCTGG-3' at 40, 5'-GGCCAGG-3' at 215, 5'-AATGTGA-3' at 230, 5'-GGCCTGG-3' at 286, 5'-GGCAAGG-3' at 503, 5'-AGCCAGG-3' at 515, 5'-GGCGAGA-3' at 557, 5'-GGCAAGG-3' at 587, 5'-GGCGAGA-3' at 641, 5'-GGCAAGG-3' at 671, 5'-GGCCTGA-3' at 725, 5'-GGCAAGG-3' at 823, 5'-AGCCAGA-3' at 835, 5'-AACCTGG-3' at 847, 5'-GGCAAGG-3' at 923, 5'-AGCCAGA-3' at 935, 5'-AACCTGG-3' at 947, 5'-GGCAAGG-3' at 1007, 5'-AGCGAGA-3' at 1061, 5'-GGCCAGG-3' at 1175, 5'-GGCGAGA-3' at 1229, 5'-GGCAAGG-3' at 1259, 5'-GGCAAGG-3' at 1327, 5'-GGCGAGA-3' at 1381, 5'-GGCAAGG-3' at 1427, 5'-GGCGAGA-3' at 1481, 5'-AGCAAGG-3' at 1511, 5'-GGCGAGA-3' at 1565, 5'-GGCGTGA-3' at 1720, 5'-GGTGTGG-3' at 1805, 5'-GGCGAGA-3' at 1921, 5'-GGCAAGA-3' at 1948, 5'-GGTGTGG-3' at 1971, 5'-AGTTAAA-3' at 2136, 5'-AACATGA-3' at 2141, 5'-GATGAAA-3' at 2146, 5'-GGCAAGA-3' at 2190, 5'-GGTCAGA-3' at 2222, 5'-AACCAGA-3' at 2228, 5'-GGCGTGA-3' at 2555, 5'-GGCCAGG-3' at 2574, 5'-AGTCAGA-3' at 2609, 5'-AGTCAAG-3' at 2615, 5'-AGTCAGG-3' at 2620, 5'-GATATAA-3' at 2662, 5'-AGTTAGG-3' at 2668, 5'-AGCAAAA-3' at 2707, 5'-AGCTAAG-3' at 2789, 5'-AACGAGG-3' at 2806, 5'-GATTTGA-3' at 2871, 5'-GACCAGG-3' at 2876, 5'-GGTCTGA-3' at 2943, 5'-GGCCTGG-3' at 2988, 5'-GGTCTGG-3' at 3021, 5'-AATATGG-3' at 3162, 5'-GACCAAA-3' at 3175, 5'-AGCCAGA-3' at 3221, 5'-GATGAGG-3' at 3478, 5'-GGCTAGG-3' at 3484, 5'-AGCTAGG-3' at 3522, 5'-GACCAGA-3' at 3548, 5'-AGTGTGA-3' at 3594, 5'-GGTGAGG-3' at 3647, 5'-GGCCTGG-3' at 3679, 5'-GGCCTGG-3' at 3758, 5'-GACCTGG-3' at 3787, 5'-AGTGAGG-3' at 3878, 5'-AGTCTGA-3' at 3924, 5'-AGTGTGG-3' at 3966, 5'-GGTGTGA-3' at 3971, 5'-AATGAGG-3' at 4096, 5'-GATGAGG-3' at 4102, 5'-GATTTAG-3' at 4136, 5'-GGTGAGG-3' at 4401, 5'-GGTCTGG-3' at 4416.
  7. complement, positive strand, negative direction is SuccessablesInrc+-.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 121, 3'-AACAAGG-5', 71, 3'-GATATGG-5', 77, 3'-GGCAAAG-5', 93, 3'-GGCATGA-5', 124, 3'-GGTATAA-5', 181, 3'-GATGTAA-5', 247, 3'-AACCAGG-5', 262, 3'-AATATGA-5', 274, 3'-AGTGAGA-5', 301, 3'-GACGAAA-5', 312, 3'-GGCCAAG-5', 419, 3'-GGTCAGG-5', 441, 3'-AGCCTGG-5', 459, 3'-AACATAG-5', 468, 3'-AGTGAAA-5', 473, 3'-AGCCTGG-5', 508, 3'-GGCCAAG-5', 556, 3'-GGTCAGG-5', 578, 3'-AATATGG-5', 605, 3'-GGCCAGG-5', 648, 3'-GGCCAAG-5', 692, 3'-GGTCAGG-5', 714, 3'-AGCCTGA-5', 732, 3'-AGCGTGG-5', 741, 3'-GATGTGG-5', 787, 3'-AGCCAAG-5', 874, 3'-AGCCTGG-5', 899, 3'-AGCGAGA-5', 913, 3'-AGCCAGG-5', 948, 3'-GGCATGG-5', 953, 3'-AATCAGG-5', 984, 3'-AACCTGG-5', 1015, 3'-AGTGAGA-5', 1079, 3'-AGCCTGG-5', 1198, 3'-AACATGG-5', 1207, 3'-GGTGAAA-5', 1212, 3'-GGCGTGG-5', 1244, 3'-AACCTAG-5', 1306, 3'-AGTCTGG-5', 1356, 3'-AATAAGA-5', 1365, 3'-AGCAAAA-5', 1371, 3'-AACAAAA-5', 1394, 3'-GGTGTGA-5', 1479, 3'-AACGAAG-5', 1555, 3'-GGCAAAA-5', 1561, 3'-AATGAAA-5', 1582, 3'-AACCTAA-5', 1591, 3'-AATTAAA-5', 1697, 3'-AATATGG-5', 1742, 3'-GGCGTGG-5', 1897, 3'-GGCATGA-5', 1953, 3'-AACCTGG-5', 1959, 3'-AGCCTGG-5', 2009, 3'-AGCAAGA-5', 2023, 3'-AATGTGG-5', 2065, 3'-GGCCAGG-5', 2077, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AACATGG-5', 2152, 3'-GGCGAAA-5', 2157, 3'-GGTCAGG-5', 2250, 3'-AGTTTGA-5', 2257, 3'-AGCCTGG-5', 2268, 3'-AGCATGG-5', 2277, 3'-GGTGAAA-5', 2282, 3'-AACCTGG-5', 2385, 3'-AGCCTGG-5', 2435, 3'-AGTGAGA-5', 2449, 3'-AGCAAAA-5', 2476, 3'-AACAAAA-5', 2490, 3'-AGTAAGA-5', 2503, 3'-GGCCAGG-5', 2519, 3'-GGTCAGG-5', 2587, 3'-AGTGTGG-5', 2605, 3'-AACATGG-5', 2614, 3'-GGTGAAA-5', 2619, 3'-AGTGTGG-5', 2658, 3'-AACCTGG-5', 2720, 3'-AGCCTGG-5', 2770, 3'-AGCATGA-5', 2784, 3'-AACTAAG-5', 2914, 3'-GGCTAAA-5', 3009, 3'-AACTAAG-5', 3031, 3'-GGCGTGG-5', 3047, 3'-AGCCTGG-5', 3128, 3'-AACAAGG-5', 3141, 3'-GGTGAAA-5', 3146, 3'-AACATAA-5', 3169, 3'-GGTGTGG-5', 3186, 3'-AGCCAAG-5', 3273, 3'-AGCCTGG-5', 3298, 3'-AACAAGA-5', 3307, 3'-AGCAAAA-5', 3313, 3'-AACAAGA-5', 3340, 3'-AGCAAGA-5', 3374, 3'-GGCTTGA-5', 3401, 3'-GGCATAG-5', 3446, 3'-AACTAGA-5', 3463, 3'-AACCAGA-5', 3486, 3'-GACAAGA-5', 3759, 3'-GATGTGG-5', 3810, 3'-GACCAGG-5', 3871, 3'-AGTAAGA-5', 3893, 3'-GATGAAA-5', 3922, 3'-GGCCAGG-5', 3951, 3'-AGCCTGG-5', 4037, 3'-AACATAG-5', 4046, 3'-AGTGAGA-5', 4051, 3'-AATGTGA-5', 4092, 3'-GGCCAGG-5', 4102, 3'-GGCATGG-5', 4107, 3'-GGCCAGG-5', 4170, 3'-AGCTTGG-5', 4188, 3'-AGTGAGA-5', 4202, 3'-AGCCAGA-5', 4233, 3'-GACGTGG-5', 4238, 3'-AGCCTGG-5', 4300, 3'-GGTCAAA-5', 4309, 3'-AGCCTGG-5', 4349, 3'-AGTGTGA-5', 4361, 3'-AATGAGG-5', 4557,
  8. complement, positive strand, positive direction is SuccessablesInrc++.bas, looking for 3'-A/G-A/G-C/T-A/C/G/T-A/T-A/G-A/G-5', 45, 5'-AACATAA-3' at 115, 5'-GACAAAA-3' at 147, 5'-GGTGTGA-3' at 345, 5'-GGCCTGA-3' at 746, 5'-GACGTGA-3' at 1372, 5'-GACGTGA-3' at 1472, 5'-GGTCTGA-3' at 1744, 5'-GGTGAAG-3' at 1914, 5'-GATAAAG-3' at 1978, 5'-GGTCAGG-3' at 2026, 5'-AGCGAAG-3' at 2095, 5'-AGTATAA-3' at 2178, 5'-GACGTAA-3' at 2206, 5'-GGTCTAG-3' at 2230, 5'-AGTTAGA-3' at 2235, 5'-GACAAAG-3' at 2263, 5'-AGTGAGA-3' at 2306, 5'-GATGTGG-3' at 2430, 5'-GATTAAA-3' at 2440, 5'-GGCGTGG-3' at 2566, 5'-AATATGG-3' at 2590, 5'-GGTGTGG-3' at 2602, 5'-GGTGTGA-3' at 2636, 5'-AGTCTAA-3' at 2868, 5'-GACGAGG-3' at 2978, 5'-GGTCAGG-3' at 2998, 5'-GGTCAGG-3' at 3084, 5'-GACCAGA-3' at 3245, 5'-AGCGAGA-3' at 3276, 5'-GACCAGA-3' at 3299, 5'-GACGAGG-3' at 3309, 5'-GACGTGG-3' at 3322, 5'-GGCGTAG-3' at 3328, 5'-AACGTGA-3' at 3343, 5'-GACAAGG-3' at 3352, 5'-AACGTAG-3' at 3402, 5'-AGTGTGA-3' at 3507, 5'-GGTCTGG-3' at 3550, 5'-GACAAGG-3' at 3625, 5'-AGTGTGG-3' at 3824, 5'-AGTAAAA-3' at 4120, 5'-AGTGAGA-3' at 4128, 5'-AACTAAA-3' at 4134, 5'-AATCAAA-3' at 4139, 5'-GACGTGG-3' at 4343.
  9. inverse complement, negative strand, negative direction: 32, AGTGTAA at 4533, GGTCCGA at 4255, AGTACGG at 4118, AGTGTGG at 3967, GGTCCGG at 3873, GGTATGG at 3858, GGTCTAG at 3488, AGTCCGA at 3398, AGTGCGG at 3281, GGACCGG at 3130, GATTCGA at 3033, AAAGTAG at 2887, AGTACGG at 2753, AGTACGG at 2535, AGTGTGG at 2418, AGTGCGG at 2208, AGTGCGG at 1992, GGACCGA at 1843, AGTGCAG at 1773, AAAATAG at 1730, AGAACGG at 1608, GATATAG at 1528, GGTCCGA at 1462, AGAGCGA at 1448, GGACCGG at 1200, AGTGTGG at 1128, GAAGTGA at 1056, AGTGTGG at 882, GGACTGG at 734, AGTGCGG at 664, GGACCGA at 598, GATACAA at 213.
  10. inverse complement, negative strand, positive direction: 61, GGAACAG at 4445, GGTCTGG at 4416, GGAGTGA at 4350, GATTTAG at 4136, GAAATGA at 4094, AGAACAG at 4069, AGAGTGG at 4040, GGTGTGA at 3971, AGTGTGG at 3966, AGTCTGA at 3924, AGAGTGA at 3876, GAACCAG at 3840, AGAATGA at 3835, AATCCGA at 3799, GAAGCGG at 3670, AGTGTGA at 3594, GGAATGA at 3567, GGACCAG at 3547, AGTGCAG at 3465, GATGCAG at 3460, GGAATGA at 3441, GGACCAA at 3174, GAAATGG at 3168, AATATGG at 3162, GGTCTGG at 3021, GGTCTGA at 2943, GATTTGA at 2871, AGAATGA at 2841, GGTGCAA at 2801, AAAGTGG at 2711, AGAGCAA at 2705, GGACTGA at 2674, GATATAA at 2662, GAAATAG at 2626, GGTGCAA at 2335, AGTGCAG at 2327, AGATCAA at 2232, GAACCAG at 2227, AGTGCAG at 2064, AAAGCAG at 2007, GGTGTGG at 1971, GAACTGG at 1953, GGTGTGG at 1805, AGTGCAG at 1787, GGTGCGG at 1764, GAAGCGG at 1636, AGTGCGG at 1590, AATGCGG at 1422, AATGCGG at 1322, AGTGCGG at 1254, AGTGCGG at 1170, AGTGCGG at 1086, GGTGCAG at 784, AGTGCGG at 666, AGTGCGG at 582, AGTGCGG at 498, GGTGCGG at 489, AGACCGG at 442, GGAGCGA at 429, AATGTGA at 230, AGAGTGG at 53.
  11. inverse complement, positive strand, negative direction: 100, GGAATGA at 4555, AGTCCAA at 4502, AGTGTGA at 4361, AAAATAA at 4221, AGTTCAA at 4177, AATGTGA at 4092, AAAATAA at 4071, AGACCAG at 4032, AGTTCAA at 4026, GGAGTAA at 3891, GGACCAG at 3870, GATGTGG at 3810, AATGCAG at 3772, GGACTGG at 3749, GGAACAG at 3725, AATCCAG at 3681, AAACCAG at 3485, GAACTAG at 3462, GAAGTGA at 3410, AAATTGA at 3358, AAAACAA at 3330, AGAGCAA at 3311, GGTGTGG at 3186, AAATTAG at 3176, AGACCAG at 3123, AAACTAA at 3030, AAAATAA at 3013, AGAATGG at 3004, AAAACAA at 2842, AGTGTGG at 2658, AAATCAG at 2649, AGTGTGG at 2605, AGACCAG at 2600, AAAACAA at 2509, AAAGCAA at 2480, AAAGCAA at 2474, GATTCGG at 2454, AGAGTGA at 2447, AAACTAG at 2313, AATACAA at 2305, AGACCAG at 2263, AGTTTGA at 2257, GGTGCGG at 2197, AAAATGA at 2187, GATACAA at 2180, AGACCAA at 2147, AGTTTGA at 2141, AGTGTAA at 2087, GGTGCAG at 2082, AATGTGG at 2065, AGAGCAA at 2021, AGAATGG at 1948, AGACTGA at 1935, AAATTAG at 1887, AATACAA at 1878, AATATGG at 1742, GAATTAA at 1696, AAAGCGG at 1680, GAAATGA at 1663, GAAACAA at 1585, AATACAG at 1566, AGAACGA at 1553, AGTGCAA at 1536, GGTGTGA at 1479, AGTGCAG at 1471, AAAACAA at 1388, AGAGCAA at 1369, AGTCTGG at 1356, AAATTAG at 1234, AGAGTGA at 1077, AGATTGG at 1045, GATCCAG at 975, AGAGCGA at 911, GATGTGG at 787, AAATTAG at 777, AATACAA at 769, AGACCAG at 727, AGTTCGA at 721, AAATTGG at 643, AATACAA at 635, AATATGG at 605, AGATTGA at 585, AAATTAG at 499, AATACGA at 492, AGTGCGA at 448, GGTGCGG at 380, AAACTGA at 307, AGAACAG at 288, AATATGA at 274, AAACCAG at 261, AGTTCAA at 255, GATGTAA at 247, GAAACAA at 229, GGTATAA at 181, AAAACAG at 167, AAACTGA at 130, GATATGG at 77, AAAACAA at 69, GGACCAG at 34, AGACTGA at 17.
  12. inverse complement, positive strand, positive direction: 75, AGAACGA at 4390, GGTACGA at 4372, AGTACAG at 4366, GGAGTAA at 4309, GGACTGG at 4216, GAAACGG at 4210, AAATCAA at 4138, GAACTAA at 4133, AAAATAG at 4123, GAACTGG at 4018, AGTGTGG at 3824, GGACCGG at 3681, AGAGTGG at 3612, GGTCTGG at 3550, GATCCGA at 3524, AGTGTGA at 3507, GGAACGG at 3375, GGTACAA at 3337, AGAGTGA at 3317, GGACCAG at 3298, AGTGCAG at 3255, GAAGTAG at 3250, GGACCAA at 3049, AGTCCGG at 3036, AGACCAA at 3023, GGTCCAG at 3018, GGAACAG at 3003, GGACCGG at 2990, AGACCGG at 2985, AGACTGA at 2945, GGAGTAA at 2902, AGACCGA at 2885, GGTCCGG at 2878, AAACTGG at 2873, AGTCTAA at 2868, GGTGTGA at 2636, AGTTCAG at 2617, GGTGTGG at 2602, AATATGG at 2590, GGACCGG at 2571, GGTACAA at 2475, AGAGTGG at 2470, GGACCGA at 2435, GATGTGG at 2430, AATCCGA at 2368, GGTCCGA at 2318, AAAGTGA at 2304, AGAGTGG at 2247, GGTCTAG at 2230, GGACTGG at 2213, AGTATAA at 2178, GAAGTAG at 2110, AGAATGG at 1888, GGTCCGG at 1857, GGACCGA at 1817, GGTCTGA at 1744, GGACTGG at 1662, GATGCGA at 1576, AATTCGG at 1541, GAAGCGG at 1408, GAAGCGG at 1308, AAAGCAG at 1183, GGTCCGA at 1177, GGACCGG at 949, GGACCGG at 849, GGTGCGA at 777, GATGCGA at 652, GAAGCGG at 595, AGAATGA at 524, GAAGCGG at 459, GGTGTGA at 345, GGTCCAG at 217, AATCCAG at 152, AGTCCGG at 92, GGTCCGA at 10.
  13. inverse, negative strand, negative direction, is SuccessablesInri--.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 100, 3'-TCTGACT-5', 17, 3'-CCTGGTC-5', 34, 3'-TTTTGTT-5', 69, 3'-CTATACC-5', 77, 3'-TTTGACT-5', 130, 3'-TTTTGTC-5', 167, 3'-CCATATT-5', 181, 3'-CTTTGTT-5', 229, 3'-CTACATT-5', 247, 3'-TCAAGTT-5', 255, 3'-TTTGGTC-5', 261, 3'-TTATACT-5', 274, 3'-TCTTGTC-5', 288, 3'-TTTGACT-5', 307, 3'-CCACGCC-5', 380, 3'-TCACGCT-5', 448, 3'-TTATGCT-5', 492, 3'-TTTAATC-5', 499, 3'-TCTAACT-5', 585, 3'-TTATACC-5', 605, 3'-TTATGTT-5', 635, 3'-TTTAACC-5', 643, 3'-TCAAGCT-5', 721, 3'-TCTGGTC-5', 727, 3'-TTATGTT-5', 769, 3'-TTTAATC-5', 777, 3'-CTACACC-5', 787, 3'-TCTCGCT-5', 911, 3'-CTAGGTC-5', 975, 3'-TCTAACC-5', 1045, 3'-TCTCACT-5', 1077, 3'-TTTAATC-5', 1234, 3'-TCAGACC-5', 1356, 3'-TCTCGTT-5', 1369, 3'-TTTTGTT-5', 1388, 3'-TCACGTC-5', 1471, 3'-CCACACT-5', 1479, 3'-TCACGTT-5', 1536, 3'-TCTTGCT-5', 1553, 3'-TTATGTC-5', 1566, 3'-CTTTGTT-5', 1585, 3'-CTTTACT-5', 1663, 3'-TTTCGCC-5', 1680, 3'-CTTAATT-5', 1696, 3'-TTATACC-5', 1742, 3'-TTATGTT-5', 1878, 3'-TTTAATC-5', 1887, 3'-TCTGACT-5', 1935, 3'-TCTTACC-5', 1948, 3'-TCTCGTT-5', 2021, 3'-TTACACC-5', 2065, 3'-CCACGTC-5', 2082, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TCTGGTT-5', 2147, 3'-CTATGTT-5', 2180, 3'-TTTTACT-5', 2187, 3'-CCACGCC-5', 2197, 3'-TCAAACT-5', 2257, 3'-TCTGGTC-5', 2263, 3'-TTATGTT-5', 2305, 3'-TTTGATC-5', 2313, 3'-TCTCACT-5', 2447, 3'-CTAAGCC-5', 2454, 3'-TTTCGTT-5', 2474, 3'-TTTCGTT-5', 2480, 3'-TTTTGTT-5', 2509, 3'-TCTGGTC-5', 2600, 3'-TCACACC-5', 2605, 3'-TTTAGTC-5', 2649, 3'-TCACACC-5', 2658, 3'-TTTTGTT-5', 2842, 3'-TCTTACC-5', 3004, 3'-TTTTATT-5', 3013, 3'-TTTGATT-5', 3030, 3'-TCTGGTC-5', 3123, 3'-TTTAATC-5', 3176, 3'-CCACACC-5', 3186, 3'-TCTCGTT-5', 3311, 3'-TTTTGTT-5', 3330, 3'-TTTAACT-5', 3358, 3'-CTTCACT-5', 3410, 3'-CTTGATC-5', 3462, 3'-TTTGGTC-5', 3485, 3'-TTAGGTC-5', 3681, 3'-CCTTGTC-5', 3725, 3'-CCTGACC-5', 3749, 3'-TTACGTC-5', 3772, 3'-CTACACC-5', 3810, 3'-CCTGGTC-5', 3870, 3'-CCTCATT-5', 3891, 3'-TCAAGTT-5', 4026, 3'-TCTGGTC-5', 4032, 3'-TTTTATT-5', 4071, 3'-TTACACT-5', 4092, 3'-TCAAGTT-5', 4177, 3'-TTTTATT-5', 4221, 3'-TCACACT-5', 4361, 3'-TCAGGTT-5', 4502, 3'-CCTTACT-5', 4555,
  14. inverse, negative strand, positive direction, is SuccessablesInri-+.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 75, 5'-CCAGGCT-3' at 10, 5'-TCAGGCC-3' at 92, 5'-TTAGGTC-3' at 152, 5'-CCAGGTC-3' at 217, 5'-CCACACT-3' at 345, 5'-CTTCGCC-3' at 459, 5'-TCTTACT-3' at 524, 5'-CTTCGCC-3' at 595, 5'-CTACGCT-3' at 652, 5'-CCACGCT-3' at 777, 5'-CCTGGCC-3' at 849, 5'-CCTGGCC-3' at 949, 5'-CCAGGCT-3' at 1177, 5'-TTTCGTC-3' at 1183, 5'-CTTCGCC-3' at 1308, 5'-CTTCGCC-3' at 1408, 5'-TTAAGCC-3' at 1541, 5'-CTACGCT-3' at 1576, 5'-CCTGACC-3' at 1662, 5'-CCAGACT-3' at 1744, 5'-CCTGGCT-3' at 1817, 5'-CCAGGCC-3' at 1857, 5'-TCTTACC-3' at 1888, 5'-CTTCATC-3' at 2110, 5'-TCATATT-3' at 2178, 5'-CCTGACC-3' at 2213, 5'-CCAGATC-3' at 2230, 5'-TCTCACC-3' at 2247, 5'-TTTCACT-3' at 2304, 5'-CCAGGCT-3' at 2318, 5'-TTAGGCT-3' at 2368, 5'-CTACACC-3' at 2430, 5'-CCTGGCT-3' at 2435, 5'-TCTCACC-3' at 2470, 5'-CCATGTT-3' at 2475, 5'-CCTGGCC-3' at 2571, 5'-TTATACC-3' at 2590, 5'-CCACACC-3' at 2602, 5'-TCAAGTC-3' at 2617, 5'-CCACACT-3' at 2636, 5'-TCAGATT-3' at 2868, 5'-TTTGACC-3' at 2873, 5'-CCAGGCC-3' at 2878, 5'-TCTGGCT-3' at 2885, 5'-CCTCATT-3' at 2902, 5'-TCTGACT-3' at 2945, 5'-TCTGGCC-3' at 2985, 5'-CCTGGCC-3' at 2990, 5'-CCTTGTC-3' at 3003, 5'-CCAGGTC-3' at 3018, 5'-TCTGGTT-3' at 3023, 5'-TCAGGCC-3' at 3036, 5'-CCTGGTT-3' at 3049, 5'-CTTCATC-3' at 3250, 5'-TCACGTC-3' at 3255, 5'-CCTGGTC-3' at 3298, 5'-TCTCACT-3' at 3317, 5'-CCATGTT-3' at 3337, 5'-CCTTGCC-3' at 3375, 5'-TCACACT-3' at 3507, 5'-CTAGGCT-3' at 3524, 5'-CCAGACC-3' at 3550, 5'-TCTCACC-3' at 3612, 5'-CCTGGCC-3' at 3681, 5'-TCACACC-3' at 3824, 5'-CTTGACC-3' at 4018, 5'-TTTTATC-3' at 4123, 5'-CTTGATT-3' at 4133, 5'-TTTAGTT-3' at 4138, 5'-CTTTGCC-3' at 4210, 5'-CCTGACC-3' at 4216, 5'-CCTCATT-3' at 4309, 5'-TCATGTC-3' at 4366, 5'-CCATGCT-3' at 4372, 5'-TCTTGCT-3' at 4390.
  15. inverse, positive strand, negative direction, is SuccessablesInri+-.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 32, 3'-CTATGTT-5', 213, 3'-CCTGGCT-5', 598, 3'-TCACGCC-5', 664, 3'-CCTGACC-5', 734, 3'-TCACACC-5', 882, 3'-CTTCACT-5', 1056, 3'-TCACACC-5', 1128, 3'-CCTGGCC-5', 1200, 3'-TCTCGCT-5', 1448, 3'-CCAGGCT-5', 1462, 3'-CTATATC-5', 1528, 3'-TCTTGCC-5', 1608, 3'-TTTTATC-5', 1730, 3'-TCACGTC-5', 1773, 3'-CCTGGCT-5', 1843, 3'-TCACGCC-5', 1992, 3'-TCACGCC-5', 2208, 3'-TCACACC-5', 2418, 3'-TCATGCC-5', 2535, 3'-TCATGCC-5', 2753, 3'-TTTCATC-5', 2887, 3'-CTAAGCT-5', 3033, 3'-CCTGGCC-5', 3130, 3'-TCACGCC-5', 3281, 3'-TCAGGCT-5', 3398, 3'-CCAGATC-5', 3488, 3'-CCATACC-5', 3858, 3'-CCAGGCC-5', 3873, 3'-TCACACC-5', 3967, 3'-TCATGCC-5', 4118, 3'-CCAGGCT-5', 4255, 3'-TCACATT-5', 4533,
  16. inverse, positive strand, positive direction, is SuccessablesInri++.bas, looking for 3'-C/T-C/T-A/T-A/C/G/T-A/G-C/T-C/T-5', 61, 5'-TCTCACC-3' at 53, 5'-TTACACT-3' at 230, 5'-CCTCGCT-3' at 429, 5'-TCTGGCC-3' at 442, 5'-CCACGCC-3' at 489, 5'-TCACGCC-3' at 498, 5'-TCACGCC-3' at 582, 5'-TCACGCC-3' at 666, 5'-CCACGTC-3' at 784, 5'-TCACGCC-3' at 1086, 5'-TCACGCC-3' at 1170, 5'-TCACGCC-3' at 1254, 5'-TTACGCC-3' at 1322, 5'-TTACGCC-3' at 1422, 5'-TCACGCC-3' at 1590, 5'-CTTCGCC-3' at 1636, 5'-CCACGCC-3' at 1764, 5'-TCACGTC-3' at 1787, 5'-CCACACC-3' at 1805, 5'-CTTGACC-3' at 1953, 5'-CCACACC-3' at 1971, 5'-TTTCGTC-3' at 2007, 5'-TCACGTC-3' at 2064, 5'-CTTGGTC-3' at 2227, 5'-TCTAGTT-3' at 2232, 5'-TCACGTC-3' at 2327, 5'-CCACGTT-3' at 2335, 5'-CTTTATC-3' at 2626, 5'-CTATATT-3' at 2662, 5'-CCTGACT-3' at 2674, 5'-TCTCGTT-3' at 2705, 5'-TTTCACC-3' at 2711, 5'-CCACGTT-3' at 2801, 5'-TCTTACT-3' at 2841, 5'-CTAAACT-3' at 2871, 5'-CCAGACT-3' at 2943, 5'-CCAGACC-3' at 3021, 5'-TTATACC-3' at 3162, 5'-CTTTACC-3' at 3168, 5'-CCTGGTT-3' at 3174, 5'-CCTTACT-3' at 3441, 5'-CTACGTC-3' at 3460, 5'-TCACGTC-3' at 3465, 5'-CCTGGTC-3' at 3547, 5'-CCTTACT-3' at 3567, 5'-TCACACT-3' at 3594, 5'-CTTCGCC-3' at 3670, 5'-TTAGGCT-3' at 3799, 5'-TCTTACT-3' at 3835, 5'-CTTGGTC-3' at 3840, 5'-TCTCACT-3' at 3876, 5'-TCAGACT-3' at 3924, 5'-TCACACC-3' at 3966, 5'-CCACACT-3' at 3971, 5'-TCTCACC-3' at 4040, 5'-TCTTGTC-3' at 4069, 5'-CTTTACT-3' at 4094, 5'-CTAAATC-3' at 4136, 5'-CCTCACT-3' at 4350, 5'-CCAGACC-3' at 4416, 5'-CCTTGTT-3' at 4445.

YYRNWYY UTRs

YYRNWYY core promoters

YYRNWYY proximal promoters

YYRNWYY distal promoters

BBCABW

For the Basic programs (starting with SuccessablesInr2.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesInr2--.bas, looking for 5'-C/G/T-C/G/T-C-A-C/G/T-A/T-3', 44, 3'-TCCATA-5', 179, 3'-CCCAGT-5', 206, 3'-CTCAGA-5', 278, 3'-GTCACT-5', 299, 3'-TTCACA-5', 322, 3'-TCCAGT-5', 439, 3'-TGCATT-5', 533, 3'-TCCAGT-5', 568, 3'-TCCAGT-5', 576, 3'-TCCAGT-5', 712, 3'-GGCAGA-5', 754, 3'-GCCACT-5', 868, 3'-GTCACT-5', 1034, 3'-CCCACT-5', 1049, 3'-CTCACT-5', 1077, 3'-GGCACA-5', 1220, 3'-GTCACT-5', 1325, 3'-GTCAGA-5', 1354, 3'-CTCAGA-5', 1444, 3'-GGCAGT-5', 1511, 3'-TGCAGA-5', 1774, 3'-GTCACT-5', 1978, 3'-GTCACA-5', 2085, 3'-TCCAGT-5', 2248, 3'-GTCACT-5', 2404, 3'-CTCACT-5', 2447, 3'-TCCAGT-5', 2585, 3'-GTCACA-5', 2603, 3'-GTCACA-5', 2656, 3'-GTCACT-5', 2739, 3'-TTCACA-5', 2860, 3'-TCCACT-5', 3144, 3'-CCCACA-5', 3184, 3'-TTCACT-5', 3410, 3'-GTCATT-5', 3480, 3'-TCCACT-5', 3825, 3'-CTCATA-5', 3829, 3'-CTCATT-5', 3891, 3'-TTCACA-5', 3939, 3'-GTCACT-5', 4200, 3'-TCCAGT-5', 4307, 3'-GTCACT-5', 4319, 3'-CCCACT-5', 4353, 3'-GTCACA-5', 4359.
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesInr2-+.bas, looking for 5'-C/G/T-C/G/T-C-A-C/G/T-A/T-3', 87, 3'-TCCAGA-5', 15, 3'-GGCATT-5', 22, 3'-GTCACA-5', 155, 3'-CCCAGA-5', 204, 3'-GCCACA-5', 343, 3'-CGCAGA-5', 396, 3'-TGCAGA-5', 438, 3'-CCCAGA-5', 468, 3'-TGCACA-5', 548, 3'-TCCACA-5', 632, 3'-CGCACT-5', 686, 3'-CGCACA-5', 800, 3'-GCCAGA-5', 835, 3'-GCCACA-5', 884, 3'-GCCAGA-5', 935, 3'-GCCACA-5', 984, 3'-CGCACA-5', 1052, 3'-CGCACA-5', 1136, 3'-TGCACA-5', 1220, 3'-CCCAGT-5', 1250, 3'-CGCAGA-5', 1316, 3'-TGCACT-5', 1372, 3'-CGCAGA-5', 1416, 3'-TGCACT-5', 1472, 3'-CCCACT-5', 1502, 3'-CGCACA-5', 1556, 3'-GGCATT-5', 1702, 3'-CCCAGA-5', 1742, 3'-TGCACA-5', 1822, 3'-TCCACT-5', 1912, 3'-TGCAGA-5', 1937, 3'-GGCACT-5', 1996, 3'-CCCAGT-5', 2024, 3'-TCCACA-5', 2029, 3'-CTCAGT-5', 2060, 3'-TGCAGT-5', 2065, 3'-GCCACT-5', 2072, 3'-TTCAGT-5', 2098, 3'-CTCATA-5', 2176, 3'-TGCATT-5', 2206, 3'-GTCAGA-5', 2222, 3'-CTCAGA-5', 2239, 3'-TTCACT-5', 2304, 3'-TGCAGT-5', 2328, 3'-GTCACT-5', 2425, 3'-GTCAGA-5', 2609, 3'-CTCAGA-5', 2699, 3'-TGCAGA-5', 2721, 3'-CTCAGA-5', 2729, 3'-TGCAGA-5', 2859, 3'-CTCAGA-5', 2866, 3'-CTCATT-5', 2902, 3'-GTCACT-5', 2929, 3'-TTCAGT-5', 2936, 3'-TGCACA-5', 2962, 3'-TGCATT-5', 3072, 3'-CCCAGT-5', 3082, 3'-CCCAGA-5', 3091, 3'-TCCACA-5', 3192, 3'-CTCACA-5', 3209, 3'-GCCAGA-5', 3221, 3'-TGCAGT-5', 3232, 3'-TGCAGT-5', 3281, 3'-CTCACT-5', 3317, 3'-TGCACT-5', 3343, 3'-CCCAGT-5', 3379, 3'-CCCACT-5', 3388, 3'-GGCACA-5', 3409, 3'-TGCAGT-5', 3461, 3'-GGCAGA-5', 3473, 3'-CTCACA-5', 3505, 3'-GCCACA-5', 3705, 3'-TCCAGA-5', 3806, 3'-GTCACA-5', 3822, 3'-TGCAGA-5', 3831, 3'-TCCAGA-5', 3891, 3'-CGCAGA-5', 3916, 3'-GTCACA-5', 3954, 3'-TGCAGT-5', 3962, 3'-GGCACT-5', 4006, 3'-TCCACT-5', 4013, 3'-CTCAGA-5', 4195, 3'-GTCAGT-5', 4271, 3'-CTCATT-5', 4309, 3'-TGCAGA-5', 4317, 3'-CCCAGA-5', 4330, 3'-CTCACT-5', 4338.
  3. positive strand in the negative direction is SuccessablesInr2+-.bas, looking for 5'-C/G/T-C/G/T-C-A-C/G/T-A/T-3', 59, 3'-GCCATA-5', 39, 3'-TGCATT-5', 152, 3'-GTCACT-5', 208, 3'-GGCACA-5', 266, 3'-GGCACA-5', 518, 3'-GGCACA-5', 960, 3'-GGCAGA-5', 1023, 3'-TGCAGT-5', 1032, 3'-TTCACT-5', 1056, 3'-GGCACA-5', 1116, 3'-CTCACA-5', 1126, 3'-GGCAGA-5', 1314, 3'-TGCAGT-5', 1323, 3'-TGCACT-5', 1347, 3'-TCCAGT-5', 1352, 3'-TCCATT-5', 1378, 3'-CCCAGA-5', 1411, 3'-TGCAGT-5', 1472, 3'-CTCACT-5', 1491, 3'-CCCAGA-5', 1518, 3'-TCCAGT-5', 1532, 3'-TGCACA-5', 1719, 3'-GGCAGA-5', 1967, 3'-TGCAGT-5', 1976, 3'-GCCACT-5', 1995, 3'-TGCACT-5', 2000, 3'-TGCAGT-5', 2083, 3'-GCCAGT-5', 2211, 3'-TGCAGT-5', 2402, 3'-TGCACT-5', 2426, 3'-TCCACT-5', 2632, 3'-GCCAGT-5', 2654, 3'-GGCACA-5', 2665, 3'-TGCAGT-5', 2737, 3'-GCCACT-5', 2756, 3'-GCCATT-5', 3284, 3'-TGCACT-5', 3289, 3'-TGCAGA-5', 3431, 3'-GGCATA-5', 3445, 3'-GGCATA-5', 3451, 3'-GGCAGT-5', 3478, 3'-GGCAGA-5', 3589, 3'-GGCAGT-5', 3600, 3'-GTCAGA-5', 3625, 3'-GGCACA-5', 3632, 3'-CTCAGA-5', 3644, 3'-GCCATT-5', 3686, 3'-TCCACA-5', 3692, 3'-CCCATA-5', 3856, 3'-CTCACA-5', 3965, 3'-GCCAGA-5', 4233, 3'-TGCAGT-5', 4317, 3'-TGCACT-5', 4340, 3'-GCCAGT-5', 4415, 3'-TCCACT-5', 4423, 3'-CCCAGA-5', 4448, 3'-TCCACT-5', 4459, 3'-CCCACT-5', 4485, 3'-TTCACA-5', 4531.
  4. positive strand in the positive direction is SuccessablesInr2++.bas, looking for 5'-C/G/T-C/G/T-C-A-C/G/T-A/T-3', 40, 3'-TCCAGT-5', 153, 3'-CGCACA-5', 1020, 3'-CCCAGA-5', 1711, 3'-CGCACT-5', 1720, 3'-CCCACA-5', 1803, 3'-CCCAGA-5', 1958, 3'-TCCACA-5', 1969, 3'-GTCAGT-5', 2100, 3'-TCCACT-5', 2128, 3'-TCCAGT-5', 2220, 3'-TCCAGA-5', 2258, 3'-TCCACT-5', 2375, 3'-CGCAGT-5', 2423, 3'-GTCACA-5', 2464, 3'-CCCAGA-5', 2489, 3'-TTCACT-5', 2511, 3'-CGCACT-5', 2555, 3'-GTCAGT-5', 2607, 3'-CTCAGT-5', 2613, 3'-TTCAGT-5', 2618, 3'-TCCATA-5', 2642, 3'-TCCAGA-5', 3019, 3'-CTCAGA-5', 3187, 3'-TGCAGA-5', 3256, 3'-CTCACA-5', 3592, 3'-GCCAGA-5', 3608, 3'-CTCACT-5', 3712, 3'-TCCATT-5', 3731, 3'-TCCAGA-5', 3771, 3'-CCCAGT-5', 3820, 3'-GTCACT-5', 3843, 3'-CTCACT-5', 3876, 3'-TTCAGA-5', 3922, 3'-TCCACT-5', 3934, 3'-GTCACA-5', 3964, 3'-CGCAGA-5', 4056, 3'-TCCAGT-5', 4269, 3'-CTCACT-5', 4350, 3'-CCCACT-5', 4399, 3'-CCCAGA-5', 4414.
  5. complement, negative strand, negative direction is SuccessablesInr2c--.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 59, 3'-CGGTAT-5', 39, 3'-ACGTAA-5', 152, 3'-CAGTGA-5', 208, 3'-CCGTGT-5', 266, 3'-CCGTGT-5', 518, 3'-CCGTGT-5', 960, 3'-CCGTCT-5', 1023, 3'-ACGTCA-5', 1032, 3'-AAGTGA-5', 1056, 3'-CCGTGT-5', 1116, 3'-GAGTGT-5', 1126, 3'-CCGTCT-5', 1314, 3'-ACGTCA-5', 1323, 3'-ACGTGA-5', 1347, 3'-AGGTCA-5', 1352, 3'-AGGTAA-5', 1378, 3'-GGGTCT-5', 1411, 3'-ACGTCA-5', 1472, 3'-GAGTGA-5', 1491, 3'-GGGTCT-5', 1518, 3'-AGGTCA-5', 1532, 3'-ACGTGT-5', 1719, 3'-CCGTCT-5', 1967, 3'-ACGTCA-5', 1976, 3'-CGGTGA-5', 1995, 3'-ACGTGA-5', 2000, 3'-ACGTCA-5', 2083, 3'-CGGTCA-5', 2211, 3'-ACGTCA-5', 2402, 3'-ACGTGA-5', 2426, 3'-AGGTGA-5', 2632, 3'-CGGTCA-5', 2654, 3'-CCGTGT-5', 2665, 3'-ACGTCA-5', 2737, 3'-CGGTGA-5', 2756, 3'-CGGTAA-5', 3284, 3'-ACGTGA-5', 3289, 3'-ACGTCT-5', 3431, 3'-CCGTAT-5', 3445, 3'-CCGTAT-5', 3451, 3'-CCGTCA-5', 3478, 3'-CCGTCT-5', 3589, 3'-CCGTCA-5', 3600, 3'-CAGTCT-5', 3625, 3'-CCGTGT-5', 3632, 3'-GAGTCT-5', 3644, 3'-CGGTAA-5', 3686, 3'-AGGTGT-5', 3692, 3'-GGGTAT-5', 3856, 3'-GAGTGT-5', 3965, 3'-CGGTCT-5', 4233, 3'-ACGTCA-5', 4317, 3'-ACGTGA-5', 4340, 3'-CGGTCA-5', 4415, 3'-AGGTGA-5', 4423, 3'-GGGTCT-5', 4448, 3'-AGGTGA-5', 4459, 3'-GGGTGA-5', 4485, 3'-AAGTGT-5', 4531.
  6. complement, negative strand, positive direction is SuccessablesInr2c-+.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 40 , 3'-AGGTCA-5', 153 , 3'-GCGTGT-5', 1020 , 3'-GGGTCT-5', 1711 , 3'-GCGTGA-5', 1720 , 3'-GGGTGT-5', 1803 , 3'-GGGTCT-5', 1958 , 3'-AGGTGT-5', 1969 , 3'-CAGTCA-5', 2100 , 3'-AGGTGA-5', 2128 , 3'-AGGTCA-5', 2220 , 3'-AGGTCT-5', 2258 , 3'-AGGTGA-5', 2375 , 3'-GCGTCA-5', 2423 , 3'-CAGTGT-5', 2464 , 3'-GGGTCT-5', 2489 , 3'-AAGTGA-5', 2511 , 3'-GCGTGA-5', 2555 , 3'-CAGTCA-5', 2607 , 3'-GAGTCA-5', 2613 , 3'-AAGTCA-5', 2618 , 3'-AGGTAT-5', 2642 , 3'-AGGTCT-5', 3019 , 3'-GAGTCT-5', 3187 , 3'-ACGTCT-5', 3256 , 3'-GAGTGT-5', 3592 , 3'-CGGTCT-5', 3608 , 3'-GAGTGA-5', 3712 , 3'-AGGTAA-5', 3731 , 3'-AGGTCT-5', 3771 , 3'-GGGTCA-5', 3820 , 3'-CAGTGA-5', 3843 , 3'-GAGTGA-5', 3876 , 3'-AAGTCT-5', 3922 , 3'-AGGTGA-5', 3934 , 3'-CAGTGT-5', 3964 , 3'-GCGTCT-5', 4056 , 3'-AGGTCA-5', 4269 , 3'-GAGTGA-5', 4350 , 3'-GGGTGA-5', 4399 , 3'-GGGTCT-5', 4414.
  7. complement, positive strand, negative direction is SuccessablesInr2c+-.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 44, 3'-AGGTAT-5', 179, 3'-GGGTCA-5', 206, 3'-GAGTCT-5', 278, 3'-CAGTGA-5', 299, 3'-AAGTGT-5', 322, 3'-AGGTCA-5', 439, 3'-ACGTAA-5', 533, 3'-AGGTCA-5', 568, 3'-AGGTCA-5', 576, 3'-AGGTCA-5', 712, 3'-CCGTCT-5', 754, 3'-CGGTGA-5', 868, 3'-CAGTGA-5', 1034, 3'-GGGTGA-5', 1049, 3'-GAGTGA-5', 1077, 3'-CCGTGT-5', 1220, 3'-CAGTGA-5', 1325, 3'-CAGTCT-5', 1354, 3'-GAGTCT-5', 1444, 3'-CCGTCA-5', 1511, 3'-ACGTCT-5', 1774, 3'-CAGTGA-5', 1978, 3'-CAGTGT-5', 2085, 3'-AGGTCA-5', 2248, 3'-CAGTGA-5', 2404, 3'-GAGTGA-5', 2447, 3'-AGGTCA-5', 2585, 3'-CAGTGT-5', 2603, 3'-CAGTGT-5', 2656, 3'-CAGTGA-5', 2739, 3'-AAGTGT-5', 2860, 3'-AGGTGA-5', 3144, 3'-GGGTGT-5', 3184, 3'-AAGTGA-5', 3410, 3'-CAGTAA-5', 3480, 3'-AGGTGA-5', 3825, 3'-GAGTAT-5', 3829, 3'-GAGTAA-5', 3891, 3'-AAGTGT-5', 3939, 3'-CAGTGA-5', 4200, 3'-AGGTCA-5', 4307, 3'-CAGTGA-5', 4319, 3'-GGGTGA-5', 4353, 3'-CAGTGT-5', 4359.
  8. complement, positive strand, positive direction is SuccessablesInr2c++.bas, looking for 5'-A/C/G-A/C/G-G-T-A/C/G-A/T-3', 87, 3'-AGGTCT-5', 15, 3'-CCGTAA-5', 22, 3'-CAGTGT-5', 155, 3'-GGGTCT-5', 204, 3'-CGGTGT-5', 343, 3'-GCGTCT-5', 396, 3'-ACGTCT-5', 438, 3'-GGGTCT-5', 468, 3'-ACGTGT-5', 548, 3'-AGGTGT-5', 632, 3'-GCGTGA-5', 686, 3'-GCGTGT-5', 800, 3'-CGGTCT-5', 835, 3'-CGGTGT-5', 884, 3'-CGGTCT-5', 935, 3'-CGGTGT-5', 984, 3'-GCGTGT-5', 1052, 3'-GCGTGT-5', 1136, 3'-ACGTGT-5', 1220, 3'-GGGTCA-5', 1250, 3'-GCGTCT-5', 1316, 3'-ACGTGA-5', 1372, 3'-GCGTCT-5', 1416, 3'-ACGTGA-5', 1472, 3'-GGGTGA-5', 1502, 3'-GCGTGT-5', 1556, 3'-CCGTAA-5', 1702, 3'-GGGTCT-5', 1742, 3'-ACGTGT-5', 1822, 3'-AGGTGA-5', 1912, 3'-ACGTCT-5', 1937, 3'-CCGTGA-5', 1996, 3'-GGGTCA-5', 2024, 3'-AGGTGT-5', 2029, 3'-GAGTCA-5', 2060, 3'-ACGTCA-5', 2065, 3'-CGGTGA-5', 2072, 3'-AAGTCA-5', 2098, 3'-GAGTAT-5', 2176, 3'-ACGTAA-5', 2206, 3'-CAGTCT-5', 2222, 3'-GAGTCT-5', 2239, 3'-AAGTGA-5', 2304, 3'-ACGTCA-5', 2328, 3'-CAGTGA-5', 2425, 3'-CAGTCT-5', 2609, 3'-GAGTCT-5', 2699, 3'-ACGTCT-5', 2721, 3'-GAGTCT-5', 2729, 3'-ACGTCT-5', 2859, 3'-GAGTCT-5', 2866, 3'-GAGTAA-5', 2902, 3'-CAGTGA-5', 2929, 3'-AAGTCA-5', 2936, 3'-ACGTGT-5', 2962, 3'-ACGTAA-5', 3072, 3'-GGGTCA-5', 3082, 3'-GGGTCT-5', 3091, 3'-AGGTGT-5', 3192, 3'-GAGTGT-5', 3209, 3'-CGGTCT-5', 3221, 3'-ACGTCA-5', 3232, 3'-ACGTCA-5', 3281, 3'-GAGTGA-5', 3317, 3'-ACGTGA-5', 3343, 3'-GGGTCA-5', 3379, 3'-GGGTGA-5', 3388, 3'-CCGTGT-5', 3409, 3'-ACGTCA-5', 3461, 3'-CCGTCT-5', 3473, 3'-GAGTGT-5', 3505, 3'-CGGTGT-5', 3705, 3'-AGGTCT-5', 3806, 3'-CAGTGT-5', 3822, 3'-ACGTCT-5', 3831, 3'-AGGTCT-5', 3891, 3'-GCGTCT-5', 3916, 3'-CAGTGT-5', 3954, 3'-ACGTCA-5', 3962, 3'-CCGTGA-5', 4006, 3'-AGGTGA-5', 4013, 3'-GAGTCT-5', 4195, 3'-CAGTCA-5', 4271, 3'-GAGTAA-5', 4309, 3'-ACGTCT-5', 4317, 3'-GGGTCT-5', 4330, 3'-GAGTGA-5', 4338.
  9. inverse complement, negative strand, negative direction is SuccessablesInr2ci--.bas, looking for 5'-A/T-A/C/G-T-G-A/C/G-A/C/G-3', 46, 3'-TCTGAC-5', 16, 3'-TGTGGA-5', 62, 3'-TGTGCA-5', 342, 3'-TGTGCA-5', 531, 3'-AGTGCG-5', 663, 3'-TGTGGG-5', 749, 3'-TCTGAG-5', 916, 3'-TGTGCG-5', 963, 3'-ACTGAA-5', 1052, 3'-AGTGAG-5', 1057, 3'-TCTGAG-5', 1082, 3'-TGTGGA-5', 1129, 3'-AGTGGA-5', 1171, 3'-AATGAA-5', 1298, 3'-TCTGAG-5', 1403, 3'-AGTGAC-5', 1492, 3'-TGTGAA-5', 1544, 3'-TCTGAA-5', 1617, 3'-AGTGCA-5', 1772, 3'-TCTGAC-5', 1934, 3'-AGTGCG-5', 1991, 3'-TCTGAG-5', 2026, 3'-TATGAC-5', 2162, 3'-ACTGGC-5', 2190, 3'-AGTGCG-5', 2207, 3'-TGTGAA-5', 2551, 3'-AGTGAA-5', 2578, 3'-ACTGAG-5', 2787, 3'-TATGGA-5', 2994, 3'-AGTGGG-5', 3057, 3'-AGTGAA-5', 3101, 3'-AGTGAA-5', 3240, 3'-AGTGCG-5', 3280, 3'-TCTGAC-5', 3425, 3'-TATGAC-5', 3541, 3'-TATGCG-5', 3547, 3'-TATGGA-5', 3859, 3'-TGTGGA-5', 3968, 3'-TGTGAA-5', 3983, 3'-AGTGAA-5', 4010, 3'-TCTGAG-5', 4054, 3'-AGTGAA-5', 4161, 3'-TCTGGG-5', 4205, 3'-TCTGCA-5', 4236, 3'-TGTGAC-5', 4336, 3'-TCTGGG-5', 4366.
  10. inverse complement, negative strand, positive direction is SuccessablesInr2ci-+.bas, looking for 5'-A/T-A/C/G-T-G-A/C/G-A/C/G-3', 94, 3'-AGTGGG-5', 54, 3'-TCTGCA-5', 224, 3'-TGTGAA-5', 231, 3'-ACTGCC-5', 238, 3'-TCTGAG-5', 256, 3'-TCTGGA-5', 271, 3'-ACTGGG-5', 348, 3'-AGTGCG-5', 497, 3'-AGTGCG-5', 581, 3'-AGTGCG-5', 665, 3'-ACTGCG-5', 749, 3'-TGTGGC-5', 819, 3'-ACTGCC-5', 901, 3'-TGTGGC-5', 919, 3'-ACTGCG-5', 1001, 3'-TGTGGC-5', 1023, 3'-AGTGCG-5', 1085, 3'-AGTGCG-5', 1160, 3'-AGTGCG-5', 1169, 3'-AGTGCG-5', 1253, 3'-ACTGAG-5', 1287, 3'-AATGCG-5', 1321, 3'-TCTGGC-5', 1377, 3'-TCTGCG-5', 1396, 3'-AATGCG-5', 1421, 3'-TCTGGC-5', 1477, 3'-TCTGCG-5', 1496, 3'-ACTGCA-5', 1505, 3'-AGTGCG-5', 1589, 3'-AGTGCG-5', 1725, 3'-AGTGCA-5', 1786, 3'-TGTGGA-5', 1806, 3'-TCTGGG-5', 1865, 3'-ACTGGG-5', 1954, 3'-TGTGGC-5', 1972, 3'-TCTGGC-5', 1993, 3'-AGTGCA-5', 2063, 3'-AGTGGC-5', 2068, 3'-TATGGC-5', 2160, 3'-ACTGCA-5', 2204, 3'-AGTGCA-5', 2326, 3'-TGTGCA-5', 2681, 3'-AGTGGA-5', 2712, 3'-ACTGCC-5', 2823, 3'-AATGAC-5', 2842, 3'-TCTGCA-5', 2857, 3'-TCTGGC-5', 2884, 3'-AATGGG-5', 2911, 3'-TCTGAC-5', 2944, 3'-TCTGAG-5', 2951, 3'-TGTGCA-5', 2960, 3'-TCTGGC-5', 2984, 3'-TCTGAG-5', 3007, 3'-AGTGCC-5', 3011, 3'-TATGAC-5', 3028, 3'-TCTGCA-5', 3061, 3'-AATGCA-5', 3070, 3'-ACTGGC-5', 3118, 3'-TCTGAG-5', 3124, 3'-TATGGA-5', 3163, 3'-AATGGG-5', 3169, 3'-AGTGCC-5', 3235, 3'-TATGAG-5', 3261, 3'-TCTGCA-5', 3268, 3'-TCTGCA-5', 3279, 3'-ACTGCA-5', 3320, 3'-ACTGGC-5', 3346, 3'-TCTGCC-5', 3359, 3'-TCTGGC-5', 3406, 3'-AATGCC-5', 3431, 3'-TGTGGA-5', 3437, 3'-AATGAA-5', 3442, 3'-AATGAG-5', 3446, 3'-AGTGGG-5', 3450, 3'-AGTGCA-5', 3464, 3'-AATGAC-5', 3568, 3'-TGTGAA-5', 3595, 3'-AGTGAC-5', 3713, 3'-ACTGAG-5', 3736, 3'-AATGAC-5', 3783, 3'-AATGAA-5', 3836, 3'-AGTGAG-5', 3877, 3'-TGTGAG-5', 3904, 3'-TCTGAA-5', 3925, 3'-TGTGCA-5', 3960, 3'-TGTGAC-5', 3972, 3'-AGTGGG-5', 4041, 3'-ACTGAA-5', 4090, 3'-AATGAG-5', 4095, 3'-AGTGCC-5', 4274, 3'-ACTGCA-5', 4341, 3'-AGTGAG-5', 4351, 3'-TGTGGG-5', 4395, 3'-TCTGGG-5', 4417.
  11. inverse complement, positive strand, negative direction is SuccessablesInr2ci+-.bas, looking for 5'-A/T-A/C/G-T-G-A/C/G-A/C/G-3', 54, 3'-ACTGAA-5', 18, 3'-TATGGG-5', 78, 3'-ACTGAA-5', 131, 3'-TATGAG-5', 275, 3'-AGTGAG-5', 300, 3'-ACTGAC-5', 308, 3'-AGTGCG-5', 447, 3'-AGTGAA-5', 472, 3'-AGTGGA-5', 523, 3'-AGTGAG-5', 1035, 3'-AGTGAG-5', 1078, 3'-AGTGGC-5', 1121, 3'-AGTGAG-5', 1326, 3'-TCTGGG-5', 1357, 3'-AGTGCA-5', 1470, 3'-ACTGCA-5', 1494, 3'-AGTGCA-5', 1535, 3'-AATGAA-5', 1581, 3'-AATGCC-5', 1634, 3'-TATGGC-5', 1743, 3'-ACTGAG-5', 1936, 3'-AATGGC-5', 1949, 3'-AGTGAG-5', 1979, 3'-ACTGCA-5', 1998, 3'-TGTGGC-5', 2066, 3'-AATGAC-5', 2188, 3'-AGTGAG-5', 2405, 3'-ACTGCA-5', 2424, 3'-AGTGAG-5', 2448, 3'-TGTGGC-5', 2606, 3'-AGTGAG-5', 2740, 3'-ACTGCA-5', 2759, 3'-TGTGCA-5', 2863, 3'-AATGGC-5', 3005, 3'-TGTGAG-5', 3268, 3'-AGTGAC-5', 3411, 3'-TGTGCA-5', 3429, 3'-TGTGCC-5', 3561, 3'-AATGGG-5', 3660, 3'-TGTGGG-5', 3712, 3'-ACTGGG-5', 3750, 3'-AATGCA-5', 3771, 3'-TCTGGA-5', 3836, 3'-ACTGCC-5', 3852, 3'-TGTGGC-5', 3960, 3'-AGTGAG-5', 4050, 3'-TGTGAG-5', 4093, 3'-AGTGAG-5', 4201, 3'-ACTGCA-5', 4315, 3'-AGTGAG-5', 4320, 3'-ACTGCA-5', 4330, 3'-ACTGCA-5', 4338, 3'-TGTGAG-5', 4362, 3'-AATGAG-5', 4556.
  12. inverse complement, positive strand, positive direction is SuccessablesInr2ci++.bas, looking for 5'-A/T-A/C/G-T-G-A/C/G-A/C/G-3', 47, 3'-TCTGAC-5', 236, 3'-TGTGAC-5', 346, 3'-TCTGCC-5', 399, 3'-TCTGGC-5', 441, 3'-AATGAA-5', 525, 3'-TGTGCA-5', 569, 3'-TGTGCG-5', 803, 3'-TGTGCG-5', 887, 3'-TGTGCG-5', 987, 3'-TGTGAC-5', 1139, 3'-TGTGCC-5', 1223, 3'-TGTGCC-5', 1559, 3'-ACTGGG-5', 1663, 3'-TGTGCC-5', 1698, 3'-TCTGAA-5', 1745, 3'-AATGGG-5', 1889, 3'-ACTGGC-5', 2214, 3'-AGTGGA-5', 2248, 3'-AGTGAG-5', 2305, 3'-AGTGGG-5', 2313, 3'-AGTGAC-5', 2341, 3'-TCTGAA-5', 2417, 3'-TGTGGA-5', 2431, 3'-TATGAA-5', 2740, 3'-TCTGGA-5', 2862, 3'-AGTGAC-5', 2930, 3'-ACTGAA-5', 2946, 3'-TGTGGG-5', 2965, 3'-ACTGAA-5', 3030, 3'-AGTGCA-5', 3254, 3'-AGTGAC-5', 3318, 3'-TGTGAG-5', 3508, 3'-TGTGGG-5', 3533, 3'-TCTGGA-5', 3551, 3'-AGTGGG-5', 3613, 3'-AGTGCC-5', 3748, 3'-ACTGGA-5', 3785, 3'-ACTGGA-5', 4019, 3'-AGTGAC-5', 4088, 3'-AGTGAG-5', 4127, 3'-AGTGGG-5', 4204, 3'-ACTGGG-5', 4217, 3'-TGTGCC-5', 4259, 3'-TCTGCG-5', 4320, 3'-AGTGGG-5', 4326, 3'-TGTGAG-5', 4335, 3'-AGTGAC-5', 4339.
  13. inverse, negative strand, negative direction, is SuccessablesInr2i--.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 54, 3'-TGACTT-5', 18, 3'-ATACCC-5', 78, 3'-TGACTT-5', 131, 3'-ATACTC-5', 275, 3'-TCACTC-5', 300, 3'-TGACTG-5', 308, 3'-TCACGC-5', 447, 3'-TCACTT-5', 472, 3'-TCACCT-5', 523, 3'-TCACTC-5', 1035, 3'-TCACTC-5', 1078, 3'-TCACCG-5', 1121, 3'-TCACTC-5', 1326, 3'-AGACCC-5', 1357, 3'-TCACGT-5', 1470, 3'-TGACGT-5', 1494, 3'-TCACGT-5', 1535, 3'-TTACTT-5', 1581, 3'-TTACGG-5', 1634, 3'-ATACCG-5', 1743, 3'-TGACTC-5', 1936, 3'-TTACCG-5', 1949, 3'-TCACTC-5', 1979, 3'-TGACGT-5', 1998, 3'-ACACCG-5', 2066, 3'-TTACTG-5', 2188, 3'-TCACTC-5', 2405, 3'-TGACGT-5', 2424, 3'-TCACTC-5', 2448, 3'-ACACCG-5', 2606, 3'-TCACTC-5', 2740, 3'-TGACGT-5', 2759, 3'-ACACGT-5', 2863, 3'-TTACCG-5', 3005, 3'-ACACTC-5', 3268, 3'-TCACTG-5', 3411, 3'-ACACGT-5', 3429, 3'-ACACGG-5', 3561, 3'-TTACCC-5', 3660, 3'-ACACCC-5', 3712, 3'-TGACCC-5', 3750, 3'-TTACGT-5', 3771, 3'-AGACCT-5', 3836, 3'-TGACGG-5', 3852, 3'-ACACCG-5', 3960, 3'-TCACTC-5', 4050, 3'-ACACTC-5', 4093, 3'-TCACTC-5', 4201, 3'-TGACGT-5', 4315, 3'-TCACTC-5', 4320, 3'-TGACGT-5', 4330, 3'-TGACGT-5', 4338, 3'-ACACTC-5', 4362, 3'-TTACTC-5', 4556.
  14. inverse, negative strand, positive direction, is SuccessablesInr2i-+.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 47, 3'-AGACTG-5', 236, 3'-ACACTG-5', 346, 3'-AGACGG-5', 399, 3'-AGACCG-5', 441, 3'-TTACTT-5', 525, 3'-ACACGT-5', 569, 3'-ACACGC-5', 803, 3'-ACACGC-5', 887, 3'-ACACGC-5', 987, 3'-ACACTG-5', 1139, 3'-ACACGG-5', 1223, 3'-ACACGG-5', 1559, 3'-TGACCC-5', 1663, 3'-ACACGG-5', 1698, 3'-AGACTT-5', 1745, 3'-TTACCC-5', 1889, 3'-TGACCG-5', 2214, 3'-TCACCT-5', 2248, 3'-TCACTC-5', 2305, 3'-TCACCC-5', 2313, 3'-TCACTG-5', 2341, 3'-AGACTT-5', 2417, 3'-ACACCT-5', 2431, 3'-ATACTT-5', 2740, 3'-AGACCT-5', 2862, 3'-TCACTG-5', 2930, 3'-TGACTT-5', 2946, 3'-ACACCC-5', 2965, 3'-TGACTT-5', 3030, 3'-TCACGT-5', 3254, 3'-TCACTG-5', 3318, 3'-ACACTC-5', 3508, 3'-ACACCC-5', 3533, 3'-AGACCT-5', 3551, 3'-TCACCC-5', 3613, 3'-TCACGG-5', 3748, 3'-TGACCT-5', 3785, 3'-TGACCT-5', 4019, 3'-TCACTG-5', 4088, 3'-TCACTC-5', 4127, 3'-TCACCC-5', 4204, 3'-TGACCC-5', 4217, 3'-ACACGG-5', 4259, 3'-AGACGC-5', 4320, 3'-TCACCC-5', 4326, 3'-ACACTC-5', 4335, 3'-TCACTG-5', 4339.
  15. inverse, positive strand, negative direction, is SuccessablesInr2i+-.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 46, 3'-AGACTG-5', 16, 3'-ACACCT-5', 62, 3'-ACACGT-5', 342, 3'-ACACGT-5', 531, 3'-TCACGC-5', 663, 3'-ACACCC-5', 749, 3'-AGACTC-5', 916, 3'-ACACGC-5', 963, 3'-TGACTT-5', 1052, 3'-TCACTC-5', 1057, 3'-AGACTC-5', 1082, 3'-ACACCT-5', 1129, 3'-TCACCT-5', 1171, 3'-TTACTT-5', 1298, 3'-AGACTC-5', 1403, 3'-TCACTG-5', 1492, 3'-ACACTT-5', 1544, 3'-AGACTT-5', 1617, 3'-TCACGT-5', 1772, 3'-AGACTG-5', 1934, 3'-TCACGC-5', 1991, 3'-AGACTC-5', 2026, 3'-ATACTG-5', 2162, 3'-TGACCG-5', 2190, 3'-TCACGC-5', 2207, 3'-ACACTT-5', 2551, 3'-TCACTT-5', 2578, 3'-TGACTC-5', 2787, 3'-ATACCT-5', 2994, 3'-TCACCC-5', 3057, 3'-TCACTT-5', 3101, 3'-TCACTT-5', 3240, 3'-TCACGC-5', 3280, 3'-AGACTG-5', 3425, 3'-ATACTG-5', 3541, 3'-ATACGC-5', 3547, 3'-ATACCT-5', 3859, 3'-ACACCT-5', 3968, 3'-ACACTT-5', 3983, 3'-TCACTT-5', 4010, 3'-AGACTC-5', 4054, 3'-TCACTT-5', 4161, 3'-AGACCC-5', 4205, 3'-AGACGT-5', 4236, 3'-ACACTG-5', 4336, 3'-AGACCC-5', 4366.
  16. inverse, positive strand, positive direction, is SuccessablesInr2i++.bas, looking for 5'-A/T-C/G/T-A-C-C/G/T-C/G/T-3', 94, 3'-TCACCC-5', 54, 3'-AGACGT-5', 224, 3'-ACACTT-5', 231, 3'-TGACGG-5', 238, 3'-AGACTC-5', 256, 3'-AGACCT-5', 271, 3'-TGACCC-5', 348, 3'-TCACGC-5', 497, 3'-TCACGC-5', 581, 3'-TCACGC-5', 665, 3'-TGACGC-5', 749, 3'-ACACCG-5', 819, 3'-TGACGG-5', 901, 3'-ACACCG-5', 919, 3'-TGACGC-5', 1001, 3'-ACACCG-5', 1023, 3'-TCACGC-5', 1085, 3'-TCACGC-5', 1160, 3'-TCACGC-5', 1169, 3'-TCACGC-5', 1253, 3'-TGACTC-5', 1287, 3'-TTACGC-5', 1321, 3'-AGACCG-5', 1377, 3'-AGACGC-5', 1396, 3'-TTACGC-5', 1421, 3'-AGACCG-5', 1477, 3'-AGACGC-5', 1496, 3'-TGACGT-5', 1505, 3'-TCACGC-5', 1589, 3'-TCACGC-5', 1725, 3'-TCACGT-5', 1786, 3'-ACACCT-5', 1806, 3'-AGACCC-5', 1865, 3'-TGACCC-5', 1954, 3'-ACACCG-5', 1972, 3'-AGACCG-5', 1993, 3'-TCACGT-5', 2063, 3'-TCACCG-5', 2068, 3'-ATACCG-5', 2160, 3'-TGACGT-5', 2204, 3'-TCACGT-5', 2326, 3'-ACACGT-5', 2681, 3'-TCACCT-5', 2712, 3'-TGACGG-5', 2823, 3'-TTACTG-5', 2842, 3'-AGACGT-5', 2857, 3'-AGACCG-5', 2884, 3'-TTACCC-5', 2911, 3'-AGACTG-5', 2944, 3'-AGACTC-5', 2951, 3'-ACACGT-5', 2960, 3'-AGACCG-5', 2984, 3'-AGACTC-5', 3007, 3'-TCACGG-5', 3011, 3'-ATACTG-5', 3028, 3'-AGACGT-5', 3061, 3'-TTACGT-5', 3070, 3'-TGACCG-5', 3118, 3'-AGACTC-5', 3124, 3'-ATACCT-5', 3163, 3'-TTACCC-5', 3169, 3'-TCACGG-5', 3235, 3'-ATACTC-5', 3261, 3'-AGACGT-5', 3268, 3'-AGACGT-5', 3279, 3'-TGACGT-5', 3320, 3'-TGACCG-5', 3346, 3'-AGACGG-5', 3359, 3'-AGACCG-5', 3406, 3'-TTACGG-5', 3431, 3'-ACACCT-5', 3437, 3'-TTACTT-5', 3442, 3'-TTACTC-5', 3446, 3'-TCACCC-5', 3450, 3'-TCACGT-5', 3464, 3'-TTACTG-5', 3568, 3'-ACACTT-5', 3595, 3'-TCACTG-5', 3713, 3'-TGACTC-5', 3736, 3'-TTACTG-5', 3783, 3'-TTACTT-5', 3836, 3'-TCACTC-5', 3877, 3'-ACACTC-5', 3904, 3'-AGACTT-5', 3925, 3'-ACACGT-5', 3960, 3'-ACACTG-5', 3972, 3'-TCACCC-5', 4041, 3'-TGACTT-5', 4090, 3'-TTACTC-5', 4095, 3'-TCACGG-5', 4274, 3'-TGACGT-5', 4341, 3'-TCACTC-5', 4351, 3'-ACACCC-5', 4395, 3'-AGACCC-5', 4417.

BBCABW UTRs

BBCABW core promoters

BBCABW proximal promoters

BBCABW distal promoters

Inr-like, TCTs samplings

Copying TTCTCT in "⌘F" yields three between ZSCAN22 and A1BG and three between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence TTCTCT (starting with SuccessablesTCT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for TTCTCT, 3, TTCTCT at 3380, TTCTCT at 2826, TTCTCT at 2809.
  2. negative strand, positive direction, looking for TTCTCT, 4, TTCTCT at 4386, TTCTCT at 1990, TTCTCT at 139, TTCTCT at 119.
  3. positive strand, negative direction, looking for TTCTCT, 1, TTCTCT at 622.
  4. positive strand, positive direction, looking for TTCTCT, 0.
  5. complement, negative strand, negative direction, looking for AAGAGA, 1, AAGAGA at 622.
  6. complement, negative strand, positive direction, looking for AAGAGA, 0.
  7. complement, positive strand, negative direction, looking for AAGAGA, 3, AAGAGA at 3380, AAGAGA at 2826, AAGAGA at 2809.
  8. complement, positive strand, positive direction, looking for AAGAGA, 4, AAGAGA at 4386, AAGAGA at 1990, AAGAGA at 139, AAGAGA at 119.
  9. inverse complement, negative strand, negative direction, looking for AGAGAA, 1, AGAGAA at 4527.
  10. inverse complement, negative strand, positive direction, looking for AGAGAA, 0.
  11. inverse complement, positive strand, negative direction, looking for AGAGAA, 3, AGAGAA at 3406, AGAGAA at 2827, AGAGAA at 2810.
  12. inverse complement, positive strand, positive direction, looking for AGAGAA, 2, AGAGAA at 4387, AGAGAA at 3056.
  13. inverse negative strand, negative direction, looking for TCTCTT, 3, TCTCTT at 3406, TCTCTT at 2827, TCTCTT at 2810.
  14. inverse negative strand, positive direction, looking for TCTCTT, 2, TCTCTT at 4387, TCTCTT at 3056.
  15. inverse positive strand, negative direction, looking for TCTCTT, 1, TCTCTT at 4527.
  16. inverse positive strand, positive direction, looking for TCTCTT, 0.

TCT core promoters

Negative strand, negative direction: AGAGAA at 4527, and complement.

Negative strand, positive direction: TTCTCT at 4386, and complement.

Positive strand, positive direction: AGAGAA at 4387, and complement.

TCT distal promoters

Negative strand, negative direction: TTCTCT at 3380, TTCTCT at 2826, TTCTCT at 2809, and complements.

Positive strand, negative direction: AGAGAA at 3406, AGAGAA at 2827, AGAGAA at 2810, TTCTCT at 622, and complements.

Negative strand, positive direction: TTCTCT at 1990, TTCTCT at 139, TTCTCT at 119, and complements.

Positive strand, positive direction: AGAGAA at 3056, and complement.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. Smale, Stephen T.; Baltimore, David (1989-04-07). "The "initiator" as a transcription control element". Cell. 57 (1): 103–113. doi:10.1016/0092-8674(89)90176-1. ISSN 0092-8674. PMID 2467742.
  2. 2.0 2.1 Gershenzon, Naum I.; Ioshikhes, Ilya P. (2005-04-15). "Synergy of human Pol II core promoter elements revealed by statistical sequence analysis". Bioinformatics. 21 (8): 1295–1300. doi:10.1093/bioinformatics/bti172. ISSN 1367-4803.
  3. Lim, Chin Yan; Santoso, Buyung; Boulay, Thomas; Dong, Emily; Ohler, Uwe; Kadonaga, James T. (2004-07-01). "The MTE, a new core promoter element for transcription by RNA polymerase II". Genes & Development. 18 (13): 1606–1617. doi:10.1101/gad.1193404. ISSN 0890-9369. PMC 443522. PMID 15231738.
  4. Kaufmann, J.; Smale, S. T. (1994-04-01). "Direct recognition of initiator elements by a component of the transcription factor IID complex". Genes & Development. 8 (7): 821–829. doi:10.1101/gad.8.7.821. ISSN 0890-9369. PMID 7926770.
  5. O'Shea-Greenfield, A.; Smale, S. T. (1992-01-15). "Roles of TATA and initiator elements in determining the start site location and direction of RNA polymerase II transcription". The Journal of Biological Chemistry. 267 (2): 1391–1402. ISSN 0021-9258. PMID 1730658.
  6. 6.0 6.1 6.2 Yang, Chuhu; Bolotin, Eugene; Jiang, Tao; Sladek, Frances M.; Martinez, Ernest (2007-03-01). "Prevalence of the Initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. ISSN 0378-1119. PMC 1955227. PMID 17123746.
  7. Ngoc, Long Vo; Cassidy, California Jack; Huang, Cassidy Yunjing; Duttke, Sascha H. C.; Kadonaga, James T. (2017-01-20). "The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters". Genes & Development. doi:10.1101/gad.293837.116. ISSN 0890-9369. PMC 5287114. PMID 28108474.
  8. Javahery, R; Khachi, A; Lo, K; Zenzie-Gregory, B; Smale, S T (1994-01-01). "DNA sequence requirements for transcriptional initiator activity in mammalian cells". Molecular and Cellular Biology. 14 (1): 116–127. doi:10.1128/mcb.14.1.116. ISSN 0270-7306. PMC 358362. PMID 8264580.
  9. Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima (9 August 2020). "Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" (PDF). Fisheries Science. 86: 873–887. doi:10.1007/s12562-020-01451-z. Retrieved 27 September 2020.
  10. 10.0 10.1 Gillian E. Chalkley and C. Peter Verrijzer (September 1, 1999). "DNA binding site selection by RNA polymerase II TAFs: a TAFII250-TAFII150 complex recognizes the Initiator" (PDF). The EMBO Journal. 18 (17): 4835–45. PMID 10469661. Retrieved 2012-04-26.
  11. 11.0 11.1 J. Carcamo, L. Buckbinder, D. Reinberg (1991). Proceedings of the National Academy of Sciences USA. 88: 8052–6. Missing or empty |title= (help); |access-date= requires |url= (help)
  12. L. Weis and D. Reinberg (1997). "Accurate positioning of RNA polymerase II on a natural TATA-less promoter is independent of TATA-binding protein associated factors and initiator-binding proteins". Mol. Cell. Biol. 17: 2973–84. |access-date= requires |url= (help)
  13. RefSeq (May 2009). "BRCA1 BRCA1, DNA repair associated [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  14. 14.0 14.1 RefSeq (February 2010). "BRCA1 BRCA1, DNA repair associated [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  15. Schlegel BP, Starita LM, Parvin JD (February 2003). "Overexpression of a protein fragment of RNA helicase A causes inhibition of endogenous BRCA1 function and defects in ploidy and cytokinesis in mammary epithelial cells". Oncogene. 22 (7): 983–91. doi:10.1038/sj.onc.1206195. PMID 12592385.
  16. Anderson SF, Schlegel BP, Nakajima T, Wolpin ES, Parvin JD (July 1998). "BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase A". Nat. Genet. 19 (3): 254–6. doi:10.1038/930. PMID 9662397.
  17. Lee CG, Hurwitz J (Aug 1993). "Human RNA helicase A is homologous to the maleless protein of Drosophila". The Journal of Biological Chemistry. 268 (22): 16822–30. PMID 8344961.
  18. Zhang S, Grosse F (April 1997). "Domain structure of human nuclear DNA helicase II (RNA helicase A)". The Journal of Biological Chemistry. 272 (17): 11487–94. doi:10.1074/jbc.272.17.11487. PMID 9111062.
  19. Archambault J, Chambers RS, Kobor MS, Ho Y, Cartier M, Bolotin D, Andrews B, Kane CM, Greenblatt J (February 1998). "An essential component of a C-terminal domain phosphatase that interacts with transcription factor IIF in Saccharomyces cerevisiae". Proc Natl Acad Sci U S A. 94 (26): 14300–5. Bibcode:1997PNAS...9414300A. doi:10.1073/pnas.94.26.14300. PMC 24951. PMID 9405607.
  20. 20.0 20.1 Archambault J, Pan G, Dahmus GK, Cartier M, Marshall N, Zhang S, Dahmus ME, Greenblatt J (November 1998). "FCP1, the RAP74-interacting subunit of a human protein phosphatase that dephosphorylates the carboxyl-terminal domain of RNA polymerase IIO". J Biol Chem. 273 (42): 27593–601. doi:10.1074/jbc.273.42.27593. PMID 9765293.
  21. 21.0 21.1 "Entrez Gene: CTDP1 CTD (carboxy-terminal domain, RNA polymerase II, polypeptide A) phosphatase, subunit 1".
  22. RefSeq (February 2011). "CTDP1 CTD phosphatase subunit 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  23. Licciardo, Paolo; Amente Stefano; Ruggiero Luca; Monti Maria; Pucci Piero; Lania Luigi; Majello Barbara (Feb 2003). "The FCP1 phosphatase interacts with RNA polymerase II and with MEP50 a component of the methylosome complex involved in the assembly of snRNP". Nucleic Acids Res. England. 31 (3): 999–1005. doi:10.1093/nar/gkg197. PMC 149217. PMID 12560496.
  24. Scully, R; Anderson S F; Chao D M; Wei W; Ye L; Young R A; Livingston D M; Parvin J D (May 1997). "BRCA1 is a component of the RNA polymerase II holoenzyme". Proc. Natl. Acad. Sci. U.S.A. UNITED STATES. 94 (11): 5605–10. Bibcode:1997PNAS...94.5605S. doi:10.1073/pnas.94.11.5605. ISSN 0027-8424. PMC 20825. PMID 9159119.
  25. 25.0 25.1 RefSeq (September 2011). "DDX53 DEAD-box helicase 53 [ Homo sapiens (human)". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 22 December 2018.
  26. DR Liston, PJ Johnson (March 1999). "Analysis of a Ubiquitous Promoter Element in a Primitive Eukaryote: Early Evolution of the Initiator Element". Molecular and Cellular Biology. 19 (3): 2380–8. PMID 10022924. |access-date= requires |url= (help)
  27. 27.0 27.1 C Yang, E Bolotin, T Jiang, FM Sladek, E Martinez (March 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMC 1955227. PMID 17123746.
  28. JE Purdy, BJ Mann, LT Pho, WA Petri Jr (July 19, 1994). "Transient transfection of the enteric parasite Entamoeba histolytica and expression of firefly luciferase". Proceedings of the National Academy of Science USA. 91 (15): 7099–103. PMID 8041752. Retrieved 2012-06-10.
  29. 29.0 29.1 Hualin Xi, Yong Yu, Yutao Fu, Jonathan Foley, Anason Halees, and Zhiping Weng (June 2007). "Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1". Genome Research. 17 (6): 798–806. doi:10.1101/gr.5754707. PMC 1891339. PMID 17567998.
  30. R. Javahery, A. Khachi, K. Lo, B. Zenzie-Gregory, S. T. Smale (January 1994). "DNA Sequence Requirements for Transcriptional Initiator Activity in Mammalian Cells". Molecular and Cellular Biology. 14 (1): 116–27. PMID 8264580. |access-date= requires |url= (help)
  31. Ananda L. Roy (August 2001). "Biochemistry and biology of the inducible multifunctional transcription factor TFII-I" (PDF). Gene. 274 (1–2): 1–13. doi:10.1016/S0378-1119(01)00625-4. Retrieved 2012-04-06.
  32. HGNC:11535 (March 24, 2012). "TAF1 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 250kDa". Bethesda, Maryland: NCBI. Retrieved 2012-04-09.
  33. ST Smale (March 1997). "Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes". Biochimica & Biophysica Acta. 1351 (1–2): 73–88. doi:10.1016/S0167-4781(96)00206-0. PMID 9116046. |access-date= requires |url= (help)
  34. KH Emami, A Jain, ST Smale (1997). Genes Development. 11: 3007–19. Missing or empty |title= (help); |access-date= requires |url= (help)
  35. 35.0 35.1 35.2 Benjamin Lewin (2004). Genes VIII. Upper Saddle River, NJ: Pearson Prentice Hall. pp. 636–637. ISBN 0-13-144946-X.
  36. AL Roy, M Meisterernst, P. Pognonec, RG Roeder (1991). Nature. 354: 245–8. Missing or empty |title= (help); |access-date= requires |url= (help)
  37. AL Roy, S. Malik, M. Meisterernst, RG Roeder (1993). 365: 355–9. Missing or empty |title= (help); |access-date= requires |url= (help)
  38. Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
  39. Tamar Juven-Gershon, Jer-Yuan Hsu, Joshua W. M. Theisen, and James T. Kadonaga (June 2008). "The RNA Polymerase II Core Promoter – the Gateway to Transcription". Current Opinion in Cell Biology. 20 (3): 253–9. doi:10.1016/j.ceb.2008.03.003. Retrieved 2013-02-13.

Further reading

External links

{{Chemistry resources}}

{{Phosphate biochemistry}}Template:Sisterlinks