Complex locus A1BG and ZNF497: Difference between revisions

Jump to navigation Jump to search
Line 293: Line 293:


==HY boxes==
==HY boxes==
===Core promoters===
Positive strand in the negative direction there is 1: 3'-TGAGGG-5' at 4558.
Inverse complement, negative strand, negative direction there is 1: 3'-CCCTCA-5', 4498.
Negative strand in the positive direction there is 1: 3'-TGTGGG-5', 4395.
===Distal promoters===
Negative strand in the negative direction there is 1: 3'-TGTGGG-5' at 749.
Positive strand in the negative direction there are 4: 3'-TGAGGG-5' at 88, 3'-TGAGGG-5' at 2699, 3'-TGAGGG-5' at 3652, 3'-TGTGGG-5' at 3712.
Inverse complement, negative strand, negative direction there are 3: 3'-CCCTCA-5', 2702, 3'-CCCACA-5', 3184, 3'-CCCTCA-5', 3889.
Positive strand in the positive direction there are 2: 3'-TGTGGG-5', 2965, 3'-TGTGGG-5', 3533.
Negative strand in the positive direction there are 3: 3'-TGAGGG-5', 258, 3'-TGAGGG-5', 3479, 3'-TGAGGG-5', 3879.
Inverse complement, negative strand, positive direction there are 3: 3'-CCCTCA-5', 88, 3'-CCCTCA-5', 3207, 3'-CCCTCA-5', 3503.
Inverse complement, positive strand, positive direction there is 5: 3'-CCCTCA-5', 494, 3'-CCCTCA-5', 662, 3'-CCCTCA-5', 1783, 3'-CCCACA-5', 1803, 3'-CCCTCA-5', 3185.


==Initiator elements==
==Initiator elements==

Revision as of 00:42, 18 January 2020

ZSCAN22

  1. Gene ID: 342945 is ZSCAN22 zinc finger and SCAN domain containing 22.[1] ZSCAN22 is transcribed in the negative direction from LOC100887072.[1]
  2. Gene ID: 102465484 is MIR6806.[2] MIR6806 is transcribed in the negative direction from LOC105372480.[2]

Alpha-1-B glycoprotein

  1. Gene ID: 1 is Alpha-1-B glycoprotein, a 54.3 kDa protein in humans that is encoded by the A1BG gene.[3] A1BG is transcribed in the positive direction from ZNF497.[3] The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. Patients who have pancreatic ductal adenocarcinoma show an overexpression of A1BG in pancreatic juice.[4] The gene contains 20 distinct introns.[5] Transcription produces 15 different mRNAs, 10 alternatively spliced variants and 5 unspliced forms.[5] There are 4 probable alternative promoters, 4 non overlapping alternative last exons and 7 validated alternative polyadenylation sites.[5] The mRNAs appear to differ by truncation of the 5' end, truncation of the 3' end, presence or absence of 4 cassette exons, overlapping exons with different boundaries, splicing versus retention of 3 introns.[5]
  2. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[6] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[6]

ZNF497

  1. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[6] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[6]
  2. Gene ID: 162968 is ZNF497 zinc finger protein 497.[7] ZNF497 is transcribed in the positive direction from RNA5SP473.[7]
  3. Gene ID: 100419840 is LOC100419840 zinc finger protein 446 pseudogene.[8] LOC100419840 may be transcribed in the positive direction from LOC105372483.[8]
  4. Gene ID: 105372483 is LOC105372483 uncharacterized LOC105372483 ncRNA.[9] LOC105372483 is transcribed in the negative direction from LOC100419840.[9]
  5. Gene ID: 106479017 is RNA5SP473 RNA, 5S ribosomal pseudogene 473.[10] RNA5SP473 may be transcribed in the negative direction from ZNF497.[10]

AGC boxes

An inverse AGC box occurs negative strand, negative direction, 3'-CCGCCGA-5' at 1754 nts from ZSCAN22 toward A1BG in the distal promoter with its complement on the positive strand, negative direction.

ATA boxes

Core promoters

There is the following inverse ATA box on the negative strand, negative direction: 1, 3'-AAATAA-5' at 4537 inside A1BG as the TSS is at 4460 nts from ZSCAN22.

Proximal promoters

There is the following inverse ATA box on the positive strand, negative direction: 3'-AAATAA-5' at 4221.

There is one inverse and inverse complement between 4050 and 4300 in the positive direction: 3'-AAATAA-5' at 4142, and 3'-TTTATT-5' at 4142.

Distal promoters

There is the following ATA box on the negative strand in the negative direction: 1, 3'-AATAAA-5' at 1726 nts from ZSCAN22.

There are the following ATA boxes on the positive strand in the negative direction: 3, 3'-AATAAA-5' at 3014, 3'-AATAAA-5' at 3335, and 3'-AATAAA-5' at 4072.

There are the following inverse ATA boxes on the positive strand, negative direction: 4, 3'-AAATAA-5' at 3013, 3'-AAATAA-5' at 3334, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075.

There is the following ATA box on the negative strand in the positive direction: 1, 3'-AATAAA-5' at 3427. It has a complement on the positive strand in the positive direction: 1, 3'-TTATTT-5' at 3427.

There is another inverse complement ATA box on the negative strand in the positive direction in distal promoter: 3'-TTTATT-5' at 2347. It also has an inverse in the distal promoter: 3'-AAATAA-5' at 2347.

C boxes

Proximal promoters

There is one C box 3'-ACATCA-5' at 4116 nts in the positive direction.

Distal promoters

There are four C boxes: 3'-AGTAGT-5' at 2888, 3'-AGTAGT-5' at 2944, 3'-AGTAGT-5' at 3418, and 3'-AGTAGT-5' at 3521 on the negative strand in the negative direction and its complement on the positive strand.

There is one C box: 3'-TCATCA-5' at 3251 on the negative strand in the positive direction and its complement on the positive strand.

D boxes

There is one D box in the distal promoter: 3'-AGTCTG-5' at 2947 on the negative strand in the negative direction and its complement on the positive strand.

There is one D box in the distal promoter: 3'-AGTCTG-5' at 3923 on the negative strand in the positive direction and its complement on the positive strand.

CAREs

A CARE occurs in the negative direction: 3'-CAACTC-5' at 86 possibly associated with ZSCAN22. But inverse CAREs occur 3'-CTCAAC-5' at 1406, 3'-CTCAAC-5' at 2592, 3'-CTCAAC-5' at 2704, 3'-CTCAAC-5' at 3115, and 3'-CTCAAC-5' at 4096.

A CARE occurs in the positive direction: 3'-CAACTC-5' at 3292 in the positive direction. But inverse CARE occur 3'-CTCAAC-5' at 1406 and 3'-CTCAAC-5' at 1621 and 3'-CTCAAC-5' at 3290.

CArG boxes

There is a more general CArG box, 3'-CATTAAAAGG-5', at 3441 from ZSCAN22, or -1019 nts from the TSS of A1BG in the negative direction on the positive strand in the distal promoter.

A second more general CArG box, 3'-CAAAAAAAAG-5', at 1399 from ZSCAN22, or -3061 nts from the A1BG TSS may be a CArG box for ZSCAN22 in the negative direction on the positive strand in the distal promoter.

CGCG boxes

Negative strand in the negative direction there are 2: 3'-GCGCGT-5', 161, 3'-CCGCGC-5', 1761, in the distal promoter.

Positive strand in the negative direction there is 1: 3'-GCGCGG-5', 1762, in the distal promoter.

Negative strand in the positive direction there are 8: 3'-GCGCGT-5', 543, 3'-CCGCGC-5', 681, 3'-GCGCGC-5', 683, 3'-ACGCGG-5', 871, 3'-ACGCGG-5', 971, 3'-CCGCGG-5', 1337, 3'-CCGCGG-5', 1437, 3'-CCGCGC-5', 1650, in the distal promoter.

Positive strand in the positive direction there are 22: 3'-CCGCGC-5', 161, 3'-ACGCGG-5', 452, 3'-CCGCGC-5', 542, 3'-GCGCGC-5', 682, 3'-GCGCGT-5', 684, 3'-CCGCGT-5', 876, 3'-CCGCGT-5', 976, 3'-CCGCGT-5', 1046, 3'-ACGCGG-5', 1078, 3'-ACGCGG-5', 1162, 3'-CCGCGC-5', 1214, 3'-ACGCGG-5', 1246, 3'-CCGCGT-5', 1298, 3'-ACGCGT-5', 1314, 3'-ACGCGG-5', 1354, 3'-ACGCGG-5', 1398, 3'-ACGCGT-5', 1414, 3'-ACGCGG-5', 1454, 3'-ACGCGG-5', 1498, 3'-ACGCGT-5', 1523, 3'-CCGCGT-5', 1550, 3'-CCGCGG-5', 1769, in the distal promoter.

CRE boxes

Negative strand in the negative direction there is 1: 3'-TGACGTCA-5', 4317, and its complement in the proximal promoter.

E2 boxes

Negative strand in the negative direction there are 5: 3'-ACAGATGT-5', 482, 3'-ACAGATGT-5', 1225, 3'-GCAGTTGG-5', 1514, 3'-ACAGATGT-5', 2989, 3'-ACAGATGT-5', 4213, in the distal promoter.

Positive strand in the negative direction there are 2: 3'-GCAGGTGG-5', 2571, 3'-ACAGATGA-5', 3920.

Inverse complement, negative strand, negative direction there is 1: 3'-CCACCTGT-5', 2117.

Inverse complement, positive strand, negative direction there are 4: 3'-CCACCTGT-5', 394, 3'-ACACCTGT-5', 1131, 3'-GCAACTGC-5', 3851, 3'-ACACCTGT-5', 3970

Negative strand in the positive direction there is 1: 3'-GCAGATGA-5', 37.

Enhancer boxes

Core promoters

Negative strand in the positive direction there are 2: 3'-CACATG-5', 4364, 3'-CACATG-5', 4370.

Proximal promoters

Positive strand, negative direction there is 1: 3'-CACATG-5' at 4247.

Negative strand, positive direction there are 2: 3'-CACATG-5', 4153, 3'-CACATG-5', 4221.

Distal promoters

Negative strand in the negative direction there are 4: 3'-CACATG-5' at 324, 3'-CACATG-5' at 797, 3'-CACATG-5' at 2213, and 3'-CACATG-5' at 2342.

Positive strand in the negative direction there are 17, 3'-CACATG-5' at 123, 3'-CACATG-5' at 200, 3'-CACATG-5' at 952, 3'-CACATG-5' at 1206, 3'-CACATG-5' at 1849, 3'-CACATG-5' at 1952, 3'-CACATG-5' at 2151, 3'-CACATG-5' at 2276, 3'-CACATG-5' at 2322, 3'-CACATG-5' at 2533, 3'-CACATG-5' at 2613, 3'-CACATG-5' at 2667, 3'-CACATG-5' at 2751, 3'-CACATG-5' at 2783, 3'-CACATG-5' at 4106, 3'-CACATG-5' at 4116.

Negative strand in the positive direction there are 17: 3'-CACATG-5', 1186, 3'-CACATG-5', 1238, 3'-CACATG-5', 1871, 3'-CACATG-5', 1933, 3'-CACATG-5', 2031, 3'-CACATG-5', 2140, 3'-CACATG-5', 2153, 3'-CACATG-5', 2266, 3'-CACATG-5', 2473, 3'-CACATG-5', 3140, 3'-CACATG-5', 3335, 3'-CACATG-5', 3580, 3'-CACATG-5', 3707, 3'-CACATG-5', 3742, 3'-CACATG-5', 3827, 3'-CACATG-5', 3900, 3'-CACATG-5', 3956.

Positive strand in the positive direction there are 4: 3'-CACATG-5', 126, 3'-CACATG-5', 565, 3'-CACATG-5', 2596, 3'-CACATG-5', 3114.

B recognition elements

The factor II B recognition element is BREu.

Negative strand in the negative direction there are 3: 3'-CCACGCC-5' at 380, 3'-CCGCGCC-5' at 1762, and 3'-CCACGCC-5' at 2197 the distal promoter.

Complement, negative strand, negative direction there us 1: 3'-CCTGCGG-5' at 1153.

Inverse complement, positive strand, negative direction there are 4: 3'-GGCGTGG-5' at 1244, 3'-GGCGCGG-5' at 1762, 3'-GGCGTGG-5' at 1897, and 3'-GGCGTGG-5' at 3047.

Negative strand in the positive direction there are 3: 3'-GCACGCC-5', 1302, 3'-GGACGCC-5', 1672, 3'-GGGCGCC-5', 1769.

Positive strand in the positive direction there are 3: 3'-CCACGCC-5', 489, 3'-CGACGCC-5', 1033, 3'-CCACGCC-5', 1764.

Inverse complement, negative strand, positive direction there is 1: 3'-GGCGCCC-5', 1770.

Inverse complement, positive strand, positive direction there is 4: 3'-GGCGCGC-5', 682, 3'-GGCGCCG-5', 1338, 3'-GGCGCCG-5', 1438, 3'-GGCGTGG-5', 2566.

GA responsive elements

Only one GARE (an inverse) occurs: between ZSCAN22 and A1BG 3'-AAACAAT-5' at 230 nts and its complement.

GATA boxes

Proximal promoters

Inverse complement, negative strand, positive direction there is 1: 3'-TTTATCAC-5', 4125.

Distal promoters

Positive strand in the negative direction there are 2: 3'-GGGATAGA-5', 100, 3'-ATGATAGA-5', 355.

Inverse complement, negative strand, negative direction there is 1: 3'-GTTATCAT-5', 2500.

Inverse complement, positive strand, negative direction there is 1: 3'-TTTATCTT-5', 1732.

Inverse complement, negative strand, positive direction there is 1: 3'-GTTATCCC-5', 3385.

Inverse complement, positive strand, positive direction there are 2: 3'-GCTATCAG-5', 1840, 3'-TTTATCTT-5', 2628.

GC boxes

Positive strand in the negative direction there are 2; 3'-TGGGCGTGGT-5', 1898, 3'-TGGGCGTGGT-5', 3048, in the distal promoter.

Inverse complement, negative strand, negative direction there is 1: 3'-ACTCCGCCCA-5', 3092.

Inverse complement, positive strand, negative direction there is 1: 3'-GCTCCGCCTC-5', 1505.

Negative strand in the positive direction there is 1: 3'-TGGGCGGGAC-5', 409.

Inverse complement, positive strand, positive direction there is 1:, 3'-GCCACGCCCC-5', 491.

H boxes

Core promoters

Between ZSCAN22 and A1BG: There is one inverse and its complement 3'-AGGAGA-5' at 4428 nts.

Between ZNF497 and A1BG: There is an inverse and its complement 3'-AGGACA-5' at 4252. There is five after the TSS: 3'-AGAGAA-5' at 4387, 3'-AGTACA-5' at 4365, 3'-ACCAGA-5' at 4380, 3'-AAGAGA-5' at 4386, 3'-ACGACA-5' at 4392 and their complements.

Proximal promoters

Between ZSCAN22 and A1BG: There is one H box (3'-ANANNA-5'): negative direction, negative strand, 3'-ACACGA-5' at 4402. On the positive strand in the negative direction there are 16: 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5'at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395, with their complements on the negative strand, negative direction.

Between ZNF497 and A1BG: There is one H box (3'-ANANNA-5'): 3'-AGAGAA-5' at 4387 in the proximal promoter, negative strand, positive direction. There are four: 3'-TCATGT-5' at 4365, 3'-TGGTCT-5' at 4380, 3'-TTCTCT-5' at 4386, and 3'-TGCTGT-5' at 4392 and their complements in the positive direction.

Distal promoters

Between ZSCAN22 and A1BG: 3'-AGAGGA-5' at 3387, 3'-AGAGGA-5' at 3638, and 3'-AGAGGA-5' at 3675. One inverse and its complement 3'-AGGAGA-5' at 3790. There are 13 H boxes: 3'-ACATCA-5' at 2541, 3'-ACACCA-5' at 2659, 3'-ACATTA-5' at 2675, 3'-ATAAAA-5' at 2853, 3'-AAAGTA-5' at 2886, 3'-ACATTA-5' at 3064, 3'-AGATGA-5' at 3159, 3'-ACACCA-5' at 3187, 3'-AGAAGA-5' at 3554, 3'-AGACGA-5' at 3707, 3'-ACACCA-5' at 3811, 3'-ACATTA-5' at 3973, and 3'-ACATCA-5' at 4124.

On the positive strand, negative direction, there are 122 H boxes: 3'-AAAAAA-5' at 2461, 3'-AAAAAA-5' at 2462, 3'-AAAAAA-5' at 2463, 3'-AAAAAA-5' at 2464, 3'-AAAAAA-5' at 2465, 3'-AAAAAA-5' at 2466, 3'-AAAAAA-5' at 2467, 3'-AAAAAA-5' at 2468, 3'-AAAAAA-5' at 2469, 3'-AAAAAA-5' at 2470, 3'-AAAGCA-5' at 2473, 3'-AAAGCA-5' at 2479, 3'-AAACAA-5' at 2484, 3'-AAACAA-5' at 2488, 3'-ACAAAA-5' at 2490, 3'-ATAGTA-5' at 2500, 3'-AGAAAA-5' at 2506, 3'-AAAACA-5' at 2508, 3'-AAACAA-5' at 2509, 3'-AGACCA-5' at 2599, 3'-ATACAA-5' at 2642, 3'-ACAAAA-5' at 2644, 3'-AAATCA-5' at 2648, 3'-ACAGGA-5' at 2690, 3'-AAATCA-5' at 2749, 3'-AGAGCA-5' at 2781, 3'-AAAAGA-5' at 2798, 3'-AAAGAA-5' at 2799, 3'-AAAGAA-5' at 2803, 3'-AGAAAA-5' at 2805, 3'-AAAAGA-5' at 2807, 3'-AGAGAA-5' at 2810, 3'-AGAAGA-5' at 2812, 3'-AGAAAA-5' at 2815, 3'-AAAAAA-5' at 2817, 3'-AAAAGA-5' at 2819, 3'-AAAGAA-5' at 2820, 3'-AGAAAA-5' at 2822, 3'-AAAAGA-5' at 2824, 3'-AGAGAA-5' at 2827, 3'-AGAAGA-5' at 2829, 3'-AGAAAA-5' at 2832, 3'-AAAAAA-5' at 2834, 3'-AAAAGA-5' at 2836, 3'-AAAGAA-5' at 2837, 3'-AGAAAA-5' at 2839, 3'-AAAACA-5' at 2841, 3'-AAACAA-5' at 2842, 3'-AAAATA-5' at 2868, 3'-ATATAA-5' at 2873, 3'-AAAAAA-5' at 2929, 3'-ACATCA-5' at 2941, 3'-ACATTA-5' at 2951, 3'-AAACCA-5' at 2971, 3'-AAAATA-5' at 3012, 3'-AAATAA-5' at 3013, 3'-AAAAAA-5' at 3026, 3'-AAACTA-5' at 3029, 3'-AGACCA-5' at 3122, 3'-AAAACA-5' at 3166, 3'-ACATAA-5' at 3169, 3'-ATAAAA-5' at 3171, 3'-AAATTA-5' at 3175, 3'-AGATCA-5' at 3277, 3'-ACAAGA-5' at 3307, 3'-AGAGCA-5' at 3310, 3'-AAAACA-5' at 3329, 3'-AAACAA-5' at 3330, 3'-AAATAA-5' at 3334, 3'-AAACAA-5' at 3338, 3'-ACAAGA-5' at 3340, 3'-AGAAAA-5' at 3343, 3'-AAACCA-5' at 3365, 3'-AGAGGA-5' at 3387, 3'-ACATCA-5' at 3394, 3'-AGAGAA-5' at 3406, 3'-ACATCA-5' at 3415, 3'-ACATTA-5' at 3436, 3'-ATATTA-5' at 3454, 3'-ATATTA-5' at 3468, 3'-AAACCA-5' at 3484, 3'-AGATCA-5' at 3489, 3'-AAAACA-5' at 3511, 3'-ACACAA-5' at 3514, 3'-ATAATA-5' at 3538, 3'-ACAAGA-5' at 3635, 3'-AGAGGA-5' at 3638, 3'-AAAGAA-5' at 3666, 3'-AGAACA-5' at 3668, 3'-AGAGGA-5' at 3675, 3'-ACAAGA-5' at 3759, 3'-AGACCA-5' at 3762, 3'-ACAAAA-5' at 3767, 3'-AGAGCA-5' at 3913, 3'-AGATGA-5' at 3920, 3'-AGACCA-5' at 4031, 3'-ACAAAA-5' at 4066, 3'-AAAAAA-5' at 4068, 3'-AAAATA-5' at 4070, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075, 3'-ATAATA-5' at 4077, 3'-ATAGAA-5' at 4080, 3'-AAAGAA-5' at 4084, 3'-AGAAAA-5' at 4086, 3'-AGACAA-5' at 4182, 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5' at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395.

Between ZNF497 and A1BG: There are two H boxes after nucleotide number 2300 in the negative strand and positive direction: 3'-ACACCA-5' at 2603 and 3'-ACACCA-5' at 3825.

There are two H boxes after nucleotide number 2300 in the positive strand and positive direction: 3'-ACACCA-5' at 3643 and 3'-ACACCA-5' at 3967.

Regarding 3'-ANANNA-5', on the negative strand, positive direction, there are 25 H boxes: 3'-ATACCA-5' at 2591, 3'-ACACCA-5' at 2603, 3'-ATAGAA-5' at 2628, 3'-AAACCA-5' at 2632, 3'-ACACTA-5'at 2637, 3'-ATATAA-5' at 2662, 3'-AGAGCA-5' at 2704, 3'-AGAGGA-5' at 2793, 3'-AAAGGA-5' at 2829, 3'-ACAGAA-5' at 2838, 3'-AAAGAA-5' at 3066, 3'-AGAACA-5' at 3094, 3'-AGAGCA-5' at 3138, 3'-ACAGCA-5' at 3212, 3'-ACAGTA-5' at 3414, 3'-AGATGA-5' at 3476, 3'-ACAGGA-5' at 3572, 3'-AAAGCA-5' at 3599, 3'-ACATGA-5' at 3708, 3'-ACACCA-5' at 3825, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-AAATGA-5' at 4094, 3'-ACATCA-5' at 4116, and 3'-ACATGA-5' at 4154.

On the positive strand, positive direction there are 20 H boxes: 3'-AAATAA-5' at 2347, 3'-AAAAAA-5' at 2451, 3'-AAAACA-5' at 2453, 3'-AGACGA-5' at 2976, 3'-AGACCA-5' at 3022, 3'-AGAGAA-5' at 3056, 3'-AGAAGA-5' at 3058, 3'-AGAGGA-5' at 3302, 3'-AGACGA-5' at 3307, 3'-ACAGAA-5' at 3393, 3'-AGAAGA-5' at 3395, 3'-ACAGGA-5' at 3620, 3'-ACACCA-5' at 3643, 3'-AAACCA-5' at 3948, 3'-ACACCA-5' at 3967, 3'-AGAGGA-5' at 4059, 3'-AAAATA-5' at 4122, 3'-AAATCA-5' at 4137, 3'-AAATAA-5' at 4142, and 3'-ATATTA-5' at 4168.

There inverses on the negative strand in the positive direction of 31 H boxes: 3'-ATGACA-5' at 2412, 3'-ACTACA-5' at 2428, 3'-AGGACA-5' at 2460, 3'-ATTATA-5' at 2548, 3'-ACCACA-5' at 2600, 3'-AGGAAA-5' at 2623, 3'-AATAGA-5' at 2627, 3'-ACCACA-5' at 2634, 3'-AACAGA-5' at 2652, 3'-AGCAAA-5' at 2706, 3'-AGGAAA-5' at 2831, 3'-AACACA-5' at 2835, 3'-ATGACA-5' at 2843, 3'-AGAACA-5' at 3094, 3'-AACACA-5' at 3096, 3'-AGGACA-5' at 3131, 3'-ACCAAA-5' at 3175, 3'-AACAGA-5' at 3179, 3'-AGCAGA-5' at 3214, 3'-AGTAGA-5' at 3416, 3'-AATAAA-5' at 3427, 3'-ACCAGA-5' at 3548, 3'-ATGACA-5' at 3569, 3'-AGGAGA-5' at 3650, 3'-AGCACA-5' at 3740, 3'-ACCACA-5' at 3859, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-ATCATA-5' at 4149, and 3'-ATTATA-5' at 4166.

HNF6s

Core promoters

Inverse complement, positive strand, negative direction there is 1: 3'-TTATTAATTC-5', 4542.

Proximal promoters

Negative strand in the negative direction there is 1: 3'-TTATTAATCG-5', 4229.

Negative strand in the positive direction there are 2: 3'-TTATTAATCA-5', 4147, 3'-TTATTGATTA-5', 4164.

Inverse complement, positive strand, positive direction there are 1: 3'-ATATTAACAA-5', 4172.

Distal promoters

Negative strand in the negative direction there are 2: 3'-GTGTTAATAA-5', 1725, 3'-TAGTTGATAA-5', 3527.

Positive strand in the negative direction there is 1: 3'-AAATTGATAA-5', 3361.

Inverse complement, negative strand, negative direction there are 2: 3'-ACATGGACAT-5', 802, 3'-TAATGAACTT-5', 1301.

Inverse complement, positive strand, negative direction there are 2: 3'-AAATTGATAA-5', 3361, 3'-TCATCAACTA-5', 3525.

Negative strand in the positive direction there are 1: 3'-ATGTCCATGG-5', 3581.

Positive strand in the positive direction there is 1: 3'-GAGTCCATTG-5', 3732.

Inverse complement, positive strand, positive direction there is 1: 3'-CCATTGACTC-5', 3736.

HY boxes

Core promoters

Positive strand in the negative direction there is 1: 3'-TGAGGG-5' at 4558.

Inverse complement, negative strand, negative direction there is 1: 3'-CCCTCA-5', 4498.

Negative strand in the positive direction there is 1: 3'-TGTGGG-5', 4395.

Distal promoters

Negative strand in the negative direction there is 1: 3'-TGTGGG-5' at 749.

Positive strand in the negative direction there are 4: 3'-TGAGGG-5' at 88, 3'-TGAGGG-5' at 2699, 3'-TGAGGG-5' at 3652, 3'-TGTGGG-5' at 3712.

Inverse complement, negative strand, negative direction there are 3: 3'-CCCTCA-5', 2702, 3'-CCCACA-5', 3184, 3'-CCCTCA-5', 3889.

Positive strand in the positive direction there are 2: 3'-TGTGGG-5', 2965, 3'-TGTGGG-5', 3533.

Negative strand in the positive direction there are 3: 3'-TGAGGG-5', 258, 3'-TGAGGG-5', 3479, 3'-TGAGGG-5', 3879.

Inverse complement, negative strand, positive direction there are 3: 3'-CCCTCA-5', 88, 3'-CCCTCA-5', 3207, 3'-CCCTCA-5', 3503.

Inverse complement, positive strand, positive direction there is 5: 3'-CCCTCA-5', 494, 3'-CCCTCA-5', 662, 3'-CCCTCA-5', 1783, 3'-CCCACA-5', 1803, 3'-CCCTCA-5', 3185.

Initiator elements

Metal responsive elements

Pyrimidine boxes

Pyrimidine boxes and their complements in the negative direction: 3'-CCTTTT-5' at 2459, 3'-CCTTTT-5' at 2927, and 3'-CCTTTT-5' at 2968 occur. Inverse pyrimidine boxes and their complements occur 3'-AAAAGG-5' at 105, 3'-AAAAGG-5' at 1107, 3'-AAAAGG-5' at 3345, and 3'-AAAAGG-5' at 3441.

Pyrimidine boxes in the positive direction: 3'-CCTTTT-5' at 135 and 3'-CCTTTT-5' at 291 and their complements are close to ZNF497.

STAT5s

TATA boxes

TATCCAC boxes

None occur.

TAT boxes

Only an inverse and its complement occurs between ZSCAN22 and A1BG: 3'-TACCTAT-5' at 2996 nts from ZSCAN22.

Telomeric repeat DNA-binding factors

Copying the consensus telomeric repeat DNA-binding factor (TRF): 3'-TTAGGG-5' and putting the sequence in "⌘F" locates this sequence in the A1BG negative direction, nucleotide positions as can be found by the computer programs.

In the nucleotides between ZSCAN22 and A1BG there is at least one 3'-TTAGGG-5' beginning about 680 nucleotides from ZSCAN22 or ending at about 686 nts.

Homo sapiens genes containing these are found using Homo sapiens "TRF (TTAGGG repeat-binding factor)".

W boxes

Proximal promoters

Inverse W boxes occur in the negative direction of A1BG: 3'-GGTCAA-5' at 4416 and 3'-GGTCAA-5' at 4308.

W boxes occur in the positive direction of A1BG: 3'-CTGACC-5' and its complement at 4216 and inverse W boxes occur 3'-GGTCAG-5' and its complement at 4270.

Distal promoters

A W box occurs 3'-CTGACC-5' at 3749, whereas 3'-CTGACT-5' at 17, 3'-TTGACT-5' at 130, 3'-TTGACT-5' at 307, and 3'-CTGACC-5' at 734 occur close to ZSCAN22, but 3'-CTGACT-5' at 1935 could be associated ZSCAN22 or an unknown gene between it and A1BG, along with their complements.

W box inverses occur 3'-GGTCAG-5' at 1353 and 3'-AGTCAG-5' at 2101, 3'-GGTCAG-5' at 2221, 3'-AGTCAG-5' at 2608, 3'-AGTCAA-5' at 2614, and 3'-AGTCAG-5' at 2619 along with their complements.

W boxes in the positive direction occur 3'-CTGACC-5' at 1662, 3'-CTGACC-5' at 2213, 3'-TTGACC-5' at 2873, 3'-CTGACT-5' at 2945, and 3'-TTGACC-5' at 4018 that could be associated with A1BG, along with 3'-TTGACC-5' at 1953, 3'-CTGACT-5' at 2674, and 3'-TTGACT-5' at 3735.

See also

References

  1. 1.0 1.1 HGNC2019 (10 December 2019). "ZSCAN22 zinc finger and SCAN domain containing 22 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  2. 2.0 2.1 HGNC2019 (10 December 2019). "MIR6806 microRNA 6806 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  3. 3.0 3.1 RefSeqJuly2008 (10 December 2019). "A1BG alpha-1-B glycoprotein [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  4. Tian M, Cui YZ, Song GH, Zong MJ, Zhou XY, Chen Y, Han JX (2008). "Proteomic analysis identifies MMP-9, DJ-1 and A1BG as overexpressed proteins in pancreatic juice from pancreatic ductal adenocarcinoma patients". BMC Cancer. 8: 241. doi:10.1186/1471-2407-8-241. PMC 2528014. PMID 18706098.
  5. 5.0 5.1 5.2 5.3 "AceView: A1BG". Retrieved May 11, 2013.
  6. 6.0 6.1 6.2 6.3 HGNC2019 (10 December 2019). "A1BG-AS1 A1BG antisense RNA 1 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  7. 7.0 7.1 HGNC2019 (10 December 2019). "ZNF497 zinc finger protein 497 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  8. 8.0 8.1 HGNC2019 (10 December 2019). "LOC100419840 zinc finger protein 446 pseudogene [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  9. 9.0 9.1 HGNC2019 (10 December 2019). "LOC105372483 uncharacterized LOC105372483 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  10. 10.0 10.1 HGNC2019 (10 December 2019). "RNA5SP473 RNA, 5S ribosomal pseudogene 473 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.

External links

{{Phosphate biochemistry}}Template:Sisterlinks