Complex locus A1BG and ZNF497

Jump to navigation Jump to search

ZSCAN22

  1. Gene ID: 342945 is ZSCAN22 zinc finger and SCAN domain containing 22.[1] ZSCAN22 is transcribed in the negative direction from LOC100887072.[1]
  2. Gene ID: 102465484 is MIR6806.[2] MIR6806 is transcribed in the negative direction from LOC105372480.[2]

Alpha-1-B glycoprotein

  1. Gene ID: 1 is Alpha-1-B glycoprotein, a 54.3 kDa protein in humans that is encoded by the A1BG gene.[3] A1BG is transcribed in the positive direction from ZNF497.[3] The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. Patients who have pancreatic ductal adenocarcinoma show an overexpression of A1BG in pancreatic juice.[4] The gene contains 20 distinct introns.[5] Transcription produces 15 different mRNAs, 10 alternatively spliced variants and 5 unspliced forms.[5] There are 4 probable alternative promoters, 4 non overlapping alternative last exons and 7 validated alternative polyadenylation sites.[5] The mRNAs appear to differ by truncation of the 5' end, truncation of the 3' end, presence or absence of 4 cassette exons, overlapping exons with different boundaries, splicing versus retention of 3 introns.[5]
  2. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[6] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[6]

ZNF497

  1. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[6] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[6]
  2. Gene ID: 162968 is ZNF497 zinc finger protein 497.[7] ZNF497 is transcribed in the positive direction from RNA5SP473.[7]
  3. Gene ID: 100419840 is LOC100419840 zinc finger protein 446 pseudogene.[8] LOC100419840 may be transcribed in the positive direction from LOC105372483.[8]
  4. Gene ID: 105372483 is LOC105372483 uncharacterized LOC105372483 ncRNA.[9] LOC105372483 is transcribed in the negative direction from LOC100419840.[9]
  5. Gene ID: 106479017 is RNA5SP473 RNA, 5S ribosomal pseudogene 473.[10] RNA5SP473 may be transcribed in the negative direction from ZNF497.[10]

AGC boxes

An inverse AGC box occurs negative strand, negative direction, 3'-CCGCCGA-5' at 1754 nts from ZSCAN22 toward A1BG in the distal promoter with its complement on the positive strand, negative direction.

ATA boxes

Core promoters

There is the following inverse ATA box on the negative strand, negative direction: 1, 3'-AAATAA-5' at 4537 inside A1BG as the TSS is at 4460 nts from ZSCAN22.

Proximal promoters

There is the following inverse ATA box on the positive strand, negative direction: 3'-AAATAA-5' at 4221.

There is one inverse and inverse complement between 4050 and 4300 in the positive direction: 3'-AAATAA-5' at 4142, and 3'-TTTATT-5' at 4142.

Distal promoters

There is the following ATA box on the negative strand in the negative direction: 1, 3'-AATAAA-5' at 1726 nts from ZSCAN22.

There are the following ATA boxes on the positive strand in the negative direction: 3, 3'-AATAAA-5' at 3014, 3'-AATAAA-5' at 3335, and 3'-AATAAA-5' at 4072.

There are the following inverse ATA boxes on the positive strand, negative direction: 4, 3'-AAATAA-5' at 3013, 3'-AAATAA-5' at 3334, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075.

There is the following ATA box on the negative strand in the positive direction: 1, 3'-AATAAA-5' at 3427. It has a complement on the positive strand in the positive direction: 1, 3'-TTATTT-5' at 3427.

There is another inverse complement ATA box on the negative strand in the positive direction in distal promoter: 3'-TTTATT-5' at 2347. It also has an inverse in the distal promoter: 3'-AAATAA-5' at 2347.

C boxes

D boxes

CAREs

A CARE occurs in the negative direction: 3'-CAACTC-5' at 86 possibly associated with ZSCAN22. But inverse CAREs occur 3'-CTCAAC-5' at 1406, 3'-CTCAAC-5' at 2592, 3'-CTCAAC-5' at 2704, 3'-CTCAAC-5' at 3115, and 3'-CTCAAC-5' at 4096.

A CARE occurs in the positive direction: 3'-CAACTC-5' at 3292 in the positive direction. But inverse CARE occur 3'-CTCAAC-5' at 1406 and 3'-CTCAAC-5' at 1621 and 3'-CTCAAC-5' at 3290.

CArG boxes

CGCG boxes

CRE boxes

E2 boxes

Enhancer boxes

B recognition elements

GA responsive elements

Only one GARE (an inverse) occurs: between ZSCAN22 and A1BG 3'-AAACAAT-5' at 230 nts and its complement.

GATA boxes

GC boxes

H boxes

Core promoters

Between ZSCAN22 and A1BG: There is one inverse and its complement 3'-AGGAGA-5' at 4428 nts.

Between ZNF497 and A1BG: There is an inverse and its complement 3'-AGGACA-5' at 4252. There is five after the TSS: 3'-AGAGAA-5' at 4387, 3'-AGTACA-5' at 4365, 3'-ACCAGA-5' at 4380, 3'-AAGAGA-5' at 4386, 3'-ACGACA-5' at 4392 and their complements.

Proximal promoters

Between ZSCAN22 and A1BG: There is one H box (3'-ANANNA-5'): negative direction, negative strand, 3'-ACACGA-5' at 4402. On the positive strand in the negative direction there are 16: 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5'at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395, with their complements on the negative strand, negative direction.

Between ZNF497 and A1BG: There is one H box (3'-ANANNA-5'): 3'-AGAGAA-5' at 4387 in the proximal promoter, negative strand, positive direction. There are four: 3'-TCATGT-5' at 4365, 3'-TGGTCT-5' at 4380, 3'-TTCTCT-5' at 4386, and 3'-TGCTGT-5' at 4392 and their complements in the positive direction.

Distal promoters

Between ZSCAN22 and A1BG: 3'-AGAGGA-5' at 3387, 3'-AGAGGA-5' at 3638, and 3'-AGAGGA-5' at 3675. One inverse and its complement 3'-AGGAGA-5' at 3790. There are 13 H boxes: 3'-ACATCA-5' at 2541, 3'-ACACCA-5' at 2659, 3'-ACATTA-5' at 2675, 3'-ATAAAA-5' at 2853, 3'-AAAGTA-5' at 2886, 3'-ACATTA-5' at 3064, 3'-AGATGA-5' at 3159, 3'-ACACCA-5' at 3187, 3'-AGAAGA-5' at 3554, 3'-AGACGA-5' at 3707, 3'-ACACCA-5' at 3811, 3'-ACATTA-5' at 3973, and 3'-ACATCA-5' at 4124.

On the positive strand, negative direction, there are 122 H boxes: 3'-AAAAAA-5' at 2461, 3'-AAAAAA-5' at 2462, 3'-AAAAAA-5' at 2463, 3'-AAAAAA-5' at 2464, 3'-AAAAAA-5' at 2465, 3'-AAAAAA-5' at 2466, 3'-AAAAAA-5' at 2467, 3'-AAAAAA-5' at 2468, 3'-AAAAAA-5' at 2469, 3'-AAAAAA-5' at 2470, 3'-AAAGCA-5' at 2473, 3'-AAAGCA-5' at 2479, 3'-AAACAA-5' at 2484, 3'-AAACAA-5' at 2488, 3'-ACAAAA-5' at 2490, 3'-ATAGTA-5' at 2500, 3'-AGAAAA-5' at 2506, 3'-AAAACA-5' at 2508, 3'-AAACAA-5' at 2509, 3'-AGACCA-5' at 2599, 3'-ATACAA-5' at 2642, 3'-ACAAAA-5' at 2644, 3'-AAATCA-5' at 2648, 3'-ACAGGA-5' at 2690, 3'-AAATCA-5' at 2749, 3'-AGAGCA-5' at 2781, 3'-AAAAGA-5' at 2798, 3'-AAAGAA-5' at 2799, 3'-AAAGAA-5' at 2803, 3'-AGAAAA-5' at 2805, 3'-AAAAGA-5' at 2807, 3'-AGAGAA-5' at 2810, 3'-AGAAGA-5' at 2812, 3'-AGAAAA-5' at 2815, 3'-AAAAAA-5' at 2817, 3'-AAAAGA-5' at 2819, 3'-AAAGAA-5' at 2820, 3'-AGAAAA-5' at 2822, 3'-AAAAGA-5' at 2824, 3'-AGAGAA-5' at 2827, 3'-AGAAGA-5' at 2829, 3'-AGAAAA-5' at 2832, 3'-AAAAAA-5' at 2834, 3'-AAAAGA-5' at 2836, 3'-AAAGAA-5' at 2837, 3'-AGAAAA-5' at 2839, 3'-AAAACA-5' at 2841, 3'-AAACAA-5' at 2842, 3'-AAAATA-5' at 2868, 3'-ATATAA-5' at 2873, 3'-AAAAAA-5' at 2929, 3'-ACATCA-5' at 2941, 3'-ACATTA-5' at 2951, 3'-AAACCA-5' at 2971, 3'-AAAATA-5' at 3012, 3'-AAATAA-5' at 3013, 3'-AAAAAA-5' at 3026, 3'-AAACTA-5' at 3029, 3'-AGACCA-5' at 3122, 3'-AAAACA-5' at 3166, 3'-ACATAA-5' at 3169, 3'-ATAAAA-5' at 3171, 3'-AAATTA-5' at 3175, 3'-AGATCA-5' at 3277, 3'-ACAAGA-5' at 3307, 3'-AGAGCA-5' at 3310, 3'-AAAACA-5' at 3329, 3'-AAACAA-5' at 3330, 3'-AAATAA-5' at 3334, 3'-AAACAA-5' at 3338, 3'-ACAAGA-5' at 3340, 3'-AGAAAA-5' at 3343, 3'-AAACCA-5' at 3365, 3'-AGAGGA-5' at 3387, 3'-ACATCA-5' at 3394, 3'-AGAGAA-5' at 3406, 3'-ACATCA-5' at 3415, 3'-ACATTA-5' at 3436, 3'-ATATTA-5' at 3454, 3'-ATATTA-5' at 3468, 3'-AAACCA-5' at 3484, 3'-AGATCA-5' at 3489, 3'-AAAACA-5' at 3511, 3'-ACACAA-5' at 3514, 3'-ATAATA-5' at 3538, 3'-ACAAGA-5' at 3635, 3'-AGAGGA-5' at 3638, 3'-AAAGAA-5' at 3666, 3'-AGAACA-5' at 3668, 3'-AGAGGA-5' at 3675, 3'-ACAAGA-5' at 3759, 3'-AGACCA-5' at 3762, 3'-ACAAAA-5' at 3767, 3'-AGAGCA-5' at 3913, 3'-AGATGA-5' at 3920, 3'-AGACCA-5' at 4031, 3'-ACAAAA-5' at 4066, 3'-AAAAAA-5' at 4068, 3'-AAAATA-5' at 4070, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075, 3'-ATAATA-5' at 4077, 3'-ATAGAA-5' at 4080, 3'-AAAGAA-5' at 4084, 3'-AGAAAA-5' at 4086, 3'-AGACAA-5' at 4182, 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5' at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395.

Between ZNF497 and A1BG: There are two H boxes after nucleotide number 2300 in the negative strand and positive direction: 3'-ACACCA-5' at 2603 and 3'-ACACCA-5' at 3825.

There are two H boxes after nucleotide number 2300 in the positive strand and positive direction: 3'-ACACCA-5' at 3643 and 3'-ACACCA-5' at 3967.

Regarding 3'-ANANNA-5', on the negative strand, positive direction, there are 25 H boxes: 3'-ATACCA-5' at 2591, 3'-ACACCA-5' at 2603, 3'-ATAGAA-5' at 2628, 3'-AAACCA-5' at 2632, 3'-ACACTA-5'at 2637, 3'-ATATAA-5' at 2662, 3'-AGAGCA-5' at 2704, 3'-AGAGGA-5' at 2793, 3'-AAAGGA-5' at 2829, 3'-ACAGAA-5' at 2838, 3'-AAAGAA-5' at 3066, 3'-AGAACA-5' at 3094, 3'-AGAGCA-5' at 3138, 3'-ACAGCA-5' at 3212, 3'-ACAGTA-5' at 3414, 3'-AGATGA-5' at 3476, 3'-ACAGGA-5' at 3572, 3'-AAAGCA-5' at 3599, 3'-ACATGA-5' at 3708, 3'-ACACCA-5' at 3825, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-AAATGA-5' at 4094, 3'-ACATCA-5' at 4116, and 3'-ACATGA-5' at 4154.

On the positive strand, positive direction there are 20 H boxes: 3'-AAATAA-5' at 2347, 3'-AAAAAA-5' at 2451, 3'-AAAACA-5' at 2453, 3'-AGACGA-5' at 2976, 3'-AGACCA-5' at 3022, 3'-AGAGAA-5' at 3056, 3'-AGAAGA-5' at 3058, 3'-AGAGGA-5' at 3302, 3'-AGACGA-5' at 3307, 3'-ACAGAA-5' at 3393, 3'-AGAAGA-5' at 3395, 3'-ACAGGA-5' at 3620, 3'-ACACCA-5' at 3643, 3'-AAACCA-5' at 3948, 3'-ACACCA-5' at 3967, 3'-AGAGGA-5' at 4059, 3'-AAAATA-5' at 4122, 3'-AAATCA-5' at 4137, 3'-AAATAA-5' at 4142, and 3'-ATATTA-5' at 4168.

There inverses on the negative strand in the positive direction of 31 H boxes: 3'-ATGACA-5' at 2412, 3'-ACTACA-5' at 2428, 3'-AGGACA-5' at 2460, 3'-ATTATA-5' at 2548, 3'-ACCACA-5' at 2600, 3'-AGGAAA-5' at 2623, 3'-AATAGA-5' at 2627, 3'-ACCACA-5' at 2634, 3'-AACAGA-5' at 2652, 3'-AGCAAA-5' at 2706, 3'-AGGAAA-5' at 2831, 3'-AACACA-5' at 2835, 3'-ATGACA-5' at 2843, 3'-AGAACA-5' at 3094, 3'-AACACA-5' at 3096, 3'-AGGACA-5' at 3131, 3'-ACCAAA-5' at 3175, 3'-AACAGA-5' at 3179, 3'-AGCAGA-5' at 3214, 3'-AGTAGA-5' at 3416, 3'-AATAAA-5' at 3427, 3'-ACCAGA-5' at 3548, 3'-ATGACA-5' at 3569, 3'-AGGAGA-5' at 3650, 3'-AGCACA-5' at 3740, 3'-ACCACA-5' at 3859, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-ATCATA-5' at 4149, and 3'-ATTATA-5' at 4166.

HNF6s

HY boxes

Initiator elements

Metal responsive elements

Pyrimidine boxes

Pyrimidine boxes and their complements in the negative direction: 3'-CCTTTT-5' at 2459, 3'-CCTTTT-5' at 2927, and 3'-CCTTTT-5' at 2968 occur. Inverse pyrimidine boxes and their complements occur 3'-AAAAGG-5' at 105, 3'-AAAAGG-5' at 1107, 3'-AAAAGG-5' at 3345, and 3'-AAAAGG-5' at 3441.

Pyrimidine boxes in the positive direction: 3'-CCTTTT-5' at 135 and 3'-CCTTTT-5' at 291 and their complements are close to ZNF497.

STAT5s

TATA boxes

TATCCAC boxes

None occur.

TAT boxes

Only an inverse and its complement occurs between ZSCAN22 and A1BG: 3'-TACCTAT-5' at 2996 nts from ZSCAN22.

Telomeric repeat DNA-binding factors

Copying the consensus telomeric repeat DNA-binding factor (TRF): 3'-TTAGGG-5' and putting the sequence in "⌘F" locates this sequence in the A1BG negative direction, nucleotide positions as can be found by the computer programs.

In the nucleotides between ZSCAN22 and A1BG there is at least one 3'-TTAGGG-5' beginning about 680 nucleotides from ZSCAN22 or ending at about 686 nts.

Homo sapiens genes containing these are found using Homo sapiens "TRF (TTAGGG repeat-binding factor)".

W boxes

Proximal promoters

Inverse W boxes occur in the negative direction of A1BG: 3'-GGTCAA-5' at 4416 and 3'-GGTCAA-5' at 4308.

W boxes occur in the positive direction of A1BG: 3'-CTGACC-5' and its complement at 4216 and inverse W boxes occur 3'-GGTCAG-5' and its complement at 4270.

Distal promoters

A W box occurs 3'-CTGACC-5' at 3749, whereas 3'-CTGACT-5' at 17, 3'-TTGACT-5' at 130, 3'-TTGACT-5' at 307, and 3'-CTGACC-5' at 734 occur close to ZSCAN22, but 3'-CTGACT-5' at 1935 could be associated ZSCAN22 or an unknown gene between it and A1BG, along with their complements.

W box inverses occur 3'-GGTCAG-5' at 1353 and 3'-AGTCAG-5' at 2101, 3'-GGTCAG-5' at 2221, 3'-AGTCAG-5' at 2608, 3'-AGTCAA-5' at 2614, and 3'-AGTCAG-5' at 2619 along with their complements.

W boxes in the positive direction occur 3'-CTGACC-5' at 1662, 3'-CTGACC-5' at 2213, 3'-TTGACC-5' at 2873, 3'-CTGACT-5' at 2945, and 3'-TTGACC-5' at 4018 that could be associated with A1BG, along with 3'-TTGACC-5' at 1953, 3'-CTGACT-5' at 2674, and 3'-TTGACT-5' at 3735.

See also

References

  1. 1.0 1.1 HGNC2019 (10 December 2019). "ZSCAN22 zinc finger and SCAN domain containing 22 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  2. 2.0 2.1 HGNC2019 (10 December 2019). "MIR6806 microRNA 6806 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  3. 3.0 3.1 RefSeqJuly2008 (10 December 2019). "A1BG alpha-1-B glycoprotein [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  4. Tian M, Cui YZ, Song GH, Zong MJ, Zhou XY, Chen Y, Han JX (2008). "Proteomic analysis identifies MMP-9, DJ-1 and A1BG as overexpressed proteins in pancreatic juice from pancreatic ductal adenocarcinoma patients". BMC Cancer. 8: 241. doi:10.1186/1471-2407-8-241. PMC 2528014. PMID 18706098.
  5. 5.0 5.1 5.2 5.3 "AceView: A1BG". Retrieved May 11, 2013.
  6. 6.0 6.1 6.2 6.3 HGNC2019 (10 December 2019). "A1BG-AS1 A1BG antisense RNA 1 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  7. 7.0 7.1 HGNC2019 (10 December 2019). "ZNF497 zinc finger protein 497 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  8. 8.0 8.1 HGNC2019 (10 December 2019). "LOC100419840 zinc finger protein 446 pseudogene [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  9. 9.0 9.1 HGNC2019 (10 December 2019). "LOC105372483 uncharacterized LOC105372483 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  10. 10.0 10.1 HGNC2019 (10 December 2019). "RNA5SP473 RNA, 5S ribosomal pseudogene 473 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.

External links

{{Phosphate biochemistry}}Template:Sisterlinks