Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

A gene is a distinct sequence of nucleotides forming part of a chromosome, the order of which determines the order of monomers in a polypeptide or nucleic acid molecule which a cell (or virus) may synthesize.

Theoretical genes

Def. a "theoretical unit of heredity of living organisms ; a gene may take several values and in principle predetermines a precise trait of an organism's form (phenotype), such as hair color"[1] or a "segment of DNA or RNA from a cell's or an organism's genome, that may take several forms and thus parameterizes a phenomenon, in general the structure of a protein; locus"[1] is called a gene.

Here's a theoretical definition:

Def. a specific nucleotide sequence within a gene locus with its own transcription start site(s), introns, exons, and UTRs, that transcribes a specific RNA product is called an isoform, or gene isoform.

Def. any "of several different forms of the same protein, arising from either single nucleotide polymorphisms,[2] differential splicing of mRNA, or post-translational modifications (e.g. sulfation, glycosylation, etc.)"[3] is called an isoform.

Def. a "region of a transcribed gene present in the final functional RNA molecule"[4] is called an exon.

Def. a "portion of a split gene that is included in pre-RNA transcripts but is removed during RNA processing and rapidly degraded"[5] is called an intron.

Gene clusters

GeneID: 348 APOE apolipoprotein E description contains this: "This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes."

Gene expressions

Gene expressions is a suite of genes, and their isoforms, that appear to be biochemically involved in the appearance of a trait.

Although it is harder to regulate the transcription of genes with multiple transcription start sites, "variations in the expression of a constitutive gene would be minimized by the use of multiple start sites."[6]

Earlier "studies led to the design of a super core promoter (SCP) that contains a TATA, Inr, MTE, and DPE in a single promoter (Juven-Gershon et al., 2006b). The SCP is the strongest core promoter observed in vitro and in cultured cells and yields high levels of transcription in conjunction with transcriptional enhancers. These findings indicate that gene expression levels can be modulated via the core promoter."[6]

Gene regulations

Each gene, or its isoforms, is likely to have upregulation and downregulation transcription factors. As each gene is investigated, these enhancers and inhibitors are noted as discovered.

For example, submitting "gene regulation" APOE human to the NCBI gene database returns 28 genes and 21 mouse analogs. The first on the list is GeneID: 2099 ESR1 estrogen receptor 1. "This gene encodes an estrogen receptor, a ligand-activated transcription factor composed of several domains important for hormone binding, DNA binding, and activation of transcription. [...] Estrogen and its receptors are essential for sexual development and reproductive function, but also play a role in other tissues such as bone. Estrogen receptors are also involved in pathological processes including breast cancer, endometrial cancer, and osteoporosis." from the page url= The database also maintains the DNA sequence upstream, downstream, and through the entire gene locus so that analysis of "Alternative promoter usage and alternative splicing result in dozens of transcript variants, but the full-length nature of many of these variants has not been determined. [provided by RefSeq, Mar 2014]" can be attempted. The site lists gene interactions and six variants for three isoforms (1, 2, and 3) and ten experimental transcriptions.

Gene similarities

There are genes on other chromosomes that are similar to each gene being considered. For example, GeneID: 338, Apolipoprotein B, is on chromosome 2.

Eukaryote genes

File:Eukaryote DNA-en.svg
This diagram of a eukaryote cell shows that the DNA is located in the nucleus. Credit: Sponk.

Def. any "of the single-celled or multicellular organisms, of the taxonomic domain Eukaryota, whose cells contain at least one distinct nucleus"[7] is called a eukaryote.

Those specific genes that cause cells to contain at least one distinct nucleus are eukaryote genes.


The last common ancestor of monkeys and apes lived about 25 million years ago. Credit: Smithsonian Institution.

There are "more than 4 million sites where proteins bind to DNA to regulate genetic function, sort of like a switch."[8]

"Humans belong to the biological group known as Primates, and are classified with the great apes, one of the major groups of the primate evolutionary tree. Besides similarities in anatomy and behavior, our close biological kinship with other primate species is indicated by DNA evidence. It confirms that our closest living biological relatives are chimpanzees and bonobos, with whom we share many traits. But we did not evolve directly from any primates living today."[9]

"DNA also shows that our species and chimpanzees diverged from a common ancestor species that lived between 8 and 6 million years ago. The last common ancestor of monkeys and apes lived about 25 million years ago."[9]

Human DNA

File:DNA NoBB.png
This diagram of the structure of DNA shows the four bases; adenine, cytosine, guanine and thymine, and the location of the major and minor groove. Credit: Zephyris.

"[H]uman DNA has millions of on-off switches and complex networks that control the genes' activities. ... [A]t least 80% of the human genome is active, which opposed the previously held idea that most of the DNA are useless."[10]

"DNA contains genes, which hold the instructions for [life. But, these] take up only about 2 percent of the genome ... The human genome is made up of about 3 billion “letters” along strands that make up the familiar double helix structure of DNA. Particular sequences of these letters form genes, which tell cells how to make proteins. People have about 20,000 genes, but the vast majority of DNA lies outside of genes. ... [A]t least three-quarters of the genome is involved in making RNA [...] it appears to help regulate gene activity."[8]

Human genes

"Nine elements were tested, representing a sampling of elements present in the two gene deserts and DACH introns, spread over a 1530-kb region surrounding the human DACH's TATA box."[11]

Gene ID: 1602 is the human gene DACH1 dachshund homolog 1 also known as DACH.[12] DACH1 has three isoforms: a, b, and c.

"[T]he human ... prostaglandin-endoperoxide-synthase-2 [gene contains] a canonical TATA box (nucleotide residues at positions -31 to -25 for the human gene)."[13] This is Gene ID: 5743.

The Drosophila hsp70 has a TATA box containing promoter.[14] This suggests that GeneID: 3308 HSPA4 heat shock 70kDa protein 4 [Homo sapiens], also known as hsp70,[15] has a TATA box in its core promoter.


The genetic information in a genome is held within genes, and the complete set of this information in an organism is called its genotype. A gene is a unit of heredity and is a region of DNA that influences a particular characteristic in an organism. Genes contain an open reading frame that can be transcribed, as well as regulatory sequences such as promoters and enhancers, which control the transcription of the open reading frame.

Only about 1.5% of the human genome consists of protein-coding exons.


"An abundant form of noncoding DNA in humans are pseudogenes, which are copies of genes that have been disabled by mutation.[16] These sequences are usually just molecular fossils, although they can occasionally serve as raw genetic material for the creation of new genes through the process of gene duplication and divergence.[17]

About 2700 formerly active genes are now pseudogenes.


The CpG deficiency is due to an increased vulnerability of methylcytosines to spontaneously deaminate to thymine in genomes with CpG cytosine methylation.[18]


Cytosines in CpG dinucleotides can be methylated to form 5-methylcytosine. In mammals, methylating the cytosine within a gene can turn the gene off, a mechanism that is part of a larger field of science studying gene regulation that is called epigenetics. Enzymes that add a methyl group are called DNA methyltransferases.

In mammals, 70% to 80% of CpG cytosines are methylated.[19]

CpG dinucleotides have long been observed to occur with a much lower frequency in the sequence of vertebrate genomes than would be expected due to random chance. For example, in the human genome, which has a 42% GC content, a pair of nucleotides consisting of cytosine followed by guanine would be expected to occur 0.21 * 0.21 = 4.41% of the time. The frequency of CpG dinucleotides in human genomes is 1% — less than one-quarter of the expected frequency.

Unmethylated CpG sites can be detected by Toll-Like Receptor 9[20] (TLR 9) on plasmacytoid dendritic cells and B cells in humans. This is used to detect intracellular viral, fungal, and bacterial pathogen DNA.

Methylation is central to imprinting, along with histone modifications.[21] Most of the methylation occurs a short distance from the CpG islands (at "CpG island shores") rather than in the islands themselves.[22]

Methylation of CpG sites within the promoters of genes can lead to their silencing, a feature found in a number of human cancers (for example the silencing of tumor suppressor genes). In contrast, the hypomethylation of CpG sites has been associated with the over-expression of oncogenes within cancer cells.[23]


Alu elements are a common source of mutation in humans, but such mutations are often confined to non-coding regions where they have little discernible impact on the bearer.[24]

The mutagenic effect of Alu[25] and retrotransposons in general[26] has played a major role in the recent evolution of the human genome.

The first report of Alu-mediated recombination causing a prevalent inherited predisposition to cancer was a 1995 report about hereditary nonpolyposis colorectal cancer.[27]

"The human diseases caused by Alu insertions include":[28]

The following diseases have been associated with single-nucleotide DNA variations in Alu elements impacting transcription levels:[29]

"The ACE gene, encoding angiotensin-converting enzyme, has 2 common variants, one with an Alu insertion (ACE-I) and one with the Alu deleted (ACE-D). This variation has been linked to changes in sporting ability: the presence of the Alu element is associated with better performance in endurance-oriented events (e.g. triathlons), whereas its absence is associated with strength- and power-oriented performance[30]

The opsin gene duplication which resulted in the re-gaining of trichromacy in Old World primates (including humans) is flanked by an Alu element,[31] implicating the role of Alu in the evolution of three colour vision.


  1. Each gene may be expressed by one of more isoforms usually subject to cell type.

See also


  1. 1.0 1.1 gene. San Francisco, California: Wikimedia Foundation, Inc. 6 August 2015. Retrieved 2015-08-24.
  2. SemperBlotto (6 January 2007). isoform. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  3. (30 November 2008). isoform. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  4. TransControl~enwiktionary (22 February 2008). exon. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  5. SemperBlotto (9 March 2006). intron. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  6. 6.0 6.1 Tamar Juven-Gershon and James T. Kadonaga (2010). "Regulation of gene expression via the core promoter and the basal transcriptional machinery". Developmental Biology. 339 (2): 225–9. doi:10.1016/j.ydbio.2009.08.009. Retrieved 2016-01-16. Unknown parameter |month= ignored (help)
  7. eukaryote. San Francisco, California: Wikimedia Foundation, Inc. August 28, 2012. Retrieved 2012-09-29.
  8. 8.0 8.1 Malcolm Ritter (September 6, 2012). Far from being mostly junk, human DNA is ‘a jungle’ of complex activity, huge project shows. The Washington Post. Retrieved 2012-09-06.
  9. 9.0 9.1 Homo sapiens (3 June 2015). Primate Family Tree. Washington, DC USA: Smithsonian Institution. Retrieved 2015-06-09.
  10. Bryan McBournie (September 6, 2012). Human genome study could unlock the biology of disease. Sigma Xi. Retrieved 2012-09-06.
  11. Marcelo A. Nobrega, Ivan Ovcharenko, Veena Afzal, and Edward M. Rubin (2003). "Scanning human gene deserts for long-range enhancers". Science. 302 (5644): 413. doi:10.1126/science.1088328. PMID 14563999. Retrieved 2012-12-26. Unknown parameter |month= ignored (help)
  12. HGNC (December 20, 2012). DACH1 dachshund homolog 1 (Drosophila) [ Homo sapiens ]. Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2012-12-26.
  13. Tetsuya Kosaka, Atsuro Miyata, Hayato Ihara, Shuntaro Hara, Tamiko Sugimoto, Osamu Takeda, Ei-ichi Takahashi, Tadashi Tanabe (1994). "Characterization of the human gene (PTGS2) encoding prostaglandin‐endoperoxide synthase 2". European Journal of Biochemistry. 221 (3): 889–97. doi:10.1111/j.1432-1033.1994.tb18804.x. Retrieved 2012-12-26. Unknown parameter |month= ignored (help)
  14. Thomas W. Burke and James T. Kadonaga (1997). "The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila". Genes & Development. 11 (22): 3020–31. doi:10.1101/gad.11.22.3020. PMC 316699. PMID 9367984. Unknown parameter |month= ignored (help)
  15. HGNC (February 3, 2013). HSPA4 heat shock 70kDa protein 4 [ Homo sapiens ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2013-02-07.
  16. Harrison P, Hegyi H, Balasubramanian S, Luscombe N, Bertone P, Echols N, Johnson T, Gerstein M (2002). "Molecular Fossils in the Human Genome: Identification and Analysis of the Pseudogenes in Chromosomes 21 and 22". Genome Res. 12 (2): 272–80. doi:10.1101/gr.207102. PMC 155275. PMID 11827946.
  17. Harrison P, Gerstein M (2002). "Studying genomes through the aeons: protein families, pseudogenes and proteome evolution". J Mol Biol. 318 (5): 1155–74. doi:10.1016/S0022-2836(02)00109-2. PMID 12083509.
  18. Scarano E, Iaccarino M, Grippo P, Parisi E (1967). "The heterogeneity of thymine methyl group origin in DNA pyrimidine isostichs of developing sea urchin embryos". Proc. Natl. Acad. Sci. USA. 57 (5): 1394–400. doi:10.1073/pnas.57.5.1394. PMC 224485. PMID 5231746.
  19. Jabbari K, Bernardi G (2004). "Cytosine methylation and CpG, TpG (CpA) and TpA frequencies". Gene. 333: 143–9. doi:10.1016/j.gene.2004.02.043. PMID 15177689. Unknown parameter |month= ignored (help)
  20. Ramirez-Ortiz ZG, Specht CA, Wang JP, Lee CK, Bartholomeu DC, Gazzinelli RT, Levitz SM (2008). "Toll-like receptor 9-dependent immune activation by unmethylated CpG motifs in Aspergillus fumigatus DNA". Infect Immun. 76 (5): 2123–9. doi:10.1128/IAI.00047-08. PMC 2346696. PMID 18332208.
  21. Feil R, Berger F (2007). "Convergent evolution of genomic imprinting in plants and mammals". Trends Genet. 23 (4): 192–9. doi:10.1016/j.tig.2007.02.004. PMID 17316885.
  22. Irizarry RA, Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo K, Rongione M, Webster M, Ji H, Potash JB, Sabunciyan S, Feinberg AP (2009). "The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores". Nature Genetics. 41 (2): 178–86. PMID 19151715.
  23. Jones PA, Laird PW (1999). "Cancer epigenetics comes of age". Nat. Genet. 21 (2): 163–7. doi:10.1038/5947. PMID 9988266. Unknown parameter |month= ignored (help)
  24. International Human Genome Sequencing Consortium (2001). "Initial sequencing and analysis of the human genome". Nature. 409 (6822): 860–921. doi:10.1038/35057062. PMID 11237011.
  25. Shen S, Lin L, Cai JJ, Jiang P, Kenkel EJ, Stroik MR, Sato S, Davidson BL, Xing Y (2011). "Widespread establishment and regulatory impact of Alu exons in human genes". PNAS. 108 (7): 2837–42. doi:10.1073/pnas.1012834108.
  26. Cordaux R, Batzer MA (2009). "The impact of retrotransposons on human genome evolution" (PDF). Nature Reviews Genetics. 10: 691–703. doi:10.1038/nrg2640. PMC 2884099. PMID 19763152.
  27. Nyström-Lahti M, Kristo P, Nicolaides NC; et al. (1995). "Founding mutations and Alu-mediated recombination in hereditary colon cancer". Nat. Med. 1 (11): 1203–6. doi:10.1038/nm1195-1203. PMID 7584997. Unknown parameter |month= ignored (help)
  28. Batzer MA, Deininger PL (2002). "Alu repeats and human genomic diversity" (PDF). Nat. Rev. Genet. 3 (5): 370–9. doi:10.1038/nrg798. PMID 11988762. Unknown parameter |month= ignored (help)
  29. SNPedia: SNP in the promoter region of the myeloperoxidase MPO gene.
  30. Puthucheary Z, Skipworth J, Rawal J, Loosemore M, Van Someren K, Montgomery H (2011). "The ACE Gene and Human Performance: 12 Years On". Sports Medicine. 41: 433–448. doi:10.2165/11588720-000000000-00000. PMID 21615186.
  31. Dulai KS, Von Dornum M, Mollon JD, Hunt DM (1999). "The Evolution of Trichromatic Color Vision by Opsin Gene Duplication in New World and Old World Primates". Genome Research. 9 (7): 629–638. doi:10.1101/gr.9.7.629. PMID 10413401.

External links