Pseudogene

Pseudogene

An illustration of the mutations that can cause pseudogenes. The human sequence is of a pseudogene in the olfactory gene family. The chimpanzee sequence is the functional ortholog. Key differences are highlighted.

Pseudogenes are dysfunctional relatives of

  • Gerstein M, Zheng D (August 2006). "The real life of pseudogenes". Sci. Am. 295 (2): 48–55.  
  • Torrents D, Suyama M, Zdobnov E, Bork P (December 2003). "A Genome-Wide Survey of Human Pseudogenes". Genome Res. 13 (12): 2559–67.  
  • Bischof JM, Chiang AP, Scheetz TE, et al. (June 2006). "Genome-wide identification of pseudogenes capable of disease-causing gene conversion". Hum. Mutat. 27 (6): 545–52.  

Further reading

  • pseudogene interaction database, miRNA-pseudogene and protein-pseudogene interaction maps database
  • Yale University pseudogene database
  • Hoppsigen database (homologous processed pseudogenes)

External links

  1. ^ Vanin EF (1985). "Processed pseudogenes: characteristics and evolution". Annu. Rev. Genet. 19: 253–72.  
  2. ^ Poliseno L (2010). "A coding-independent function of gene and pseudogene mRNAs regulates tumour biology". Nature 465: 1033–1038.  
  3. ^ Herron, Jon C.; Freeman, Scott (2007). Evolutionary analysis (4th ed.). Upper Saddle River, NJ: Pearson Prentice Hall.  
  4. ^ Jacq C, Miller JR, Brownlee GG (September 1977). "A pseudogene structure in 5S DNA of Xenopus laevis". Cell 12 (1): 109–20.  
  5. ^ a b c Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, Choo SW, Lu Y, Denoeud F, Antonarakis SE, Snyder M, Ruan Y, Wei CL, Gingeras TR, Guigó R, Harrow J, Gerstein MB (June 2007). "Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution". Genome Res. 17 (6): 839–51.  
  6. ^ Mighell AJ, Smith NR, Robinson PA, Markham AF (February 2000). "Vertebrate pseudogenes". FEBS Lett. 468 (2–3): 109–14.  
  7. ^ Jurka J (December 2004). "Evolutionary impact of human Alu repetitive elements". Current Opinion in Genetics & Development 14 (6): 603–8.  
  8. ^ Dewannieux M, Heidmann T (2005). "LINEs, SINEs and processed pseudogenes: parasitic strategies for genome modeling". Cytogenet. Genome Res. 110 (1–4): 35–48.  
  9. ^ Dewannieux M, Esnault C, Heidmann T (September 2003). "LINE-mediated retrotransposition of marked Alu sequences". Nat. Genet. 35 (1): 41–8.  
  10. ^ Graur D, Shuali Y, Li WH (April 1989). "Deletions in processed pseudogenes accumulate faster in rodents than in humans". J. Mol. Evol. 28 (4): 279–85.  
  11. ^ a b Baertsch R, Diekhans M, Kent J, Haussler D, Brosius J (October 2008). "Retrocopy contributions to the evolution of the human genome". BMC Genomics 9: 446–54.  
  12. ^ Pavlícek A, Paces J, Zíka R, Hejnar J (October 2002). "Length distribution of long interspersed nucleotide elements (LINEs) and processed pseudogenes of human endogenous retroviruses: implications for retrotransposition and pseudogene detection". Gene 300 (1–2): 189–94.  
  13. ^ Max EE (2003-05-05). "Plagiarized Errors and Molecular Genetics".  
  14. ^ a b Lynch M, Conery JS (November 2000). "The evolutionary fate and consequences of duplicate genes". Science 290 (5494): 1151–5.  
  15. ^ Walsh JB (January 1995). "How often do duplicated genes evolve new functions?". Genetics 139 (1): 421–8.  
  16. ^ Lynch M, O'Hely M, Walsh B, Force A (December 2001). "The probability of preservation of a newly arisen gene duplicate". Genetics 159 (4): 1789–804.  
  17. ^ Harrison PM, Hegyi H, Balasubramanian S, Luscombe NM, Bertone P, Echols N, Johnson T, Gerstein M (February 2002). "Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22". Genome Res. 12 (2): 272–80.  
  18. ^ a b Zhang J (2003). "Evolution by gene duplication: an update.". Trends in Ecology and Evolution 18 (6): 292–298.  
  19. ^ Nishikimi M, Kawai T, Yagi K (October 1992). "Guinea pigs possess a highly mutated gene for L-gulono-gamma-lactone oxidase, the key enzyme for L-ascorbic acid biosynthesis missing in this species". J. Biol. Chem. 267 (30): 21967–72.  
  20. ^ Nishikimi M, Fukuyama R, Minoshima S, Shimizu N, Yagi K (May 1994). "Cloning and chromosomal mapping of the human nonfunctional gene for L-gulono-gamma-lactone oxidase, the enzyme for L-ascorbic acid biosynthesis missing in man". J. Biol. Chem. 269 (18): 13685–8.  
  21. ^ Xue Y, Daly A, Yngvadottir B, Liu M, Coop G, Kim Y, Sabeti P, Chen Y, Stalker J, Huckle E, Burton J, Leonard S, Rogers J, Tyler-Smith C (April 2006). "Spread of an Inactive Form of Caspase-12 in Humans Is Due to Recent Positive Selection". American Journal of Human Genetics 78 (4): 659–70.  
  22. ^ van Baren MJ, Brent MR (May 2006). "Iterative gene prediction and pseudogene removal improves genome annotation". Genome Res. 16 (5): 678–85.  
  23. ^ Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M (February 2003). "Identification of pseudogenes in the Drosophila melanogaster genome". Nucleic Acids Res. 31 (3): 1033–7.  
  24. ^ Long M, Langley CH (April 1993). "Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila". Science 260 (5104): 91–5.  
  25. ^ Hirotsune S, Yoshida N, Chen A, Garrett L, Sugiyama F, Takahashi S, Yagami K, Wynshaw-Boris A, Yoshiki A (May 2003). "An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene". Nature 423 (6935): 91–6.  
  26. ^ Balakirev ES, Ayala FJ (2003). "Pseudogenes: are they "junk" or functional DNA?". Annu. Rev. Genet. 37: 123–51.  
  27. ^ Hirotsune S, Yoshida N, Chen A, Garrett L, Sugiyama F, Takahashi S, Yagami K, Wynshaw-Boris A, Yoshiki A (November 2003). "Addendum: An Expressed Pseudogene Regulates the messenger-RNA Stability of Its Homologous Coding Gene". Nature 426 (6962): 100.  
  28. ^ Gray TA, Wilson A, Fortin PJ, Nicholls RD (August 2006). "The putatively functional Mkrn1-p1 pseudogene is neither expressed nor imprinted, nor does it regulate its source gene in trans". Proc. Natl. Acad. Sci. U.S.A. 103 (32): 12039–44.  
  29. ^ Betrán E, Wang W, Jin L, Long M (May 2002). "Evolution of the phosphoglycerate mutase processed gene in human and chimpanzee revealing the origin of a new primate gene". Mol. Biol. Evol. 19 (5): 654–63.  
  30. ^ Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, Cheloufi S, Hodges E, Anger M, Sachidanandam R, Schultz RM, Hannon GJ (May 2008). "Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes". Nature 453 (7194): 534–8.  
  31. ^ a b Watanabe T, Totoki Y, Toyoda A, Kaneda M, Kuramochi-Miyagawa S, Obata Y, Chiba H, Kohara Y, Kono T, Nakano T, Surani MA, Sakaki Y, Sasaki H (May 2008). "Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes". Nature 453 (7194): 539–43.  
  32. ^ Poliseno, L, Salmena L, Zhang J, Carver B, Haveman WJ, Pandolfi PP (June 2010). "A coding-independent function of gene and pseudogene mRNAs regulates tumour biology". Nature 465 (7301): 1033–8.  
  33. ^ Svensson O, Arvestad L, Lagergren J (May 2006). "Genome-Wide Survey for Biologically Functional Pseudogenes". PLoS Comput. Biol. 2 (5): e46.  
  34. ^ Williams DL, Slayden RA, Amin A, et al. (2009). "Implications of high level pseudogene transcription in Mycobacterium leprae". BMC Genomics 10: 397.  
  35. ^ Koch AL (October 1972). "Enzyme evolution. I. The importance of untranslatable intermediates". Genetics 72 (2): 297–316.  
  36. ^ Sassi SO, Braun EL, Benner SA (April 2007). "The evolution of seminal ribonuclease: pseudogene reactivation or multiple gene inactivation events?". Mol. Biol. Evol. 24 (4): 1012–24.  
  37. ^ Trabesinger-Ruef N, Jermann T, Zankel T, Durrant B, Frank G, Benner SA (March 1996). "Pseudogenes in ribonuclease evolution: a source of new biomacromolecular function?". FEBS Lett. 382 (3): 319–22.  
  38. ^ a b Sharon D, Glusman G, Pilpel Y, Khen M, Gruetzner F, Haaf T, Lancet D (October 1999). "Primate evolution of an olfactory receptor cluster: diversification by gene conversion and recent emergence of pseudogenes". Genomics 61 (1): 24–36.  
  39. ^ Pâques F, Haber JE (June 1999). "Multiple pathways of recombination induced by double-strand breaks in Saccharomyces cerevisiae". Microbiol. Mol. Biol. Rev. 63 (2): 349–404.  
  40. ^ Gerstein M, Zheng D; Zheng (August 2006). "The real life of pseudogenes". Sci. Am. 295 (2): 48–55.  
  41. ^ Scarpulla RC (November 1984). "Processed pseudogenes for rat cytochrome c are preferentially derived from one of three alternate mRNAs". Mol. Cell. Biol. 4 (11): 2279–88.  
  42. ^ Dudov KP, Perry RP (June 1984). "The gene family encoding the mouse ribosomal protein L32 contains a uniquely expressed intron-containing gene and an unmutated processed gene". Cell 37 (2): 457–68.  
  43. ^ Fu LM, Shinnick TM (2007). "Genome-wide analysis of intergenic regions of Mycobacterium tuberculosis H37Rv using Affymetrix GeneChips". EURASIP J Bioinform Syst Biol 2007 (1): 23054.  
  44. ^ Rozowsky JS, Newburger D, Sayward F, Wu J, Jordan G, Korbel JO, Nagalakshmi U, Yang J, Zheng D, Guigó R, Gingeras TR, Weissman S, Miller P, Snyder M, Gerstein MB (June 2007). "The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci". Genome Res. 17 (6): 732–45.  
  45. ^ Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, Cheloufi S, Hodges E, Anger M, Sachidanandam R, Schultz RM, Hannon GJ (May 2008). "Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes". Nature 453 (7194): 534–8.  

References

See also

In addition to the seminal cytochrome c pseudogene[41] and the mouse L 32 ribosomal protein pseudogene rpL32-4A are implied to be potentially functional.[42] From the recent experiments, they found that in a bacterial genome a considerable segment of the intergenic regions are actively transcribed.[43] From the ENCODE project, scientists have found about 20% of the TARS were produced from previously unidentified ‘potential unborn genes’ which says that there are functional pseudogenes inside these regions.[44] To make sure that do the pseudogenes are transcribed in to RNA and to ascertain their functionality the studies on mouse oocyte are very useful where the small interfering RNAs (siRNAs) derived from pseudogene are found to be functional in regulating gene expression.[45] [31] Some pseudogenes are dead yet with some functions strengthen the fact that they are not ‘junk DNA". With the embedded picture of genome annotation the real evolutionary history of pseudogenes will be revealed out in the near future of research.

The era of molecular paleontology is just beginning. The surface of the pseudogene strata is barely studied, but if scientists conduct more research, they may be able to identify many more pseudogenes. The data mining process of large scale identification of pseudogenes is dynamic. The ancient and decayed pseudogenes are escaping from detection, although the recently generated pseudogenes are readily identified by the current techniques which are heavily based on the sequence comparison to well characterized genes. Characterization of pseudogenes will likely be improved as well since the sequence and annotation of the human genome itself are refined and updated. Modern clues may point to some possibilities of pseudogene resurrection- a dead gene become a living one and making a functional protein exist with the evidence.[40]

There are several examples that can be used to support such resurrection. The Bovine Seminal Ribonuclease, which had lain dormant for about 20 million years as a pseudogene, appears to have been resurrected into a functional gene. It is believed that the event called gene conversion may be the cause of such resurrection.[37] The large group of pseudogenes for olfactory receptors (ORs) in metazoans, where 60% of the ORs in the human genome are pseudogenic, are resurrectable may be due to gene conversion events. In a cluster of ORs which contains 16 OR genes and 6 OR pseudogenes on chromosome 17, is appeared to be subjected to many number (20) of gene conversion events over the course of primate evolution.[38] These gene conversion events in OR gene clusters may aid to bring diversity in binding capability at the odorant binding site.[38] Finally, the resurrection of a pseudogene also led to the diversity of immunoglobulin heavy chain variable-region gene segments in the chicken which appears to be brought by the gene conversion event of a single functional gene with more than 80 pseudogenic gene segments.[39]

The duplicated pseudogenic DNA can be resurrected to a functional protein in certain cases as a rare or occasional evolutionary event and may enable sampling of more sequence space for a protein or protein family.[18] The pseudogenes or parts of pseudogenes may be re-utilized once they have been drifted randomly without being subjected to selection pressure for certain period of evolution. Koch, for the first time, postulated an idea about such "untranslatable intermediates" in the evolution of protein.[35] Occasionally, this mechanism may yield a shorter evolutionary route to another desirable or favorable evolutionary energetic minimum although one would generally expect it to produce unviable or unfavorable leaps in sequence space. A longer time will be available to search sequence space by the pseudogene resurrection, but it is believed that it rarely brings into existence the proteins with new functions. The repair of lesions could be achieved by the reinsertion of a deleted segment, the removal (in frame) of an inserted segment, or other events that are likely to be improbable like gene conversion. Conversion of a pseudogene with a functional gene as a donor might improve the probability of pseudogene reactivation provided enough of the pseudogene sequence must be preserved throughout the course to maintain the benefits of expanding the sequence space explored after duplication.[36]

Gene resurrection

Quite a few pseudogenes can go through the process of transcription, either if their own promoter is still intact or in some cases using the promoter of a nearby gene; this expression of pseudogenes may be tissue-specific.[5] In the bacterium Mycobacterium leprae, 43% of its 1,133 pseudogenes are transcribed (as opposed to 49% overall and 57% of its ORFs[34]). However, that does not make them "functional" in the sense that these genes or proteins have an activity that benefits the organism.

Transcription

A bioinformatics analysis has shown that processed pseudogenes can be inserted into introns of annotated genes and be incorporated into alternatively spliced transcripts.[11] This analysis showed strong evidence for transcription of 726 such retrogenes. However, their function was not studied experimentally.

Svensson et al. have published a genome-wide survey of functional pseudogenes.[33]

Surveys

  1. The Drosophila jingwei gene, a functional, chimeric gene which was once thought to be a processed pseudogene.[24]
  2. Makorin1 (MKRN1). In 2003, Hirotsune et al. identified a retrotransposed pseudogene whose transcript purportedly plays a trans-regulatory role in the expression of its homologous gene, Makorin1 (MKRN1) (see also RING finger domain and ubiquitin ligases), and suggested this as a general model under which pseudogenes may play an important biological role.[25] Hirotsune's report prompted two molecular biologists to carefully review scientific literature on the subject of pseudogenes. To the surprise of many, they found a number of examples in which pseudogenes play a role in gene regulation and expression,[26] forcing Hirotsune's group to rescind their claim that they were the first to identify pseudogene function.[27] Furthermore, the original findings of Hirotsune et al. concerning Makorin1 have recently been strongly contested;[28] thus, the possibility that some pseudogenes could have important biological functions was disputed.
  3. Phosphoglycerate mutase 3 (PGAM3P). A processed pseudogene called phosphoglycerate mutase 3 (PGAM3P) actually produces a functional protein.[29]
  4. siRNAs. Some endogenous siRNAs appear to be derived from pseudogenes, and thus some pseudogenes play a role in regulating protein-coding transcripts.[30][31]
  5. piRNAs. Some Piwi-interacting RNAs (piRNAs) are derived from pseudogenes located in piRNA clusters. Those pseudogenes regulate their founding source genes via the piRNA pathway in mammalian testes.
  6. PTENP1 and KRAS1P (KRASP1). In June 2010, Nature published an article showing the mRNA levels of tumour suppressor PTEN and oncogenicKRAS is affected by their pseudogenes PTENP1 and KRASP1. This discovery demonstrated an miRNA decoy function for pseudogenes and identified their transcripts as biologically active units in tumor biology; thus attributing a novel biological role to expressed pseudogenes, as they can regulate coding gene expression, and reveal a non-coding function for mRNAs in disease progression.[32]

By definition, pseudogenes lack a functioning gene product. However, the classification of pseudogenes generally relies on computational analysis of genomic sequences using complex algorithms.[23] This has led to the incorrect identification of pseudogenes. Examples include

Potential function

It has also been shown that the parent sequences that give rise to processed pseudogenes lose their coding potential faster than those giving rise to non-processed pseudogenes.[5]

Processed pseudogenes often pose a problem for gene prediction programs, often being misidentified as real genes or exons. It has been proposed that identification of processed pseudogenes can help improve the accuracy of gene prediction methods.[22]

Pseudogenes can complicate molecular genetic studies. For example, a researcher who wants to amplify a gene by PCR may simultaneously amplify a pseudogene that shares similar sequences. This is known as PCR bias or amplification bias. Similarly, pseudogenes are sometimes annotated as genes in genome sequences.

Disabled genes, or unitary pseudogenes. Various mutations can stop a gene from being successfully transcribed or translated, and a gene may become nonfunctional or deactivated if such a mutation becomes fixed in the population. This is the same mechanism by which non-processed genes become deactivated, but the difference in this case is that the gene was not duplicated before becoming disabled. Normally, such gene deactivation would be unlikely to become fixed in a population, but various population effects, such as genetic drift, a population bottleneck, or in some cases, natural selection, can lead to fixation. The classic example of a unitary pseudogene is the gene that presumably coded the enzyme L-gulono-γ-lactone oxidase (GULO) in primates. In all mammals studied besides primates (except guinea pigs), GULO aids in the biosynthesis of Ascorbic acid (vitamin C), but it exists as a disabled gene (GULOP) in humans and other primates.[19][20] Another interesting and more recent example of a disabled gene links the deactivation of the caspase 12 gene (through a nonsense mutation) to positive selection in humans.[21]

Disabled

Non-processed (or duplicated) pseudogenes. fitness, since an intact functional copy still exists. According to some evolutionary models, shared duplicated pseudogenes indicate the evolutionary relatedness of humans and the other primates.[13] If pseudogenization is due to gene duplication, it usually occurs in the first few million years after the gene duplication, provided the gene has not been subjected to any selection pressure.[14] Gene duplication generates functional redundancy and it is not normally advantageous to carry two identical genes. Mutations that disrupt either the structure or the function of any one of the two genes are not deleterious and will not be removed through the selection process. As a result, the gene that has been mutated gradually becomes a pseudogene and will be either unexpressed or functionless. This kind of evolutionary fate is shown by population genetic modeling[15][16] and also by genome analysis.[14][17] According to evolutionary context, these pseudogenes will either be deleted or become so distinct from the parental genes so that they will no longer be identifiable. Relatively young pseudogenes can be recognized due to their sequence similarity.[18]

Non-processed

Processed (or retrotransposed) pseudogenes. In higher eukaryotes, particularly mammals, retrotransposition is a fairly common event that has had a huge impact on the composition of the genome. For example, somewhere between 30% - 44% of the human genome consists of repetitive elements such as SINEs and LINEs (see retrotransposons).[7][8] In the process of retrotransposition, a portion of the mRNA transcript of a gene is spontaneously reverse transcribed back into DNA and inserted into chromosomal DNA. Although retrotransposons usually create copies of themselves, it has been shown in an in vitro system that they can create retrotransposed copies of random genes, too.[9] Once these pseudogenes are inserted back into the genome, they usually contain a poly-A tail, and usually have had their introns spliced out; these are both hallmark features of cDNAs. However, because they are derived from a mature mRNA product, processed pseudogenes also lack the upstream promoters of normal genes; thus, they are considered "dead on arrival", becoming non-functional pseudogenes immediately upon the retrotransposition event.[10] However, these insertions occasionally contribute exons to existing genes, usually via alternatively spliced transcripts.[11] A further characteristic of processed pseudogenes is common truncation of the 5' end relative to the parent sequence, which is a result of the relatively non-processive retrotransposition mechanism that creates processed pseudogenes.[12]

Processed

There are three main types of pseudogenes, all with distinct mechanisms of origin and characteristic features. The classifications of pseudogenes are as follows:

Types and origin

Pseudogenes for RNA genes are usually more difficult to discover as they do not need to be translated and thus do not have "reading frames".

  1. Homology is implied by sequence identity between the DNA sequences of the pseudogene and parent gene. After aligning the two sequences, the percentage of identical base pairs is computed. A high sequence identity (usually between 40% and 100%) means that it is highly likely that these two sequences diverged from a common ancestral sequence (are homologous), and highly unlikely that these two sequences have evolved independently (see Convergent evolution).
  2. Nonfunctionality can manifest itself in many ways. Normally, a gene must go through several steps to a fully functional protein: transcription, pre-mRNA processing, translation, and protein folding are all required parts of this process. If any of these steps fails, then the sequence may be considered nonfunctional. In high-throughput pseudogene identification, the most commonly identified disablements are premature stop codons and frameshifts, which almost universally prevent the translation of a functional protein product.

Pseudogenes are characterized by a combination of homology to a known gene and nonfunctionality. That is, although every pseudogene has a DNA sequence that is similar to some functional gene, they are nonetheless unable to produce functional final protein products.[6] Pseudogenes are sometimes difficult to identify and characterize in genomes, because the two requirements of homology and nonfunctionality are usually implied through sequence alignments rather than biologically proven.

Properties

Contents

  • Properties 1
  • Types and origin 2
    • Processed 2.1
    • Non-processed 2.2
    • Disabled 2.3
  • Potential function 3
    • Surveys 3.1
    • Transcription 3.2
  • Gene resurrection 4
  • See also 5
  • References 6
  • External links 7
  • Further reading 8

Because pseudogenes are generally thought of as the last stop for genomic material that is to be removed from the genome,[5] they are often labeled as junk DNA. We can define a pseudogene operationally as a fragment of nucleotide sequence that resembles a known protein's domains but with stop codons or frameshifts mid-domain. Nonetheless, pseudogenes contain fascinating biological and evolutionary histories within their sequences. This is due to a pseudogene's shared ancestry with a functional gene: in the same way that Darwin thought of two species as possibly having a shared common ancestry followed by millions of years of evolutionary divergence (see speciation), a pseudogene and its associated functional gene also share a common ancestor and have diverged as separate genetic entities over millions of years.

Although some do not have introns or promoters (these pseudogenes are copied from mRNA and incorporated into the chromosome and are called processed pseudogenes),[3] most have some gene-like features such as promoters, CpG islands, and splice sites. They are different from normal genes due to a lack of protein-coding ability resulting from a variety of disabling mutations (e.g. premature stop codons or frameshifts), a lack of transcription, or their inability to encode RNA (such as with rRNA pseudogenes). The term was coined in 1977 by Jacq et al.[4]

which can have a regulatory role. non-coding DNA similar to other kinds of [2]