TI - DISCUSSION . AB - The pattern of nucleotide substitution between homologous genes gives a good indication of the selective pressure on a coding region . In general , mitochondrial proteins for which function can be assigned have higher Ks/Ka ratios than proteins of unknown function , whereas proteins involved in oxidative PHOSphorylation have among the highest Ks/Ka ratios . Genes atp9 and nad5 are exceptions . Their Ks/Ka ratios indicate less selective pressure than expected for proteins with assigned function . The divergence at the C-terminus of atp9 may account for its low Ks/Ka ratio , but the reason for the low Ks/Ka ratio of nad5 is unclear . The proteins for which function cannot be readily assigned have generally lower Ks/Ka ratios , with some exceptions like Ymf56 , Ymf65 and Ymf68 , which have Ks/Ka ratios >10 . The largest ORF , ymf77 , has a very low Ks/Ka ratio ( 161 ) suggesting reduced selective pressure , which may contribute to the divergence of this protein and the difficulty in matching it with a protein of known function . The mitochondrial genome is very compact with very short intergenic regions ( 38% of the genome ) , only three of which are longer than 63 bp . These intergenic regions appear to be too small to accommodate promoters . This is consistent with the transcript mapping of the T.pyriformis mitochondrial genes which suggested multi gene transcripts ( 16 ) . A working hypothesis is that the entire genome is transcribed from a central bi-directional promoter , which would account for the transcription of all of the genes except the three gene cluster in the 5' portion of the genome . These three genes could be transcribed from a unidirectional promoter in the 5'-3' direction ( Fig 1 ) . The three long intergenic regions in T.thermophila have sequence z scores indicating that these regions are under strong selective pressures ( data not shown ) . The largest intergenic region ( 493 bp ) , between ymf77 and cob , corresponds to the site of the putative central bi-directional promoter set . This position also corresponds to the origin of DNA replication identified in electron micrographs of partially replicated mitochondrial molecules in T.pyriformis ( 43 ) . The sequence of the second largest intergenic region ( 261 bp ) is highly conserved relative to the T.pyriformis homolog . It is located upstream from and immediately adjacent to the three gene cluster transcribed in the 5'-3' direction . This may be the site of a unidirectional promoter responsible for transcribing this three gene cluster . The most notable difference between the T.thermophila and T.pyriformis mitochondrial genomes is the tandem duplication of the nad9 gene in T.thermophila . A similar duplication occurs in T.malaccensis and the divergence between the genes from different species ( orthologous ) is much greater than the divergence among genes within the species ( paralogous ) indicating concerted evolution ( Table 3a ) . Concerted evolution also appears to be operating on the terminally repeated regions ( Table 3b ) . The regions under concerted evolution include the two split portions of the LSU rRNA gene , the leucine tRNA gene interrupting the LSU rRNA sequence and the two short ( 3 and 9 bp ) intergenic regions flanking the tRNA gene . The telomeres at both ends of the mitochondrial DNA molecule ( 53 bp repeats ) are also identical although they start at different positions in the telomere repeat sequence . Morin and Cech have proposed that unequal crossing over and terminal hybridization may be part of a process maintaining the telomeres ( 11,44 ) . Whether this is the mechanism by which telomeres are regenerated or not , it is not a plausible mechanism for the concerted evolution of the inverted terminal repeats as it would be expected to homogenize the terminal tRNA genes . Concerted evolution of paralogs is relatively common ( 45-47 ) . The sequences for the C-terminus of the Atp9 protein and the intergenic region following the atp9 gene from T.thermophila and T.pyriformis have diverged significantly . The T.malaccensis and T.thermophila atp9 gene sequences are very similar at the C-termini and they are more similar to the Atp9 proteins from other organisms than is the T.pyriformis sequence . The most probable evolutionary history is that a DNA sequence was inserted into the C-terminus of the atp9 gene in the lineage leading to T.pyriformis , creating a slightly different C-terminal sequence and substantially lengthening the following intergenic region . One of the putative mitochondrial proteins in Tetrahymena , Ymf77 , presents a particular challenge . This gene has 1321 amino acids ( almost 9% of the Tetrahymena mitochondrial genome ) and apparently has 15 TM . It is conserved in both T.thermophila and T.pyriformis , although not found in Paramecium aurelia ( 48 ) . In spite of its size and apparent function , BLASTP does not yield any reasonable protein candidates . Additional mitochondrial sequences from ciliates intermediate between Tetrahymena and Paramecium may offer additional versions of Ymf77 that give clues to its function . The proteins found in mitochodrial genomes are relatively limited in potential function , which simplifies the identification of mitochondrial ORFs ( 13 ) . In most mitochondrial genomes all of the genes are identified , even R.americana with 97 mitochondrial genes has virtually all of its genes functionally identified ( 22 ) . In spite of this , almost half of the putative proteins in Tetrahymena remain unidentified . Clearly the mitochondrial genomes of the ciliates have diverged dramatically from other mitochondrial genomes , however , the evolutionary mechanisms responsible for this divergence are not immediately obvious . Although sequence similarity remains the primary means of establishing homology from which function can be assigned , physico-chemical parameters provide a powerful additional indicator of potential homology . These parameters can be combined with nominal sequence similarity to establish homology . They are also valuable in clustering unidentified proteins into specific groups .