Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine-TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a parallel, partially compositionally driven, synonymous codon-usage bias affects the nucleotide topology. As substitutions between serine codon families can proceed through threonine or cysteine intermediates, amino acid data sets might also be affected by the serine codon-usage bias. We suggest that a Dayhoff recoding strategy would partially ameliorate the effects of such bias. Although amino acids provide an alternative hypothesis of pancrustacean relationships, neither the nucleotides nor the amino acids version of this data set seems to bring enough genuine phylogenetic information to robustly resolve the relationships within group, which should still be considered unresolved
Rota Stabelli, O.; Lartillot, N.; Philippe, H.; Pisani, D. (2013). Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study. SYSTEMATIC BIOLOGY SYSTEMATIC BIOLOGY, 62 (1): 121-133. doi: 10.1093/sysbio/sys077 handle: http://hdl.handle.net/10449/22410
Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study
Rota Stabelli, Omar;
2013-01-01
Abstract
Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine-TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a parallel, partially compositionally driven, synonymous codon-usage bias affects the nucleotide topology. As substitutions between serine codon families can proceed through threonine or cysteine intermediates, amino acid data sets might also be affected by the serine codon-usage bias. We suggest that a Dayhoff recoding strategy would partially ameliorate the effects of such bias. Although amino acids provide an alternative hypothesis of pancrustacean relationships, neither the nucleotides nor the amino acids version of this data set seems to bring enough genuine phylogenetic information to robustly resolve the relationships within group, which should still be considered unresolvedFile | Dimensione | Formato | |
---|---|---|---|
2013 SB Rota et al.pdf
accesso aperto
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.13 MB
Formato
Adobe PDF
|
1.13 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.