Additional file 2: Table S2 gives the different parsimonious LGK-974 concentration models, and their estimated parameters, selected by the Akaike criterion (jMODELTEST version 0.1.1, written by Posada [51], available at http://darwin.uvigo.es/software/jmodeltest.html). Tree comparisons We compared the phylogenetic history of aes to the phylogenetic history of the strains, based on the concatenated PXD101 nucleotide sequences of six housekeeping genes (trpA, trpB, pabB, putP, icd and polB) and individual gene sequences, as described elsewhere
[19]. Briefly, each phylogenetic tree T i is firstly transformed into a tree-distance matrix D i , the distance between two strains being the number of branches with positive length connecting them along the tree. The resulting tree distance matrix D i allows the initial tree structure T i to be recovered, independently of branch length. Two tree distance matrices (D i and D j ) (corresponding to two gene trees i and j) can be compared by calculating the Euclidian distance
between Torin 2 them (δ ij ) [52]. A low δ ij value means that the similarity between the two tree distance matrices D i and D j is high, and, consequently, that their tree structures T i and T j are close. As several gene tree structures are compared through this Euclidian distance metric, a new distance matrix Δ can be built with the δ ij elements. This Δ matrix can then be transformed into a “”tree of gene trees”" using a neighbour-joining algorithm [53]. To obtain a support value for each partition of this tree, we applied this same procedure
to 500 bootstrapped sets of data, obtaining 500 Δ matrices and finally, a bootstrapped consensus “”tree of gene Methane monooxygenase trees”". A high bootstrap support value separating two sets of gene trees allows incongruent sets of gene trees to be identified; however, a low bootstrap value suggests that the two sets of trees are not incongruent or that there is insufficient phylogenetic information to reject the hypothesis of incongruence. The “”TreeOfTree”" package is available from the website http://bioinformatics.lif.univ-mrs.fr. Protein structure modelling and analysis Modelling of the Aes protein structure was based on comparison of the available models from MODBASE [54] with models previously obtained using the Tasser-Lite homology modelling server [55, 56]. Although some differences were observed between the models obtained by these two independent approaches, in particular in the N terminus region, the best models proposed by Tasser-Lite and MODBASE were similar overall. Given that our aim was to determine only the approximate location of the Aes polymorphism within the protein structure, the MODBASE model was used for further analysis. The model was finally tested to ensure that it contains an active site consistent with esterase activity. This was carried out using the 3D MSS-Sites program http://bioserv.rpbs.jussieu.