GEBA focuses on cultured isolates that have a formal species description (type strains). A frequent misconception is that FTY720 molecular weight the types used in taxonomy (type strains, type species, type genera etc.) are taxonomic types used for representing a certain taxon by its most typical member. If so, they were bound to, and dependent on, certain taxonomic views such as species concepts or even the general notion that evolution is best represented by a hierarchical classification such as the currently dominating Linnean taxonomy [3]. The critique of hierarchical classifications as being unsuitable for microbiology because of the occurrence of lateral gene transfer, yielding rather a network than a hierarchy [4], would then also affect GEBA.
But types are nomenclatural constructs which, given a certain taxonomic view, define which names are to be used for a taxon [5]. In microbiology, the use of type strains for genome projects has the additional practical advantage that these strains are guaranteed, or nearly so, to be deposited in at least two distinct culture collections in two distinct countries [6,7]. This ensures that living material is available for follow-up studies that test genome-sequence-derived hypotheses. The availability of biological reference material or even genomic DNA (gDNA) [8] is a great step forward to ensuring reproducibility of the results [2]. The target organisms of GEBA are selected using a 16S rRNA gene-sequence-based phylogenetic tree (the gene on which the current bacterial and archaeal classification is largely based [6,9]), progressively filling in the genomic gaps [10].
Phylogeny-driven genome-sequencing projects are promising for improving microbial classification [4] and particularly for the binning of metagenomic sequences [10]. In the long term, the genomes of representatives of each branch of the tree of life, and of all type strains at the time of accession into public culture collections, will likely be sequenced. But GEBA targeted the organisms deemed genomically more interesting [10] first, and thus required a phylogeny-derived scoring system [11,12] covering all strains of potential interest. GEBA started with a pilot project (165 strains) that was subsequently extended to approximately 250 target strains and then followed by two phases of 1,000 target strains each. About 140 GEBA genomes have been published at the time of writing (October 2012).
For instance, target organisms of the GEBA pilot project included the type strains of Ktedonobacter racemifer, the bacterium with the largest genome sequence obtained to date [13], and Pyrolobus fumarii, the archaeon with the highest Anacetrapib known optimal temperature [14]. Taxonomic conclusions (e.g., reclassifications) were drawn from some of the newly obtained genomic information [15,16].