The CloVR-Microbe protocol supports whole-genome shotgun (WGS) microbial sequencing projects that use Sanger, 454 or Illumina sequence data. CloVR-Microbe includes all software to support the entire process of sequence assembly, gene finding and annotation, from raw unassembled sequence reads to annotated sequence data, which can be further manually edited using common genome browsers or directly submitted to the NCBI sequence databases. Assembly and annotation components of the CloVR-Microbe pipeline can be executed separately, if users prefer to annotate assemblies through alternative pipelines or want to annotate pre-assembled sequence data. To run the entire CloVR-Microbe pipelines on 454 or Illumina data or only the assembly and annotation portions individually, refer to the corresponding walkthrough.
CloVR-Microbe includes tools for sequence assembly (Celera Assembler, SPAdes Assembler), the prediction of protein-coding genes (Glimmer), tRNAs (tRNAscan-SE) and rRNAs (RNAmmer) and the functional annotation of protein-coding genes (IGS Annotation Engine). Input files can be provided in SFF or FASTQ file formats, which are part of the standard output of the Roche/454 and Illumina sequencers, respectively. CloVR-Microbe output includes assembled scaffold/contig sequences, gene sequences (protein-coding genes, tRNAs, rRNAs), protein sequences, and a summary report file. Annotated sequence files are provided in GenBank and XML format as well as in a table format, which can be used for sequence submission to GenBank through the Sequin tool.