The following mapping file formats are supported in CloVR-tracks:
CloVR-formatted mapping files
For datasets with one or more fasta files, the associated CloVR-mapping file is tab-delimited and describes the features of each file.
This format requires that:
1. All entries are tab-delimited.
2. All entries in every column are defined.
3. A varying number of colunms may be defined, but three colunms are mandatory: File, SampleName and Description. Additional colunms could be used for pairwise comparisions (see below).
4. The header line begins with: “#File<tab>SampleName†and ends in “Descriptionâ€.
5. There are no duplicate header fields or file names.
6. No header fields or corresponding entries contain invalid characters (only alphanumeric and underscores are allowed).
Below are two simple examples:
#File SampleName PH_p Gender_p Status Description A.fasta sampleA low male control none B.fasta sampleB low female control none C.fasta sampleC high male control none D.fasta sampleD high female treated none
#File SampleName BodySite_p Description A.fasta sampleA oral oral_visit1_subject0001 B.fasta sampleB airways airways_visit1_subject0001 C.fasta sampleC oral oral_visit2_subject0001 D.fasta sampleD airways airways_visit2_subject0001
Pairwise comparisons: To utilize the Metastats statistical methodology for differential abundance detection, the associated header field must end with “_pâ€, (e.g. “Treatment_pâ€, or “PH_pâ€). Otherwise Metastats will skip pairwise analysis of the entire header field. Please note that only groups containing more than one sample can be compared.
Qiime-formatted mapping files
In some cases (typically 16S), sequence data may consist of a single fasta file that contains sequences from multiple samples, individually tagged by sample-specific barcodes as commonly used in the 454 amplicon sequencing protocol. The mapping file provides sample-associated information with the following Qiime-based formatting requirements:
#SampleID BarcodeSequence LinkerPrimerSequence Treatment_p Description Sample1 AGCACGAGCCTA   TATGCTGCCTCCCGTAGGAGT  Control   male Sample2  AGCACGAGCCTA   TATGCTGCCTCCCGTAGGAGT  Diabetic  female Sample3  AACTCGTCGATG   TATGCTGCCTCCCGTAGGAGT  Control   female Sample4  ACAGACCACTCA   TATGCTGCCTCCCGTAGGAGT  Diabetic  male
where:
1. All entries are tab-delimited.
2. All entries in every column are defined.
3. The header line begins with the following fields: “#SampleID<tab>BarcodeSequence<tab> LinkerPrimerSequenceâ€.
4. The header line must end with the field “Descriptionâ€.
5. The BarcodeSequence and LinkerPrimerSequences fields have valid
IUPAC DNA characters.
6. There are no duplicate header fields.
7. No header fields or corresponding entries contain invalid characters
(alphanumeric and underscore only allowed).
8. There are no duplicates when the primer and barcodes are appended.
Pairwise comparisons: To utilize the Metastats statistical methodology for differential abundance detection, the associated header field must end with “_pâ€, (e.g. “Treatment_pâ€, or “PH_pâ€). Otherwise Metastats will skip pairwise analysis of the entire header field.  Please note that only groups containing more than one sample can be compared.