Improve documentation!

Introduction

Purpose of this guide is to give a consolidated and authoritative overview of the data from MyOSD 2016. After reading this guide, you should:

Overview of Datasets

Primers

The 16S alma-alma primer pair is described in Parada et al., 2016.

ribosomal subunit designation direction label sequence reference
16S alma forward 515F-Y 5’-GTGYCAGCMGCCGCGGTAA-3’ Parada et al., 2016
16S alma reverse 926R 5’-CCGYCAATTYMTTTRAGTTT-3’ Parada et al., 2016

16S datasets

Technical samples

Blanks

Several blank samples were sequenced with the 16S alma-alma primer pair as controls. The DNA extractions were done using DNA extraction kits. For each kit one extraction was done on sterile water to ensure the kit was not contaminated. These blank samples are labelled as follows:

Others

sample label comment / description
MYOSD0_2016-06_1 control in compliance with the MyOSD sampling protocol.
MYOSD0_2016-06_2 first contamination step, by placing syringe and filter on the ground
MYOSD0_2016-06_3 intense contamination step, by placing filter and syringe on the ground and by touching Sterivex and syringe openings without gloves
MYOSD0_2016-06_10 Blu Tak experiment
MYOSD0_2016-06_11 Blu Tak experiment

Sample labeling

All samples described here are labelled using the same labeling scheme as the 2015 datasets, which is independent of the sample metadata, and therefore different from the labeling in the 2014.

The current labeling scheme is as follows ${campaign_name}${site_id|kit_number}_${campaign_date}_${artificial_number}_${dataset_name}_${primer_pair_name}

Where:

NOTE: Minor deviations from the scheme are possible in some file names and SILVAngs analysis results. This means that additional information might be included in the sample labels, for example ‘16S/18S’ notation or ‘qc.filt’ suffix (denoting quality filtering and additional length filtering, see Sequence Data Pre Processing ).

Sequence data access

All sequence datasets (both raw and workable) will be made available as soon as we are done with quiality control. All sequence data will be submitted for long-term archival to the European Nucleotide Archive (ENA) (see ENA umbrella project PRJEB5129 and/or OSD 2014 Data Guide), once the manual curation of the contextual data is finalized (see section Contextual Data for more detail).

Pre-processing

The pre-processsing was done using the same workflow as in 2014 with minor modifications - see Sequence Data Pre Processing

Contextual Data

The contextual data for OSD 2016 and MyOSD 2016 is currently being manually curated.

SILVAngs Analysis

The SILVAngs analysis is currently running, stay tuned for results.