Whole genome sequences for M.tuberculosis isolates from the TDR strain bank

Phelan, J, Coll, F, Mcnerney, R, Ascher, DB, Pires, DEV, Furnham, N, Coeck, N, Hill-Cawthorne, GA, Nair, MB, Mallard, K, Ramsay, A, Campino, S, Hibberd, M, Pain, A, Rigouts, L and Clark, T. 2015. Whole genome sequences for M.tuberculosis isolates from the TDR strain bank. [Online]. European Nucleotide Archive. Available from: http://www.ebi.ac.uk/ena/data/view/PRJEB11653

Phelan, J, Coll, F, Mcnerney, R, Ascher, DB, Pires, DEV, Furnham, N, Coeck, N, Hill-Cawthorne, GA, Nair, MB, Mallard, K, Ramsay, A, Campino, S, Hibberd, M, Pain, A, Rigouts, L and Clark, T. Whole genome sequences for M.tuberculosis isolates from the TDR strain bank. [Internet] LSHTM Data Compass. European Nucleotide Archive; 2015. Available from: http://www.ebi.ac.uk/ena/data/view/PRJEB11653

Phelan, J, Coll, F, Mcnerney, R, Ascher, DB, Pires, DEV, Furnham, N, Coeck, N, Hill-Cawthorne, GA, Nair, MB, Mallard, K, Ramsay, A, Campino, S, Hibberd, M, Pain, A, Rigouts, L and Clark, T (2015). Whole genome sequences for M.tuberculosis isolates from the TDR strain bank. [Data Collection]. European Nucleotide Archive. http://www.ebi.ac.uk/ena/data/view/PRJEB11653

Description

Description of data capture All DNA samples underwent Illumina sequencing on the HiSeq 2000 platform at the KAUST genomic facility, generating paired-end reads of 150 bp (Additional file 1: Table S1, pathogenseq.lshtm.ac.uk/tdr, Additional file 1: Table S2). All raw sequence data can be downloaded from the ENA short read archive (accession number PRJEB11653). For the raw sequence data, trimmomatic (v0.33) software [42] (parameters: LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36) was used to remove or truncate reads of low quality. High quality reads were then mapped to the H37Rv reference genome (Genbank accession: AL123456.3) using the BWA-mem (v0.7.12) algorithm [43] (parameters: -c 100 -M -T 50). From the resulting alignments, SAMtools (v1.3) [44] and GATK (v3.5) [45] software (default parameter settings) were used to call SNPs and small indels, and the interaction of variants between the methods retained. Mappability values were calculated along the reference genome using GEM-Mappability software with a k-mer length of 50 bp and a 0.04 % substitution threshold [46]. Non-unique SNP sites (mappability values greater than one) were removed. Sample genotypes were called using the majority allele (minimum frequency 75 %) in positions supported by at least 20-fold total genome coverage, otherwise they were classified as missing. Isolates or SNPs with in excess of 10 % missing genotype calls were excluded. The final dataset included 144 isolates and 17,952 genome-wide SNPs.
Data capture method Experiment
Date (Published in a 3rd party system) 4 November 2015
Language(s) of written materials English
Data Creators Phelan, J, Coll, F, Mcnerney, R, Ascher, DB, Pires, DEV, Furnham, N, Coeck, N, Hill-Cawthorne, GA, Nair, MB, Mallard, K, Ramsay, A, Campino, S, Hibberd, M, Pain, A, Rigouts, L and Clark, T
LSHTM Faculty/Department Faculty of Epidemiology and Population Health > Dept of Infectious Disease Epidemiology
Faculty of Infectious and Tropical Diseases > Dept of Pathogen Molecular Biology
Participating Institutions Study consortium
Funders
ProjectFunderGrant NumberFunder URI
UNSPECIFIEDBiotechnology & Biological Sciences Research CouncilUNSPECIFIEDUNSPECIFIED
UNSPECIFIEDNational Health and Medical Research CouncilUNSPECIFIEDUNSPECIFIED
UNSPECIFIEDRené Rachou Research CenterUNSPECIFIEDUNSPECIFIED
UNSPECIFIEDMedical Research CouncilUNSPECIFIEDUNSPECIFIED
UNSPECIFIEDFundação de Amparo à Pesquisa do Estado de Minas GeraisUNSPECIFIEDUNSPECIFIED
Depositor LSHTM Library & Archives Service
Date Deposited 11 Apr 2016 09:57
Last Modified 11 Apr 2016 09:58
Publisher European Nucleotide Archive

Share

Downloads

Data

Filename: AdditionalFile1-TableS1.docx

Description: The isolates according to geographic location and phenotypic drug resistance

Licence:

Content type: Dataset

File size: 57kB

Mime-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document

Download

[img]

Downloads

View details

Metrics & Citations

Google Scholar