Kinghorn Centre for Clinicial Genomics Sample Sequencing Data
These data are from the Coriell Cell Repository NA12878 reference cell line, which has been extensively analysed by the Genome in a Bottle Consortium. The sequencing libraries were generated with Illumina’s TruSeq Nano V2.5 kit using the Hamilton Microlab STAR robotics platform, achieving >400bp inserts. Each library was sequenced on a single lane of an Illumina HiSeq X patterned flow cell, achieving over 130Gb of yield, with > 83% of bases with quality > Q30 in just 2.8 days. The four data sets are of similar quality, and are provided to allow you to assess the reproducibility of the technology. Each data set substantially surpasses the minimum coverage and quality guaranteed by Illumina and is indicative of the potential for the Illumina HiSeq X Ten sequencing system.
Each of the four datasets consists of raw paired-end data (fastq.gz files) and results obtained with the GATK DNAseq best practices pipeline run on each library independently with the recommended parameters for whole genome sequencing.
Download the data to your computer or server, using the links below.
|Read length||151bp PE||151bp PE||151bp PE||151bp PE|
|Raw Read Pairs (PF)||439,013,514||510,726,469||464,350,208||479,861,658|
|Raw Yield (Gb)||131.704||153.218||139.305||143.958|
|% bases >=Q30 (R1/R2)||92.39/81.23||93.18/73.37||89.89/77.44||93.00/78.75|
|% bases >=Q30 (mean)||86.81||83.28||83.67||85.88|