Due to the precious nature of challenging samples, such as liquid biopsies and FFPE tumor tissue, research aimed at identifying the best treatment regimen and molecular diagnoses of diseases through genetic analysis requires the preparation of amplified genomic material from small amounts of DNA or single cells. It is therefore critical for whole genome amplification (WGA) technologies to allow for accurate and reproducible detection of single-nucleotide variations (SNVs) and copy number variations (CNVs) in genomic material from limited samples with high fidelity and genome coverage. Additionally, these technologies should be flexible enough to be used on a variety of analytical platforms. To address these needs, we have developed the PicoPLEX Single Cell WGA Kit v3 (PicoPLEX WGA v3), a platform-agnostic whole genome amplification system. This kit uses optimized enzymes, primers, and protocols for exceptional sequencing coverage, uniformity, and accuracy in detecting SNVs, all while increasing the resolution for CNV detection relative to previous PicoPLEX WGA iterations. Importantly, the system maintains a simple workflow (Figure 1) and the unmatched cell-to-cell reproducibility that is a hallmark of our PicoPLEX technology.
In this technical note, we demonstrate CNV detection to 25.5-Mb resolution at a depth of 1 million read pairs in a single cell with validated copy number gains and losses. SNV detection and reproducibility are shown to be superior to competitive technologies.
Results
Accurate detection of targeted SNVs using the PicoPLEX WGA v3 system
SNV detection by PicoPLEX WGA v3 chemistry was compared to SNVs observed in a bulk gDNA control and two other commonly used kits, DOPlify (Perkin Elmer) and REPLI-g (Qiagen). Whole-genome amplification products from single- or five-cell samples of a GM12878 cell line (Coriell Institute) were prepared in replicates, and SNVs detected were reported as numbers and percentages. Although the REPLI-g system produced sufficient yield, only one of the five-cell samples contained enough amplicon material to sequence, and, therefore, no data is available for the other samples. An intersection of Genome In a Bottle (GIB) variants to hg19 (human genome assembly GRCh37, Ensembl) indicates that a total of 78 variants are expected to be present for the GM12879 cell line. Due to the amplicon design, paired-read lengths of 75 bp were too short to capture 4 out of the 78 SNVs; therefore, the total number of capturable SNVs was cut down to 74. VarDict was used to interpret SNVs from BAM files using the following criteria: depth of SNV position ≥10 reads (10X coverage), allele frequency ≥20%.
PicoPLEX WGA v3 showed higher call rates and fewer allele dropouts than the other two kits tested.
The PicoPLEX WGA v3 system is more accurate at detecting SNVs when compared to DOPlify and REPLI-g technologies
Depth of SNV position ≥10 Allele frequency ≥20%
PicoPLEX WGA v3 1 cell
PicoPLEX WGA v3 5 cells
DOPlify 1 cell
DOPlify 5 cells
REPLI-g 1 cell
REPLI-g 5 cells
Bulk
Rep1
Rep2
Rep1
Rep2
Rep1
Rep2
Rep1
Rep2
Rep1
Rep2
Rep1
Rep2
Number of SNVs called
74
57
67
69
67
34
57
62
67
Failed
Failed
40
Failed
Number of false positives
3
1
0
1
5
1
0
7
Failed
Failed
0
Failed
Average false positives
0.02%
0.005%
0.03%
0.035%
Failed
0
Failed
Call rate
78%
92%
95%
92%
47%
78%
85%
92%
Failed
Failed
55%
Failed
Average call rate
85%
93%
62%
88%
Failed
55%
Failed
Missed
17
7
5
7
40
17
12
7
Failed
Failed
34
Failed
Average locus dropouts
16.2%
8.1%
38.5%
12.8%
Failed
45.9%
Failed
Number of heterozygous SNVs called
45
45
38
45
45
36
31
38
41
Failed
Failed
32
Failed
Average allele dropouts
7.8%
0.0%
25.6%
12.2%
Failed
71.1%
Failed
Accurate detection of segmental aneuploidies with low-pass sequencing
The performance of PicoPLEX Single Cell WGA Kit v3 was examined by preparing samples with known segmental aneuploidies. These include cell line GM22601, known to contain a 25.5 Mb deletion in chromosome 4 that is implicated in Wolf-Hirschhorn Syndrome, and GM05067, known to contain an amplification of a 44.7 Mb region in chromosome 9. Single-cell preparations using PicoPLEX WGA v3, followed by library preparation and Illumina sequencing, showed detection of both aberrations, along with good global genome representation for GM12878, a standard euploid cell line. These data demonstrate excellent resolution in the detection of segmental aneuploidies, in addition to the reliable detection of chromosomal aneuploidies (data not shown).
Best-in-class reproducibility
While there are many options for whole-genome amplification, none are as reproducible as PicoPLEX technology. We compared the reproducibility of the GM12878 single-cell preparations (see table above) using PicoPLEX WGA v3 and two competitors (DOPlify and REPLI-g, Figure 3). As expected, PicoPLEX demonstrated significantly better reproducibility than the competitor systems. This reproducibility is critical when samples are limiting, and an experimenter does not have the luxury of performing multiple preparations to obtain a reliable result.
Conclusions
In summary, PicoPLEX WGA v3 enables preparation of amplified DNA, in under three hours, that is highly dependable and results in accurate measurement of single-nucleotide variants and copy number variations. When compared to the QIAGEN REPLI-g and PerkinElmer DOPlify systems, PicoPLEX WGA v3 shows superior mutation (SNV) detection and reproducibility. In addition, detection of segmental aneuploidies at a resolution of 25.5 Mb are demonstrated. The improvements in PicoPLEX WGA v3 make this an excellent choice for a variety of single-cell and low-input applications, including characterizing the heterogeneity and tumor evolution of cancer tissues, and profiling circulating tumor and immune cells.
Methods
Sample preparation
GM12878 cells were sourced from the Coriell Institute, stained with CD81-FITC antibody, and flow sorted using a BD FACSJazz instrument. WGA products were prepared from single-cell samples of GM12878 in replicates, using a prototype of PicoPLEX WGA v3. 1 ng of amplified product was used as input for a Nextera XT kit, and the resulting libraries were sequenced on an Illumina MiSeq platform using a read length of 2 x 75 bp.
Bioinformatic analysis
FASTQ reads were trimmed to remove the primer sequence from the 5' end of the read. Trimmed reads were aligned using BWA (default parameters). Single-nucleotide variants were generated using GATK (according to its best-practices guidelines, found at https://software.broadinstitute.org/gatk/best-practices/) and filtered at a minimum depth of 10X, with a minimum quality score of 75. Allele drop-out rates were calculated as described in Leung et al. 2015. CNVs were generated using CNV-seq (Xie and Tammi, 2009). Normalized counts in 50 kb bins from H929 cells were compared to GM12878 cells (euploid reference) to detect CNVs.
References
Leung, M. L, et al. SNES: single nucleus exome sequencing. Gen. Biol. 16, 55 (2015).
Xie, C. & Tammi, M. T. CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics10, 80 (2009).