参见作者原研究论文

本实验方案简略版
Jul 2018
Advertisement

本文章节


 

Detection of Heteroplasmic Variants in the Mitochondrial Genome through Massive Parallel Sequencing
大规模平行测序检测线粒体基因组异质粒   

引用 收藏 提问与回复 分享您的反馈 Cited by

Abstract

Detecting heteroplasmies in the mitochondrial DNA (mtDNA) has been a challenge for many years. In the past, Sanger sequencing was the main option to perform this analysis, however, this method could not detect low frequency heteroplasmies. Massive Parallel Sequencing (MPS) provides the opportunity to study the mtDNA in depth, but a controlled pipeline is necessary to reliably retrieve and quantify the low frequency variants. It has been shown that differences in methods can significantly affect the number and frequency of the retrieved variants.

In this protocol, we present a method involving both wet lab and bioinformatics that allows identifying and quantifying single nucleotide variants in the full mtDNA sequence, down to a heteroplasmic load of 1.5%. For this, we set up a PCR-based amplification of the mtDNA, followed by MPS using Illumina chemistry, and variant calling with two different algorithms, mtDNA server and Mutect.

The PCR amplification is used to enrich the mitochondrial fraction, while the bioinformatic processing with two algorithms is used to discriminate the true heteroplasmies from background noise. The protocol described here allows for deep sequencing of the mitochondrial DNA in bulk DNA samples as well as single cells (both large cells such as human oocytes, and small-sized single cells such as human embryonic stem cells) with minor modifications to the protocol.

Keywords: mtDNA (线粒体DNA), Massive parallel sequencing (大规模平行测定), Heteroplasmy (异质粒), Long range PCR (长片段PCR), Amplicon sequencing (扩增子测序)

Background

In the past, the methods used for studying the mtDNA were amongst others PCR-RFLP (PCR-restriction fragment length polymorphism), Sanger sequencing and mitochondrial DNA re-sequencing using Affymetrix’s MitoChip. However, these methods are not able to accurately quantify the heteroplasmic load under 10%. Massive Parallel Sequencing (MPS) represents, in all likelihood, the best solution to investigate variants in the mitochondrial genome. However, when analyzing mtDNA there are two key factors that make mtDNA analysis less straightforward compared to the nuclear genome. First, the mtDNA contains regions with significant homology to regions dispersed within the nuclear genome called nuclear mitochondrial DNA sequences (NuMTS). Second, multiple mtDNA copies are present within a cell, and variants can be present at frequencies ranging between 0 and 100%. For this reason, when performing MPS analysis of the mtDNA different strategies might be applied to accurately identify and quantify single nucleotide variants (SNVs). The best approach to overcome the first problem is to enrich the sample for its mitochondrial genome only. This can be achieved by selectively amplifying the mtDNA by long-range PCR, by isolating the mtDNA using mtDNA enrichment kits, or by isolating the mitochondria themselves prior to DNA extraction (Just et al., 2015). The second issue is more challenging. Whilst MPS provides the ideal type of data to simultaneously identify SNVs and/or rearrangements and calculating their loads, the manner in which the data are generated and processed will not only determine the type of variants that can be detected, but also their lower threshold. If an SNV is present at a very low frequency, its signal will be undistinguishable from the systems sequencing errors (Bai and Wong, 2004; Rohlin et al., 2009; Zhang et al., 2012; Ye et al., 2014). Recently, many pipelines have been released to identify and quantify variants. However, bioinformatic processing does not represent the only critical step in these analyses. In our hands, we found that the initial amplification of the template is an extremely important step for the correct evaluation of SNV frequencies, such that a suboptimal PCR amplification leads to a gross overestimation of the retrieved frequencies (Zambelli et al., 2017). This is especially the case for PCR protocols with higher cycle numbers and primer sets that do not result in a linear amplification. The method we present here can be used to accurately detect and quantify single nucleotide variants at a low heteroplasmic load (as low as 1.5%) in both bulk DNA samples and single cells (Zambelli et al., 2017 and 2018). We here describe amplification conditions and bioinformatic processing for both bulk DNA and single cells with detailed information and screenshots about the bioinformatic steps.

Materials and Reagents

  1. 96-well plate
  2. 200 μl Eppendorf tubes
  3. LongAmp Buffer (LongAmp Taq DNA Polymerase kit) (New England Biolabs, catalog number: M0323L), stored at -20 °C
  4. Taq LongAmp (LongAmp Taq DNA Polymerase kit) (New England Biolabs, catalog number: M0323L), stored and kept at all times at -20 °C
  5. dNTP’s (dNTP sets, IllustraTM) (VWR, catalog number: 28-4065-57), diluted to 2 mM and stored at -20 °C
  6. Tricine (Sigma-Aldrich, catalog number: T9784), diluted to 200 mM and stored at 4 °C
  7. DTT (DL-Dithiothreitol) (Sigma, catalog number: D-0632), stored at 4 °C
  8. NaOH (Stock solution: 1 M)
  9. TBE buffer (1x) (Thermo Fisher Scientific, InvitrogenTM, catalog number: 15581044)
  10. Agarose (Agarose DNA Pure Grade, VWR, catalog number: 443666A)
  11. PCR Purification Kit (AMPure XP for PCR purification, Beckman Coulter, catalog number: A63882)
  12. Primer set 1 (5042f-1424r) (Integrated DNA Technologies, diluted to 10 μmol, aliquoted and stored at -20 °C)
    Forward primer: 5′-AGC AGT TCT ACC GTA CAA CC-3′
    Reverse primer: 5′-ATC CAC CTT CGA CCC TTA AG-3′
  13. Primer set 2 (528f-5789r) (Integrated DNA Technologies, diluted to 10 μmol, aliquoted and stored at -20 °C)
    Forward primer: 5′-TGC TAA CCC CAT ACC CCG AAC C-3′
    Reverse primer: 5′-AAG AAG CAG CTT CAA ACC TGC C-3′

Note: These two primer sets (Items 12 and 13) were selected because they were able to amplify large fragments of the mtDNA and tested negative when amplifying Rho Zero cells, indicating that they did not amplify NuMTS in the nuclear genome. These primer sets were also evaluated for their performance in low frequency heteroplasmy calling by performing spike-in experiments and were shown to give the better estimation of the low frequency variants (Zambelli et al., 2017).

  1. QubitTM dsDNA Broad Range Assay Kit (InvitrogenTM, catalog number: Q32853) 
  2. Ethanol (80%)
  3. Tris-HCl solution (10 mM, pH 8.0)
  4. Library Preparation kit (KAPA HyperPlus kit, Roche, catalog number: 07962436001)
  5. Fragmentation Kit (HS NGS Fragment Kit, Agilent, catalog number: DNF-474)
  6. Tween 20 (Sigma-Aldrich, catalog number: P9416)
  7. MPS Reagent Kit (NovaSeq6000 S2 Reagent Kit [200 cycles]) (Illumina, catalog number: 20012861)
  8. Gel electrophoresis (see Recipes)
  9. Alkaline Lysis Buffer (see Recipes)
  10. EBT (Elution Buffer with Tris) buffer (see Recipes)

Equipment

  1. Agarose Gel electrophoresis apparatus
  2. Power Supply (Electrophoresis Power Supply–EPS 301) (GE Health Care, catalog number: 18113001) 
  3. Thermal Cycler (VeritiTM 96-Well, Applied Biosystems, catalog number: 4375786)
  4. Magnet
  5. Qubit Fluorometric Quantification (InvitrogenTM, catalog number: Q33238)
  6. Fragment Analyzer (Agilent formerly Advanced Analytical Technologies, Agilent, https://www.aati-us.com/instruments/fragment-analyzer/
  7. Sequencing system (e.g., NovaSeq6000, Illumina, catalog number: 20012850)

Software

  1. bcl2fastq conversion software v2.19 script (Illumina, http://emea.support.illumina.com/downloads/bcl2fastq-conversion-software-v2-19.html)
  2. Seqtk (https://bioconda.github.io/recipes/seqtk/README.html
  3. BWA-MEM 
  4. GATK v3.3 (Broad Institute, https://software.broadinstitute.org/gatk/
  5. GATK v3.6 Mutect2 (Broad Institute, https://software.broadinstitute.org/gatk/documentation/tooldocs/3.7-0/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php)
  6. Mpileup (SAMtools, http://samtools.sourceforge.net/mpileup.shtml
  7. mtDNA server (https://mtdna-server.uibk.ac.at/index.html)
  8. MitoWheel (http://mitowheel.org/mitowheel.html)
  9. MitImpact (http://mitimpact.css-mendel.it/)
  10. MutPred2 (http://mutpred.mutdb.org/)

Procedure

  1. Extraction of bulk DNA: extract DNA from the sample of interest. 
    1. The sample can be a bulk of tissue or cell lines. No special kit is required for bulk analysis, for instance, we routinely use either the DNeasy kit from Qiagen or a phenol/chlorophorm extraction followed by ethanol precipitation (Zambelli et al., 2018). 
    2. For single cells, we manually collect them with the aid of a stereomicroscope under a horizontal flow in 2.5 μl of alkaline lysis buffer (200 mM NaOH and 50 mM DTT) in 200 μl Eppendorf tubes (Spits et al., 2006) and store them at -20 °C. Before amplification, the single cells are incubated for 10 min at 65 °C for lysis, then prepared for amplification as under Step 3. 
  2. Dilute the DNA sample to a working solution of 10 ng/μl (only for bulk DNA).
  3. Prepare the master mix as following:

    *Tricine is used in the PCR reaction to buffer the alkaline lysis buffer used in the single cell collection.
  4. Aliquot 45 μl of the master mix to PCR tubes and add 5 μl of 10 ng/μl DNA for bulk.
    Note: For single cells, add 47.5 μl of the master mix to the Eppendorf tubes with the cells collected in the alkaline lysis buffer.
  5. Use the following PCR program:
    1. step 1 (Initiation): 30 s at 94 °C
    2. step 2 (8 cycles): 15 s at 94 °C
      30 s at 64 °C minus 0.4 °C per cycle (this is the touchdown step)
      11 min at 65 °C (for primer set 1) or 5 min at 65 °C (for primer set 2)
    3. step 3 (22 cycles for bulk; 27 cycles for large single cells (human oocytes), 37 cycles for other single cells):
      15 s at 94 °C
      30 s at 61 °C
      11 min at 65 °C (for primer set 1) or 5 min at 65 °C (for primer set 2)
    4. step 4 (Final): 11 min at 65 °C (for primer set 1) or 5 min at 65 °C (for primer set 2)
    5. step 5: cooling step at 4 °C until storage.
  6. Keep the amplicons stored at 4 °C (for short-term storage) or -20 °C (long-term storage) until further processing.
  7. Confirm successful amplification by running a gel-electrophoresis (1% for bulk, 1.5% for single cells) (see Recipes).
  8. Load 9 μl per sample on the gel and run the gel at 80 V for 1 h. Successful amplification should be as shown in Figure 1.


    Figure 1. Successful amplification of the two primer sets compared to a DNA ladder shown by gel-electrophoresis

  9. Clean up the samples with AMPure beads as follows.
    Note: This is usually done in batches of 96 samples in a 96-well plate.
    1. Add 54 μl of AMPure beads to 30 μl of each amplicon (the ratio is 1.8x).
    2. Mix the samples by pipetting up and down and incubate the samples for 5 min at room temperature.
    3. Place the samples on the magnet for 5 min or until the solution is clear.
    4. Remove the supernatant without disturbing the beads.
    5. Add 200 μl ethanol (80%).
    6. Incubate the samples for 30 s on the magnet.
    7. Remove the supernatant without disturbing the beads.
    8. Repeat Steps 9e-9g.
    9. Let the beads dry for 5 min at room temperature.
    10. Remove the samples from the magnet.
    11. Add 30 μl of Tris-HCl (10 mM, pH 8.0) and incubate the samples for 2 min at room temperature.
    12. Place the samples on the magnet until the solution is clear.
    13. Transfer the supernatant into a new 96-well plate.
  10. Quantify the DNA concentration of the samples using the Qubit according to the manufacturer’s instructions.
  11. Pool the amplicons of primer sets 1 and 2 together per sample, and to maintain a uniform coverage, the amplicons need to be mixed in a 0.35/0.65 ratio (35% of the shorter amplicon generated by primer set 2 and 65% of the longer amplicon generated by primer set 1). Use a total of 500 ng DNA in 17.5 μl Tris-HCl (10 mM, pH 8.0).
    Note: The procedure is also possible for each amplicon separately, again with a total of 500 ng.
  12. Prepare the library with the KAPA HyperPlus Kit using half of the reagent volumes as specified by the supplier. This is the protocol in summary:
    1. Prepare the fragmentation mix (on ice):


    2. Vortex the fragmentation mix and spin down.
    3. Add 7.5 μl of the fragmentation mix to 17.5 μl of the purified DNA sample and mix by pipetting (work on ice).
    4. Incubate the samples in a thermocycler (lid heated at 50 °C).
      10 min at 4 °C
      15 min at 37 °C
      Keep the samples at 4 °C
    5. Proceed immediately to the End Repair and A-tailing.
    6. Prepare the End Repair and A-tailing mix as follows:


    7. Mix the End Repair and A-tailing mix by vortexing and spin down.
    8. Add 5 μl of the End Repair and A-tailing mix to the fragmented DNA.
    9. Incubate the samples in the thermocycler (put the lid on 85 °C).
      30 min at 65 °C
      Keep the samples at 4 °C.
    10. Proceed immediately to the Ligation step.
    11. Prepare the Ligation mix as follows:


    12. Mix the Ligation mix by pipetting up and down.
    13. Add 20 μl of the Ligation mix to 30 μl of the End Repaired and A-tailed DNA.
    14. Add a unique adapter per sample: 5 μl of 7.5 μM Illumina TruSeq Unique Dual (UD) indexed adapter.
      Note: These adapters may be custom produced by an oligo supplier (e.g., Integrated DNA Technologies) or may be purchased as a kit (e.g., IDT for Illumina–TruSeq DNA UD Indexes). Be aware that most suppliers deliver ready-made adapters at 15 μM while in this protocol we use 7.5 μM.
    15. Mix thoroughly by pipetting.
    16. Incubate the samples in a thermocycler (no heated lid) for 15 min at 20 °C.
    17. Proceed immediately with the bead sample clean up. 
  13. Clean up the libraries with AMPure beads as follows:
    1. Add 44 μl of AMPure beads to 35 μl of the library (the ratio is 0.8x).
    2. Mix the samples by pipetting up and down and incubate the samples for 5 min at room temperature.
    3. Place the samples on a magnet for 5 min or until the solution is clear.
    4. Remove the supernatant without disturbing the beads.
    5. Add 200 μl of ethanol (80%).
    6. Incubate the samples for 30 s on the magnet.
    7. Remove the supernatant without disturbing the beads.
    8. Repeat Steps 13e-13g.
    9. Let the beads dry for 5 min at room temperature.
    10. Remove the samples from the magnet.
    11. Add 27 μl of Tris-HCl (10 mM, pH 8.0).
    12. Incubate the samples for 2 min at room temperature.
    13. Place the samples on the magnet until the solution is clear. 
    14. Transfer 25 μl of the eluate to a new 96-well plate.
    Note: This can be a stopping point. Keep the samples in the fridge (4 °C) for 24 h, or for longer storage, keep them in the freezer (-20 °C).
  14. Size-select the libraries with AMPure beads as follows:
    1. Add 75 μl of nuclease-free water to the purified libraries.
    2. Add 50 μl of AMPure beads to 100 μl of the library (the ratio is 0.5x).
    3. Mix the samples by pipetting up and down and incubate the samples for 5 min at room temperature.
    4. Place the samples on the magnet for 5 min or until the solution is clear.
    5. Transfer 140 μl of the supernatant to a new 96-well plate.
    6. Add 20 μl of AMPure beads to 140 μl of supernatant (the ratio is 0.7x).
    7. Mix the samples by pipetting up and down and incubate the samples for 5 min at room temperature.
    8. Place the samples on the magnet for 5 min or until the solution is clear.
    9. Remove the supernatant without disturbing the beads.
    10. Add 200 μl of ethanol (80%).
    11. Incubate the samples for 30 s on the magnet.
    12. Remove supernatant without disturbing the beads.
    13. Repeat Steps 14j-14I.
    14. Let the beads dry for 5 min at room temperature.
    15. Remove the samples from the magnet.
    16. Add 15 μl of Tris-HCl (10 mM; pH 8.0).
    17. Incubate the samples for 2 min at room temperature.
    18. Place the samples on the magnet until the solution is clear.
    19. Transfer 13 μl of the eluate to a new 96-well plate.
    Note: This can be a stopping point. Keep the samples in the fridge (4 °C) for 24 h or, for longer storage, keep them in the freezer (-20 °C).
  15. Check the quality of the final library (1/10 dilution of the prepared library) by electrophoresis on the AATI Fragment Analyzer using the HS NGS Fragment Kit. An example of successful libraries is shown in Figure 2.


    Figure 2. Successfully prepared libraries shown by electrophoresis on an AATI Fragment Analyzer

  16. Quantify all samples using the Qubit according to the supplier’s instructions.
  17. Use the average size (smear analysis on the AATI Fragment Analyzer between 200 and 1500 base pairs) and the Qubit concentration, to calculate the molarity of the obtained library. Use the formula below:



  18. Dilute all libraries to 2 nM in EBT buffer.
  19. Pool all the samples by combining equal volumes of each 2 nM library.
  20. Load the samples on the Illumina platform of choice. For the Illumina NovaSeq6000, perform the library denaturation as mentioned in the NovaSeq6000 Sequencing System Guide (Document 1000000019358v11). For 96 samples, use 6% capacity (matching the denaturation of 9 μl of 2 nM library pool) of the NovaSeq6000 S2 Reagent Kit (200 cycles).

Data analysis

  1. Demultiplex the base call (.bcl) files to .fastq files with Illumina bcl2fastq v2.19 script.
  2. Extract ad random 1.5 million reads from the .fastq files with the ‘seqtk’ tool.
    Note: This is to reduce the number of reads because a high number can cause computational problems in Step 6.
  3. Align the .fastq file to the reference genome (NC_012920.1) and generate a .bam file using BWA-MEM.
  4. Realign for insertions and deletions and base recalibrations (from .bam to .bam) with GATK v3.3.
  5. For variant calling, use the tool GATK v3.6Mutect2 (from .bam to .vcf).
    Note: The code for the previous steps 1-5 can be found in the Supplementary File 1. 
  6. Upload the .bam file to mtDNA server (https://mtdna-server.uibk.ac.at/index.html). 
  7. The mtDNA server report shows:
    1. Heteroplasmic variants (an example is shown in Figure 3).


      Figure 3. Report of the detected heteroplasmic variants by mtDNA server

    2. Homoplasmic variants (an example is shown in Figure 4).


      Figure 4. Report of the detected homoplasmic variants by mtDNA server

    3. Haplogroup and possible contamination with other haplogroups (an example is shown in Figure 5).


      Figure 5. Report of the haplogroup detected by mtDNA server

    4. Coverage plot (an example is shown in Figure 6).


      Figure 6. Example of a coverage plot provided by mtDNA server

  8. Open the Mutect file and sort the variants per frequency.
    1. Open the Mutect file and follow the process shown in Figure 7:


      Figure 7. Sequential steps to open a Mutect file in a Microsoft Office Excel. Make sure that “Delimited” is indicated in step 1 and follow the standard settings.

    2. When the file is opened, select the complete column below cell J39 as shown in Figure 8.


      Figure 8. Lay-out when a Mutect file is opened and the complete column below cell J39 is selected

    3. Convert the selected area to columns by selecting the tool “Convert text to columns” as shown in Figure 9.


      Figure 9. Conversion of text to columns in the complete column below cell J39

    4. Follow these setting to complete the conversion as shown in Figure 10.


      Figure 10. Sequential steps to complete the conversion from text to columns. Make sure that “Delimited” is indicated in step 1 and that the delimiter “other” is indicated with a semicolon as the delimiter. Follow the standard settings in step 3.

    5. Select all rows from Row 39 and sort in “Column L” for “Largest to smallest” as shown in Figure 11.


      Figure 11. Example of the selected rows from Row 39 and of the sorting from largest to smallest from all rows below Row 39 

    6. The variants > 0.015 in column L are selected for further analysis. Column K represents the forward and reverse reads of the position divided by a comma. Only variants with total reads (forward + reverse) above 1000 are considered (Figure 12).


      Figure 12. The cells shown in yellow are the variants that are considered for further analysis

  9. Crosscheck the frequency of the variants detected by MuTect with the variants of mtDNA server and report possible insertions/deletions.
  10. If there is a discrepancy, calculate the frequency given by the coverage file as follows:



  11. Annotate the variant by looking up the region on http://mitowheel.org/mitowheel.html. An example is shown in Figure 13.


    Figure 13. Results of the annotation provided by mitoWheel for the position 7306

  12. If the variant is in a protein-coding region, identify the potential amino acid change on http://mitimpact.css-mendel.it/. An example is shown in Figure 14.


    Figure 14. Results for the variants that cause an amino acid change on the position 7306 provided by MitImpact2

  13. Check if the amino acid change is pathogenic by uploading the amino acid sequence and amino acid changes on http://mutpred.mutdb.org/. An example is shown in Figure 15.


    Figure 15. Results of the pathogenicity predictor MutPred2 for the amino acid changes M468T and M468K in the MT-CO1 gene

  14. Certain regions were excluded from the analysis. These excluded regions can be found in the supplementary data of Zambelli et al., 2018 (https://www.cell.com/stem-cell-reports/fulltext/S2213-6711(18)30224-8#secsectitle0020).

Recipes

  1. Gel electrophoresis
    50 ml of 1x TBE buffer
    0.5 g agarose
  2. Alkaline Lysis Buffer

  3. EBT (Elution Buffer with Tris) buffer
    10 mM Tris-HCl pH 8.5
    1% Tween 20

Acknowledgments

This work was funded by the Wetenschappelijke Fonds Willy Gepts (UZ Brussel), the Fonds voor Wetenschappelijk Onderzoek (FWO, 1506616N) and the Methusalem grant of the Vrije Universiteit Brussel to Prof. Karen Sermon.

Competing interests

None.

Ethics

In this protocol, we have used human mtDNA for optimization only. All donors gave their informed consent and all studies related to this protocol received approval of appropriate local ethical committees.

References

  1. Bai, R. K. and Wong, L. J. (2004). Detection and quantification of heteroplasmic mutant mitochondrial DNA by real-time amplification refractory mutation system quantitative PCR analysis: a single-step approach. Clin Chem 50(6): 996-1001.
  2. Just, R. S., Irwin, J. A. and Parson, W. (2015). Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing. Forensic Sci Int Genet 18: 131-139.
  3. Rohlin, A., Wernersson, J., Engwall, Y., Wiklund, L., Bjork, J. and Nordling, M. (2009). Parallel sequencing used in detection of mosaic mutations: comparison with four diagnostic DNA screening techniques. Hum Mutat 30(6): 1012-1020.
  4. Spits, C., Le Caignec, C., De Rycke, M., Van Haute, L., Van Steirteghem, A., Liebaers, I. and Sermon, K. (2006). Whole-genome multiple displacement amplification from single cells. Nat Protoc 1(4): 1965-1970.
  5. Ye, F., Samuels, D. C., Clark, T. and Guo, Y. (2014). High-throughput sequencing in mitochondrial DNA research. Mitochondrion 17: 157-163.
  6. Zambelli, F., Mertens, J., Dziedzicka, D., Sterckx, J., Markouli, C., Keller, A., Tropel, P., Jung, L., Viville, S., Van de Velde, H., Geens, M., Seneca, S., Sermon, K. and Spits, C. (2018). Random mutagenesis, clonal events, and embryonic or somatic origin determine the mtDNA variant type and load in human pluripotent stem cells. Stem Cell Reports 11(1): 102-114.
  7. Zambelli, F., Vancampenhout, K., Daneels, D., Brown, D., Mertens, J., Van Dooren, S., Caljon, B., Gianaroli, L., Sermon, K., Voet, T., Seneca, S. and Spits, C. (2017). Accurate and comprehensive analysis of single nucleotide variants and large deletions of the human mitochondrial genome in DNA and single cells. Eur J Hum Genet 25(11): 1229-1236.
  8. Zhang, W., Cui, H. and Wong, L. J. (2012). Comprehensive one-step molecular analyses of mitochondrial genome by massively parallel sequencing. Clin Chem 58(9): 1322-1331.

简介

多年来,检测线粒体DNA(mtDNA)中的异质体一直是一个挑战。过去,Sanger测序是进行此分析的主要选择,但是,这种方法无法检测低频异质性。大规模并行测序(MPS)提供了深入研究mtDNA的机会,但是需要受控的管道来可靠地检索和量化低频变体。已经表明,方法的差异可以显着影响检索变体的数量和频率。

在该方案中,我们提出了一种涉及湿实验室和生物信息学的方法,该方法允许在完整mtDNA序列中鉴定和定量单核苷酸变体,直至1.5%的异质载量。为此,我们建立了基于PCR的mtDNA扩增,然后是使用Illumina化学的MPS,以及使用两种不同算法mtDNA服务器和Mutect的变体调用。

PCR扩增用于富集线粒体部分,而使用两种算法的生物信息学处理用于区分真正的异质体和背景噪声。这里描述的方案允许对大量DNA样品中的线粒体DNA以及单个细胞(包括人类卵母细胞的大细胞和诸如人胚胎干细胞的小尺寸单细胞)进行深度测序,对方案进行微小修改。
【背景】过去,用于研究mtDNA的方法包括PCR-RFLP(PCR-限制性片段长度多态性),使用Affymetrix的MitoChip进行Sanger测序和线粒体DNA重新测序。然而,这些方法不能准确地将异质载荷量化为10%以下。大规模并行测序(MPS)很可能是研究线粒体基因组中变体的最佳解决方案。然而,在分析mtDNA时,有两个关键因素使得mtDNA分析与核基因组相比不那么简单。首先,mtDNA含有与分散在核基因组内的称为核线粒体DNA序列(NuMTS)的区域具有显着同源性的区域。其次,多个mtDNA拷贝存在于细胞内,并且变体可以以0-100%的频率存在。因此,当进行mtDNA的MPS分析时,可以应用不同的策略来准确鉴定和定量单核苷酸变体(SNV)。克服第一个问题的最佳方法是仅为其线粒体基因组富集样品。这可以通过长程PCR选择性扩增mtDNA,通过使用mtDNA富集试剂盒分离mtDNA,或通过在DNA提取前分离线粒体来实现(Just et al 。,2015)。第二个问题更具挑战性。虽然MPS提供理想类型的数据以同时识别SNV和/或重新排列并计算其负载,但是生成和处理数据的方式不仅将确定可检测的变体的类型,还将确定其较低的阈值。如果SNV以非常低的频率存在,其信号将与系统测序错误无法区分(Bai和Wong,2004; Rohlin et al 。,2009; Zhang et al 。,2012; Ye et al 。,2014)。最近,已经发布了许多管道来识别和量化变体。然而,生物信息处理并不代表这些分析中唯一关键的步骤。在我们的手中,我们发现模板的初始扩增是正确评估SNV频率的极其重要的步骤,因此次优的PCR扩增导致对检索到的频率的高估(Zambelli 等。,2017)。对于具有较高循环数的PCR方案和不导致线性扩增的引物组尤其如此。我们在这里提出的方法可以用于准确检测和量化大量DNA样本和单个细胞中低异质负荷(低至1.5%)的单核苷酸变异(Zambelli et al 。,2017和2018年)。我们在这里描述了大量DNA和单细胞的扩增条件和生物信息学处理,并提供了有关生物信息学步骤的详细信息和截图。

关键字:线粒体DNA, 大规模平行测定, 异质粒, 长片段PCR, 扩增子测序

材料和试剂

  1. 96孔板
  2. 200μlEppendorf管
  3. LongAmp缓冲液(LongAmp Taq DNA聚合酶试剂盒)(New England Biolabs,目录号:M0323L),储存于-20°C
  4. Taq LongAmp(LongAmp Taq DNA聚合酶试剂盒)(New England Biolabs,目录号:M0323L),始终保存并保存在-20°C
  5. dNTP(dNTP组,Illustra TM )(VWR,目录号:28-4065-57),稀释至2 mM并储存于-20°C
  6. Tricine(Sigma-Aldrich,目录号:T9784),稀释至200mM并在4℃下储存
  7. DTT(DL-二硫苏糖醇)(Sigma,目录号:D-0632),储存在4℃
  8. NaOH(储备溶液:1M)
  9. TBE缓冲液(1x)(Thermo Fisher Scientific,Invitrogen TM ,目录号:15581044)
  10. 琼脂糖(Agarose DNA Pure Grade,VWR,目录号:443666A)
  11. PCR纯化试剂盒(用于PCR纯化的AMPure XP,Beckman Coulter,目录号:A63882)
  12. 引物组1(5042f-1424r)(Integrated DNA Technologies,稀释至10μmol,等分并储存于-20°C)
    正向引物:5'-AGC AGT TCT ACC GTA CAA CC-3'
    反向引物:5'-ATC CAC CTT CGA CCC TTA AG-3'
  13. 引物组2(528f-5789r)(Integrated DNA Technologies,稀释至10μmol,等分并储存于-20°C)
    正向引物:5'-TGC TAA CCC CAT ACC CCG AAC C-3'
    反向引物:5'-AAG AAG CAG CTT CAA ACC TGC C-3'

注意:选择这两个引物组(第12和13项)是因为它们能够扩增mtDNA的大片段并在扩增Rho Zero细胞时检测为阴性,表明它们没有扩增核基因组中的NuMTS。通过进行掺入实验,还评估了这些引物组在低频异质调用中的表现,并且显示出对低频变体的更好估计(Zambelli等,2017)。

  1. Qubit TM dsDNA宽范围检测试剂盒(Invitrogen TM ,目录号:Q32853) 
  2. 乙醇(80%)
  3. Tris-HCl溶液(10 mM,pH 8.0)
  4. 文库制备试剂盒(KAPA HyperPlus试剂盒,罗氏,目录号:07962436001)
  5. 破碎试剂盒(HS NGS Fragment Kit,Agilent,目录号:DNF-474)
  6. 吐温20(西格玛奥德里奇,目录号:P9416)
  7. MPS试剂盒(NovaSeq6000 S2试剂盒[200个循环])(Illumina,目录号:20012861)
  8. 凝胶电泳(见食谱)
  9. 碱性裂解缓冲液(见食谱)
  10. EBT(带Tris的洗脱缓冲液)缓冲液(见食谱)

设备

  1. 琼脂糖凝胶电泳仪
  2. 电源(Electrophoresis Power Supply-EPS 301)(GE Health Care,目录号:18113001) 
  3. 热循环仪(Veriti TM 96-Well,Applied Biosystems,目录号:4375786)
  4. 磁铁
  5. Qubit荧光定量(Invitrogen TM ,目录号:Q33238)
  6. 片段分析仪(安捷伦以前的高级分析技术,安捷伦, https://www.aati-us的.com /乐器/片段分析器/ ) 
  7. 测序系统(例如,NovaSeq6000,Illumina,目录号:20012850)

软件

  1. bcl2fastq转换软件v2.19脚本(Illumina, http:/ /emea.support.illumina.com/downloads/bcl2fastq-conversion-software-v2-19.html )
  2. Seqtk( https://bioconda.github.io/recipes/seqtk/README.html) 
  3. BWA-MEM 
  4. GATK v3.3(Broad Institute, https://software.broadinstitute.org/gatk/ )&nbsp ;
  5. GATK v3.6 Mutect2(Broad Institute, https:// software。 broadinstitute.org/gatk/documentation/tooldocs/3.7-0/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php )
  6. Mpileup(SAMtools, http://samtools.sourceforge.net/mpileup.shtml ) 
  7. mtDNA服务器( https://mtdna-server.uibk.ac.at/index.html < / A>)
  8. MitoWheel( http://mitowheel.org/mitowheel.html )
  9. MitImpact( http://mitimpact.css-mendel.it/ )
  10. MutPred2( http://mutpred.mutdb.org/ )

程序

  1. 提取大量DNA:从感兴趣的样本中提取DNA。&nbsp;
    1. 样品可以是大量组织或细胞系。批量分析不需要特殊的试剂盒,例如,我们经常使用Qiagen的DNeasy试剂盒或苯酚/氯仿提取物,然后进行乙醇沉淀(Zambelli et al。,2018)。&nbsp;
    2. 对于单个细胞,我们借助立体显微镜在2.5μl碱性裂解缓冲液(200 mM NaOH和50 mM DTT)中在200μlEppendorf管中进行手动收集它们(Spits et al。,2006)并将它们储存在-20°C。在扩增之前,将单细胞在65℃温育10分钟以进行裂解,然后如步骤3中那样制备用于扩增。&nbsp;
  2. 将DNA样品稀释至10 ng /μl的工作溶液(仅适用于大量DNA)。
  3. 准备主混合物如下:

    * Tricine用于PCR反应,以缓冲单细胞收集中使用的碱性裂解缓冲液。
  4. 将45μl主混合物等分到PCR管中,加入5μl10ng /μlDNA进行批量处理。
    注意:对于单细胞,将47.5μl预混液加入Eppendorf试管中,细胞收集在碱性裂解缓冲液中。
  5. 使用以下PCR程序:
    1. 步骤1(引发):在94℃下30秒
    2. 步骤2(8个循环):在94℃下15秒
      每个循环在64°C减去0.4°C 30秒(这是着陆步骤)
      在65°C下11分钟(对于引物组1)或在65°C下5分钟(对于引物组2)
    3. 步骤3(22个循环用于大量; 27个循环用于大单细胞(人卵母细胞),37个循环用于其他单个细胞):
      15°C在94°C
      在61°C下30秒
      在65°C下11分钟(对于引物组1)或在65°C下5分钟(对于引物组2)
    4. 步骤4(最终):在65℃下11分钟(对于引物组1)或在65℃下5分钟(对于引物组2)
    5. 步骤5:在4℃冷却步骤直至储存。
  6. 将扩增子保存在4°C(短期储存)或-20°C(长期储存),直至进一步处理。
  7. 通过运行凝胶电泳确认成功扩增(1%用于批量,1.5%用于单个细胞)(参见配方)。
  8. 在凝胶上每个样品加载9μl,并在80V下运行凝胶1小时。成功放大应如图1所示。


    图1.与凝胶电泳显示的DNA梯子相比,两个引物组的成功扩增

  9. 使用AMPure珠子清洁样品如下。
    注意:这通常是在96孔板中96批样品中进行的。
    1. 将54μlAmpure珠加入30μl每种扩增子中(比例为1.8x)。
    2. 通过上下吸移混合样品,并在室温下孵育样品5分钟。
    3. 将样品放在磁铁上5分钟或直到溶液澄清。
    4. 去除上清液而不打扰珠子。
    5. 加入200μl乙醇(80%)。
    6. 将样品在磁体上孵育30秒。
    7. 去除上清液而不打扰珠子。
    8. 重复步骤9e-9g。
    9. 让珠子在室温下干燥5分钟。
    10. 从磁铁上取下样品。
    11. 加入30μlTris-HCl(10mM,pH8.0)并在室温下孵育样品2分钟。
    12. 将样品放在磁铁上直至溶液澄清。
    13. 将上清液转移到新的96孔板中。
  10. 根据制造商的说明使用Qubit定量样品的DNA浓度。
  11. 每个样品将引物组1和2的扩增子汇集在一起,并且为了保持均匀的覆盖率,扩增子需要以0.35 / 0.65的比例混合(引物组2产生的较短扩增子的35%和较长的65%)引物组1)产生的扩增子。在17.5μlTris-HCl(10mM,pH 8.0)中使用总共500ng DNA。
    注意:每个扩增子也可以单独进行,总共500 ng。
  12. 使用KAPA HyperPlus Kit准备库,使用供应商指定的一半试剂量。这是总结的协议:
    1. 准备碎裂混合物(在冰上):


    2. 涡旋破碎混合并旋转下来。
    3. 将7.5μl片段化混合物加入17.5μl纯化的DNA样品中,并通过移液(在冰上工作)混合。
    4. 将样品在热循环仪(加热至50°C的盖子)中孵育。
      在4°C下10分钟
      在37°C下15分钟
      将样品保持在4°C
    5. 立即进行End Repair和A-tailing。
    6. 准备最终修复和A-tailing混合如下:


    7. 通过涡旋和旋转混合末端修复和A尾混合物。
    8. 向片段化DNA中加入5μlEndRepair和A-tailing mix。
    9. 将样品在热循环仪中孵育(将盖子置于85°C)。
      在65°C下30分钟
      将样品保持在4°C。
    10. 立即进入结扎步骤。
    11. 按如下方式准备结扎混合物:


    12. 通过上下移液混合结扎混合物。
    13. 将20μl连接混合物加入30μlEndRepaising和A-tailed DNA中。
    14. 每个样品添加一个独特的适配器:5μl7.5μMIlluminaTruSeq Unique Dual(UD)索引适配器。
      注意:这些适配器可以由寡聚物供应商(例如,Integrated DNA Technologies)定制,或者可以作为试剂盒购买(例如,Illumina-TruSeq DNA UD Indexes的IDT)。请注意,大多数供应商提供15μM的现成适配器,而在此协议中,我们使用7.5μM。
    15. 通过移液彻底混合。
    16. 将样品在20℃下在热循环仪(无加热盖)中孵育15分钟。
    17. 立即进行珠子样品清理。&nbsp;
  13. 使用AMPure珠子清理库,如下所示:
    1. 将44μlAmpure珠加入35μl文库中(比例为0.8x)。
    2. 通过上下吸移混合样品,并在室温下孵育样品5分钟。
    3. 将样品放在磁铁上5分钟或直到溶液澄清。
    4. 去除上清液而不打扰珠子。
    5. 加入200μl乙醇(80%)。
    6. 将样品在磁体上孵育30秒。
    7. 去除上清液而不打扰珠子。
    8. 重复步骤13e-13g。
    9. 让珠子在室温下干燥5分钟。
    10. 从磁铁上取下样品。
    11. 加入27μlTris-HCl(10mM,pH8.0)。
    12. 在室温下孵育样品2分钟。
    13. 将样品放在磁铁上,直至溶液澄清。&nbsp;
    14. 将25μl洗脱液转移到新的96孔板中。
    注意:这可能是一个停止点。将样品放入冰箱(4°C)中24小时,或长时间存放,保存在冰箱中(-20°C)。
  14. 大小 - 使用AMPure珠子选择库如下:
    1. 向纯化的文库中加入75μl不含核酸酶的水。
    2. 将50μlAmpure珠加入100μl文库中(比例为0.5x)。
    3. 通过上下吸移混合样品,并在室温下孵育样品5分钟。
    4. 将样品放在磁铁上5分钟或直到溶液澄清。
    5. 将140μl上清液转移到新的96孔板中。
    6. 将20μlAmpure珠加入140μl上清液中(比例为0.7x)。
    7. 通过上下吸移混合样品,并在室温下孵育样品5分钟。
    8. 将样品放在磁铁上5分钟或直到溶液澄清。
    9. 去除上清液而不打扰珠子。
    10. 加入200μl乙醇(80%)。
    11. 将样品在磁体上孵育30秒。
    12. 去除上清液而不打扰珠子。
    13. 重复步骤14j-14I。
    14. 让珠子在室温下干燥5分钟。
    15. 从磁铁上取下样品。
    16. 加入15μlTris-HCl(10mM; pH8.0)。
    17. 在室温下孵育样品2分钟。
    18. 将样品放在磁铁上直至溶液澄清。
    19. 将13μl洗脱液转移到新的96孔板中。
    注意:这可能是一个停止点。将样品放入冰箱(4°C)中24小时,或者长时间存放,将它们放入冰箱(-20°C)。
  15. 使用HS NGS片段试剂盒在AATI片段分析仪上通过电泳检查最终文库的质量(制备的文库的1/10稀释度)。成功库的一个例子如图2所示。


    图2.通过AATI片段分析仪上的电泳显示成功制备的文库

  16. 根据供应商的说明使用Qubit量化所有样品。
  17. 使用平均大小(AATI片段分析仪上的涂片分析在200和1500碱基对之间)和Qubit浓度,计算获得的文库的摩尔浓度。使用下面的公式:



  18. 在EBT缓冲液中将所有文库稀释至2 nM。
  19. 通过组合每个2 nM库的相等体积来汇集所有样品。
  20. 将样品加载到您选择的Illumina平台上。对于Illumina NovaSeq6000,执行NovaSeq6000测序系统指南(文档1000000019358v11)中提到的文库变性。对于96个样品,使用NovaSeq6000 S2试剂盒(200个循环)的6%容量(匹配9μl2nM文库池的变性)。

数据分析

  1. 使用Illumina bcl2fastq v2.19脚本将基本调用(.bcl)文件解复用到.fastq文件。
  2. 使用'seqtk'工具从.fastq文件中随机提取150万次读取。
    注意:这是为了减少读取次数,因为在步骤6中,较大的数字可能会导致计算问题。
  3. 将.fastq文件与参考基因组(NC_012920.1)对齐,并使用BWA-MEM生成.bam文件。
  4. 使用GATK v3.3重新插入插入和删除以及基本重新校准(从.bam到.bam)。
  5. 对于变体调用,请使用工具GATK v3.6Mutect2(从.bam到.vcf)。
    注意:前面步骤1-5的代码可以在补充文件1 。&nbsp;
  6. 将.bam文件上传到mtDNA服务器( https://mtdna-server.uibk.ac。在/ index.html中)群组。
  7. mtDNA服务器报告显示:
    1. 异质变体(示例如图3所示)。


      图3.通过mtDNA服务器检测到的异质变体的报告

    2. 同质变体(示例如图4所示)。


      图4. mtDNA服务器检测到的同质变体的报告

    3. Haplogroup和其他单倍群的可能污染(图5中显示了一个例子)。


      图5. mtDNA服务器检测到的单倍群的报告

    4. 覆盖图(示例如图6所示)。


      图6. mtDNA服务器提供的覆盖图示例

  8. 打开Mutect文件并按频率对变体进行排序。
    1. 打开Mutect文件,然后按照图7所示的过程进行操作:


      图7.在Microsoft Office Excel中打开Mutect文件的顺序步骤。 确保在步骤1中指示“分隔”并遵循标准设置。

    2. 打开文件后,选择单元格J39下面的完整列,如图8所示。


      图8.打开Mutect文件并选择单元格J39下方的完整列时的布局

    3. 通过选择“将文本转换为列”工具将所选区域转换为列,如图9所示。


      图9.将文本转换为单元格J39下面的完整列中的列

    4. 按照以下设置完成转换,如图10所示。


      图10.完成从文本到列的转换的顺序步骤。确保在步骤1中指示“Delimited”,并且用分号作为分隔符指示分隔符“other”。请遵循步骤3中的标准设置。

    5. 选择第39行中的所有行,并在“列L”中排序“最大到最小”,如图11所示。


      图11.第39行中所选行的示例以及第39行以下所有行中从最大到最小的排序示例&nbsp;

    6. <变体>选择L列中的0.015用于进一步分析。列K表示位置的正向和反向读数除以逗号。只考虑总读数(正向+反向)高于1000的变体(图12)。


      图12.黄色显示的细胞是考虑进一步分析的变体

  9. 交叉检查MuTect检测到的变体与mtDNA服务器变体的频率,并报告可能的插入/删除。
  10. 如果存在差异,请按如下方式计算覆盖文件给出的频率:



  11. 通过在 http://mitowheel.org/mitowheel.html。示例如图13所示。


    图13. mitoWheel为7306位置提供的注释结果

  12. 如果变体位于蛋白质编码区,则在 http://mitimpact.css-mendel.it/。图14显示了一个例子。


    图14.由MitImpact2提供的位置7306上导致氨基酸改变的变体的结果

  13. 通过上传 http://mutpred.mutdb.org/。图15中显示了一个示例。


    图15.致病性预测因子MutPred2对MT-CO1基因中M468T和M468K的氨基酸变化的结果

  14. 某些地区被排除在分析之外。这些被排除的区域可以在Zambelli et al。,2018的补充数据中找到( https://www.cell.com/stem-cell-reports/fulltext/S2213-6711(18)30224-8#secsectitle0020 ) 。

食谱

  1. 凝胶电泳
    50毫升1x TBE缓冲液
    0.5克琼脂糖
  2. 碱性裂解缓冲液

  3. EBT(具有Tris的洗脱缓冲液)缓冲液
    10mM Tris-HCl pH 8.5
    1%吐温20

致谢

这项工作由Wetenschappelijke Fonds Willy Gepts(UZ Brussel),Fonds voor Wetenschappelijk Onderzoek(FWO,1506616N)以及Karen Sermon教授的Vrije Universiteit Brussel的Methusalem赠款资助。

利益争夺

没有。

伦理

在该协议中,我们仅使用人类mtDNA进行优化。所有捐助者都给出了知情同意书,所有与该议定书有关的研究都得到了当地道德委员会的批准。

参考

  1. Bai,R。K.和Wong,L。J.(2004)。 通过实时扩增难治性突变系统定量PCR分析检测和定量异质突变体线粒体DNA:a单步法。 Clin Chem 50(6):996-1001。
  2. Just,R。S.,Irwin,J。A.和Parson,W。(2015)。 新兴大规模平行测序领域的线粒体DNA异质性。 法医科学Int Genet 18:131-139。
  3. Rohlin,A.,Wernersson,J.,Engwall,Y.,Wiklund,L.,Bjork,J。和Nordling,M。(2009)。 用于检测镶嵌突变的平行测序:与四种诊断DNA筛查技术进行比较。 Hum Mutat 30(6):1012-1020。
  4. Spits,C.,Le Caignec,C.,De Rycke,M.,Van Haute,L.,Van Steirteghem,A.,Liebaers,I。和Sermon,K。(2006)。 单细胞的全基因组多重置换扩增。 Nat Protoc 1(4):1965-1970。
  5. Ye,F.,Samuels,D.C.,Clark,T。和Guo,Y。(2014)。 线粒体DNA研究中的高通量测序。 Mitochondrion 17:157-163。
  6. Zambelli,F.,Mertens,J.,Dziedzicka,D.,Sterckx,J.,Markouli,C.,Keller,A.,Tropel,P.,Jung,L.,Viville,S.,Van de Velde,H 。,Geens,M.,Seneca,S.,Sermon,K。和Spits,C。(2018)。 随机诱变,克隆事件和胚胎或体细胞来源决定了人类多能性mtDNA变异类型和负荷干细胞。 干细胞报告 11(1):102-114。
  7. Zambelli,F.,Vancampenhout,K.,Daneels,D.,Brown,D.,Mertens,J.,Van Dooren,S.,Caljon,B.,Gianaroli,L.,Sermon,K.,Voet,T。 ,Seneca,S。和Spits,C。(2017)。 对DNA和单细胞中人类线粒体基因组的单核苷酸变异和大量缺失进行准确而全面的分析。 Eur J Hum Genet 25(11):1229-1236。
  8. Zhang,W.,Cui,H。和Wong,L。J.(2012)。 通过大规模平行测序对线粒体基因组进行全面的一步分子分析。 Clin Chem 58(9):1322-1331。
登录/注册账号可免费阅读全文
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容,www.bio-protocol.org 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright: © 2019 The Authors; exclusive licensee Bio-protocol LLC.
引用:Mertens, J., Zambelli, F., Daneels, D., Caljon, B., Sermon, K. and Spits, C. (2019). Detection of Heteroplasmic Variants in the Mitochondrial Genome through Massive Parallel Sequencing. Bio-protocol 9(13): e3283. DOI: 10.21769/BioProtoc.3283.
分类
提问与回复

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。