参见作者原研究论文

本实验方案简略版
Sep 2020
Advertisement

本文章节


 

TGIRT-seq Protocol for the Comprehensive Profiling of Coding and Non-coding RNA Biotypes in Cellular, Extracellular Vesicle, and Plasma RNAs
用于全面分析细胞、细胞外囊泡和血浆 RNA 中编码和非编码 RNA 生物型的 TGIRT-seq 方法   

引用 收藏 提问与回复 分享您的反馈 Cited by

Abstract

High-throughput RNA sequencing (RNA-seq) has extraordinarily advanced our understanding of gene expression and disease etiology, and is a powerful tool for the identification of biomarkers in a wide range of organisms. However, most RNA-seq methods rely on retroviral reverse transcriptases (RTs), enzymes that have inherently low fidelity and processivity, to convert RNAs into cDNAs for sequencing. Here, we describe an RNA-seq protocol using Thermostable Group II Intron Reverse Transcriptases (TGIRTs), which have high fidelity, processivity, and strand-displacement activity, as well as a proficient template-switching activity that enables efficient and seamless RNA-seq adapter addition. By combining these activities, TGIRT-seq enables the simultaneous profiling of all RNA biotypes from small amounts of starting material, with superior RNA-seq metrics, and unprecedented ability to sequence structured RNAs. The TGIRT-seq protocol for Illumina sequencing consists of three steps: (i) addition of a 3' RNA-seq adapter, coupled to the initiation of cDNA synthesis at the 3' end of a target RNA, via template switching from a synthetic adapter RNA/DNA starter duplex; (ii) addition of a 5' RNA-seq adapter, by using thermostable 5' App DNA/RNA ligase to ligate an adapter oligonucleotide to the 3' end of the completed cDNA; (iii) minimal PCR amplification, to add capture sites and indices for Illumina sequencing. TGIRT-seq for the Illumina sequencing platform has been used for comprehensive profiling of coding and non-coding RNAs in ribodepleted, chemically fragmented cellular RNAs, and for the analysis of intact (non-chemically fragmented) cellular, extracellular vesicle (EV), and plasma RNAs, where it yields continuous full-length end-to-end sequences of structured small non-coding RNAs (sncRNAs), including tRNAs, snoRNAs, snRNAs, pre-miRNAs, and full-length excised linear intron (FLEXI) RNAs.


Graphic abstract:


Figure 1. Overview of the TGIRT-seq protocol for Illumina sequencing.

Major steps are: (1) Template switching from a synthetic R2 RNA/R2R DNA starter duplex with a 1-nt 3' DNA overhang (a mixture of A, C, G, and T residues, denoted N) that base pairs to the 3' nucleotide of a target RNA, and upon initiating reverse transcription by adding dNTPs, seamlessly links an R2R adapter to the 5' end of the resulting cDNA; (2) Ligation of an R1R adapter to the 3' end of the completed cDNA; and (3) Minimal PCR amplification with primers that add Illumina capture sites (P5 and P7) and barcode sequences (indices 5 and 7). The index 7 barcode is required, while the index 5 barcode is optional, to provide unique dual indices (UDIs).


Keywords: Group II intron reverse transcriptase (II 组内含子逆转录酶), Non-LTR-retroelement (非LTR反转录录元件), Reverse transcriptase (逆转录酶), RNA-seq (RNA-seq), Template switching (模板切换), Transcriptomics (转录组学), Illumina sequencing (Illumina 测序)

Background

Most RNA-seq methods rely on a retroviral reverse transcriptase (RT) to convert target RNAs into cDNAs, which can then be sequenced on a high-throughput DNA sequencing platform (Stark et al., 2019). However, retroviral RTs evolved to have inherently low fidelity and processivity, to help retrovirus evade host defenses by introducing frequent mutational variations and rapidly propagating beneficial ones by RNA recombination, which involves dissociating from one template and reinitiating on another (Onafuwa-Nuga and Telesnitsky, 2009; Hu and Hughes, 2012). Although commercial retroviral RTs have been engineered to increase their processivity and thermostability, the ability to improve these enzymes is limited by the retroviral RT structural framework, and even highly engineered retroviral RTs have difficulty reverse transcribing structured RNAs (Martín-Alonso et al., 2021).


TGIRT-seq is a next-generation comprehensive RNA-seq method that utilizes the beneficial biochemical properties of group II intron-encoded RTs, which are evolutionary ancestors of retroviral RTs (Belfort and Lambowitz, 2019). Group II intron RTs are largely prokaryotic enzymes that are associated with bacterial retrotransposons called mobile group II introns, and they evolved to function in retrohoming, a retrotransposition mechanism that requires reverse transcription of a highly structured group II intron RNA with high fidelity and processivity (Lambowitz and Belfort, 2015). They belong to a large subgroup of RTs encoded by non-LTR-retroelements, which include other bacterial RTs, retroplasmid RTs, and RVT RTs, as well as human LINE-1 element, insect R2 element, and other eukaryotic non-LTR-retrotransposon RTs (Xiong and Eickbush, 1990; Kojima and Kanehisa, 2008; Simon and Zimmerly, 2008; Gladyshev and Arkhipova, 2011; Zimmerly and Wu, 2015). Non-LTR-retroelement RTs differ from retroviral RTs in having a distinctive N-terminal extension (NTE) and two distinctive insertions (RT2a and RT3a) in the fingers and palm, which were shown by the crystal structure of a group II intron RT to contribute multiple additional interactions with the RNA template, and more tightly constrain the RT active site in ways that could contribute to higher fidelity and processivity (Xiong and Eickbush, 1990; Blocker et al., 2005; Stamos et al., 2017). The NTE also plays a crucial role in a proficient end-to-end template-switching activity, which enables efficient and seamless RNA-seq adapter addition (Lentzsch et al., 2019 and 2021). TGIRTs from bacterial thermophiles combine these beneficial properties with the ability to function at high temperatures (≥60°C), which help melt out stable RNA secondary structures (Mohr et al., 2013). A key to using TGIRTs and other non-LTR-retroelement RTs for RNA-seq and other applications was their fusion to a large solubility tag (e.g., maltose-binding protein), which enables them to be produced in large quantities and remain soluble and stable during storage, when freed of endogenous tightly bound nucleic acids (Mohr et al., 2013; Upton et al., 2021). As thousands of group II intron and other non-LTR-retroelement RTs have been identified, it is likely that we have only scratched the surface in identifying optimal enzymes for RNA-seq applications.


Figure 1 outlines a TGIRT-seq protocol using TGIRT-III (InGex; a proprietary version of Geobacillus stearothermophilus GsI-IIC RT; Mohr et al., 2013) for the simultaneous profiling of all coding and non-coding RNA biotypes in cellular, extracellular vesicle (EV), and human plasma RNA preparations without size selection (also referred to as the TGIRT Total RNA-seq method). The protocol described here is an updated version used inYao et al. (2020) based upon earlier versions described inQin et al. (2016),Nottingham et al. (2016), andXu et al. (2019).


In the first step used for the initiation of reverse transcription, the TGIRT enzyme template switches from a synthetic RNA template/DNA primer starter duplex that contains an RNA-seq adapter sequence directly to the 3' end of a target RNA, thereby linking the reverse complement of the adapter sequence to the 5' end of the cDNA (Mohr et al., 2013; Qin et al., 2016). The RNA/DNA starter duplex has a 1-nt 3' DNA overhang that directs template switching with high specificity by base pairing to the 3' nucleotide of a target RNA, resulting in seamless template-switching junctions (Lentzsch et al., 2019 and 2021). For Illumina sequencing, the starter duplex consists of a 35-nt RNA oligonucleotide that contains an Illumina Read 2 sequence (denoted R2 RNA) and has a 3’-blocking group (C3 Spacer, denoted 3SpC3) annealed to a 36-nt DNA primer containing the reverse complement of the Read 2 sequence (denoted R2R DNA), leaving the 1-nt 3’ DNA overhang. For comprehensive RNA-seq of a pool of RNAs, the 1-nt 3’ DNA overhang is a mixture of A, C, G, and T residues (denoted as N), and is added in excess to the amount of RNA template. Reverse transcription is typically carried out for 15 min at 60°C, an optimal temperature for GsI-IIC RT (Mohr et al., 2013), but different times and lower or higher temperatures can be used for different applications. (e.g., Zheng et al., 2015; Behrens et al., 2021). Reverse transcription is terminated by adding NaOH, which degrades the RNA template, followed by neutralization with HCl and a MinElute cDNA clean-up step to remove unused R2R DNA.


In the next step, a second RNA-seq adapter containing the reverse complement of an Illumina Read 1 sequence (denoted R1R DNA) is attached to the 3' end of the cDNA by a single-stranded DNA ligation using thermostable 5’ App RNA/DNA ligase, and this is followed by minimal PCR amplification (no more than 12 cycles) with primers that add capture sites and indices for Illumina sequencing. Using this protocol, comprehensive TGIRT-seq libraries can be prepared from as little as 0.5 ng of human plasma RNA in approximately 5 h through completion of the PCR step.


TGIRT-seq can be done with rRNA-depleted, chemically fragmented or intact cellular RNAs, or with total RNAs from EVs or plasma. Compared to benchmark TruSeq v3 datasets of rRNA depleted, chemically fragmented Universal Human Reference RNA (UHRR), with External RNA Control Consortium (ERCC) spike-ins, TGIRT-seq: (i) better recapitulated the relative abundance of mRNAs and ERCC spike-ins; (ii) was more strand-specific; (iii) gave more uniform 5’ to 3’ gene body coverage and detected more splice junctions, particularly near the 5’ ends of genes; and (iv) eliminated sequence biases due to random hexamer priming, which are inherent in TruSeq (Nottingham et al., 2016). Subsequent improvements in the TGIRT-seq method have included the use of modified RNA-seq adapter sequences that strongly decrease adapter dimer formation (Xu et al., 2019), and the use of lower salt concentrations in the template-switching and reverse transcription step, to increase library yield (Yao et al., 2020). If desired, a unique molecular identifier can be incorporated into the R1R adapter oligonucleotide (Yao et al., 2020). Although used mainly for Illumina sequencing thus far, the RNA-seq adapter sequences could readily be reformatted for other high-throughput DNA sequencing platforms.


The TGIRT-template switching reaction for 3' RNA-seq adapter addition is the defining step of the TGIRT-seq method, while 5' RNA-seq adapter addition using thermostable 5' App DNA/RNA ligase could in principle be replaced by an alternative method. Despite relying on a single base pair between the 1-nt 3' DNA overhang of starter duplex and 3' nucleotide of the target RNA, RNA-seq adapter addition by TGIRT template switching is highly specific, yielding 97.5-99.7% precise junctions, depending upon the base pair (Lentzsch et al., 2019). This high specificity reflects the fact that the 3' end of the acceptor RNA binds in a pocket formed by the NTE and RT fingertips loop, which promotes annealing of the 3' nucleotide of the acceptor RNA to the complementary 1-nt 3' DNA overhang of the starter duplex, and positions the penultimate nucleotide of the acceptor as the templating RNA base at the RT active site (Lentzsch et al., 2021). A second base-pairing interaction between the templating base and a complementary incoming dNTP then results in a conformational change required for initiation of reverse transcription, an irreversible step that drives the reaction forward and ensures high specificity (Lentzsch et al., 2021). The same template-switching pocket does not exist in retroviral RTs, which lack the NTE, likely leading to weaker binding of the acceptor with a greater dependence upon base-pairing interactions and propensity for artifacts resulting from template switching to alternative sites containing complementary nucleotides (e.g., Mader et al., 2001; Cocquet et al., 2006).


After reverse transcription, TGIRT enzymes can add non-templated nucleotides to the 3' end of a completed cDNA, generating 3' DNA overhangs that enable template switching to a second RNA template with a complementary 3' nucleotide. Such secondary template switches, which result in fusion reads, are suppressed in TGIRT-seq by using a relatively high salt concentration in the reaction medium (200 mM NaCl in the updated protocol and 450 mM NaCl in earlier versions; Mohr et al., 2013; Lentzsch et al., 2019; Yao et al., 2020). A longer-term solution may be provided by a recently described mutation that specifically inhibits non-templated addition and thus selectively inhibits secondary template switches from the 5’ end of a completed cDNA, but not from a starter duplex with a pre-formed 1-nt DNA overhang (Lentzsch et al., 2021). Secondary template switches can also be used for 5' RNA-seq adapter addition (Zhu et al., 2001; Picelli et al., 2013), and the recent detailed biochemical and structural analyses of the TGIRT template-switching reaction could be used to perfect such a method for TGIRT-seq (Lentzsch et al., 2019 and 2021).


Sequence biases in TGIRT-seq using TGIRT-III determined in reaction medium containing 450 mM NaCl and 60°C are limited to the first three nucleotides from the 5’ end of the RNA, corresponding to known sequence biases of the thermostable 5’ App RNA/DNA ligase (Hafner et al., 2011), and the 3' nucleotide of the target RNA, reflecting a preference for, or stability of different base pairs with the 1-nt 3’ DNA overhang of the starter duplex for TGIRT template switching (Xu et al., 2019). The 5’-adapter ligation step using thermostable 5’ App RNA/DNA ligase at high temperature has no major “co-folding” bias due to base pairings between the adapter and acceptor nucleic acids, and is not mitigated by using an R1R adapter with randomized nucleotides near its 5' end, as in 4 N protocols used for other ligases (Xu et al., 2019). The template-switching bias for the 3' nucleotide of the target RNA can be remediated by using unequal ratios of 3'-overhang nucleotides in the starter duplex, to compensate for the base-pairing biases, and overall bias can be corrected by using a bias correction algorithm, based on a random forest regression model of all factors contributing to bias (Mohr et al., 2013; Xu et al., 2019). Additionally, biochemical analysis showed that differences in the rates and amplitudes of TGIRT template-switching to acceptor RNAs with different 3' nucleotides are smaller at 200 mM than at 450 mM NaCl, suggesting that template-switching biases might be decreased by the lower salt concentration used in the updated protocol (Lentzsch et al., 2019). TGIRT-III is more efficient than retroviral RTs in reverse transcribing through expanded GC-rich repeat sequences, such as those characteristic of myotonic dystrophy and familial amyotropic lateral sclerosis (Carrell et al., 2018), but more prone to indels at homopolymer runs, particularly if followed by a stable hairpin structure (Penno et al., 2017). The latter may reflect that the TGIRT enzyme is less likely to dissociate at stable RNA secondary structures than are retroviral RTs, and continues to incorporate nucleotides via slippage until it can read through such impediments. TGIRT-III does not template switch efficiently to 3' poly(A) tails of eukaryotic mRNAs (Yao et al., 2021).


In ribodepleted, chemically fragmented RNA preparations, TGIRT-III gave reliable quantitation of RNA spike-ins ≥60 nt (Nottingham et al., 2016; Boivin et al., 2018). In heterogeneously sized RNA preparations, however, TGIRT-seq under-represents RNAs <60 nt, particularly very small RNAs, such as miRNAs and short tRNA fragments (tRFs; Boivin et al., 2018; Yao et al., 2020), requiring an orthogonal method, such as RT-qPCR, or a hybridization-based assay to determine their relative abundance. This size bias occurs at the initial template-switching step used for 3' RNA-seq adapter addition and likely reflects that 5' regions of longer RNAs that extend outside the template-switching pocket can bind to additional sites on the outer surface of the protein (Lentzsch et al., 2021). These additional sites likely correspond to basic surface residues that function in binding group II intron RNAs for RNA splicing and retrohoming, but are not required for reverse transcription, and thus relatively easy to remove without affecting performance in RNA-seq (Zhao et al., 2018; Stamos et al., 2019). Additionally, this size bias appears to be significantly lower for another TGIRT enzyme (TeI4c RT; Mohr et al., 2013), which is not yet sold commercially.


While TGIRT-seq of ribodepleted chemically fragmented RNAs is best for mRNA quantitation, TGIRT-seq of intact (i.e., non-chemically fragmented) RNA preparations has been useful for the analysis of tRNAs and other structured sncRNAs, for which TGIRT-seq yields full-length end-to-end sequences (Katibah et al., 2014; Shurtleff et al., 2017; Yao et al., 2020). An initial TGIRT-based protocol for tRNA-seq used gel purification and template switching to RNAs that have a C-terminal A residue in combination with an RNA demethylase to obtain full-length reads of mature tRNAs (Zheng et al., 2015). However, the simpler TGIRT-seq total RNA-seq protocol described here also gives largely full-length end-to-end sequences of mature tRNAs and tRNA fragments without demethylase treatment (Katibah et al., 2014; Qin et al., 2016; Shurtleff et al., 2017), as does a more recent TGIRT-based tRNA-seq method (mim-tRNAseq; Behrens et al., 2021). The TGIRT enzyme is highly processive and, given enough time, pauses at post-transcriptional modifications that affect base pairing until it can read through via distinctive patterns of misincorporation that can be used to identify the modification (Katibah et al., 2014; Shen et al., 2015; Zheng et al., 2015; Qin et al., 2016). This read-through via misincorporation ability has also been used to map chemical modifications for RNA structure mapping in procedures such as DMS-MaPseq (Zubradt et al., 2017; Wu and Bartel, 2017; Wang et al., 2018). The ability of TGIRT-III to give full-length tRNA sequences was key to demonstrating that mature tRNAs, rather than tRNA fragments, predominate in human plasma and EVs (Qin et al., 2016; Shurtleff et al., 2017; Yao et al., 2020), and in distinguishing mature from pre-tRNAs bound by the human interferon-induced protein IFIT5 (Katibah et al., 2014). TGIRT template-switching has also been used to measure levels of tRNA aminoacylation, which blocks 3’ tRNA ends for template switching (Evans et al., 2017).


In addition to tRNAs, TGIRT-seq also gives full-length end-to-end sequences of other structured sncRNAs, enabling the identification of novel snoRNAs in cellular RNA preparations (Boivin et al., 2020), and distinguishing pre-miRNA hairpins from mature miRNAs in human plasma (Qin et al., 2016; Yao et al., 2020). In recent work, TGIRT-seq of human plasma and cellular RNAs revealed the presence of thousands of short full-length excised linear intron (FLEXI) RNAs, many of which have stable predicted RNA secondary structures, that would make them difficult to identify by other methods (Yao et al., 2020 and 2021). TGIRT enzymes and variations of the TGIRT-seq method have also been used for high-throughput sequencing of protein-bound RNAs or RNA fragments by RIP-seq and irCLIP (Katibah et al., 2014; Zarnegar et al., 2016), and for ssDNA-seq of human plasma DNA, providing information about nucleosome positioning and DNA methylation sites that can be used to identify tissues of origin (Wu and Lambowitz, 2017). Going forward, we anticipate that the current version of the TGIRT-seq method will be subject to continued improvement, by using the structural and biochemical information now available for GsI-IIC RT (TGIRT-III) to enhance methods for 5' and 3' RNA-seq adapter addition, as well as by using other natural and engineered versions of TGIRTs and other non-LTR-retroelement RTs.

Materials and Reagents

Use reagents and solutions that are RNA grade and nuclease free. Store solutions in frozen aliquots to avoid repeated freezing and thawing.

  1. 1.5 ml DNA LoBind microcentrifuge tubes (Eppendorf, catalog number: 022431021)

  2. 2 ml DNA LoBind microcentrifuge tubes (Eppendorf, catalog number: 022431048)

  3. Ep Dualfilter T.I.P.S LoRetention 0.1-10 μl (Eppendorf, catalog number: 0030078632)

  4. Ep Dualfilter T.I.P.S LoRetention 2-100 μl (Eppendorf, catalog number: 0030078659)

  5. Ep Dualfilter T.I.P.S LoRetention 20-300 μl (Eppendorf, catalog number: 0030078675)

  6. Ep Dualfilter T.I.P.S LoRetention 50-1,000 μl (Eppendorf, catalog number: 0030078683)

  7. 8 × 0.2 ml PCR reaction tube strip with attached flat cap (Simport, catalog number: T3202N)

  8. Dithiothreitol (DTT), 1 M (Thermo Fisher Scientific, catalog number: P2325)

  9. dNTPs mix, 25 mM each (Thermo Fisher Scientific, catalog number: R1122). Dilute to 20 mM each with RNase-free water before use

  10. TGIRT-III enzyme (InGex, catalog number: TGIRT50)

    Note: Store TGIRT-III enzyme received from a supplier at -80°C until ready for use. Store opened tubes at -20°C. TGIRT-III may lose activity after 3 months of storage at -20°C (Behrens et al., 2021).

  11. MinElute Reaction Cleanup kit (Qiagen, catalog number: 28204 or 28206) or MinElute PCR Purification kit (QIAGEN, catalog number: 28004 or 28006)

  12. 5’ DNA adenylation kit (New England Biolabs, catalog number: E2610S/L)

  13. Oligo Clean & Concentrator kit (Zymo Research, catalog number: D4060/4061)

  14. Thermostable 5’ app DNA/RNA ligase (New England Biolabs, catalog number: M0319S/L)

  15. Phusion high-fidelity PCR master mix with HF buffer (Thermo Fisher Scientific, catalog number: F531S/L)

  16. AMPure XP (Beckman Coulter, catalog number: A63881)

  17. High sensitivity DNA kit (Agilent, catalog number: 5067-4626)

  18. (Optional) RNA 6000 Pico kit (Agilent, catalog number: 5067-1513)

  19. (Optional) Small RNA kit (Agilent, catalog number: 5067-1548)

  20. Oligonucleotides: All oligonucleotides should be HPLC-purified RNase-free grade

    1. R2 RNA

      5’-rArArG rArUrC rGrGrA rArGrA rGrCrA rCrArC rGrUrC rUrGrA rArCrU rCrCrA rGrUrC rArC/3SpC3/-3’

      Note: Other blockers such as 3’ Amino Modifier C6 dT(3AmMC6T) from IDT can also be used.

    2. R2R DNA

      5’-GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC TTN-3’ (N = equimolar A, T, G, C)

      Note: The R2R DNA used in the current TGIRT-seq protocol differs from that used in earlier versions. It has a single nucleotide change (insertion of the underlined T residue at the -3 position from the 3' end) that strongly decreases formation of R1R-R2R adapter-dimers during the ligation step (Xu et al., 2019). A complementary A residue is inserted at the corresponding position in R2 RNA (underlined A; see above).

    3. R1R DNA

      5’-/5Phos/GAT CGT CGG ACT GTA GAA CTC TGA ACG TGT AG/3SpC3/-3’

      Note: The Read 1 (R1) sequence corresponds to the small RNA sequencing primer site used in the NEBNEXT Small RNA Library Prep Set for Illumina sequencing.

    4. 6N unique molecular identifier (UMI) R1R DNA

      5’-/5Phos/NNN NNN GAT CGT CGG ACT GTA GAA CTC TGA ACG TGT AG/3SpC3/-3’

      Note: UMI nucleotides (machine-mixed equimolar A, C, G, and T residues, denoted N) are added at the 5’ end of the R1R sequence. The number of N nucleotides can be changed to suit the complexity of the samples being sequenced and the number of duplicates expected after PCR.

    5. Illumina barcode PCR primer (P5)

      5’-AAT GAT ACG GCG ACC ACC GAG AT BARCODE C TAC ACG TTC AGA GTT CTA CAG TCC GAC GAT C-3’

      Note: The barcode sequence in the primer is the sense strand (e.g., ATCACG in the primer for TruSeq Barcode 01 (TSBC01) on the Illumina website). This barcode is optional but recommended to provide unique dual indices (UDI) for libraries sequenced on a NovaSeq instrument.

    6. Illumina barcode PCR primer (P7)

      5’-CAA GCA GAA GAC GGC ATA CGA GAT BARCODE GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC T-3’

      Note: The barcode sequence in the primer should be the reverse complement of the barcode listed on the Illumina website (e.g., CGTGAT in the primer for TSBC01).

  21. UltraPure DNase/RNase-free distilled water (Thermo Fisher Scientific, catalog number: 10977015) or equivalent from other companies or in house sources

  22. Trizma hydrochloride solution (Tris-HCl), pH 7.5, 2 M (Sigma-Aldrich, catalog number: T2944)

  23. EDTA, 0.5 M, pH 8.0, RNase-free (Thermo Fisher Scientific, catalog number: AM9260G)

  24. Sodium chloride solution, 5 M, RNase-free (Thermo Fisher Scientific, catalog number: AM9760G)

  25. Magnesium chloride solution, BioUltra for molecular biology, 2 M (Sigma-Aldrich, catalog number: 68475-100ML-F)

  26. RNA CenturyTM-Plus Markers (e.g., Thermo Fisher Scientific, catalog number: AM7145)

  27. AMPure XP beads (Beckman Coulter, catalog number: A63880)

  28. mirVana miRNA Isolation kit (Thermo Fisher Scientific, catalog number: AM1560)

  29. Total Exosome RNA and Protein Isolation kit (Thermo Fisher Scientific, catalog number: 4478545)

  30. TRIzol LS Reagent (Thermo Fisher Scientific, catalog number: 10296010)

  31. Turbo DNase (Thermo Fisher Scientific, catalog number: AM2238)

  32. DNase I (Zymo Research, catalog number: E1010)

  33. Exonuclease I (Lucigen, catalog number: X40520K)

  34. Illumina Ribo-Zero Plus rRNA Depletion kit (Illumina, catalog number: 20040526)

  35. NEBNext Magnesium RNA Fragmentation Module (New England Biolabs, catalog number: E6150S)

  36. RNA Clean & Concentrator-5 (Zymo Research, catalog number: R1013)

Equipment

  1. -20°C freezer

  2. T100 96-well PCR thermal cycler (Bio-Rad, catalog number: 1861096), Veriti 9044 60-well thermal cycler (Applied Biosystems, catalog number: 4384638), or equivalent thermal cycler with a heated lid and stable temperature control

  3. Microcentrifuge (Eppendorf, catalog number: 2231000768)

  4. DynaMag-2 Magnet (Thermo Fisher Scientific, catalog number: 12321D)

  5. 2100 Bioanalyzer instrument (Agilent, catalog number: G2939BA)

  6. Chip priming station (Agilent, catalog number: 5065-4401)

Software

  1. 2100 Expert software/upgrade (Agilent, catalog number: G2946CA)

    Note: The Bioanalyzer software should be bundled with the instrument. If not, an upgraded version is purchasable from Agilent.

Procedure

  1. Preparation of 10× Starter Duplex (Table 1)


    Table 1. Components needed for 10× (1 μM) Starter Duplex

    ComponentsVolume (final concentration)
    10 μM R2 RNA5 μl (1 μM)
    10 μM R2R hand-mixed DNA5 μl (1 μM)
    10× TE buffer (100 mM Tris-HCl, pH 7.5, 10 mM EDTA)5 μl (10 mM Tris-HCl, pH 7.5, 1 mM EDTA)
    Nuclease-free H2Oto 50 μl


    1. Prepare 10 µM R2R DNA containing an equimolar ratio of A, C, G, and T residues at the 1-nt DNA overhang position by hand-mixing equal volumes of four separate R2R DNA oligonucleotides with 3' A, C, G, or T residues (10 µM each) in a 1.5-ml Eppendorf Lobind microcentrifuge tube or equivalent and vortex. Aliquot the stock solution and store at -20°C until use. A volume of 50 μl of 10× Starter Duplex is sufficient for 25 reactions (2 μl per reaction).

      Note: R2R DNA with unequal ratios of 1-nt 3’ DNA overhang nucleotides can be used to mitigate template-switching biases in TGIRT-seq (Mohr et al., 2013; Xu et al., 2019) ,

    2. Set up the above reaction components in a sterile PCR tube for annealing of oligonucleotides to form the Starter Duplex.

    3. Incubate the mixed 10× Starter Duplex components at 82°C for 2 min in a pre-heated thermocycler.

    4. Cool to 25°C with a 10% ramp or at a rate of 0.1°C/s.

      Note: The annealed R2 RNA/R2R DNA Starter Duplex should be prepared freshly each time before the template-switching reaction. Discard any left over after use.


  2. Template-switching reverse transcription reaction

    1. Set up the following reaction components in a sterile PCR tube adding the TGIRT-III enzyme last (Table 2).


      Table 2. Components needed for template-switching reverse transcription reaction

      ComponentsVolume (final concentration)
      RNA sample0.5-50 ng or <100 nM
      5× Reaction Buffer (1 M NaCl, 25 mM MgCl2, 100 mM Tris-HCl, pH 7.5)4 μl (200 mM NaCl, 5 mM MgCl2, 20 mM Tris-HCl, pH 7.5)
      10× DTT (100 mM; avoid excessive freezing and thawing)2 μl (10 mM)
      10× Starter Duplex (after annealing)2 μl (100 nM final)
      TGIRT-III enzyme (10 μM)1 or 2 μl (500 nM or 1 μM final)
      Nuclease-free H2Oto 19 μl


      Notes:

      1. Because TGIRT enzymes can also template switch to DNA, RNA samples should be DNase treated prior to TGIRT-seq. Minimum RNA Inputs for DNase-treated RNA samples are: 2 ng rRNA-depleted, chemically fragmented cellular RNA; 20-50 ng rRNA-depleted, unfragmented cellular RNAs; 500 pg plasma RNA; 2 ng RNA extracted from highly purified extracellular vesicles. Measure low RNA concentrations with a Qubit or Bioanalyzer. See Notes 1-4 at the end of the text for further details regarding preparation of RNA samples for TGIRT-seq.

      2. We use 200 mM NaCl in this protocol instead of 450 mM NaCl used in earlier versions to increase the efficiency of template switching without prohibitively increasing multiple sequential template switches, which result in artifactual fusion reads. See also Note 5 below.

      3. Do a pilot experiment to determine the optimal enzyme concentration for your samples using an Agilent Bioanalyzer or equivalent instrument to monitor library quality and quantity.

      4. A template-switching reverse transcription reaction using the TGIRT-III enzyme to a commercial RNA ladder (e.g., Thermo Fisher Scientific, catalog number: AM7145) can be carried through the procedure as a positive control.

    2. Pre-incubate at room temperature for 30 min, then initiate template switching and reverse transcription by adding 1 μl of 20 mM dNTPs (an equimolar mixture of 20 mM each dATP, dCTP, dGTP, and dTTP).

    3. Incubate at 60°C for 15 min for whole cell, EV, or plasma RNAs, 5-10 min for mature miRNAs, or up to 60 min for long or heavily modified RNAs. The optimal incubation time should be determined experimentally for different RNA templates.

    4. Add 1 μl of 5 M NaOH and incubate at 95°C for 3 min.

    5. Cool to room temperature and neutralize with 1 μl of 5 M HCl.

      Note: Steps B4 and B5 are needed because TGIRT-III binds very tightly to nucleic acids. Alkaline hydrolysis degrades the RNA template and enzyme but not the cDNA products.

    6. Add 78 μl of nuclease-free water to bring to a final volume of 100 μl.

    7. Clean up the cDNAs with a MinElute Reaction Cleanup kit (QIAGEN, catalog number: 28204) or a MinElute PCR Purification kit (QIAGEN, catalog number: 28004) and elute in 10 μl of QIAGEN elution buffer.

      Note: Incubate the column with the elution buffer at room temperature before centrifugation to maximize recovery.

      The procedure can be interrupted here, with the cDNAs stored at -20°C.

    8. Proceed with R1R adenylation, thermostable ligation, and Phusion PCR amplification.


  3. R1R 5' DNA adenylation using the New England Biolabs kit (New England Biolabs, catalog number: E2610S/L)

    Note: If the adenylated oligonucleotide is purchased from a commercial supplier, proceed to the ligation step.

    1. Set up the following reaction components in a sterile PCR tube (Table 3):


      Table 3. Components needed for 5' DNA adenylation

      ComponentsVolume
      10× reaction buffer (DNA Adenylation Buffer)2 μl
      1 mM ATP2 μl
      100 μM 5’phos/3’SpC3 R1R DNA1 μl
      Mth RNA Ligase2 μl
      Nuclease-free waterto 20 μl

      Note: The New England Biolabs kit includes 10× DNA Adenylation buffer, 1 mM ATP, and Mth RNA ligase.


    2. Incubate at 65°C for 1 h.

    3. Incubate at 85°C for 5 min to inactivate the enzyme.

    4. Clean up with an Oligo Clean & Concentrator kit (Zymo Research, catalog number: D4060) and elute in 10 μl of nuclease-free water for a final concentration of 10 μM 5’-end adenylated R1R DNA.

      Note: If required for a large number of TGIRT-seq libraries, we recommend scaling up the adenylation reaction by doing multiple (e.g., 6× to 8×) 20-μl reactions with the same amounts of enzyme and oligonucleotide, and then combining products for clean-up, as higher elution volume helps with efficient recovery of adenylated oligonucleotides. After clean-up, recovered material should be 80-100 ng/μl for R1R and 100-120 ng/μl for 6N UMI R1R, measured with a Nanodrop spectrophotometer using the ssDNA setting. Adenylation can be monitored using an Agilent Bioanalyzer with a Small RNA kit. The adenylated oligonucleotide can be stored at -20°C for up to two weeks.


  4. Thermostable ligation of a 5’ adenylated adapter (New England Biolabs, catalog number: M0319S/L)

    1. Set up the following reaction components in a sterile PCR tube (Table 4):


      Table 4. Components needed for thermostable ligation of 5’ adapter

      ComponentsVolume
      10× reaction buffer (NEBuffer 1)2 μl
      50 mM MnCl2 2 μl
      cDNA from template-switching reactionUp to 10 μl
      Thermostable 5’ AppDNA/RNA Ligase2 μl
      10 μM 5’-end adenylated R1R DNA4 μl
      Nuclease-free waterTo 20 μl if using less than 10 μl cDNA


    2. Incubate at 65°C for 1-2 h.

      Note: We recommend 2 h ligation. However, 1 h ligation can be used if the starting material is abundant.

    3. Add 80 μl of nuclease-free water for a final volume of 100 μl.

    4. Clean up the ligated cDNAs with a MinElute Reaction Cleanup kit (QIAGEN, catalog number: 28204) and elute in 23 μl of QIAGEN elution buffer (incubate the column with elution buffer at room temperature before centrifugation to maximize recovery.

      Note: The procedure can be interrupted here with the ligated cDNAs stored at -20°C.

    5. Proceed with Phusion PCR amplification.


  5. PCR amplification (Thermo Fisher Scientific, catalog number: F531S/L)

    1. Set up the following reaction components in a sterile PCR tube (Table 5):


      Table 5. Components needed for PCR amplification

      ComponentsVolume (final concentration)
      2× Phusion High-Fidelity PCR Master Mix in HF buffer25 μl
      10 μM Illumina P5 primer1 μl (200 nM)
      10 μM Illumina P7 Primer1 μl (200 nM)
      cDNA from thermostable ligationUp to 23 μl
      Nuclease-free waterTo 50 μl, if using less than 23 μl cDNA


    2. PCR cycles:

      1. Denaturation 98°C for 5 s.

      2. Up to 12 cycles of 98°C for 5 s, 60°C for 10 s, 72°C for 15-30 s/kb, hold at 4°C.

        Notes:

        1. Minimizing the number of cycles of PCR amplification decreases bias and duplicate reads. Twelve PCR cycles is generally satisfactory for TGIRT-seq with RNA inputs indicated in Section B. The number of cycles can be increased or decreased for different RNA inputs and library complexities.

        2. The procedure can be interrupted here and the PCR products stored at -20°C.

    3. Use AMPure XP beads (Beckman Coulter, catalog number: A63880) to deplete adapter-dimers and enrich for desired DNA sizes in the library. The default ratio is 1.4× v/v (70 μl of beads/50 μl of PCR reaction). The ratio of beads to sample volume can be adjusted depending on the size profile of RNAs being sequenced.

    4. Check library quality by analyzing 1 μl on a Bioanalyzer with a High Sensitivity DNA Analysis kit (Figure 2).



      Figure 2. Bioanalyzer traces of representative TGIRT-seq libraries obtained by using an Agilent Bioanalyzer with a High Sensitivity DNA kit.

      A. Ribo-depleted and chemically fragmented HeLa S3 cellular RNA (20 ng starting material after ribodepletion). Library products for cellular RNAs that were fragmented to 70-90 nt run at 190-210 bp. B. Plasma RNA (0.5 ng of starting material). Library products run at 140-240 bp, with the peak at ~200 bp corresponding to library products from tRNAs. Residual adapter dimers (AD), which are present at low concentrations in the final libraries, run at ~120 bp for R1R-R2R, and slightly larger with the UMI R1R adapter or Unique Dual Index (UDI) PCR primer. Peaks smaller than 100 bp correspond to unused PCR and adapter oligonucleotides (P+A). The lower (35 bp) and upper (10,380 bp) markers are internal markers provided with the High Sensitivity DNA kit. DNA lengths (bp) are determined by the software program based on the ladder provided with the kit.


      Notes:

      1. If needed, an additional round of the 1.4× AMPure XP beads clean-up can be performed to further deplete levels of adapter dimers. More extensive AMPure XP beads clean-up can result in loss of library products.

      2. The final TGIRT-seq library can be stored at -20°C until submitted for sequencing.


  6. Sequencing and data analysis

    TGIRT-seq libraries can be sequenced on any Illumina sequencing instrument to obtain single-end (SE) or paired-end (PE) reads. Read length and read depth are dependent upon the needs of the experiment. We have used PE75 on a NextSeq 500 with 330 million reads output for analysis of cellular, EV, and plasma RNAs. We have also used PE150 on a NovaSeq with at least 700 million reads output for large projects or samples requiring high-depth analysis.

    Because TGIRT-seq simultaneously profiles coding and non-coding RNAs, read mapping pipelines for TGIRT-seq typically use a sequential mapping strategy with specialized databases to first map reads with rRNA and tRNA sequences encoded at multiple loci, and sncRNAs embedded within introns in protein-coding genes, before mapping the remaining reads to a human genome reference sequence. In addition to end-to-end alignment, a local alignment step is typically included to more efficiently capture reads with non-templated nucleotides added to the 3' end of cDNAs by TGIRT enzymes. Detailed protocols for read processing and mapping, including UMI deconvolution of duplicate reads, and peak calling for the identification of protein-protected mRNA fragments, structured excised intron RNAs, and intron RNA fragments as potential biomarkers in human plasma can be found in Yao et al. (2020) .

    Additional TGIRT-seq read mapping protocols used for different sample types and applications, including statistical tests and details of replicates for different applications, can be found in the following references: plasma RNA (Qin et al., 2016); total cellular RNA, including spike-ins (Nottingham et al., 2016); plasma DNA, including nucleosome positioning and mapping of DNA methylation sites (Wu and Lambowitz, 2017); sncRNAs (Boivin et al., 2018); total cellular RNAs, including detailed analysis and remediation of sequencing biases (Xu et al., 2019).

Notes

  1. We have used various commercial kits to isolate RNAs for TGIRT-seq. The commercial kits differ significantly in RNA yield and can bias for different RNA biotypes. We have used the mirVana miRNA Isolation kit (Thermo Fisher Scientific, catalog number: AM1560) with good results for TGIRT-seq of human cellular RNA. For EVs, we obtained good results by using the Total Exosome RNA and Protein Isolation kit (Thermo Fisher Scientific, catalog number: 4478545). For isolating human plasma RNAs, we have tried a number of different extraction reagents and kits, including TRIzol LS Reagent (Thermo Fisher Scientific, catalog number: 10296010) and the Direct-zol RNA Miniprep (Zymo Research, catalog number: R2051); QIAamp ccfDNA/RNA Extraction kit (QIAGEN, catalog number: 55184); and miRNeasy Serum/Plasma Advanced kit (QIAGEN, catalog number: 217204). Each kit has advantages and disadvantages for input amount, ease of use, and RNA yield. In our hands, the QIAamp ccfDNA/RNA kit was more efficient for extracting DNA than RNA (Yao et al., 2020).

  2. TGIRT enzymes can template switch from the R2 RNA/R2R DNA starter duplex to the 3' end of DNA fragments, and this activity has been used for sequencing single-stranded DNA for analysis of nucleosome positioning and mapping of DNA methylation sites (Wu and Lambowitz, 2017). For RNA-seq, it is important to remove as much DNA as possible to minimize DNA reads. We have used Turbo DNase (Thermo Fisher Scientific, catalog number: AM2238) and a combination of DNase I (Zymo Research, catalog number: E1010) and Exonuclease I (Lucigen, catalog number: X40520K) (Yao et al., 2020). We clean up DNase-treated RNAs using an RNA Clean & Concentrator-5 kit (Zymo Research, catalog number: R1013) with a modified 8× V/V (ethanol/input RNA) protocol.

  3. Cellular RNAs are rRNA-depleted to minimize reads mapping to rRNAs. As rRNA removal procedures are not 100% specific for rRNAs, different rRNA depletion kits can also affect reads corresponding to other RNA biotypes, including RNAs of interest. Thus, the same rRNA depletion kit should be used for all samples in a study. The Illumina Ribo-Zero Plus rRNA Depletion kit (Illumina, catalog number: 20040526) with the manufacturer's protocol gives satisfactory results for TGIRT-seq [5-10% rRNA reads in the final library, although in our hands it is not as effective at removing rRNA as the original (now discontinued) version of that kit (RiboZero Gold, <1-2% rRNA reads)].

  4. For chemical fragmentation of cellular RNAs, we use the NEBNext Magnesium RNA Fragmentation Module (New England Biolabs, catalog number: E6150S). We typically use 50 ng rRNA-depleted total cellular RNA as input and heat at 94°C for 5-7 min for fragmentation. The RNA input and fragmentation time should be optimized by the researcher according to the RNAs of interest and desired read length for Illumina sequencing. After fragmentation, we clean up the RNA by using an RNA Clean & Concentrator-5 kit (Zymo Research, catalog number: R1013) with a modified 8× V/V (ethanol/input RNA) protocol. Fragmented cellular RNAs generated with divalent cations should be treated with T4 polynucleotide kinase with 3’-phosphatase activity to remove 3' phosphates and 2’,3’-cyclic phosphates, which impede TGIRT template-switching (Mohr et al., 2013; Nottingham et al., 2016). This inhibition is less pronounced at 200 mM than at 450 mM NaCl (Lentzsch et al., 2019; Yao et al., 2020).

  5. The high salt concentration (450 mM NaCl) used in initial TGIRT-seq protocols to suppress multiple end-to-end template switches by the TGIRT enzyme also decreases the efficiency of the initial template-switching reaction from the R2 RNA/R2R DNA Starter Duplex (Lentzsch et al., 2019). Multiple template switches are particularly problematic for miRNA sequencing. The updated protocol using 200 mM NaCl to increase the efficiency of initial template-switching reaction from the R2 RNA/R2R DNA starter duplex gave comprehensive libraries of coding and non-coding RNAs in human plasma with acceptable levels of fusion reads (0.5-4%), which include multiple template switches (Yao et al., 2020).

  6. With low RNA inputs (e.g., for human plasma and EV RNAs), Bioanalyzer traces of TGIRT-seq libraries may show an adapter dimer peak but no software-called library product peaks. In those cases, 1 μl of the library can be amplified with another 12 cycles of PCR to detect library products. If library products can be detected after this further PCR amplification, the original library without further amplification should be okay for sequencing. After the additional amplification, adapter dimer levels will be even higher (as the short products amplify preferentially), but the user should expect to observe desired products of the predicted sizes. If library products are still not detected, it likely means that library preparation failed due to an RNA input problem (quality or quantity).

  7. A lack of both a product peak and an adapter dimer peak (120-125 bp) in Bioanalyzer traces of the final TGIRT-seq library indicates that the ligation reaction was unsuccessful, most likely due to failure of the R1R adenylation step or ligation.

Acknowledgments

The development of TGIRT-seq methods has been supported by NIH grant R35 GM136216 and Welch Foundation Grant F-1607. This protocol has been adapted from Yao et al. (2020).

Competing interests

Thermostable group II intron reverse transcriptase enzymes and methods for their use are the subject of patents and patent applications that have been licensed by the University of Texas and East Tennessee State University to InGex, LLC. A.M.L, some former and present members of the Lambowitz laboratory, and the University of Texas are minority equity holders in InGex, LLC, and receive royalty payments from the sale of TGIRT enzymes and kits employing TGIRT template-switching activity and from the sublicensing of intellectual property to other companies.

References

  1. Behrens, A., Rodschinka, G. and Nedialkova, D. D. (2021). High-resolution quantitative profiling of tRNA abundance and modification status in eukaryotes by mim-tRNAseq. Mol Cell 81(8): 1802-1815.e1807.
  2. Belfort, M. and Lambowitz, A. M. (2019). Group II Intron RNPs and Reverse Transcriptases: From Retroelements to Research Tools. Cold Spring Harb Perspect Biol 11(4): a032375.
  3. Blocker, F. J., Mohr, G., Conlan, L. H., Qi, L., Belfort, M. and Lambowitz, A. M. (2005). Domain structure and three-dimensional model of a group II intron-encoded reverse transcriptase. RNA 11(1): 14-28.
  4. Boivin, V., Deschamps-Francoeur, G., Couture, S., Nottingham, R. M., Bouchard-Bourelle, P., Lambowitz, A. M., Scott, M. S. and Abou-Elela, S. (2018). Simultaneous sequencing of coding and noncoding RNA reveals a human transcriptome dominated by a small number of highly expressed noncoding genes. RNA 24(7): 950-965.
  5. Boivin, V., Reulet, G., Boisvert, O., Couture, S., Elela, S. A. and Scott, M. S. (2020). Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA. Nucleic Acids Res 48(5): 2271-2286.
  6. Carrell, S. T., Tang, Z., Mohr, S., Lambowitz, A. M. and Thornton, C. A. (2018). Detection of expanded RNA repeats using thermostable group II intron reverse transcriptase. Nucleic Acids Res 46(1): e1.
  7. Cocquet, J., Chong, A., Zhang, G. and Veitia, R. A. (2006). Reverse transcriptase template switching and false alternative transcripts. Genomics 88(1): 127-131.
  8. Evans, M. E., Clark, W. C., Zheng, G. and Pan, T. (2017). Determination of tRNA aminoacylation levels by high-throughput sequencing. Nucleic Acids Res 45(14): e133.
  9. Gladyshev, E. A. and Arkhipova, I. R. (2011). A widespread class of reverse transcriptase-related cellular genes. Proc Natl Acad Sci U S A 108(51): 20311-20316.
  10. Hafner, M., Renwick, N., Brown, M., Mihailović, A., Holoch, D., Lin, C., Pena, J. T., Nusbaum, J. D., Morozov, P., Ludwig, J., et al. (2011). RNA-ligase-dependent biases in miRNA representation in deep-sequenced small RNA cDNA libraries. RNA 17(9): 1697-1712.
  11. Hu, W. S. and Hughes, S. H. (2012). HIV-1 reverse transcription. Cold Spring Harb Perspect Med 2(10): a006882.
  12. Katibah, G. E., Qin, Y., Sidote, D. J., Yao, J., Lambowitz, A. M. and Collins, K. (2014). Broad and adaptable RNA structure recognition by the human interferon-induced tetratricopeptide repeat protein IFIT5. Proc Natl Acad Sci U S A 111(33): 12025-12030.
  13. Kojima, K. K. and Kanehisa, M. (2008). Systematic survey for novel types of prokaryotic retroelements based on gene neighborhood and protein architecture. Mol Biol Evol 25(7): 1395-1404.
  14. Lambowitz, A. M. and Belfort, M. (2015). Mobile Bacterial Group II Introns at the Crux of Eukaryotic Evolution. Microbiol Spectr 3(1): Mdna3-0050-2014.
  15. Lentzsch, A. M., Yao, J., Russell, R. and Lambowitz, A. M. (2019). Template-switching mechanism of a group II intron-encoded reverse transcriptase and its implications for biological function and RNA-Seq.J Biol Chem 294: 19764-19784.
  16. Lentzsch, A. M., Stamos, J. L., Yao, J., Russell, R. and Lambowitz, A. M. (2021). Structural basis for template switching by a group II intron-encoded non-LTR-retroelement reverse transcriptase. J Biol Chem 297(2): 100971.
  17. Mader, R. M., Schmidt, W. M., Sedivy, R., Rizovski, B., Braun, J., Kalipciyan, M., Exner, M., Steger, G. G. and Mueller, M. W. (2001). Reverse transcriptase template switching during reverse transcriptase-polymerase chain reaction: artificial generation of deletions in ribonucleotide reductase mRNA. J Lab Clin Med 137(6): 422-428.
  18. Martín-Alonso, S., Frutos-Beltrán, E. and Menéndez-Arias, L. (2021). Reverse Transcriptase: From Transcriptomics to Genome Editing. Trends Biotechnol 39(2): 194-210.
  19. Mohr, S., Ghanem, E., Smith, W., Sheeter, D., Qin, Y., King, O., Polioudakis, D., Iyer, V. R., Hunicke-Smith, S., Swamy, S., et al. (2013). Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing. RNA 19(7): 958-970.
  20. Nottingham, R. M., Wu, D. C., Qin, Y., Yao, J., Hunicke-Smith, S. and Lambowitz, A. M. (2016). RNA-seq of human reference RNA samples using a thermostable group II intron reverse transcriptase. RNA 22(4): 597-613.
  21. Onafuwa-Nuga, A. and Telesnitsky, A. (2009). The remarkable frequency of human immunodeficiency virus type 1 genetic recombination. Microbiol Mol Biol Rev 73(3): 451-480, Table of Contents.
  22. Penno, C., Kumari, R., Baranov, P. V., van Sinderen, D. and Atkins, J. F. (2017). Stimulation of reverse transcriptase generated cDNAs with specific indels by template RNA structure: retrotransposon, dNTP balance, RT-reagent usage. Nucleic Acids Res 45(17): 10143-10155.
  23. Picelli, S., Björklund Å, K., Faridani, O. R., Sagasser, S., Winberg, G. and Sandberg, R. (2013). Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat Methods 10(11): 1096-1098.
  24. Qin, Y., Yao, J., Wu, D. C., Nottingham, R. M., Mohr, S., Hunicke-Smith, S. and Lambowitz, A. M. (2016). High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases. RNA 22(1): 111-128.
  25. Shen, P. S., Park, J., Qin, Y., Li, X., Parsawar, K., Larson, M. H., Cox, J., Cheng, Y., Lambowitz, A. M., Weissman, J. S., et al. (2015). Protein synthesis. Rqc2p and 60S ribosomal subunits mediate mRNA-independent elongation of nascent chains. Science (New York, NY) 347(6217): 75-78.
  26. Shurtleff, M. J., Yao, J., Qin, Y., Nottingham, R. M., Temoche-Diaz, M. M., Schekman, R. and Lambowitz, A. M. (2017). Broad role for YBX1 in defining the small noncoding RNA composition of exosomes. Proc Natl Acad Sci U S A 114(43): E8987-E8995.
  27. Simon, D. M. and Zimmerly, S. (2008). A diversity of uncharacterized reverse transcriptases in bacteria. Nucleic Acids Res 36(22): 7219-7229.
  28. Stamos, J. L., Lentzsch, A. M. and Lambowitz, A. M. (2017). Structure of a Thermostable Group II Intron Reverse Transcriptase with Template-Primer and Its Functional and Evolutionary Implications. Mol Cell 68(5): 926-939.e924.
  29. Stamos, J. L., Lentzsch, A. M., Park, S. K., Mohr, G., Lambowitz, A. M. (2019). Non-LTR retroelement reverse transcriptase and uses thereof. PCT/US2018/054147.
  30. Stark, R., Grzelak, M. and Hadfield, J. (2019). RNA sequencing: the teenage years. Nat Rev Genet 20(11): 631-656.
  31. Upton, H. E, Ferguson, L, Temoche-Diaz, M. M, Liu, X, Pimentel, S. C, Ingolia, N. T, Schekman, R and Collins, K. (2021). Low-bias ncRNA libraries using ordered two-template relay: serial template jumping by a modified retroelement reverse transcriptase. BioRxiv doi:https://doi.org/10.1101/2021.04.30.442027
  32. Wang, Z., Ma, Z., Castillo-González, C., Sun, D., Li, Y., Yu, B., Zhao, B., Li, P. and Zhang, X. (2018). SWI2/SNF2 ATPase CHR2 remodels pri-miRNAs via Serrate to impede miRNA production. Nature 557(7706): 516-521.
  33. Wu, D. C. and Lambowitz, A. M. (2017). Facile single-stranded DNA sequencing of human plasma DNA via thermostable group II intron reverse transcriptase template switching. Sci Rep 7(1): 8421.
  34. Wu, X. and Bartel, D. P. (2017). Widespread Influence of 3'-End Structures on Mammalian mRNA Processing and Stability. Cell 169(5): 905-917.e911.
  35. Xiong, Y. and Eickbush, T. H. (1990). Origin and evolution of retroelements based upon their reverse transcriptase sequences. Embo J 9(10): 3353-3362.
  36. Xu, H., Yao, J., Wu, D. C. and Lambowitz, A. M. (2019). Improved TGIRT-seq methods for comprehensive transcriptome profiling with decreased adapter dimer formation and bias correction. Sci Rep 9(1): 7953.
  37. Yao, J., Wu, D. C., Nottingham, R. M. and Lambowitz, A. M. (2020). Identification of protein-protected mRNA fragments and structured excised intron RNAs in human plasma by TGIRT-seq peak calling. Elife 9: e60743.
  38. Yao, J., Xu, H., Shelby, W., Wu, D. C., Ares, M. and Lambowitz, A. M. (2021). Human cells contain myriad excised linear introns with potential functions in gene regulation and as RNA biomarkers. BioRxiv doi: https://doi.org/10.1101/2020.09.07.285114.
  39. Zarnegar, B. J., Flynn, R. A., Shen, Y., Do, B. T., Chang, H. Y., and Khavari, P. A. (2016). irCLIP platform for efficient characterization of protein-RNA interactions. Nat Methods 13(6): 489-492.
  40. Zhao, C., Liu, F. and Pyle, A. M. (2018). An ultraprocessive, accurate reverse transcriptase encoded by a metazoan group II intron. RNA 24(2): 183-195.
  41. Zheng, G., Qin, Y., Clark, W. C., Dai, Q., Yi, C., He, C., Lambowitz, A. M. and Pan, T. (2015). Efficient and quantitative high-throughput tRNA sequencing. Nat Methods 12(9): 835-837.
  42. Zhu, Y. Y., Machleder, E. M., Chenchik, A., Li, R. and Siebert, P. D. (2001). Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechniques 30(4): 892-897.
  43. Zimmerly, S. and Wu, L. (2015). An Unexplored Diversity of Reverse Transcriptases in Bacteria. Microbiol Spectr 3(2): Mdna3-0058-2014.
  44. Zubradt, M., Gupta, P., Persad, S., Lambowitz, A. M., Weissman, J. S. and Rouskin, S. (2017). DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat Methods 14(1): 75-82.


简介

[摘要] 高通量 RNA 测序 (RNA-seq) 极大地促进了我们对基因表达和疾病病因学的理解,并且是在多种生物体中鉴定生物标志物的有力工具。然而,大多数 RNA-seq 方法依赖逆转录病毒逆转录酶 (RT),这种酶具有固有的低保真度和持续合成能力,将 RNA 转化为 cDNA 进行测序。在这里,我们描述了一个使用 Thermostable Group II 内含子逆转录酶 (TGIRT) 的 RNA-seq 协议,它具有高保真度、持续合成能力和链置换活性,以及一个熟练的模板转换活动,可实现高效和无缝的 RNA-seq适配器添加。通过结合这些活动,TGIRT-seq 能够从少量起始材料中同时分析所有 RNA 生物型,具有卓越的 RNA-seq 指标,以及前所未有的对结构化 RNA 进行测序的能力。用于 Illumina 测序的 TGIRT-seq 协议包括三个步骤:(i) 添加 3' RNA-seq 接头,通过从合成接头转换模板,在目标 RNA 的 3' 末端启动 cDNA 合成RNA/DNA 起始双链体;(ii) 添加 5' RNA-seq 接头,通过使用热稳定的 5' App DNA/RNA 连接酶将接头寡核苷酸连接到完整 cDNA 的 3' 末端;(iii) 最小的 PCR 扩增,为 Illumina 测序添加捕获位点和索引。用于 Illumina 测序平台的 TGIRT-seq 已被用于全面分析核耗尽、化学碎片化细胞 RNA 中的编码和非编码 RNA,以及分析完整(非化学碎片化)细胞、细胞外囊泡 (EV) 和血浆 RNA,它产生结构化小非编码 RNA (sncRNA) 的连续全长端到端序列,包括 tRNA、snoRNA、snRNA、pre-miRNA 和全长切除线性内含子 (FLEXI) RNA。

图文摘要:

图 1. Illumina 测序的 TGIRT-seq 协议概述。
主要步骤是: (1) 从合成的 R2 RNA/R2R DNA 起始双链体转换模板,该双链体具有 1-nt 3' DNA 突出端(A、C、G 和 T 残基的混合物,表示为 N),与目标 RNA 的 3' 核苷酸,在通过添加 dNTP 启动逆转录后,将 R2R 接头无缝连接到所得 cDNA 的 5' 端;(2) 将 R1R 接头连接到完整 cDNA 的 3' 端;(3) 使用添加 Illumina 捕获位点(P5 和 P7)和条形码序列(索引 5 和 7)的引物进行最小 PCR 扩增。索引 7 条码是必需的,而索引 5 条码是可选的,以提供唯一的双索引 (UDI)。

[背景] 大多数 RNA-seq 方法依赖逆转录病毒逆转录酶 (RT) 将目标 RNA 转化为 cDNA,然后可以在高通量 DNA 测序平台上进行测序(Stark等,2019)。然而,逆转录病毒 RT 进化为具有固有的低保真度和持续合成能力,通过引入频繁的突变变异和通过 RNA 重组快速传播有益的变异来帮助逆转录病毒逃避宿主防御,RNA 重组涉及从一个模板解离并重新启动另一个模板(Onafuwa-Nuga 和 Telesnitsky, 2009 年;胡和休斯,2012 年)。尽管已经对商业逆转录病毒 RT 进行了设计以提高其持续合成能力和热稳定性,但改进这些酶的能力受到逆转录病毒 RT 结构框架的限制,甚至高度工程化的逆转录病毒 RT 也难以逆转录结构化 RNA(Martín-Alonso等人,2021 年) )。
TGIRT-seq 是下一代综合 RNA-seq 方法,它利用 II 组内含子编码的 RT 的有益生化特性,这些 RT 是逆转录病毒 RT 的进化祖先(Belfort 和 Lambowitz,2019)。II 组内含子 RTs 主要是与细菌逆转录转座子相关的原核酶,称为移动组 II 内含子,它们进化为在归巢中起作用,这是一种逆转录转座机制,需要高度结构化的 II 组内含子 RNA 以高保真度和持续合成能力进行逆转录(Lambowitz和贝尔福特,2015 年)。它们属于由非 LTR 逆转录转座子编码的一大类 RT,包括其他细菌 RT、逆转录质粒 RT 和 RVT RT,以及人类 LINE-1 元件、昆虫 R2 元件和其他真核非 LTR 逆转录转座子RTs(Xiong 和 Eickbush,1990;Kojima 和 Kanehisa,2008;Simon 和 Zimmerly,2008;Gladyshev 和 Arkhipova,2011;Zimmerly 和 Wu,2015)。非 LTR 逆转录元件 RTs 与逆转录病毒 RTs 的不同之处在于在手指和手掌中有一个独特的 N 端延伸 (NTE) 和两个独特的插入 (RT2a 和 RT3a),这由 II 组内含子 RT 的晶体结构显示与 RNA 模板产生多种额外的相互作用,并以有助于提高保真度和持续合成能力的方式更严格地限制 RT 活性位点(Xiong 和 Eickbush,1990;Blocker等,2005;Stamos等,2017)。NTE 在熟练的端到端模板切换活动中也发挥着至关重要的作用,这使得添加高效无缝的 RNA-seq 适配器成为可能(Lentzsch等人,2019 年和 2021 年)。来自细菌嗜热菌的 TGIRT 将这些有益特性与在高温下发挥作用的能力相结合。≥60°C),这有助于溶解稳定的 RNA 二级结构(Mohr等,2013)。将 TGIRT 和其他非 LTR 逆向元件 RT 用于 RNA-seq 和其他应用的关键是它们与大溶解度标签(例如麦芽糖结合蛋白)的融合,这使它们能够大量生产并保持可溶性和当不含内源性紧密结合的核酸时,在储存期间保持稳定(Mohr等人,2013 年;Upton等人,2021 年)。由于已经鉴定了数千个 II 组内含子和其他非 LTR 逆转录元件 RT,因此我们很可能在鉴定用于 RNA-seq 应用的最佳酶时只触及了皮毛。
图 1 概述了使用 TGIRT-III(InGex;Geobacillus stearothermophilus GsI-IIC RT的专有版本;Mohr等人,2013)同时分析细胞、细胞外所有编码和非编码 RNA 生物型的 TGIRT-seq 协议囊泡 (EV) 和没有大小选择的人血浆 RNA 制剂(也称为 TGIRT 总 RNA-seq 方法)。此处描述的协议是 Yao等人使用的更新版本。(2020) 基于 Qin等人描述的早期版本。(2016 年),诺丁汉等人。(2016) 和徐等人。(2019)。
在用于启动逆转录的第一步中,TGIRT 酶模板从包含 RNA-seq 接头序列的合成 RNA 模板/DNA 引物起始双链体直接切换到目标 RNA 的 3' 末端,从而连接反向cDNA 5' 端的接头序列的互补序列(Mohr等,2013;Qin等,2016)。RNA/DNA 起始双链体具有 1-nt 3' DNA 突出端,通过与目标 RNA 的 3' 核苷酸碱基配对,以高特异性指导模板切换,从而产生无缝的模板切换连接(Lentzsch等人,2019 年和2021)。对于 Illumina 测序,起始双链体由 35-nt RNA 寡核苷酸组成,该寡核苷酸包含 Illumina Read 2 序列(表示为 R2 RNA)并具有与 36-nt DNA 引物退火的 3'-阻断基团(C3 间隔区,表示为 3SpC3)包含 Read 2 序列的反向互补序列(表示为 R2R DNA),留下 1-nt 3' DNA 突出端。对于 RNA 池的综合 RNA-seq,1-nt 3' DNA 悬垂是 A、C、G 和 T 残基(表示为 N)的混合物,并添加到 RNA 模板的量中。逆转录通常在 60°C 下进行 15 分钟,这是 GsI-IIC RT 的最佳温度(Mohr等,2013),但不同的时间和更低或更高的温度可用于不同的应用。(例如,Zheng等人,2015 年;Behrens等人,2021 年)。通过加入 NaOH 终止逆转录,这会降解 RNA 模板,然后用 HCl 中和,并通过 MinElute cDNA 净化步骤去除未使用的 R2R DNA。
在下一步中,使用热稳定的 5' App RNA/通过单链 DNA 连接将包含 Illumina Read 1 序列(表示为 R1R DNA)的反向互补序列的第二个 RNA-seq 接头连接到 cDNA 的 3' 末端DNA 连接酶,然后是最小的 PCR 扩增(不超过 12 个循环),引物为 Illumina 测序添加捕获位点和索引。使用该协议,通过完成 PCR 步骤,可以在大约 5 小时内从低至 0.5 ng 的人血浆 RNA 中制备全面的 TGIRT-seq 库。
TGIRT-seq 可以使用 rRNA 耗尽、化学碎片化或完整的细胞 RNA 或来自 EV 或血浆的总 RNA 来完成。与 rRNA 耗尽、化学碎片化的通用人类参考 RNA (UHRR) 的基准 TruSeq v3 数据集相比,外部 RNA 控制联盟 (ERCC) 掺入,TGIRT-seq:(i) 更好地概括了 mRNA 和 ERCC 掺入的相对丰度-插入;(ii) 更具链特异性;(iii) 提供更均匀的 5' 到 3' 基因体覆盖并检测到更多的剪接点,特别是在基因的 5' 末端附近;(iv) 消除了由于随机六聚体引发引起的序列偏差,这是 TruSeq 固有的(Nottingham等,2016)。TGIRT-seq 方法的后续改进包括使用修饰的 RNA-seq 接头序列,这些接头序列大大减少接头二聚体的形成(Xu等人,2019 年),以及在模板转换和逆转录步骤中使用较低的盐浓度, 以提高文库产量 (Yao et al. , 2020)。如果需要,可以将独特的分子标识符并入 R1R 衔接子寡核苷酸(Yao等,2020)。尽管迄今为止主要用于 Illumina 测序,但 RNA-seq 接头序列可以很容易地重新格式化以用于其他高通量 DNA 测序平台。
用于 3' RNA-seq 适配器添加的 TGIRT-模板切换反应是 TGIRT-seq 方法的定义步骤,而使用热稳定 5' App DNA/RNA 连接酶的 5' RNA-seq 适配器添加原则上可以由替代方法代替方法。尽管依赖起始双链体的 1-nt 3' DNA 突出端和目标 RNA 的 3' 核苷酸之间的单个碱基对,通过 TGIRT 模板切换添加的 RNA-seq 接头具有高度特异性,产生 97.5-99.7% 的精确连接,具体取决于基于碱基对(Lentzsch等人,2019 年)。这种高特异性反映了这样一个事实,即受体 RNA 的 3' 末端结合在由 NTE 和 RT 指尖环形成的口袋中,这促进了受体 RNA 的 3' 核苷酸与互补的 1-nt 3' DNA 悬垂的退火起始双链体,并将受体的倒数第二个核苷酸定位为 RT 活性位点的模板 RNA 碱基(Lentzsch等,2021)。模板碱基和互补的传入 dNTP 之间的第二个碱基配对相互作用导致逆转录起始所需的构象变化,这是一个不可逆的步骤,可推动反应向前并确保高特异性(Lentzsch等,2021)。在缺乏 NTE 的逆转录病毒 RT 中不存在相同的模板切换口袋,这可能导致受体结合较弱,对碱基配对相互作用的依赖性更大,并且由于模板切换到含有互补核苷酸的替代位点而导致伪影的倾向。例如,Mader等人,2001 年;Cocquet等人,2006 年)。
逆转录后,TGIRT 酶可以将非模板化核苷酸添加到完整 cDNA 的 3' 端,产生 3' DNA 突出端,使模板能够切换到具有互补 3' 核苷酸的第二个 RNA 模板。通过在反应介质中使用相对较高的盐浓度(更新方案中为 200 mM NaCl,早期版本中为 450 mM NaCl;Mohr等人,2013 ;Lentzsch等人,2019 年;Yao等人,2020 年)。最近描述的突变可以提供更长期的解决方案,该突变特异性抑制非模板化添加,从而选择性地抑制来自完整 cDNA 5' 端的二级模板转换,但不抑制具有预先形成的 1- nt DNA 悬垂(Lentzsch等,2021)。二级模板开关也可用于添加 5' RNA-seq 接头(Zhu等人,2001 年;Picelli等人,2013 年),最近对 TGIRT 模板转换反应的详细生化和结构分析可用于为 TGIRT-seq 完善这样的方法(Lentzsch等人,2019 和 2021)。
使用 TGIRT-III 在含有 450 mM NaCl 和 60°C 的反应介质中确定的 TGIRT-seq 中的序列偏差仅限于 RNA 5' 末端的前三个核苷酸,对应于热稳定 5' App RNA 的已知序列偏差/ DNA连接酶(哈夫纳等人。,2011) ,和3'靶RNA的核苷酸,反映的偏好,或者与1-nt的3'种不同的碱基对的稳定性起动双链体DNA突出端用于TGIRT模板转换(Xu et al ., 2019)。由于接头和受体核酸之间的碱基配对,在高温下使用耐热 5' App RNA/DNA 连接酶的 5'-接头连接步骤没有主要的“共折叠”偏差,并且不会通过使用具有在其 5' 末端附近随机化核苷酸,如用于其他连接酶的 4 N 协议(Xu等人,2019)。目标 RNA 3' 核苷酸的模板转换偏差可以通过使用起始双链体中不等比例的 3'-突出核苷酸来弥补,以补偿碱基配对偏差,并且总体偏差可以通过使用偏差校正算法,基于所有导致偏差的因素的随机森林回归模型(Mohr等人,2013 年;Xu等人,2019 年)。此外,生化分析表明,TGIRT 模板转换为具有不同 3' 核苷酸的受体 RNA 的速率和幅度的差异在 200 mM 时比在 450 mM NaCl 时要小,这表明模板转换偏差可能会因较低的盐浓度而降低用于更新的协议(Lentzsch等人,2019 年)。TGIRT-III 在通过扩展的富含 GC 的重复序列进行逆转录方面比逆转录病毒 RT 更有效,例如强直性营养不良和家族性肌萎缩侧索硬化的特征(Carrell等,2018),但在均聚物运行时更容易发生插入缺失,特别是如果后跟稳定的发夹结构(Penno等,2017)。后者可能反映出与逆转录病毒 RT 相比,TGIRT 酶不太可能在稳定的 RNA 二级结构上解离,并且会继续通过滑动来掺入核苷酸,直到它可以读取这些障碍。TGIRT-III 不能有效地将模板转换为真核 mRNA 的 3'poly(A) 尾(Yao等人,2021)。
在去除核糖体的、化学碎片化的 RNA 制剂中,TGIRT-III 对 ≥60 nt 的 RNA 掺入进行了可靠的定量(Nottingham等人,2016 年;Boivin等人,2018 年)。然而,在异质大小的 RNA 制备中,TGIRT-seq 不能充分代表小于 60 nt 的 RNA,尤其是非常小的 RNA,例如 miRNA 和短 tRNA 片段(tRFs;Boivin等人,2018 年;Yao等人,2020 年),需要正交方法,例如 RT-qPCR,或基于杂交的测定法来确定它们的相对丰度。这种大小偏差发生在用于 3' RNA-seq 适配器添加的初始模板转换步骤中,这可能反映了延伸到模板转换口袋外的较长 RNA 的 5' 区域可以与蛋白质外表面上的其他位点结合(Lentzsch等人,2021 年)。这些额外的位点可能对应于在结合 II 组内含子 RNA 以进行 RNA 剪接和归巢的基本表面残基,但不是逆转录所必需的,因此相对容易去除而不影响 RNA-seq 的性能(Zhao等人, 2018 年;斯塔莫斯等人,2019 年)。此外,对于另一种尚未商业销售的TGIRT 酶(TeI4c RT;Mohr等人,2013 年),这种大小偏差似乎要低得多。
虽然 ribodepleted 化学片段化 RNA 的 TGIRT-seq 最适合 mRNA 定量,但完整(即非化学片段化)RNA 制剂的TGIRT-seq可用于分析 tRNA 和其他结构化 sncRNA,为此 TGIRT-seq 产生完整的-length 端到端序列(Katibah等人,2014 年;Shurtleff等人,2017 年;Yao等人,2020 年)。用于 tRNA-seq 的初始基于 TGIRT 的协议使用凝胶纯化和模板切换到具有 C 端 A 残基的 RNA 与 RNA 脱甲基酶结合,以获得成熟 tRNA 的全长读数(Zheng等,2015)。然而,这里描述的更简单的 TGIRT-seq 总 RNA-seq 协议也提供了未经去甲基化酶处理的成熟 tRNA 和 tRNA 片段的大部分全长端到端序列(Katibah等人,2014 年;Qin等人,2016 年; Shurtleff等人,2017 年),以及最近的基于 TGIRT 的 tRNA-seq 方法(mim-tRNAseq;Behrens等人,2021 年)。TGIRT 酶具有高度的持续性,如果有足够的时间,会在影响碱基配对的转录后修饰处暂停,直到它可以通过可用于识别修饰的独特错误掺入模式读取为止(Katibah等人,2014 年;Shen等人al ., 2015; Zheng et al ., 2015; Qin et al ., 2016)。这种通过错误掺入的通读能力也被用于在 DMS-MaPseq等程序中绘制 RNA 结构映射的化学修饰(Zubradt等人,2017 年;Wu 和 Bartel,2017 年;Wang等人,2018 年)。TGIRT-III 提供全长 tRNA 序列的能力是证明成熟 tRNA 而不是 tRNA 片段在人血浆和 EV 中占主导地位的关键(Qin et al. , 2016; Shurtleff et al. , 2017; Yao et al. . , 2020),以及区分人类干扰素诱导蛋白 IFIT5 结合的成熟和前体 tRNA(Katibah等,2014)。TGIRT 模板切换也被用于测量 tRNA 氨酰化的水平,它会阻止 3' tRNA 末端进行模板切换(Evans等,2017)。
除了 tRNA,TGIRT-seq 还提供了其他结构化 sncRNA 的全长端到端序列,从而能够鉴定细胞 RNA 制剂中的新型 snoRNA(Boivin等,2020),并区分 pre-miRNA 发夹和人血浆中的成熟 miRNA(Qin等人,2016 年;Yao等人,2020 年)。在最近的工作中,人血浆和细胞 RNA 的 TGIRT-seq 揭示存在数千个短的全长切除线性内含子 (FLEXI) RNA,其中许多具有稳定的预测 RNA 二级结构,这将使它们难以被其他人识别方法(Yao等,2020 和 2021)。TGIRT 酶和 TGIRT-seq 方法的变体也已用于通过 RIP-seq 和 irCLIP 对蛋白质结合的 RNA 或 RNA 片段进行高通量测序(Katibah等人,2014 年;Zarnegar等人,2016 年),以及对于人血浆 DNA 的ssDNA-se q,提供有关核小体定位和 DNA 甲基化位点的信息,可用于识别组织来源(Wu 和 Lambowitz,2017 年)。展望未来,我们预计当前版本的 TGIRT-seq 方法将得到持续改进,通过使用现在可用于 GsI-IIC RT (TGIRT-III) 的结构和生化信息来增强 5' 和 3' 方法添加 RNA-seq 适配器,以及使用其他天然和工程版本的 TGIRT 和其他非 LTR 逆向元件 RT。

关键字:II 组内含子逆转录酶, 非LTR反转录录元件, 逆转录酶, RNA-seq, 模板切换, 转录组学, Illumina 测序

材料和试剂

 

使用 RNA 级和无核酸酶的试剂和溶液。将溶液储存在冷冻等分试样中,以避免反复冻融。

  1. 1.5 ml DNA LoBind 微量离心管(Eppendorf,目录号:022431021)
  2. 2 ml DNA LoBind微量离心管(Eppendorf,目录号:022431048)
  3. Ep Dualfilter TIPS LoRetention 0.1-10 μl(Eppendorf,目录号:0030078632)
  4. Ep Dualfilter TIPS LoRetention 2-100 μl(Eppendorf,目录号:0030078659)
  5. Ep Dualfilter TIPS LoRetention 20-300 μl(Eppendorf,目录号:0030078675)
  6. Ep Dualfilter TIPS LoRetention 50-1,000 μl(Eppendorf,目录号:0030078683)
  7. × 0.2 ml PCR 反应管带连接平盖(Simport,目录号:T3202N)
  8. 二硫苏糖醇(DTT),1 M(Thermo Fisher Scientific,目录号:P2325)
  9. dNTP 混合,每个 25 mM(Thermo Fisher Scientific,目录号:R1122)。使用前用无 RNase 的水稀释至 20 mM
  10. TGIRT-III酶(InGex,目录号:TGIRT50)

注意:从供应商处收到的 TGIRT-III 酶储存在 -80直到可以使用。将打开的试管储存在 -20°C。TGIRT-III 在 -20°C 下储存 3 个月后可能会失去活性(Behrens 等,2021)。

  1. MinElute 反应净化试剂盒(Qiagen,目录号:28204 或 28206)或 MinElute PCR 纯化试剂盒(QIAGEN,目录号:28004 或 28006)
  2. 5' DNA 腺苷酸化试剂盒(New England Biolabs,目录号:E2610S/L)
  3. Oligo Clean & Concentrator 套件(Zymo Research,目录号:D4060/4061)
  4. 耐热 5' app DNA/RNA 连接酶(New England Biolabs,目录号:M0319S/L)
  5. 带有 HF 缓冲液的 Phusion 高保真 PCR 预混液(Thermo Fisher Scientific,目录号:F531S/L)
  6. AMPure XP(Beckman Coulter,目录号:A63881)
  7. 高灵敏度 DNA 试剂盒(安捷伦,目录号:5067-4626)
  8. (可选)RNA 6000 Pico 试剂盒(安捷伦,目录号:5067-1513)
  9. (可选)小 RNA 试剂盒(安捷伦,目录号:5067-1548)
  10. 寡核苷酸:所有寡核苷酸都应为 HPLC 纯化的无 RNase 级
  1. R2 RNA

5'-RA的rA了rG rArUrC rGrGrA rArGrA rGrCrA rCrArC rGrUrC rUrGrA rArCrU rCrCrA rGrUrC RARC / 3SpC3 / -3'

注意:也可以使用其他阻断剂,例如来自 IDT 的 3' Amino Modifier C6 dT(3AmMC6T)。

  1. R2R DNA

5'-GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC TN-3'(N = 等摩尔 A、T、G、C)

注意:当前 TGIRT-seq 协议中使用的 R2R DNA 与早期版本中使用的不同。它有一个单核苷酸变化(在 3' 末端的 -3 位置插入带下划线的 T 残基),在连接步骤期间强烈减少 R1R-R2R 接头二聚体的形成(Xu 等人,2019)。在 R2 RNA 的相应位置插入一个互补的 A 残基(下划线 A;见上)。

  1. R1R DNA

5'-/5Phos/GAT CGT CGG ACT GTA GAA CTC TGA ACG TGT AG/3SpC3/-3'

注意:Read 1 (R1) 序列对应于用于 Illumina 测序的 NEBNEXT Small RNA Library Prep Set 中使用的小 RNA 测序引物位点。

  1. 6N 唯一分子标识符 (UMI) R1R DNA

5'-/5Phos/NNN NNN GAT CGT CGG ACT GTA GAA CTC TGA ACG TGT AG/3SpC3/-3'

注意:在 R1R 序列的 5' 端添加了 UMI 核苷酸(机器混合的等摩尔 A、C、G 和 T 残基,表示为 N)。可以更改 N 个核苷酸的数量,以适应测序样品的复杂性和 PCR 后预期的重复数量

  1. Illumina 条形码 PCR 引物 (P5)

5'-AAT GAT ACG GCG ACC ACC GAG AT BARCODE C TAC ACG TTC AGA GTT CTA CAG TCC GAC GAT C-3'

注意:引物中的条形码序列是有义链(例如,Illumina 网站上 TruSeq Barcode 01 (TSBC01) 引物中的 ATCACG)。此条形码是可选的,但建议为在 NovaSeq 仪器上测序的文库提供唯一的双索引 (UDI)。

  1. Illumina 条形码 PCR 引物 (P7)

5'-CAA GCA GAA GAC GGC ATA CGA GAT 条形码 GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC T-3'

注意:引物中的条形码序列应该是 Illumina 网站上列出的条形码的反向互补序列(例如,TSBC01 引物中的 CGTGAT)。

  1. UltraPure DNase/RNase-free 蒸馏水(Thermo Fisher Scientific,目录号:10977015)或来自其他公司或内部来源的等效物
  2. Trizma 盐酸盐溶液(Tris-HCl),pH 7.5,2 M(Sigma-Aldrich,目录号:T2944)
  3. EDTA,0.5 M,pH 8.0,无RNase(Thermo Fisher Scientific,目录号:AM9260G)
  4. 氯化钠溶液,5 M,无 RNase(Thermo Fisher Scientific,目录号:AM9760G)
  5. 氯化镁溶液,用于分子生物学的 BioUltra,2 M(Sigma-Aldrich,目录号:68475-100ML-F)
  6. RNA Century TM -Plus Markers 例如,Thermo Fisher Scientific,目录号:AM7145)
  7. AMPure XP 珠(Beckman Coulter,目录号:A63880)
  8. mirVana miRNA 分离试剂盒(Thermo Fisher Scientific,目录号:AM1560)
  9. 总外泌体 RNA 和蛋白质分离试剂盒(Thermo Fisher Scientific,目录号:4478545)
  10. TRIzol LS 试剂(Thermo Fisher Scientific,目录号:10296010)
  11. Turbo DNase(Thermo Fisher Scientific,目录号:AM2238)
  12. DNase I(Zymo Research,目录号:E1010)
  13. 核酸外切酶 I(Lucigen,目录号:X40520K)
  14. Illumina Ribo-Zero Plus rRNA Depletion 试剂盒(Illumina,目录号:20040526)
  15. NEBNext 镁 RNA 片段化模块(New England Biolabs,目录号:E6150S)
  16. RNA Clean & Concentrator-5(Zymo Research,目录号:R1013)

 

设备

 

  1. -20°C 冰箱
  2. T100 96 孔 PCR 热循环仪(Bio-Rad,目录号:1861096)、Veriti 9044 60 孔热循环仪(Applied Biosystems,目录号:4384638),或具有加热盖和稳定温度控制的等效热循环仪
  3. 微量离心机(Eppendorf,目录号:2231000768)
  4. DynaMag-2 磁铁(Thermo Fisher Scientific,目录号:12321D)
  5. 2100 生物分析仪(安捷伦,目录号:G2939BA)
  6. 芯片启动站(安捷伦,目录号:5065-4401)

 

软件

 

  1.                                            2100 Expert 软件/升级(安捷伦,目录号:G2946CA)

注意:生物分析仪软件应与仪器捆绑在一起。如果没有,可以从 Agilent 购买升级版本。              

 

程序

 

  1. 10×起始双工的制备(表1)

 

表 1. 10× (1 μM) Starter Duplex 所需的组件

 

  1. 通过手动混合等体积的四个单独的 R2R DNA 寡核苷酸与 3' A、C、G 或 T,制备 10 µM R2R DNA,在 1-nt DNA 悬垂位置含有等摩尔比例的 A、C、G 和 T 残基残留物(每个 10 µM)在 1.5 ml Eppendorf Lobind 微量离心管或等效物和涡旋中。将原液分装并储存在 -20°C 直至使用。50 μl 的 10× Starter Duplex 足以进行 25 次反应(每个反应 2 μl)。

注意:具有不等比例的 1-nt 3' DNA 悬垂核苷酸的 R2R DNA 可用于减轻 TGIRT-seq 中的模板切换偏差(Mohr 等人,2013 年;Xu 等人,2019 年),

  1. 将上述反应组分设置在无菌 PCR 管中,用于寡核苷酸退火以形成起始双链体。
  2. 在预热的热循环仪中,在 82°C 下孵育混合的 10x Starter Duplex 组分 2 分钟。
  3. 以 10% 的斜坡或以 0.1°C/s 的速率冷却至 25°C。

注意:退火的 R2 RNA/R2R DNA Starter Duplex 应在每次模板转换反应前新鲜制备。使用后丢弃任何剩余物。

 

  1. 模板转换逆转录反应
  1. 在无菌 PCR 管中设置以下反应组分,最后添加 TGIRT-III 酶(表 2)。

 

表 2. 模板转换逆转录反应所需的成分

 

笔记:

  1. 因为 TGIRT 酶也可以模板转换为 DNA,所以 RNA 样本应该在 TGIRT-seq 之前进行 DNase 处理。DNase 处理的 RNA 样本的最低 RNA 输入为: 2 ng rRNA 耗尽、化学碎片化的细胞 RNA;20-50 ng rRNA 耗尽、未片段化的细胞 RNA;500 pg 血浆 RNA;从高度纯化的细胞外囊泡中提取的 2 ng RNA。使用 Qubit 或生物分析仪测量低 RNA 浓度。有关为 TGIRT-seq 准备 RNA 样本的更多详细信息,请参阅文本末尾的注释 1-4。
  2. 我们在此协议中使用 200 mM NaCl,而不是早期版本中使用的 450 mM NaCl,以提高模板切换的效率,而不会过度增加多个顺序模板切换,从而导致人为融合读取。另见下面的注释 5。
  3. 使用安捷伦生物分析仪或等效仪器进行中试,以确定样品的最佳酶浓度,以监测文库质量和数量。
  4. 使用 TGIRT-III 酶对商业RNA 阶梯(例如 Thermo Fisher Scientific,目录号:AM7145的模板转换逆转录反应可以作为阳性对照进行整个过程。
  1. 在室温下预孵育 30 分钟,然后通过添加 1 μl 20 mM dNTP(dATP、dCTP、dGTP 和 dTTP 各 20 mM 的等摩尔混合物)启动模板转换和逆转录
  2. 全细胞、EV 或血浆 RNA 在 60°C 下孵育 15 分钟,成熟 miRNA 孵育 5-10 分钟,长或重度修饰的 RNA 孵育最多 60 分钟。对于不同的 RNA 模板,应通过实验确定最佳孵育时间。
  3. 加入 1 μl 5 M NaOH 并在 95°C 下孵育 3 分钟。
  4. 冷却至室温并用 1 μl 5 M HCl 中和。

注意:需要步骤 B4 和 B5,因为 TGIRT-III 与核酸结合非常紧密。碱性水解降解 RNA 模板和酶,但不降解 cDNA 产物。

  1. 加入 78 μl 无核酸酶水,使终体积为 100 μl。
  2. 使用 MinElute Reaction Cleanup 试剂盒(QIAGEN,目录号:28204)或 MinElute PCR 纯化试剂盒(QIAGEN,目录号:28004)清理 cDNA,并在 10 μl QIAGEN 洗脱缓冲液中洗脱。

注意离心前用洗脱缓冲液在室温下孵育色谱柱以最大限度地提高回收率。

该过程可以在此处中断,将 cDNA 储存在 -20°C。

  1. 继续进行 R1R 腺苷酸化、热稳定连接和 Phusion PCR 扩增。

 

  1. 使用 New England Biolabs 试剂盒(New England Biolabs,目录号:E2610S/L)进行 R1R 5' DNA 腺苷酸化

注意:如果腺苷酸化寡核苷酸是从商业供应商处购买的,请继续进行连接步骤。

  1. 在无菌 PCR 管中设置以下反应组分(表 3):

 

表 3. 5' DNA 腺苷酸化所需的成分

 

注意:New England Biolabs 试剂盒包括 10×DNA 腺苷酸化缓冲液、1 mM ATP 和 Mth RNA 连接酶。

 

  1. 在 65°C 下孵育 1 小时
  2. 在 85°C 下孵育 5 分钟以灭活酶。
  3. 使用 Oligo Clean & Concentrator 试剂盒(Zymo Research,目录号:D4060)进行清理,并在 10 μl 无核酸酶水中洗脱,最终浓度为 10 μM 5'-末端腺苷酸化 R1R DNA。

注意:如果需要大量 TGIRT-seq 文库,我们建议通过使用相同量的酶和寡核苷酸进行多次(例如 6 倍到 8 倍20 微升反应,然后组合产品来放大腺苷酸化反应用于净化,因为较高的洗脱体积有助于有效回收腺苷酸化寡核苷酸。清理后,R1R 的回收材料应为 80-100 ng/ μl,6N UMI R1R 的回收材料应为 100-120 ng/μl,使用 Nanodrop 分光光度计使用 ssDNA 设置进行测量。可以使用带有 Small RNA 试剂盒的安捷伦生物分析仪监测腺苷酸化。腺苷酸化寡核苷酸可在 -20°C 下储存长达两周。

 

 

 

  1. 5' 腺苷酸化接头的热稳定连接(New England Biolabs,目录号:M0319S/L)
  1. 在无菌 PCR 管中设置以下反应组分(表 4):

 

表 4. 5' 接头热稳定连接所需的组件

 

  1. 在 65°C 下孵育 1-2 小时。

注意:我们建议结扎 2 小时。但是,如果起始材料丰富,可以使用 1 小时连接。 

  1. 加入 80 μl 无核酸酶水,最终体积为 100 μl。
  2. 使用 MinElute Reaction Cleanup 试剂盒(QIAGEN,目录号:28204)清理连接的 cDNA,并在 23 μl QIAGEN 洗脱缓冲液中洗脱(在离心前用洗脱缓冲液在室温下孵育柱子以最大化回收率)。

注意:该过程可以在此处中断,连接的 cDNA 储存在 -20°C。

  1. 继续进行 Phusion PCR 扩增。

 

  1. PCR扩增(Thermo Fisher Scientific,目录号:F531S/L)
  1. 在无菌 PCR 管中设置以下反应组分(表 5):

 

表 5. PCR 扩增所需的组件

 

  1. PCR循环:
  1. 98°C 变性 5 秒。
  2. 最多 12 个循环,98°C 5 秒,60°C 10 秒,72°C 15-30 s/kb,4°C 保持。

 

笔记:

  1. 尽量减少 PCR 扩增的循环次数可减少偏差和重复读数。对于 TGIRT-seq,12 个 PCR 循环通常是令人满意的,RNA 输入如 B 节所示。循环次数可以根据不同的 RNA 输入和文库复杂性增加或减少。
  2. 该过程可以在此处中断,并将 PCR 产物储存在 -20°C。
  1. 使用AMPure XP 磁珠(Beckman Coulter,目录号:A63880)去除接头二聚体并富集文库中所需的 DNA 大小。默认比例为 1.4 × v/v(70 μl 珠子/50 μl PCR 反应)。珠子与样品体积的比例可以根据被测序的 RNA 的大小特征进行调整。
  2. 通过使用高灵敏度 DNA 分析试剂盒在生物分析仪上分析 1 μl 来检查文库质量(图 2)。

 

 

图 2. 使用配备高灵敏度 DNA 试剂盒的安捷伦生物分析仪获得的代表性 TGIRT-seq 文库的生物分析仪迹线。

A.核糖耗尽和化学碎片化的 HeLa S3 细胞 RNA(核去除后 20 ng 起始材料)。片段化为 70-90 nt 的细胞 RNA 的文库产品以 190-210 bp 运行。B.血浆 RNA(0.5 ng 起始材料)。库产品运行在 140-240 bp,峰值在 ~200 bp 对应于来自 tRNA 的库产品。残留的接头二聚体 (AD) 以低浓度存在于最终文库中,R1R-R2R 的运行速度约为 120 bp,而 UMI R1R 接头或唯一双索引 (UDI) PCR 引物则稍大一些。小于 100 bp 的峰对应于未使用的 PCR 和接头寡核苷酸 (P+A)。下部 (35 bp) 和上部 (10,380 bp) 标记是高灵敏度 DNA 试剂盒提供的内部标记。DNA 长度 (bp) 由软件程序根据试剂盒提供的梯子确定。

 

笔记:

  1. 如果需要,可以执行额外的一轮 1.4 × AMPure XP 磁珠清理,以进一步耗尽适配器二聚体的水平。更广泛的 AMPure XP 磁珠清理会导致文库产品的丢失。
  2. 最终的 TGIRT-seq 库可以储存在 -20°C 下,直到提交进行测序。

 

  1. 测序和数据分析

                            TGIRT-seq 文库可以在任何 Illumina 测序仪器上进行测序,以获得单端 (SE) 或双端 (PE) 读数。读取长度和读取深度取决于实验的需要。我们在 NextSeq 500 上使用了 PE75,具有 3.3 亿个读数输出,用于分析细胞、EV 和血浆 RNA。我们还在 NovaSeq 上使用了 PE150,为需要高深度分析的大型项目或样本输出至少 7 亿个读数。

由于 TGIRT-seq 同时分析编码和非编码 RNA,TGIRT-seq 的读取映射管道通常使用具有专门数据库的顺序映射策略,首先使用在多个位点编码的 rRNA 和 tRNA 序列以及嵌入蛋白质内含子中的 sncRNA 映射读取- 编码基因,在将剩余读数映射到人类基因组参考序列之前。除了端到端比对之外,通常还包括局部比对步骤,以通过 TGIRT 酶更有效地捕获非模板化核苷酸添加到 cDNA 3' 端的读数。读取处理和映射的详细协议,包括重复读取的 UMI 解卷积,以及识别受蛋白质保护的 mRNA 片段、结构化切除的内含子 RNA 和内含子 RNA 片段作为人血浆中潜在生物标志物的峰值调用,可以在 Yao等人中找到(2020)。

用于不同样本类型和应用的其他 TGIRT-seq 读取映射协议,包括统计测试和不同应用的重复细节,可在以下参考文献中找到:血浆 RNA(Qin等人,2016 年);总细胞 RNA,包括spike-ins (Nottingham et al. , 2016);血浆 DNA,包括核小体定位和 DNA 甲基化位点的定位(Wu 和 Lambowitz,2017);sncRNA(Boivin等人,2018 年);总细胞 RNA,包括详细分析和修复测序偏差(Xu,2019)。

 

笔记

 

  1. 我们已经使用各种商业试剂盒来分离 TGIRT-seq 的 RNA。商业试剂盒在 RNA 产量方面存在显着差异,并且可能会偏向于不同的 RNA 生物型。我们已经使用 mirVana miRNA 分离试剂盒(Thermo Fisher Scientific,目录号:AM1560)对人细胞 RNA 的 TGIRT-seq 取得了良好的结果。对于 EV,我们通过使用总外泌体 RNA 和蛋白质分离试剂盒(Thermo Fisher Scientific,目录号:4478545)获得了良好的结果。为了分离人血浆 RNA,我们尝试了多种不同的提取试剂和试剂盒,包括 TRIzol LS Reagent(Thermo Fisher Scientific,目录号:10296010)和 Direct-zol RNA Miniprep(Zymo Research,目录号:R2051);QIAamp ccfDNA/RNA 提取试剂盒(QIAGEN,目录号:55184);和 miRNeasy Serum/Plasma Advanced 试剂盒(QIAGEN,目录号:217204)。每个试剂盒在输入量、易用性和 RNA 产量方面都有优缺点。在我们手中,QIAamp ccfDNA/RNA 试剂盒在提取 DNA 方面比 RNA 更有效(Yao,2020)。
  2. TGIRT 酶可以从 R2 RNA/R2R DNA 起始双链体模板切换到 DNA 片段的 3' 端,该活性已被用于对单链 DNA 进行测序,以分析核小体定位和 DNA 甲基化位点的定位(Wu 和 Lambowitz , 2017) 对于 RNA-seq,重要的是尽可能多地去除 DNA,以最大限度地减少 DNA 读数。我们使用了 Turbo DNase(Thermo Fisher Scientific,目录号:AM2238)以及 DNase I(Zymo Research,目录号:E1010)和核酸外切酶 I(Lucigen,目录号:X40520K)(Yao等人,2020)的组合。我们使用 RNA Clean & Concentrator-5 试剂盒(Zymo Research,目录号:R1013)和修改后的 8x V/V(乙醇/输入 RNA)方案清理 DNase 处理的 RNA。
  3. 细胞 RNA 是 rRNA 耗尽的,以最大限度地减少映射到 rRNA 的读数。由于 rRNA 去除程序对 rRNA 并非 100% 特异,不同的 rRNA 去除试剂盒也会影响与其他 RNA 生物型(包括感兴趣的 RNA)对应的读数。因此,研究中的所有样品都应使用相同的 rRNA 去除试剂盒。使用制造商方案的 Illumina Ribo-Zero Plus rRNA Depletion 试剂盒(Illumina,目录号:20040526)为 TGIRT-seq [最终文库中的 5-10% rRNA 读数提供了令人满意的结果,尽管在我们手中它在去除 rRNA 作为该试剂盒的原始(现已停产)版本(RiboZero Gold,<1-2% rRNA 读数)]。
  4. 对于细胞 RNA 的化学片段化,我们使用 NEBNext 镁 RNA 片段化模块(New England Biolabs,目录号:E6150S)。我们通常使用 50 ng 去除了 rRNA 的总细胞 RNA 作为输入,并在 94°C 下加热 5-7 分钟以进行片段化。RNA 输入和片段化时间应由研究人员根据感兴趣的 RNA 和 Illumina 测序所需的读取长度进行优化。片段化后,我们使用RNA Clean & Concentrator-5 试剂盒(Zymo Research,目录号:R1013)和修改后的8x V/V(乙醇/输入 RNA)方案清理RNA。应使用具有 3'-磷酸酶活性的 T4 多核苷酸激酶处理用二价阳离子生成的碎片化细胞 RNA,以去除阻碍 TGIRT 模板转换的 3' 磷酸和 2',3'-环磷酸(Mohr等人,2013 年;诺丁汉等人,2016 年)。这种抑制在 200 mM 时不如在 450 mM NaCl 时明显(Lentzsch等人,2019 年;Yao等人,2020 年)。
  5. 在初始 TGIRT-seq 方案中使用高盐浓度 (450 mM NaCl) 来抑制 TGIRT 酶的多个端到端模板转换也降低了来自 R2 RNA/R2R DNA Starter Duplex 的初始模板转换反应的效率(Lentzsch等人,2019 年)。多个模板切换对于 miRNA 测序尤其成问题。使用 200 mM NaCl 提高来自 R2 RNA/R2R DNA 起始双链体的初始模板转换反应的效率的更新方案提供了人血浆中编码和非编码 RNA 的综合库,具有可接受的融合读数水平(0.5-4% ),其中包括多个模板开关 (Yao et al. , 2020)。
  6. 对于低 RNA 输入(例如,对于人血浆和 EV RNA),TGIRT-seq 文库的 Bioanalyzer 痕迹可能会显示适配器二聚体峰,但没有软件称为文库产物峰。在这些情况下,1μl 的文库可以用另外12个PCR循环扩增以检测文库产物。如果进一步 PCR 扩增后可以检测到文库产物,则未进一步扩增的原始文库应该可以用于测序。额外扩增后,接头二聚体水平将更高(因为短产物优先扩增),但用户应该期望观察到预测大小的所需产物。如果仍未检测到文库产品,则可能意味着由于 RNA 输入问题(质量或数量)导致文库制备失败。
  7. 最终 TGIRT-seq 库的 Bioanalyzer 痕迹中缺少产物峰和接头二聚体峰 (120-125 bp) 表明连接反应不成功,最有可能是由于 R1R 腺苷酸化步骤或连接失败。

 

致谢

 

TGIRT-seq 方法的开发得到了 NIH 赠款 R35 GM136216 和韦尔奇基金会赠款 F-1607 的支持。该协议已改编自姚等人(2020)。

 

利益争夺

 

热稳定性 II 组内含子逆转录酶及其使用方法是德克萨斯大学和东田纳西州立大学向 InGex, LLC 授权的专利和专利申请的主题。AML、Lambowitz 实验室的一些前任和现任成员以及德克萨斯大学是 InGex, LLC 的少数股权持有人,并从使用 TGIRT 模板转换活动的 TGIRT 酶和试剂盒的销售以及知识分子的再许可中获得特许权使用费。其他公司的财产。

 

参考

 

  1. Behrens, A.、Rodschinka, G. 和 Nedialkova, DD (2021)。通过 mim-tRNAseq 对真核生物中 tRNA 丰度和修饰状态的高分辨率定量分析。 Mol Cell 81(8): 1802-1815.e1807。             
  2. Belfort, M. 和 Lambowitz, AM (2019)。第二组内含子 RNP 和逆转录酶:从逆转录元素到研究工具。 Cold Spring Harb Perspect Biol 11(4):a032375。
  3. Blocker, FJ, Mohr, G., Conlan, LH, Qi, L., Belfort, M. 和 Lambowitz, AM (2005)。II组内含子编码逆转录酶的域结构和三维模型。 RNA 11(1):14-28。             
  4. Boivin, V., Deschamps-Francoeur, G., Couture, S., Nottingham, RM, Bouchard-Bourelle, P., Lambowitz, AM, Scott, MS 和 Abou-Elela, S. (2018)。编码和非编码 RNA 的同时测序揭示了人类转录组由少数高度表达的非编码基因主导。RNA 24(7):950-965。              
  5. Boivin, V.、Reulet, G.、Boisvert, O.、Couture, S.、Elela, SA 和 Scott, MS(2020 年)。减少 RNA-Seq 的结构偏差揭示了大量未注释的非编码 RNA。核酸研究48(5): 2271-2286。
  6. Carrell, ST, Tang, Z., Mohr, S., Lambowitz, AM 和 Thornton, CA (2018)。使用热稳定组 II 内含子逆转录酶检测扩增的 RNA 重复。 核酸研究46(1):e1。
  7. Cocquet, J.、Chong, A.、Zhang, G. 和 Veitia, RA (2006)。逆转录酶模板切换和错误的替代转录本。 基因组学88(1):127-131。
  8. Evans, ME、Clark, WC、Zheng, G. 和 Pan, T. (2017)。通过高通量测序确定 tRNA 氨酰化水平。 核酸研究45(14):e133。             
  9. Gladyshev, EA 和 Arkhipova, IR (2011)。一类广泛存在的逆转录酶相关细胞基因。 Proc Natl Acad Sci USA 108(51): 20311-20316。
  10. Hafner, M., Renwick, N., Brown, M., Mihailović, A., Holoch, D., Lin, C., Pena, JT, Nusbaum, JD, Morozov, P., Ludwig, J.,(2011)。深测序小 RNA cDNA 文库中 miRNA 表示中的 RNA 连接酶依赖性偏差。 RNA 17(9):1697-1712。             
  11. Hu, WS 和 Hughes, SH (2012)。HIV-1 逆转录。 Cold Spring Harb Perspect Med 2(10):a006882。
  12. Katibah, GE, Qin, Y., Sidote, DJ, Yao, J., Lambowitz, AM 和 Collins, K. (2014)。人干扰素诱导的四三肽重复蛋白 IFIT5 对 RNA 结构的广泛和适应性识别。Proc Natl Acad Sci USA 111(33): 12025-12030。              
  13. Kojima, KK 和 Kanehisa, M. (2008)。基于基因邻域和蛋白质结构的新型原核逆转录元件的系统调查。 Mol Biol Evol 25(7): 1395-1404。
  14. Lambowitz, AM 和 Belfort, M. (2015)。真核生物进化关键的移动细菌组 II 内含子。 微生物谱3(1):Mdna3-0050-2014。
  15. Lentzsch, AM、Yao, J.、Russell, R. 和 Lambowitz, AM(2019 年)。II 组内含子编码逆转录酶的模板转换机制及其对生物功能和 RNA-Seq 的影响。J Biol Chem 294:19764-19784。
  16. Lentzsch, AM、Stamos, JL、Yao, J.、Russell, R. 和 Lambowitz, AM(2021 年)。由组 II 内含子编码的非 LTR 逆转录酶逆转录酶进行模板转换的结构基础。生物化学杂志297(2):100971。
  17. Mader, RM, Schmidt, WM, Sedivy, R., Rizovski, B., Braun, J., Kalipciyan, M., Exner, M., Steger, GG 和 Mueller, MW (2001)。逆转录酶-聚合酶链反应过程中的逆转录酶模板转换:人工生成核糖核苷酸还原酶 mRNA 的缺失。J Lab Clin Med 137(6): 422-428。              
  18. Martín-Alonso, S.、Frutos-Beltrán, E. 和 Menéndez-Arias, L.(2021 年)。逆转录酶:从转录组学到基因组编辑。趋势生物技术39(2): 194-210。 
  19. Mohr, S., Ghanem, E., Smith, W., Sheeter, D., Qin, Y., King, O., Polioudakis, D., Iyer, VR, Hunicke-Smith, S., Swamy, S. ,(2013)。热稳定组 II 内含子逆转录酶融合蛋白及其在 cDNA 合成和下一代 RNA 测序中的用途。 RNA 19(7):958-970。             
  20. Nottingham, RM, Wu, DC, Qin, Y., Yao, J., Hunicke-Smith, S. 和 Lambowitz, AM (2016)。使用耐热组 II 内含子逆转录酶对人类参考 RNA 样本进行 RNA-seq。 RNA 22(4):597-613。             
  21. Onafuwa-Nuga, A. 和 Telesnitsky, A. (2009)。人类免疫缺陷病毒 1 型基因重组的显着频率。 Microbiol Mol Biol Rev 73(3): 451-480,目录。
  22. Penno, C.、Kumari, R.、Baranov, PV、van Sinderen, D. 和 Atkins, JF(2017 年)。通过模板 RNA 结构刺激逆转录酶生成具有特定插入缺失的 cDNA:逆转录转座子、dNTP 平衡、RT 试剂使用。核酸研究45(17):10143-10155。              
  23. Picelli, S.、Björklund Å, K.、Faridani, OR、Sagasser, S.、Winberg, G. 和 Sandberg, R.(2013 年)。Smart-seq2 用于单细胞中灵敏的全长转录组分析。Nat 方法10(11): 1096-1098。              
  24. Qin, Y., Yao, J., Wu, DC, Nottingham, RM, Mohr, S., Hunicke-Smith, S. 和 Lambowitz, AM (2016)。使用耐热组 II 内含子逆转录酶对人血浆 RNA 进行高通量测序。 RNA 22(1):111-128。
  25. Shen, PS, Park, J., Qin, Y., Li, X., Parsawar, K., Larson, MH, Cox, J., Cheng, Y., Lambowitz, AM, Weissman, JS, et al (2015)。蛋白质合成。Rqc2p 和 60S 核糖体亚基介导不依赖 mRNA 的新生链伸长。 科学(纽约,纽约州)347(6217):75-78。
  26. Shurtleff, MJ, Yao, J., Qin, Y., Nottingham, RM, Temoche-Diaz, MM, Schekman, R. 和 Lambowitz, AM (2017)。YBX1 在定义外泌体的非编码小 RNA 组成方面的广泛作用。Proc Natl Acad Sci USA 114(43):E8987-E8995。
  27. Simon, DM 和 Zimmerly, S. (2008)。细菌中多种未表征的逆转录酶。 核酸研究36(22):7219-7229。             
  28. Stamos, JL, Lentzsch, AM 和 Lambowitz, AM (2017)。具有模板引物的热稳定组 II 内含子逆转录酶的结构及其功能和进化意义。 Mol Cell 68(5): 926-939.e924。
  29. Stamos, JL, Lentzsch, AM, Park, SK, Mohr, G., Lambowitz, AM (2019)。非 LTR 逆转录元件逆转录酶及其用途。PCT/US2018/054147。
  30. Stark, R.、Grzelak, M. 和 Hadfield, J.(2019 年)。RNA测序:青少年时期。Nat Rev Genet 20(11): 631-656。
  31. Upton, H. E, Ferguson, L, Temoche-Diaz, M. M, Liu, X, Pimentel, S. C, Ingolia, NT, Schekman, R 和 Collins, K. (2021)。使用有序双模板中继的低偏差 ncRNA 文库:通过改良的逆转录逆转录酶进行的串行模板跳跃。BioRxiv doi:https://doi.org/10.1101/2021.04.30.442027
  32. Wang, Z.、Ma, Z.、Castillo-González, C.、Sun, D.、Li, Y.、Yu, B.、Zhao, B.、Li, P. 和 Zhang, X. (2018)。SWI2/SNF2 ATPase CHR2 通过 Serrate 重塑 pri-miRNA 以阻止 miRNA 的产生。自然557(7706):516-521。              
  33. Wu, DC 和 Lambowitz, AM (2017)。通过热稳定组 II 内含子逆转录酶模板切换对人血浆 DNA 进行简单的单链 DNA 测序。 科学报告7(1): 8421。
  34. Wu, X. 和 Bartel, DP (2017)。3'-末端结构对哺乳动物 mRNA 加工和稳定性的广泛影响。 单元格169(5):905-917.e911。
  35. Xiong, Y. 和 Eickbush, TH (1990)。基于逆转录酶序列的逆转录元件的起源和进化。Embo J 9(10): 3353-3362。
  36. Xu, H.、Yao, J.、Wu, DC 和 Lambowitz, AM(2019 年)。改进的 TGIRT-seq 方法用于综合转录组分析,减少接头二聚体形成和偏差校正。 科学报告9(1): 7953。             
  37. Yao, J.、Wu, DC、Nottingham, RM 和 Lambowitz, AM(2020 年)。通过 TGIRT-seq 峰调用鉴定人血浆中受蛋白质保护的 mRNA 片段和结构化切除的内含子 RNA。Elife 9:e60743。
  38. Yao, J.、Xu, H.、Shelby, W.、Wu, DC、Ares, M. 和 Lambowitz, AM(2021 年)。人类细胞含有无数切除的线性内含子,在基因调控和作为 RNA 生物标志物方面具有潜在功能。BioRxiv doi:https: //doi.org/10.1101/2020.09.07.285114 
  39. Zarnegar, BJ, Flynn, RA, Shen, Y., Do, BT, Chang, HY, 和 Khavari, PA (2016)。用于有效表征蛋白质-RNA 相互作用的 irCLIP 平台。Nat 方法13(6): 489-492。 
  40. Zhao, C.、Liu, F. 和 Pyle, AM(2018 年)。一种由后生动物组 II 内含子编码的超持续、准确的逆转录酶。 RNA 24(2):183-195。             
  41. Zheng, G., Qin, Y., Clark, WC, Dai, Q., Yi, C., He, C., Lambowitz, AM 和 Pan, T. (2015)。高效定量的高通量 tRNA 测序。Nat 方法12(9): 835-837。              
  42. Zhu, YY, Machleder, EM, Chenchik, A., Li, R. 和 Siebert, PD (2001)。逆转录酶模板切换:全长 cDNA 文库构建的 SMART 方法。生物技术30(4): 892-897。              
  43. Zimmerly, S. 和 Wu, L.(2015 年)。细菌中逆转录酶的未探索多样性。微生物谱3(2):Mdna3-0058-2014。             
  44. Zubradt, M.、Gupta, P.、Persad, S.、Lambowitz, AM、Weissman, JS 和 Rouskin, S.(2017 年)。DMS-MaPseq 用于体内全基因组或靶向 RNA 结构探测 Nat 方法14(1): 75-82。
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容,www.bio-protocol.org 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright Xu et al. This article is distributed under the terms of the Creative Commons Attribution License (CC BY 4.0).
引用: Readers should cite both the Bio-protocol article and the original research article where this protocol was used:
  1. Xu, H., Nottingham, R. M. and Lambowitz, A. M. (2021). TGIRT-seq Protocol for the Comprehensive Profiling of Coding and Non-coding RNA Biotypes in Cellular, Extracellular Vesicle, and Plasma RNAs. Bio-protocol 11(23): e4239. DOI: 10.21769/BioProtoc.4239.
  2. Yao, J., Wu, D. C., Nottingham, R. M. and Lambowitz, A. M. (2020). Identification of protein-protected mRNA fragments and structured excised intron RNAs in human plasma by TGIRT-seq peak calling. Elife 9: e60743.
提问与回复

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。