参见作者原研究论文

本实验方案简略版
Sep 2020
Advertisement

本文章节


 

Construction of DNA/RNA Triplex Helices Based on GAA/TTC Trinucleotide Repeats
基于GAA/TTC三核苷酸重复序列构建DNA/RNA三链螺旋    

引用 收藏 提问与回复 分享您的反馈 Cited by

Abstract

Atypical DNA and RNA secondary structures play a crucial role in simple sequence repeat (SSR) diseases, which are associated with a class of neurological and neuromuscular disorders known as “anticipation diseases,” where the age of disease onset decreases and the severity of the disease is increased as the intergenerational expansion of the SSR increases. While the mechanisms underlying these diseases are complex and remain elusive, there is a consensus that stable, non-B-DNA atypical secondary structures play an important – if not causative – role. These structures include single-stranded DNA loops and hairpins, G-quartets, Z-DNA, triplex nucleic acid structures, and others. While all of these structures are of interest, structures based on nucleic acid triplexes have recently garnered increased attention as they have been implicated in gene regulation, gene repair, and gene engineering. Our work here focuses on the construction of DNA triplexes and RNA/DNA hybrids formed from GAA/TTC trinucleotide repeats, which underlie Friedreich’s ataxia. While there is some software, such as the Discovery Studio Visualizer, that can aid in the initial construction of DNA triple helices, the only option for the triple helix is constrained to be that of an antiparallel pyrimidine for the third strand. In this protocol, we illustrate how to build up more generalized DNA triplexes and DNA/RNA mixed hybrids. We make use of both the Discovery Studio Visualizer and the AMBER simulation package to construct the initial triplexes. Using the steps outlined here, one can – in principle – build up any triple nucleic acid helix with a desired sequence for large-scale molecular dynamics simulation studies.

Keywords: DNA/RNA (DNA/RNA), Triplex helices (三螺旋), Molecular dynamics (分子动力学), Trinucleotide repeats (三核苷酸重复序列)

Background

Simple sequence repeats (SSRs) – which represent about 3% of the entire human genome – typically consist of 1 to 6 nucleotides that repeat up to 30 times or more (Ellegren et al., 2004; Subramanian et al., 2003). Among the various possible SSRs, trinucleotide repeats (TRs) represent one of the most common types in the exome of all eukaryotic genomes (Toth et al., 2000). Many TRs exhibit “dynamic mutations” that do not follow Mendelian inheritance, which states that mutations in a single gene may be stably transmitted between generations (Caburet et al., 2005). This can lead to genetic diseases where, in successive generations, the age of disease onset decreases and disease severity increases (Mirkin, 2006). These mutations – whose probability also increases with generations – are due to the intergenerational expansion of TRs. After a certain threshold in the repeat number of TRs is reached, the probability of further TR expansion and the severity of the disease increase with the number of repeats. The dynamic mutations associated with TRs cause severe neurodegenerative and neuromuscular disorders known generically as Trinucleotide (or Triplet) Repeat Expansion Diseases (TREDs), which lead to cell toxicity and death (Wells et al., 1998; Orr et al., 2007; Wells et al., 2005). To date, about 50 DNA expandable SSR diseases have been identified, and their number is expected to grow. The TR expansions are believed to be caused by DNA slippage during replication, repair, transcription, or recombination.


Although the mechanisms underlying TREDs may be quite complex, some simple trends are remarkably robust. In particular, there is a correlation between the repeat number beyond the repeat threshold and the probability of further expansion and increased pathology. Another important breakthrough has been the recognition that stable, non-B-DNA secondary structure in the expanded repeats is an important factor causing the expansion and disease (McMurray et al., 1999). As such, expandable repeats are known to display atypical structural characteristics such as single-stranded hairpins, Z-DNA, G-quartets, triple helix structures, and slip-stranded duplexes. It is therefore believed that understanding the structural and dynamical characteristics of these atypical secondary structures is important for ultimately unraveling the puzzle of TREDs.


While our previous work on TRs and hexanucleotide repeats in the context of C9FTD/ALS diseases (Zhang et al., 2017a and 2017b) was centered on understanding single-stranded loops and hairpins (Pan et al., 2017, 2018a and 2018b; Xu et al., 2020), here we focus on the construction of triplexes associated with GAA/TTC TRs (Zhang et al., 2020). These are associated with Friedreich’s ataxia (Grabczyk et al., 2000), which is caused by the expansion of GAA in the first intron of the frataxin gene. Experimentally, these repeats have been observed to form either triplexes or R-loops. DNA triplexes or triple-stranded DNA or H-DNA were first reported in 1957 (Felsenfeld et al., 1957). These non-canonical three-stranded helices consist of a Watson-Crick paired helical duplex and a third strand that binds to the duplex via Hoogsteen or reversed Hoogsteen hydrogen bonds. R-loops, on the other hand, represent three-stranded nucleic acid structures consisting of a hybrid RNA:DNA duplex (formed by a template DNA and the RNA strands) in conjunction with the displaced, non-template single-stranded DNA. Both triplexes and R-loops can have cellular functions and are essential for gene therapy (Kaji et al., 2001; Seidman et al., 2003).


There is much to be learned about the microscopics of DNA triplexes and R-loops; in particular, many of the atomistic aspects of these atypical nucleic acid structures remain elusive. Hence, we have recently examined the structure and stability of DNA triplexes and RNA/DNA hybrids associated with GAA/TTC TRs (Zhang et al., 2020). In these structures, the third strand is inserted into the major groove or minor groove in pure RNA triplexes (Szewczak et al., 1998). However, since the minor groove RNA triplexes are unstable (Devi et al., 2015), we only consider major groove RNA triplexes in this protocol. Our study was based on large-scale classical Molecular Dynamics (MD) simulations. The initial modeling of the triple helices was performed with the Discovery Studio Visualizer (Visualizer, 2005), and we made use of the AMBER simulation package as an optimization and sampling tool for exploring the structure and stability of DNA triplexes and selected R-loops (Case et al., 2020). In this bio-protocol paper, we provide details for the initial modeling and construction of the triple helices associated with GAA/TTC TRs. Specifically, we focus on the sequence GAA/TTC(UUC) as an example.


We primarily discuss how to build up DNA-based triplexes and mixed DNA/RNA structures, as shown below in Figure 1. With the models constructed in this bio-protocol, it is straightforward to investigate triple helices per se and their interactions with other biomolecules.



Figure 1. Structures (from left to right) of DNA triple helix, DNA·RNA:DNA hybrid triple helix, RNA·DNA:DNA hybrid triple helix, and RNA triple helix. DNA strands are colored in blue, and RNA strands are colored in red.

Software

  1. Discovery Studio Visualizer version 2019, or higher

    Discovery Studio Visualizer is a free software developed by Dassault Systemes BIOVIA. Access this software at: https://discover.3ds.com/discovery-studio-visualizer-download.

  2. Amber, version 16 or higher

    Amber is a suite of biomolecular simulation programs for large-scale MD studies. Access Amber at: http://ambermd.org/.

Procedure

  1. Initial Construction of a DNA Triple Helix

    1. Open Discovery Studio Visualizer.

    2. First, build up a homopurine/homopyrimidine B-DNA double helix with the desired sequence. The basic strategy of building a triple helix model is shown in Figure 2.



      Figure 2. Scheme illustrating the construction of a triple helix


      For example, suppose we want to build up the parallel sequence as shown in Figure 3(a).

    3. Change the display style to arrows and rings and turn off the atom display style.

    4. Select the wanted strand (all purine or all pyrimidine) and then select copy and paste. If a shifted triplex sequence is desired, then change the residues name on the third strand to the desired sequence. Next, copy and paste the pyrimidine strand from the duplex, as shown in Figure 3(b), to get the sequence Figure 3(c).

      As the desired triplex involves a shifted sequence, rename the residue name to obtain Figure 3(d).

    5. Carefully put the pasted strand into the major groove of the DNA duplex with the desired orientation, as shown in Figure 4 and in Video 1. Then our example sequence thus becomes Figure 3(e).



      Figure 3. Sample sequence for our procedure. (a) represents the triplex we want to build. (b) is initial duplex from which we obtained the strand shown in (c).The shifted strand obtained from the previous step is shown in (d). This shifted strand is placed into the major groove of the duplex as shown in (e). Protonating the cytosines in the third strand finally leads to (a).



      Figure 4. Snapshot for the placement of the third strand into the major groove of the DNA duplex


      Video 1. A short illustration of how we build up the initial model of a triple helix


    6. Roughly adjust the position of each residue to avoid any overlap between the residues, as shown in Video 1. For example, when the oxygen atoms on the phosphate groups are too close, we manually slightly move the position of one of the oxygen atoms.

    7. Change the name of the third-strand bases to construct the desired sequence. Name protonated DC just DC, protonated DA just DA, etc. (same for RNA). We will illustrate how to deal with the protonated cases later.

    8. Change the display style back to atom/ball and stick.

    9. Precisely adjust each atom on the third strand to avoid high potential energy collisions during subsequent MD runs and to construct the hydrogen bond structure as illustrated in Figure 5 and Figure 6.



      Figure 5. Final conformation for the example given in Figure 3 and built through steps 1-14.



      Figure 6. Initial hydrogen bond constraints for a DNA triple helix


    10. Select edit, select. Change the property to be element, then select hydrogen. After selecting all hydrogen atoms, type Del to delete all hydrogen atoms as some of the hydrogen atoms’ names are different than those in Amber and will cause errors.

    11. Save the structure as PDB.

    12. Edit the PDB file, deleting all the connection relations at the end of the PDB file.

    13. If there are protonated bases in the desired sequence, change the corresponding residue name DC to DCP, DA to DAP. If the protonated C is at 5’, rename it as D5C. If the protonated C is at 3’, rename it as D3C. If the protonated A is at 5’, rename it as D5A. If the protonated A is at 3’, rename it as D3A, etc. While this nomenclature seems somewhat arbitrary and inconsistent, it is mandated by the fact that tLeap (included in Amber) only recognizes three letters at most. Then our example sequence will be Figure 3(a). The configuration is shown in Figure 5.

    14. Go to section E.


  2. Initial Construction of DNA·RNA:DNA Hybrid Triple Helix

    1. Open Discovery Studio Visualizer.

    2. Build up a homopurine/homopyrimidine B-DNA or A-DNA double helix with the desired sequence.

    3. Change the display style to arrows and rings and turn off the atom display style.

    4. Repeat steps 4-7 in section A.

    5. Select the desired strand and change it from DNA to RNA. This is done by selecting the desired part of nucleic acid and then clicking the “Ribose” on the options manual “Modify Sugar.”

    6. Repeat steps 8-14 in section A.


  3. Initial Construction of RNA·DNA:DNA Hybrid Triple Helix

    1. Open Discovery Studio Visualizer.

    2. Load the DNA triple helix that has been stabilized after MD (in our case, at the end of 1 µs MD simulation [Zhang et al., 2020]) with the desired sequence (where T’s will be exchanged by U’s as described below). Possible initial hydrogen bond patterns for this structure are given in Figure 7.



      Figure 7. Initial hydrogen bond constraints for RNA·DNA:DNA and RNA triple helices


    3. Change the terminal residue, such as DG5, back to DG in order to avoid errors.

    4. If there are protonated bases, change them back to ordinary bases.

    5. Change the wanted part from DNA to RNA.

    6. Repeat steps 10-12 in section A.

    7. If there are protonated bases in the wanted sequence, change the corresponding residue name C to CP and A to AP. If the protonated C is at 5’, rename it as C5P. If the protonated C is at 3’, rename it as C3P. If the protonated A is at 5’, rename it as A5P. If the protonated A is at 3’, rename it as A3P.

    8. Go to section E.


  4. Initial Construction of RNA Triple Helix

    Repeat all steps of section C.


  5. Details of Molecular Dynamics Simulations

    1. Use tLeap in the Amber package to generate the topological file, initial coordinate file, and complete PDB file. We use the BSC1 force field (Ivani et al., 2016), BSC0 (Pérez et al., 2007) + OL3 (Zgarbová et al., 2011) force field, and protonated Amber forcefield (Weiner et al., 1986).

    2. Use periodic boundary conditions.

    3. The unit cell size should be large enough to avoid the self interaction between nearest nucleic acids, whose distance must be larger than the electrostatic and van der Waals cutoff (9 Å). In our example, there are about 840-870 atoms in each triplex, and we need to create an octahedral box that respects this distance. For this, we set the minimum distance between the nucleic acid and the box boundary as 8 Å, which results in a box size ~90,000 Å3. After this, TIP3P (Jorgensen et al., 1983) water molecules are added randomly to the box. (Figure 8)



      Figure 8. Illustration of an octahedral box of triplex


    4. We use Na+ ions (Joung et al., 2008) to neutralize the system. The number of Na+ equals the number of negative charges in the triplex.

    5. Add hydrogen bond constraints, as shown in Figure 6 (with some T replaced by U) or Figure 7. Add the hydrogen bond constraints as instructed by the website: https://ambermd.org/tutorials/advanced/tutorial4/index.htm. Edit: ${AMBERHOME}/dat/map.DG-AMBER if there is any error reported. Add some sentences like this to the file:

      RESIDUE C3P

      MAPPING H3 = H3

      Example hydrogen bond constraint input file can be written as:

      1 DG5 H21     18 DC3 O2             1.9     1.9

      1 DG5 H1       18 DC3 N3             1.9      1.9

      1 DG5 O6       18 DC3 H41           1.9      1.9

      1 DG5 O6       18 DC3 N4              2.9      2.9

      1 DG5 N1       18 DC3 N3              2.9      2.9

      1 DG5 N2       18 DC3 O2              2.9      2.9

      9 DA3 N1         10 DT5 H3             1.9      1.9

      9 DA3 H61       10 DT5 O4             1.9      1.9

      9 DA3 N6          10 DT5 O4            2.9      2.9

      9 DA3 N1          10 DT5 N3             2.9      2.9

      27 D3A H1        1 DG5 O6              1.9      1.9

      27 D3A N1        1 DG5 O6              2.9      2.9

      27 D3A H61       1 DG5 N7              1.9      1.9

      27 D3A N6         1 DG5 N7              2.9      2.9

      19 DG5 H21       10 DT5 O4             1.9      1.9

      19 DG5 N2         10 DT5 O4             2.9      2.9

      19 DG5 H1         10 DT5 O4             1.9      1.9

      19 DG5 N1         10 DT5 O4             2.9      2.9

      19 DG5 O6          9 DA3 H62            1.9      1.9

      19 DG5 O6          9 DA3 N6              2.9      2.9

    6. Perform MD simulations to optimize the initial structures. Set the electrostatic cutoff to be 9.0 Å. Set the van der Waals cutoff to be 9.0 Å. Use Langevin dynamics with a coupling parameter of 1.0 ps-1 to control the temperature. Use the SHAKE algorithm to deal with bonds involving hydrogen atoms. The MD simulation for nucleic acids is explained in detail in Šponer and Filip (2006).

    7. Minimize the energy for the initial conformations obtained by modeling: initially, keep the nucleic acid and ions fixed; then, slowly (in steps) lift the constraints, allowing them to move.

    8. Then gradually raise the temperature from 0 to 300 K over a 50 ps run with a 1 fs time step with the nucleic acid and ions constrained.

    9. Then use a 100 ps run at constant volume to gradually reduce the restraining harmonic constants for nucleic acids and ions.

    10. Then use a 1 µs run at constant pressure to gradually optimize the volume of the unit cell; the Berendsen pressure coupling method is the one used.

    11. The final structure of the MD simulation may be used as the initial structure for further research.

Notes

  1. Models built up either from B-DNA and A-DNA initial structures will converge to the same structure provided that triplex is stable (unstable triplexes fall apart). The duplex part of a given triple helix is somewhere between that of B- and A-DNA structures.

  2. When changing a protonated DNA residue to a protonated RNA residue, check the position of O2’ atom on the sugar ring.

  3. Models built up with different hydrogen bond constraints will converge to a similar structure after an unrestrained MD simulation provided that the sequence is stable. The initial hydrogen bond constraint is just a way to keep the third strand attached to the duplex part.

  4. While we use AMBER ver. 20 (Case et al., 2020) for our MD runs, these may also be performed using other simulation packages such as NAMD (Nelson el al., 1996).

Acknowledgments

Funding was provided by National Institute of Health (NIH) grant R01GM118508. Original paper behind this work is Zhang et al. (2020).

Competing interests

There are no conflicts of interest or competing interest.

References

  1. Caburet, S., Cocquet, J., Vaiman, D. and Veitia, R. A. (2005). Coding repeats and evolutionary “agility”. Bioessays 27(6): 581-587.
  2. Case, D. A., Belfon, K., Ben-Shalom, I. Y., Brozell, S. R., Cerutti, D. S., Cheatham, T. E., et al. (2020). Amber20. University of California: San Francisco, CA, USA.
  3. Devi, G., Zhou, Y., Zhong, Z., Toh, D. F. K. and Chen, G. (2015). RNA triplexes: from structural principles to biological and biotech applications.Wiley Interdiscip Rev RNA 6(1): 111-128.
  4. Ellegren, H. (2004). Microsatellites: simple sequences with complex evolution. Nat Rev Genet 5(6): 435-445.
  5. Felsenfeld, G., Davies, D. R. and Rich, A. (1957). Formation of a three-stranded polynucleotide molecule. J Am Chem Soc 79(8): 2023-2024.
  6. Grabczyk, E., and Usdin, K. (2000). The GAA• TTC triplet repeat expanded in Friedreich’s ataxia impedes transcription elongation by T7 RNA polymerase in a length and supercoil dependent manner.Nucleic Acids Res 28(14): 2815-2822.
  7. Ivani, I., Dans, P. D., Noy, A., Pérez, A., Faustino, I., Hospital, A., et al. (2016). Parmbsc1: a refined force field for DNA simulations. Nat Methods 13(1): 55.
  8. Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. and Klein, M. L. (1983). Comparison of simple potential functions for simulating liquid water.J Chem Phys 79(2): 926-935.
  9. Joung, I. S. and Cheatham III, T. E. (2008). Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. J Phys Chem B 112(30): 9020-9041.
  10. Kaji, E. H. and Leiden, J. M. (2001). Gene and stem cell therapies. Jama 285(5): 545-550.
  11. McMurray, C. T. (1999). DNA secondary structure: a common and causative factor for expansion in human disease. Proc Natl Acad Sci U S A 96(5): 1823-1825.
  12. Mirkin, S. M. (2006). DNA structures, repeat expansions and human hereditary disorders. Curr Opin Struct Biol 16(3): 351-358.
  13. Mirkin, S. M. (2007). Expandable DNA repeats and human disease. Nature 447(7147): 932-940.
  14. Nelson, M. T., Humphrey, W., Gursoy, A., Dalke, A., Kalé, L. V., Skeel, R. D. and Schulten, K. (1996). NAMD: a parallel, object-oriented molecular dynamics program. Int J Supercompt Appl High Perfor Comput 10(4): 251-268.
  15. Orr, H. T., and Zoghbi, H. Y. (2007). Trinucleotide repeat disorders. Annu Rev Neurosci 30: 575-621.
  16. Pan, F., Man, V. H., Roland, C., and Sagui, C. (2017). Structure and dynamics of DNA and RNA double helices of CAG and GAC trinucleotide repeats. Biophys J 113(1): 19-36.
  17. Pan, F., Zhang, Y., Man, V. H., Roland, C. and Sagui, C. (2018). E-motif formed by extrahelical cytosine bases in DNA homoduplexes of trinucleotide and hexanucleotide repeats. Nucleic Acids Res 46(2): 942-955.
  18. Pan, F., Man, V. H., Roland, C. and Sagui, C. (2018). Structure and dynamics of DNA and RNA double helices obtained from the CCG and GGC trinucleotide repeats. J Phys Chem B 122(16): 4491-4512.
  19. Pérez, A., Marchán, I., Svozil, D., Sponer, J., Cheatham III, T. E., Laughton, C. A. and Orozco, M. (2007). Refinement of the AMBER force field for nucleic acids: improving the description of α/γ conformers. Biophys J 92(11): 3817-3829.
  20. Seidman, M. M. and Glazer, P. M. (2003). The potential for gene repair via triple helix formation. J Clin Invest 1112(4): 487-494.
  21. Šponer, J. and Filip, L. (2006). Computational Studies of RNA and DNA. Vol. 2. Springer Science & Business Media.
  22. Szewczak, A. A., Ortoleva-Donnelly, L., Ryder, S. P., Moncoeur, E. and Strobel, S. A. (1998). A minor groove RNA triple helix within the catalytic core of a group I intron. Nat Struct Biol 5(12): 1037-1042.
  23. Subramanian, S., Madgula, V. M., George, R., Mishra, R. K., Pandit, M. W., Kumar, C. S., and Singh, L. (2003). Triplet repeats in human genome: distribution and their association with genes and other genomic regions. Bioinformatics 19(5): 549-552.
  24. Tóth, G., Gáspári, Z., and Jurka, J. (2000). Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10(7): 967-981.
  25. Visualizer, D. S. (2005). Discovery Studio Visualizer. 2. Accelrys Software Inc.
  26. Weiner, S. J., Kollman, P. A., Nguyen, D. T. and Case, D. A. (1986). An all atom force field for simulations of proteins and nucleic acids.J Comput Chem 7(2): 230-252.
  27. Wells, R. D., Dere, R., Hebert, M. L., Napierala, M. and Son, L. S. (2005). Advances in mechanisms of genetic instability related to hereditary neurological diseases. Nucleic Acids Res 33(12): 3785-3798.
  28. Wells, R. D. and Ashizawa, T. (Eds.). (2011). Genetic instabilities and neurological diseases. (Vol. 31). Elsevier.
  29. Xu, P., Pan, F., Roland, C., Sagui, C. and Weninger, K. (2020). Dynamics of strand slippage in DNA hairpins formed by CAG repeats: roles of sequence parity and trinucleotide interrupts. Nucleic Acids Res 48(5): 2232-2245.
  30. Zgarbová, M., Otyepka, M., Šponer, J., Mládek, A., Banáš, P., Cheatham III, T. E., and Jurecka, P. (2011). Refinement of the Cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. J Chemical Theory Comput 7(9): 2886-2902.
  31. Zhang, J., Fakharzadeh, A., Pan, F., Roland, C., and Sagui, C. (2020). Atypical structures of GAA/TTC trinucleotide repeats underlying Friedreich’s ataxia: DNA triplexes and RNA/DNA hybrids. Nucleic Acids Res 48(17): 9899-9917.
  32. Zhang, Y., Roland, C., and Sagui, C. (2017a). Structure and dynamics of DNA and RNA double helices obtained from the GGGGCC and CCCCGG hexanucleotide repeats that are the hallmark of C9FTD/ALS diseases.ACS Chem Neurosci 8(3): 578-591.
  33. Zhang, Y., Roland, C.,and Sagui, C. (2017b). Structural and dynamical characterization of DNA and rna quadruplexes obtained from the GGGGCC and GGGCT hexanucleotide repeats associated with c9ftd/als and sca36 diseases. ACS Chem Neurosci 9(5): 1104-1117.

简介

[摘要]非典型 DNA 和 RNA 二级结构在简单序列重复 (SSR) 疾病中起着至关重要的作用,SSR 疾病与一类被称为“预期疾病”的神经和神经肌肉疾病有关,在这种疾病中,发病年龄和严重程度降低随着 SSR 代际扩张​​的增加,疾病的发生率增加。尽管这些疾病的潜在机制是复杂的,仍然是难以捉摸的,有一个共识,即稳定的,非-B-DNA 非典型二级结构起着重要的(如果不是因果关系)作用。这些结构包括单链 DNA 环和发夹、G 四联体、Z-DNA、三链核酸结构等。虽然所有这些结构感兴趣,基于核酸三螺旋结构,最近获取了,因为他们已经在基因调控,基因修复被牵连越来越多的关注,以及基因工程。我们在这里的工作重点是构建由 GAA/TTC 三核苷酸重复形成的 DNA 三链体和 RNA/DNA 杂交体,这是弗里德赖希共济失调的基础。虽然有一些软件,例如 Discovery Studio Visualizer,可以帮助 DNA 三螺旋的初始构建,但三螺旋的唯一选择被限制为第三链的反平行嘧啶。在本协议中,我们将说明如何建立更通用的 DNA 三链体和 DNA/RNA 混合杂交体。我们同时使用 Discovery Studio Visualizer和 AMBER 模拟包来构建初始三元组。使用此处概述的步骤,原则上可以构建具有所需序列的任何三重核酸螺旋,用于大规模分子动力学模拟研究。


[背景]简单序列重复(SSR) -其代表大约整个人类基因组的3%-通常由1到6个核苷酸的重复多达30次以上(Ellegren等人。,2004;萨勃拉曼尼亚等人。,2003) . 在各种可能的SSR,三核苷酸重复序列(TRS)代表所有真核基因组的外显子组中最常见的类型之一(托特等人。,2000)。许多表现出的TR“动态突变”不遵循孟德尔遗传,其中指出,在一个单一的基因突变可代之间稳定地传递(Caburet等人。,2005)。这可能导致遗传疾病,在连续几代中,发病年龄降低,疾病严重程度增加(Mirkin ,2006)。这些突变——其概率也随着世代的增加而增加——是由于 TR 的代际扩张。当 TR 的重复次数达到某个阈值后,TR 进一步扩大的概率和疾病的严重程度随着重复次数的增加而增加。用的TR相关联的动态突变引起通常称为三核苷酸(或三峰)严重的神经变性和神经肌肉疾病重复扩张病(TREDs),这导致细胞毒性和死亡(韦尔斯等人。,1998;奥尔等人。,2007;威尔斯等人,2005 年)。迄今为止,已经确定了大约 50 种 DNA 可扩展 SSR 疾病,并且预计它们的数量还会增加。的TR扩展被认为是由DNA滑移复制,修复,过程中引起的转录,或重组。
尽管 TRED 背后的机制可能非常复杂,但一些简单的趋势却非常稳健。特别是,超过重复阈值的重复数与进一步扩大和增加病理的可能性之间存在相关性。另一个重要的突破已经认识到稳定的非-在膨胀重复B-DNA的二级结构是使扩张和疾病的重要因素(麦克莫里等人,1999)。因此,已知可扩展重复显示非典型的结构特征,例如单链发夹、Z-DNA、G-四联体、三螺旋结构和滑链双链体。因此,人们相信了解这些非典型二级结构的结构和动力学特征对于最终解开 TRED 之谜很重要。
虽然我们之前在 C9FTD/ALS 疾病背景下对 TRs 和六核苷酸重复的工作(Zhang等人,2017a和 2017 b )集中在理解单链环和发夹(Pan等人,2017 年,201 8 a和2018b ; Xu et al. , 2020) ,这里我们专注于构建与 GAA/TTC TRs 相关的三链体 (Zhang et al. , 2020)。这些与弗里德赖希共济失调有关(Grabczyk等,2000),这是由 frataxin 基因的第一个内含子中 GAA 的扩增引起的。实验上,已经观察到这些重复形成三链体或 R 环。DNA 三链体或三链 DNA 或 H-DNA 于 1957 年首次报道(Felsenfeld等,1957)。这些非规范的三链螺旋由 Watson-Crick 配对螺旋双链体和通过 Hoogsteen 或反向 Hoogsteen 氢键与双链体结合的第三条链组成。另一方面,R 环代表三链核酸结构,由杂交 RNA:DNA 双链体(由模板 DNA 和 RNA 链形成)与置换的非模板单链 DNA 组成。两个三螺旋和R-环可以具有的细胞功能小号和是用于基因治疗是必要的(梶等人,2001 ;塞德曼等人,2003 )。
关于DNA 三链体和 R 环的微观结构有很多需要学习的地方;特别是,这些非典型核酸结构的许多原子方面仍然难以捉摸。因此,我们最近检查了与 GAA/TTC TR 相关的 DNA 三链体和 RNA/DNA 杂合体的结构和稳定性(Zhang等,2020)。在这些结构中,第三条链被插入到纯 RNA 三链体的大沟或小沟中(Szewczak等,1998)。然而,由于小沟 RNA 三链体不稳定(Devi等人,2015),我们在本协议中只考虑大沟 RNA 三链体。我们的研究基于大规模经典分子动力学 (MD) 模拟。三重螺旋的初始建模是使用 Discovery Studio Visualizer (Visualizer, 2005) 进行的,我们利用 AMBER 模拟包作为优化和采样工具来探索 DNA 三链体和选定 R 环的结构和稳定性(凯斯等人,2020 年)。在这种生物的协议文件,我们提供详细的与GAA / TTC红素相关的三螺旋的初始模型和施工。具体来说,我们以序列 GAA/TTC(UUC) 为例。
我们主要讨论如何建立DNA -基于三链的d混合的DNA / RNA的结构,如下面的图URE 1.将在该生物协议构造的模型,它是简单的调查三重螺旋本身及其与相互作用其他生物分子。


图1的结构(从左至右)的DNA的三股螺旋的,DNA·RNA:DNA杂交体的三股螺旋,RNA·DNA:DNA杂交体的三股螺旋,和RNA三螺旋。DNA 链为蓝色,RNA 链为红色。

关键字:DNA/RNA, 三螺旋, 分子动力学, 三核苷酸重复序列


软件

 
Discovery Studio Visualizer 2019 或更高版本
Discovery Studio Visualizer 是由Dassault Systemes BIOVIA开发的免费软件。一个CCESS该软件在:https://discover.3ds.com/discovery-studio-visualizer-download。
琥珀色,版本 16 或更高版本
Amber 是一套用于大规模 MD 研究的生物分子模拟程序。一个CCESS琥珀在:http://ambermd.org/。
 
程序
 
DNA三螺旋的初步构建
打开 Discovery Studio Visualizer。
首先,构建具有所需序列的高嘌呤/高嘧啶 B-DNA 双螺旋。构建三螺旋模型的基本策略如图 2 所示。
 
 
图 2.三螺旋结构示意图
 
例如,假设我们要构建如图 3(a) 所示的并行序列。
将显示样式更改为箭头和环并关闭原子显示样式。
选择所需的链(全部为嘌呤或全部为嘧啶),然后选择复制和粘贴。如果需要移位的三重序列,则将第三条链上的残基名称更改为所需的序列。接着,复制并粘贴嘧啶链从双面打印,如图3(b)中,以获得序列图3(c)。
由于所需的三重体涉及移位序列,因此重命名残基名称以获得图 3(d)。
小心地将粘贴的链放入具有所需方向的 DNA 双链体的大沟中,如图 4和视频 1 所示。那么我们的示例序列就变成了图 3(e)。
 
 
图 3. 我们程序的示例序列。(a) 代表我们想要构建的三重。(b) 是初始双链体,我们从中获得了 (c) 中所示的链。从前一步骤中获得的移位链显示在 (d) 中。这个移位的链被放置在双链体的大沟中,如(e)所示。质子化第三链中的胞嘧啶最终导致 (a)。
 
 
图4 。将第三条链放入 DNA 双链体大沟的快照
 
 
视频1. 我们如何建立三螺旋初始模型的简短说明
 
粗略调整每个残基的位置,以避免残基之间出现任何重叠,如视频 1 所示。例如,当磷酸基团上的氧原子太近时,我们手动稍微移动一个氧原子的位置。
更改第三链碱基的名称以构建所需的序列。将质子化 DC 命名为 DC ,将质子化 DA命名为DA等。(RNA 相同)。稍后我们将说明如何处理质子化情况。
将显示样式改回原子/球和棒。
精确调整第三条链上的每个原子,以避免在后续 MD 运行期间发生高势能碰撞,并构建如图 5和图 6所示的氢键结构。
 
 
图5 。最终构为图4中给出的示例URE 3和通过步骤1-14建造。
 
 
图 6. DNA 三螺旋的初始氢键约束
 
选择编辑,选择。将属性更改为元素,然后选择氢。选择所有氢原子后,键入Del删除所有氢原子,因为有些氢原子的名称与 Amber 中的名称不同,会导致错误。
将结构另存为 PDB。
Ë DIT PDB文件,DELET荷兰国际集团所有日在PDB文件的末尾E连接关系。             
如果所需序列中有质子化碱基,请将相应的残基名称 DC 更改为 DCP,将 DA 更改为 DAP。如果质子化的 C 位于 5',则将其重命名为 D5C。如果质子化的 C 位于 3',则将其重命名为 D3C。如果质子化的 A 位于 5',则将其重命名为 D5A。如果质子化的 A 位于 3',则将其重命名为 D3A等。虽然这种命名法似乎有点武断和不一致的,它是由授权的事实,tLeap(包含在琥珀色)只承认THR EE字母最多。 那么我们的示例序列将是图 3(a)。配置如图 5 所示。
转到 E 部分。
 
DNA ·RNA:DNA杂种三螺旋的初步构建
打开Discovery Studio Visualizer 。
构建具有所需序列的高嘌呤/高嘧啶 B-DNA 或 A-DNA 双螺旋。
将显示样式更改为箭头和环并关闭原子显示样式。
重复步骤小号在部分A. 4-7
选择所需的链并将其从 DNA 更改为 RNA。这是通过选择核酸的期望的部分,然后点击“核糖”在th进行Ë选项手册“修改糖”。
重复A 部分中的第8-14步。
 
RNA ·DNA:DNA杂种三螺旋的初步构建
打开 Discovery Studio Visualizer。
将 MD 后稳定的 DNA 三螺旋(在我们的例子中,在 1 µs MD 模拟结束时[ Zhang et al. , 2020 ] )加载到所需序列(其中 T 将被 U 交换,如下所述)。这种结构的可能的初始氢键模式在图7中给出。
 
 
图7 。RNA·DNA的初始氢键约束:DNA和RNA三螺旋
 
将终端残基(例如 DG5 )改回 DG 以避免错误。
如果有质子化碱基,将它们改回普通碱基。
将所需部分从 DNA 更改为 RNA。
重复步骤小号在第10-12 A.
如果所需序列中有质子化碱基,请将相应的残基名称 C 更改为 CP ,将 A 更改为 AP。如果质子化的 C 位于 5',则将其重命名为 C5P。如果质子化的 C 位于 3',则将其重命名为 C3P。如果质子化的 A 位于 5',则将其重命名为 A5P。如果质子化的 A 位于 3',则将其重命名为 A3P。
转到 E 部分。
 
RNA三螺旋的初步构建
重复 C 部分的所有步骤。
 
分子动力学模拟的细节
使用tLeap在琥珀包生成拓扑文件,最初的坐标文件,并完成PDB文件。我们使用BSC1力场(Ivani等人。年,2016年),BSC0(佩雷斯等人2007年)+ OL3(Zgarbová等,2011)力场,和质子化琥珀力场(韦纳等,1986)。
使用周期性边界条件。
晶胞大小应足够大以避免最近核酸之间的自相互作用,其距离必须大于静电和范德华截止值 (9 Å)。在我们的例子中,有大约每三缸840-870个原子,而我们需要创建一个八面体箱尊重这个距离。为此,我们将核酸和框边界之间的最小距离设置为 8 Å,这导致框大小为 ~90,000 Å 3 。在此之后,TIP3P ( Jorgensen et al. , 1983) 水分子被随机添加到盒子中。(图8)
 
 
图8.我的llustration一个三缸的八面体箱
 
我们使用 Na +离子(Joung等人,2008 年)来中和系统。Na +的数量等于三重链中负电荷的数量。
添加氢键约束,如图6 (一些 T 被 U 替换)或图7所示。甲DD中的氢键的约束由网站的指示:https://ambermd.org/tutorials/advanced/tutorial4/index.htm。编辑:${AMBERHOME}/dat/map.DG-AMBER 如果报告了任何错误。在文件中添加一些这样的句子:
残留物C3P
映射 H3 = H3
示例氢键约束输入文件可以写成:
1 DG5 H21 18 DC3 O2 1.9 1.9
1 DG5 H1 18 DC3 N3 1.9 1.9
1 DG5 O6 18 DC3 H41 1.9 1.9
1 DG5 O6 18 DC3 N4 2.9 2.9
1 DG5 N1 18 DC3 N3 2.9 2.9
1 DG5 N2 18 DC3 O2 2.9 2.9
9 DA3 N1 10 DT5 H3 1.9 1.9
9 DA3 H61 10 DT5 O4 1.9 1.9
9 DA3 N6 10 DT5 O4 2.9 2.9
9 DA3 N1 10 DT5 N3 2.9 2.9
27 D3A H1 1 DG5 O6 1.9 1.9
27 D3A N1 1 DG5 O6 2.9 2.9
27 D3A H61 1 DG5 N7 1.9 1.9
27 D3A N6 1 DG5 N7 2.9 2.9
19 DG5 H21 10 DT5 O4 1.9 1.9
19 DG5 N2 10 DT5 O4 2.9 2.9
19 DG5 H1 10 DT5 O4 1.9 1.9
19 DG5 N1 10 DT5 O4 2.9 2.9
19 DG5 O6 9 DA3 H62 1.9 1.9
19 DG5 O6 9 DA3 N6 2.9 2.9
执行 MD 模拟以优化初始结构。设置了静电截止为9.0。将范德华截止设置为 9.0 Å。使用耦合参数为1.0 ps -1 的朗之万动力学来控制温度。使用 SHAKE 算法处理涉及氢原子的键。Šponer和 Filip ( 2006)详细解释了核酸的 MD 模拟。
最小化通过建模获得的初始构象的能量:最初,保持核酸和离子固定;然后,慢慢地(逐步)解除约束,让它们移动。
然后逐步提高从0到300°K的温度超过一个与该核酸和离子限制的1个fs的时间步长为50ps运行。
然后使用 100 ps 以恒定体积运行以逐渐降低核酸和离子的抑制谐波常数。
然后在恒压下运行 1 µs,逐步优化晶胞体积;所述Berendsen压力耦合的方法是使用一个。
MD 模拟的最终结构可用作进一步研究的初始结构。
 
笔记
 
如果三链体是稳定的(不稳定的三链体分崩离析),则从 B-DNA 和 A-DNA 初始结构建立的模型将收敛到相同的结构。给定三螺旋的双链部分介于 B-DNA 和 A-DNA 结构之间。
W¯¯母鸡改变质子化DNA残基,以一个质子化RNA残基,检查O2'原子的位置的糖环上。             
如果序列是稳定的,那么在无限制的 MD 模拟之后,使用不同氢键约束建立的模型将收敛到类似的结构。最初的氢键约束只是将第三条链连接到双链部分的一种方式。
虽然我们使用 AMBER 版本。20 (Case et al., 2020) 对于我们的 MD 运行,这些也可以使用其他模拟包执行,例如 NAMD ( Nelson et al. , 1996) 。
 
致谢
 
资金由美国国立卫生研究院 (NIH) 赠款 R01GM118508 提供。这项工作背后的原始论文是 Zhang等人。(2020)。
 
利益争夺
 
不存在利益冲突或竞争利益。
 
参考
 
Caburet, S.、Cocquet, J.、Vaiman, D. 和 Veitia, RA (2005 年)。编码重复和进化的“敏捷性”。生物论文27(6): 581-587。
Case, DA, Belfon, K., Ben-Shalom, IY, Brozell, SR, Cerutti, DS, Cheatham, TE等。(2020 年)。琥珀色20。加利福尼亚大学:美国加利福尼亚州旧金山。
Devi, G.、Zhou , Y.、Zhong, Z.、Toh、DFK和Chen, G. (2015 年)。RNA 三链体:从结构原理到生物学和生物技术应用。Wiley Interdiscip Rev RNA 6(1): 111-128。
埃勒格伦,H. (2004 年)。微卫星:复杂进化的简单序列。纳特ř EV ģ ENET 5(6):435-445。
Felsenfeld, G.、Davies, DR 和 Rich, A. ( 1957 ) 。三链多核苷酸分子的形成。J Am Chem Soc 79(8): 2023-2024。
Grabczyk, E. 和 Usdin, K. ( 2000 ) 。在弗里德赖希共济失调中扩增的 GAA• TTC 三联体重复以长度和超螺旋依赖的方式阻碍了 T7 RNA 聚合酶的转录延伸。核酸研究28(14):2815-2822。
Ivani, I., Dans, PD, Noy, A., Pérez, A., Faustino, I., Hospital, A., et al 。(2016 年)。Parmbsc1:用于 DNA 模拟的精细力场。纳特中号ethod小号13(1):55。
Jorgensen, WL, Chandrasekhar, J., Madura, JD, Impey, RW 和 Klein, ML ( 1983 ) 。模拟液态水的简单势函数的比较。Ĵ Ç下摆P HYS 79(2):926-935。
Joung, IS 和 Cheatham III, TE ( 2008 ) 。确定用于明确溶剂化生物分子模拟的碱金属和卤化物单价离子参数。J P hys C hem B 112(30): 9020-9041。
Kaji , EH和Leiden, JM ( 2001 ) 。基因和干细胞疗法。贾玛285(5):545-550。
麦克默里,CT(1999 年)。DNA二级结构:人类疾病扩张的常见和致病因素。Proc Natl Acad Sci USA 96(5): 1823-1825。
米尔金,SM (2006 年)。DNA 结构、重复扩增和人类遗传性疾病。Curr Opin Struct Biol 16(3): 351-358。
米尔金,SM (2007 年)。可扩展的 DNA 重复和人类疾病。自然447(7147):932-940。
Nelson, MT, Humphrey, W., Gursoy, A., Dalke, A., Kalé, LV, Skeel, RD 和 Schulten, K. ( 1996 ) 。NAMD:一个并行的、面向对象的分子动力学程序。Int J Supercomp t Appl High Perfor Comput 10(4): 251-268。
Orr, HT 和 Zoghbi, HY ( 2007 ) 。三核苷酸重复障碍。Annu Rev Neurosci 30:575-621。
Pan, F.、Man, VH、Roland, C. 和 Sagui, C. (2017 年)。CAG 和 GAC 三核苷酸重复序列的 DNA 和 RNA 双螺旋的结构和动力学。生物物理学Ĵ 113(1):19-36。
Pan, F.、Zhang, Y.、Man, VH、Roland, C. 和 Sagui, C. (2018 年)。由三核苷酸和六核苷酸重复序列的 DNA 同源双链体中的螺旋外胞嘧啶碱基形成的 E 基序。核酸研究46(2): 942-955。
Pan, F.、Man, VH、Roland, C. 和 Sagui, C. (2018 年)。从 CCG 和 GGC 三核苷酸重复序列中获得的 DNA 和 RNA 双螺旋的结构和动力学。J P hys C hem B 122(16) : 4491-4512。
Pérez, A., Marchán, I., Svozil, D., Sponer, J., Cheatham III, TE, Laughton, CA 和 Orozco, M. ( 2007 ) 。核酸的琥珀色力场的细化:改进对 α/γ 构象异构体的描述。生物物理学Ĵ 92(11):3817-3829。
Seidman, MM 和 Glazer, PM ( 2003 ) 。通过三螺旋形成进行基因修复的潜力。J Clin Invest 1112(4): 487-494。
斯波纳,J 。和菲利普,L 。(2006 年)。RNA 和 DNA 的计算研究。卷。2. Springer 科学与商业媒体。
Szewczak, AA, Ortoleva-Donnelly, L., Ryder, SP, Moncoeur, E. 和 Strobel, SA ( 1998 ) 。I 组内含子催化核心内的小沟 RNA 三螺旋。纳特小号truct乙IOL 5 (12):1037至1042年。
Subramanian, S., Madgula, VM, George, R., Mishra, RK, Pandit, MW, Kumar, CS 和 Singh, L. ( 2003 ) 。人类基因组中的三重重复:分布及其与基因和其他基因组区域的关联。生物信息学19(5):549-552。
Ť ó日,G.,加斯帕里,Z.,和Jurka,J. (2000 )。不同真核生物基因组中的微卫星:调查和分析。基因组ř上课10(7):967-981。
展示台,DS (2005 年)。探索工作室展示台。2. Accelrys 软件公司
Weiner, SJ, Kollman, PA, Nguyen, DT和Case, DA ( 1986 ) 。用于模拟蛋白质和核酸的全原子力场。J C omput C hem 7 (2) : 230-252。
Wells, RD, Dere, R., Hebert, ML, Napierala, M. 和 Son, LS ( 2005 ) 。遗传性神经系统疾病相关遗传不稳定性机制研究进展. 核酸甲的CID ř上课33(12):3785-3798。
Wells, RD 和 Ashizawa, T.(编辑)。(2011 年)。遗传不稳定性和神经系统疾病。(第 31 卷)。爱思唯尔。
Xu, P., Pan, F., Roland, C., Sagui, C. 和 Weninger, K. ( 2020 ) 。CAG 重复形成的 DNA 发夹中链滑移的动力学:序列奇偶性和三核苷酸中断的作用。核酸研究48(5): 2232-2245。
Zgarbová, M., Otyepka, M., Šponer, J., Mládek, A., Banáš, P., Cheatham III, TE 和 Jurecka, P. ( 2011 ) 。康奈尔等人的细化。基于糖苷扭转曲线的参考量子化学计算的核酸力场。Ĵ Ç hemical Ť heory Ç ompu吨7(9):2886至2902年。
Zhang, J.、Fakharzadeh, A.、Pan, F.、Roland, C. 和 Sagui, C. (2020 年)。GAA/TTC 三核苷酸重复的非典型结构是弗里德赖希共济失调的基础:DNA 三链体和 RNA/DNA 杂合体。核酸甲的CID ř上课48(17):9899-9917。
Zhang, Y.、Roland, C. 和Sagui , C. ( 2017a ) 。从 GGGGCC 和 CCCCGG 六核苷酸重复序列中获得的 DNA 和 RNA 双螺旋的结构和动力学是 C9FTD/ALS 疾病的标志。ACS Ç下摆Ñ eurosci 8(3):578-591。
Zhang, Y.、Roland, C. 和 Sagui, C. ( 2017b ) 。从与 c9ftd/als 和 sca36 疾病相关的 GGGGCC 和 GGGCT 六核苷酸重复序列中获得的 DNA 和 rna 四链体的结构和动力学表征。ACS Ç下摆Ñ eurosci 9(5):1104年至1117年。
 
版权所有 © 20 2 1作者;独家被许可人 Bio-protocol LLC。1                                                                                                                             
登录/注册账号可免费阅读全文
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容,www.bio-protocol.org 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright: © 2021 The Authors; exclusive licensee Bio-protocol LLC.
引用:Zhang, J., Fakharzadeh, A., Pan, F., Roland, C. and Sagui, C. (2021). Construction of DNA/RNA Triplex Helices Based on GAA/TTC Trinucleotide Repeats. Bio-protocol 11(18): e4155. DOI: 10.21769/BioProtoc.4155.
提问与回复
提交问题/评论即表示您同意遵守我们的服务条款。如果您发现恶意或不符合我们的条款的言论,请联系我们:eb@bio-protocol.org。

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。