基因组学 -系统生物学-BIO-PROTOCOL

Computational Workflow for Genome-Wide DNA Methylation Profiling and Differential Methylation Analysis

Pei-Yu Lin GL Guan-Jun Lin KC Kuan-Lin Chen SH Shiang-Chin Huang

0 Q&A 2219 Views Nov 5, 2025

DNA methylation is a crucial epigenetic modification that influences gene expression and plays a role in various biological processes. High-throughput sequencing techniques, such as bisulfite sequencing (BS-seq) and enzymatic methyl sequencing (EM-seq), enable genome-wide profiling of DNA methylation patterns with single-base resolution. In this protocol, we present a bioinformatics pipeline for analyzing genome-wide DNA methylation. We outline the step-by-step process of the essential analyses, including quality control using FASTQ for BS- and EM-seqs raw reads, read alignment with commonly used aligners such as Bowtie2 and BS-Seeker2, DNA methylation calling to generate CGmap files, identification of differentially methylated regions (DMRs) using tools including MethylC-analyzer and HOME, data visualization, and post-alignment analyses. Compared to existing workflows, this pipeline integrates multiple steps into a single protocol, lowering the technical barrier, improving reproducibility, and offering flexibility for both plant and animal methylome studies. To illustrate the application of BS-seq and EM-seq, we demonstrate a case study on analyzing a mutant in Arabidopsis thaliana with a mutation in the met1 gene, which encodes a DNA methyltransferase, and results in global CG hypomethylation and altered gene regulation. This example highlights the biological insights that can be gained through systematic methylome analysis. Our workflow is adaptable to any organism with a reference genome and provides a robust framework for uncovering methylation-associated regulatory mechanisms. All scripts and detailed instructions are provided in GitHub repository: https://github.com/PaoyangLab/Methylation_Analysis.

Simultaneous Capture of Chromatin-Associated RNA and Global RNA–RNA Interactions With Reduced Input Requirements

低样本量条件下同时捕获染色质相关RNA及全局RNA–RNA互作

CD Cheng Ding GC Guoting Chen SL Shiping Luan YG Yuanyuan Gong CG Cuilin Gui CY Chen Yang ZX Zihe Xiang JD Junjie Du MF Mohamed F. Foda JY Jiapei Yan* XL Xingwang Li*

0 Q&A 3600 Views Sep 5, 2025

Chromatin-associated RNAs (caRNAs) have been increasingly recognized as key regulators of gene expression and genome architecture. A few technologies, such as ChRD-PET and RedChIP, have emerged to assess protein-mediated RNA–chromatin interactions, but each has limitations. Here, we describe the TaDRIM-seq (targeted DNA-associated RNA and RNA–RNA interaction mapping by sequencing) technique, which combines Protein G (PG)-Tn5-targeted DNA tagmentation with in situ proximity ligation to simultaneously profile caRNAs across genomic regions and capture global RNA–RNA interactions within intact nuclei. This approach reduces the required cell input, shortens the experimental duration compared to existing protocols, and is applicable to both mammalian and plant systems.

Protocol for Generation of Single-Gene Knockout in Hard-to-Transfect THP1 Cell Lines Using CRISPR/Cas9

利用CRISPR/Cas9在难转染的THP1细胞系中构建单基因敲除的实验方案

KS Kaveri Srivastava BP Bhaswati Pandit*

0 Q&A 3564 Views Jul 5, 2025

This protocol provides a step-by-step approach for generating single-gene knockout in hard-to-transfect suspension immune cell lines like THP1, specifically demonstrated by knocking out the GSDMD gene. By employing CRISPR-Cas9 system delivered via lentivirus, this protocol enables precise gene disruption through targeted single-guide RNAs (sgRNAs). Key steps include designing specific sgRNAs, cloning them into a CRISPR vector, viral packaging, and transducing the target cells, followed by selection and validation. This optimized protocol is particularly useful for functional studies in immune cells, allowing researchers to reliably explore gene function in complex cellular pathways.

A Comprehensive Protocol for Bayesian Phylogenetic Analysis Using MrBayes: From Sequence Alignment to Model Selection and Phylogenetic Inference

基于MrBayes的贝叶斯系统发育分析全流程方案：从序列比对到模型选择与系统发育推断

JW Jinxing Wang FC Fangmin Chen* XX Xu Xiao XY Xinyao Yang WX Wanting Xia

0 Q&A 2130 Views Apr 20, 2025

Bayesian phylogenetic analysis is essential for elucidating evolutionary relationships among organisms. Traditional methods often rely on fixed models and manual parameter settings, which can limit accuracy and efficiency. This protocol presents an integrated workflow that leverages GUIDANCE2 for rigorous sequence alignment, ProtTest and MrModeltest for robust model selection, and MrBayes for phylogenetic tree estimation through Bayesian inference. By automating key steps and providing detailed command-line instructions, this protocol enhances the reliability and reproducibility of phylogenetic studies.

GWAS Procedures for Gene Mapping in Diverse Populations With Complex Structures

适用于复杂群体结构的多样性群体基因定位的GWAS分析流程

ZZ Zhen Zuo ML Mingliang Li DL Defu Liu QL Qi Li BH Bin Huang GY Guanshi Ye JW Jiabo Wang YT You Tang*

Zhiwu Zhang*

0 Q&A 3378 Views Apr 20, 2025

With reduced genotyping costs, genome-wide association studies (GWAS) face more challenges in diverse populations with complex structures to map genes of interest. The complex structure demands sophisticated statistical models, and increased marker density and population size require efficient computing tools. Many statistical models and computing tools have been developed with varied properties in statistical power, computing efficiency, and user-friendly accessibility. Some statistical models were developed with dedicated computing tools, such as efficient mixed model analysis (EMMA), multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). However, there are computing tools (e.g., GAPIT) that implement multiple statistical models, retain a constant user interface, and maintain enhancement on input data and result interpretation. In this study, we developed a protocol utilizing a minimal set of software tools (BEAGLE, BLINK, and GAPIT) to perform a variety of analyses including file format conversion, missing genotype imputation, GWAS, and interpretation of input data and outcome results. We demonstrated the protocol by reanalyzing data from the Rice 3000 Genomes Project and highlighting advancements in GWAS model development.

Improved Extraction Methods to Isolate High Molecular Weight DNA From Magnaporthaceae and Other Grass Root Fungi for Long-Read Whole Genome Sequencing

优化高分子量 DNA 提取方法以用于 Magnaporthaceae 及其他禾本科根部真菌的长读长全基因组测序

MG Michelle J. Grey* JF Jackie Freeman JR Jason Rudd NI Naomi Irish GC Gail Canning

Tania Chancellor JP Javier Palma-Guerrero RH Rowena Hill NH Neil Hall KH Kim E. Hammond-Kosack MM Mark McMullan*

0 Q&A 3268 Views Mar 20, 2025

This manuscript details two modified protocols for the isolation of long-stranded or high molecular weight (HMW) DNA from Magnaporthaceae (Ascomycota) fungal mycelium intended for whole genome sequencing. The Cytiva Nucleon PhytoPure and the Macherey-Nagel NucleoBond HMW DNA kits were selected because the former requires lower amounts of starting material and the latter utilizes gentler methods to maximize DNA length, albeit at a higher requirement for input material. The Cytiva Nucleon PhytoPure kit successfully recovered HMW DNA for half of our fungal species by increasing the amount of RNase A treatment and adding in a proteinase K treatment. To reduce the impact of pigmentation development, which occurs toward later stages of culturing, extractions were run in quadruplicate to increase overall DNA concentration. We also adapted the Macherey-Nagel NucleoBond HMW DNA kit for high-quality HMW DNA by grinding the sample to a fine powder, overnight lysis, and splitting the sample before washing the precipitated DNA. For both kits, precipitated DNA was spooled out pre-washing, ensuring a higher percentage of high-integrity long strands. The Macherey-Nagel protocol offers advantages over the first through the utilization of gravity columns that provide gentler treatment, yielding >50% of high-purity DNA strands exceeding 40 kbp. The limitation of this method is the requirement for a large quantity of starting material (1 g). By triaging samples based on the rate of growth relative to the accumulation of secondary metabolites, our methodologies hold promise for yielding reliable and high-quality HMW DNA from a variety of fungal samples, improving sequencing outcomes.

PCR-Based Genotyping of Zebrafish Genetic Mutants

基于 PCR 的斑马鱼基因突变体分型方法

SB Swathy Babu* YN Yuko Nishiwaki IM Ichiro Masai*

0 Q&A 2298 Views Mar 20, 2025

Zebrafish genetic mutants have emerged as a valuable model system for studying various aspects of disease and developmental biology. Mutant zebrafish embryos are generally identified based on phenotypic defects at later developmental stages, making it difficult to investigate underlying molecular mechanisms at earlier stages. This protocol presents a PCR-based genotyping method that enables the identification of wild-type, heterozygous, and homozygous zebrafish genetic mutants at any developmental stage, even when they are phenotypically indistinguishable. The approach involves the amplification of specific genomic regions using carefully designed primers, followed by gel electrophoresis. This genotyping method facilitates the investigation of the molecular mechanisms driving phenotypic defects that are observed at later timepoints. This protocol allows researchers to perform analyses such as immunofluorescence, RT-PCR, RNA sequencing, and other molecular experiments on early developmental stages of mutants. The availability of this protocol expands the utility of zebrafish genetic mutants for elucidating the molecular underpinnings of various biological processes throughout development.

Annotated Bioinformatic Pipelines for Genome Assembly and Annotation of Mitochondrial Genomes

线粒体基因组组装与注释的生物信息学流程详解

JW Jessica C. Winn* AB Aletta E. Bester-van der Merwe SM Simo N. Maduna*

0 Q&A 2737 Views Mar 5, 2025

Mitochondrial genomes (mitogenomes) display relatively rapid mutation rates, low sequence recombination, high copy numbers, and maternal inheritance patterns, rendering them valuable blueprints for mapping lineages, uncovering historical migration patterns, understanding intraspecific population dynamics, and investigating how environmental pressures shape traits underpinned by genetic variation. Here, we present the bioinformatic pipeline and code used to assemble and annotate the complete mitogenomes of five houndsharks (Chondrichthyes: Triakidae) and compare them to the mitogenomes of other closely related species. We demonstrate the value of a combined assembly approach for detecting deviations in mitogenome structure and describe how to select an assembly approach that best suits the sequencing data. The datasets required to run our analyses are available on the GitHub and Dryad repositories.

Annotated Bioinformatic Pipelines for Phylogenomic Placement of Mitochondrial Genomes

线粒体基因组系统发育分析的生物信息学流程详解

JW Jessica C. Winn* AB Aletta E. Bester-van der Merwe SM Simo N. Maduna*

0 Q&A 2382 Views Mar 5, 2025

The limited standards for the rigorous and objective use of mitochondrial genomes (mitogenomes) can lead to uncertainties regarding the phylogenetic relationships of taxa under varying evolutionary constraints. The mitogenome exhibits heterogeneity in base composition, and evolutionary rates may vary across different regions, which can cause empirical data to violate assumptions of the applied evolutionary models. Consequently, the unique evolutionary signatures of the dataset must be carefully evaluated before selecting an appropriate approach for phylogenomic inference. Here, we present the bioinformatic pipeline and code used to expand the mitogenome phylogeny of the order Carcharhiniformes (groundsharks), with a focus on houndsharks (Chondrichthyes: Triakidae). We present a rigorous approach for addressing difficult-to-resolve phylogenies, incorporating multi-species coalescent modelling (MSCM) to address gene/species tree discordance. The protocol describes carefully designed approaches for preparing alignments, partitioning datasets, assigning models of evolution, inferring phylogenies based on traditional site-homogenous concatenation approaches as well as under multispecies coalescent and site heterogenous models, and generating statistical data for comparison of different topological outcomes. The datasets required to run our analyses are available on GitHub and Dryad repositories.

Phylogenomics of Plant NLR Immune Receptors to Identify Functionally Conserved Sequence Motifs

植物NLR免疫受体的系统基因组学分析以识别功能保守的序列基序

TS Toshiyuki Sakai AT AmirAli Toghani HA Hiroaki Adachi*

0 Q&A 3128 Views Jul 5, 2024

In recent years, the increase in genome sequencing across diverse plant species has provided a significant advantage for phylogenomics studies, allowing the analysis of one of the most diverse gene families in plants: nucleotide-binding leucine-rich repeat receptors (NLRs). However, due to the sequence diversity of the NLR gene family, identifying key molecular features and functionally conserved sequence patterns is challenging through multiple sequence alignment. Here, we present a step-by-step protocol for a computational pipeline designed to identify evolutionarily conserved motifs in plant NLR proteins. In this protocol, we use a large-scale NLR dataset, including 1,862 NLR genes annotated from monocot and dicot species, to predict conserved sequence motifs, such as the MADA and EDVID motifs, within the coiled-coil (CC)-NLR subfamily. Our pipeline can be applied to identify molecular signatures that have remained conserved in the gene family over evolutionary time across plant species.

系统生物学

分类