生物信息学与计算生物学


分类

现刊
0 Q&A 276 Views Aug 5, 2025

Thousands of RNAs are localized to specific subcellular locations, and these localization patterns are often required for optimal cell function. However, the sequences within RNAs that direct their transport are unknown for almost all localized transcripts. Similarly, the RNA content of most subcellular locations remains unknown. To facilitate the study of subcellular transcriptomes, we developed the RNA proximity labeling method OINC-seq. OINC-seq utilizes photoactivatable, spatially restricted RNA oxidation to specifically label RNA in proximity to a subcellularly localized bait protein. After labeling, these oxidative RNA marks are then read out via high-throughput sequencing due to their ability to induce predictable misincorporation events by reverse transcriptase. These induced mutations are then quantitatively assessed for each gene using our software package PIGPEN. The observed mutation rate for a given RNA species is therefore related to its proximity to the localized bait protein. This protocol describes procedures for assaying RNA localization via OINC-seq experiments as well as computational procedures for analyzing the resulting data using PIGPEN.

往期刊物
0 Q&A 353 Views Jul 20, 2025

The root meristem navigates the highly variable soil environment where water availability limits water absorption, slowing or halting growth. Traditional studies use uniform high osmotic potentials, poorly representing natural conditions where roots gradually encounter increasing osmotic potentials. Uniform high osmotic potentials reduce root growth by inhibiting cell division and shortening mature cell length. This protocol describes a simple and effective in vitro system using a gradient mixer that generates a vertical gradient in an agar gel based on the principle of communicating vessels, exploiting gravity to generate a continuous mannitol concentration gradient (from 0 to 400 mM mannitol) reaching osmotic potentials of -1,2 MPa. It enables long-term Arabidopsis root growth analysis under progressive water deficit, improving phenotyping and molecular studies in soil-like conditions.

0 Q&A 675 Views Jul 20, 2025

Transcriptional pausing dynamically regulates spatiotemporal gene expression during cellular differentiation, development, and environmental adaptation. Precise measurement of pausing duration, a critical parameter in transcriptional control, has been challenging due to limitations in resolution and confounding factors. We introduce Fast TV-PRO-seq, an optimized protocol built on time-variant precision run-on sequencing (TV-PRO-seq), which enables genome-wide, single-base resolution mapping of RNA polymerase II pausing times. Unlike standard PRO-seq, Fast TV-PRO-seq employs sarkosyl-free biotin-NTP run-on with time gradients and integrates on-bead enzymatic reactions to streamline workflows. Key improvements include (1) reducing experimental time from 4 to 2 days, (2) reducing cell input requirements, and (3) improved process efficiency and simplified command-line operations through the use of bash scripts.

0 Q&A 325 Views Jul 5, 2025

Since the creation of the Global Polio Eradication Initiative (GPEI) in 1988, significant progress has been made toward attaining a poliovirus-free world. This has resulted in the eradication of wild poliovirus (WPV) serotypes two (WPV2) and three (WPV3) and limited transmission of serotype one (WPV1) in Pakistan and Afghanistan. However, the increased emergence of circulating vaccine-derived poliovirus (cVDPV) and the continued circulation of WPV1, although limited to two countries, pose a continuous threat of international spread of poliovirus. These challenges highlight the need to further strengthen surveillance and outbreak responses, particularly in the African Region (AFRO). Phylogeographic visualization tools may provide insights into changes in poliovirus epidemiology, which can in turn guide the implementation of more strategic and effective supplementary immunization activities and improved outbreak response and surveillance. We created a comprehensive protocol for the phylogeographic analysis of polioviruses using Nextstrain, a powerful open-source tool for real-time interactive visualization of virus sequencing data. It is expected that this protocol will support poliovirus elimination strategies in AFRO and contribute significantly to global eradication strategies. These tools have been utilized for other pathogens of public health importance, for example, SARS-CoV-2, human influenza, Ebola, and Mpox, among others, through real-time tracking of pathogen evolution (https://nextstrain.org), harnessing the scientific and public health potential of pathogen genome data.

0 Q&A 327 Views Jul 5, 2025

The complexity of the human transcriptome poses significant challenges for complete annotation. Traditional RNA-seq, often limited by sensitivity and short read lengths, is frequently inadequate for identifying low-abundant transcripts and resolving complex populations of transcript isoforms. Direct long-read sequencing, while offering full-length information, suffers from throughput limitations, hindering the capture of low-abundance transcripts. To address these challenges, we introduce a targeted RNA enrichment strategy, rapid amplification of cDNA ends coupled with Nanopore sequencing (RACE-Nano-Seq). This method unravels the deep complexity of transcripts containing anchor sequences—specific regions of interest that might be exons of annotated genes, in silico predicted exons, or other sequences. RACE-Nano-Seq is based on inverse PCR with primers targeting these anchor regions to enrich the corresponding transcripts in both 5' and 3' directions. This method can be scaled for high-throughput transcriptome profiling by using multiplexing strategies. Through targeted RNA enrichment and full-length sequencing, RACE-Nano-Seq enables accurate and comprehensive profiling of low-abundance transcripts, often revealing complex transcript profiles at the targeted loci, both annotated and unannotated.

0 Q&A 1146 Views Apr 20, 2025

With reduced genotyping costs, genome-wide association studies (GWAS) face more challenges in diverse populations with complex structures to map genes of interest. The complex structure demands sophisticated statistical models, and increased marker density and population size require efficient computing tools. Many statistical models and computing tools have been developed with varied properties in statistical power, computing efficiency, and user-friendly accessibility. Some statistical models were developed with dedicated computing tools, such as efficient mixed model analysis (EMMA), multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). However, there are computing tools (e.g., GAPIT) that implement multiple statistical models, retain a constant user interface, and maintain enhancement on input data and result interpretation. In this study, we developed a protocol utilizing a minimal set of software tools (BEAGLE, BLINK, and GAPIT) to perform a variety of analyses including file format conversion, missing genotype imputation, GWAS, and interpretation of input data and outcome results. We demonstrated the protocol by reanalyzing data from the Rice 3000 Genomes Project and highlighting advancements in GWAS model development.

0 Q&A 544 Views Apr 20, 2025

Bayesian phylogenetic analysis is essential for elucidating evolutionary relationships among organisms. Traditional methods often rely on fixed models and manual parameter settings, which can limit accuracy and efficiency. This protocol presents an integrated workflow that leverages GUIDANCE2 for rigorous sequence alignment, ProtTest and MrModeltest for robust model selection, and MrBayes for phylogenetic tree estimation through Bayesian inference. By automating key steps and providing detailed command-line instructions, this protocol enhances the reliability and reproducibility of phylogenetic studies.

0 Q&A 597 Views Mar 5, 2025

The limited standards for the rigorous and objective use of mitochondrial genomes (mitogenomes) can lead to uncertainties regarding the phylogenetic relationships of taxa under varying evolutionary constraints. The mitogenome exhibits heterogeneity in base composition, and evolutionary rates may vary across different regions, which can cause empirical data to violate assumptions of the applied evolutionary models. Consequently, the unique evolutionary signatures of the dataset must be carefully evaluated before selecting an appropriate approach for phylogenomic inference. Here, we present the bioinformatic pipeline and code used to expand the mitogenome phylogeny of the order Carcharhiniformes (groundsharks), with a focus on houndsharks (Chondrichthyes: Triakidae). We present a rigorous approach for addressing difficult-to-resolve phylogenies, incorporating multi-species coalescent modelling (MSCM) to address gene/species tree discordance. The protocol describes carefully designed approaches for preparing alignments, partitioning datasets, assigning models of evolution, inferring phylogenies based on traditional site-homogenous concatenation approaches as well as under multispecies coalescent and site heterogenous models, and generating statistical data for comparison of different topological outcomes. The datasets required to run our analyses are available on GitHub and Dryad repositories.

0 Q&A 657 Views Mar 5, 2025

Mitochondrial genomes (mitogenomes) display relatively rapid mutation rates, low sequence recombination, high copy numbers, and maternal inheritance patterns, rendering them valuable blueprints for mapping lineages, uncovering historical migration patterns, understanding intraspecific population dynamics, and investigating how environmental pressures shape traits underpinned by genetic variation. Here, we present the bioinformatic pipeline and code used to assemble and annotate the complete mitogenomes of five houndsharks (Chondrichthyes: Triakidae) and compare them to the mitogenomes of other closely related species. We demonstrate the value of a combined assembly approach for detecting deviations in mitogenome structure and describe how to select an assembly approach that best suits the sequencing data. The datasets required to run our analyses are available on the GitHub and Dryad repositories.

0 Q&A 486 Views Mar 5, 2025

Non-small cell lung cancer (NSCLC) is the most common type of lung cancer. According to 2020 reports, globally, 2.2 million cases are reported every year, with the mortality number being as high as 1.8 million patients. To study NSCLC, systems biology offers mathematical modeling as a tool to understand complex pathways and provide insights into the identification of biomarkers and potential therapeutic targets, which aids precision therapy. Mathematical modeling, specifically ordinary differential equations (ODEs), is used to better understand the dynamics of cancer growth and immunological interactions in the tumor microenvironment. This study highlighted the dual role of the cyclic GMP-AMP synthase–stimulator of interferon genes (cGAS/STING) pathway's classical involvement in regulating type 1 interferon (IFN I) and pro-inflammatory responses to promote tumor regression through senescence and apoptosis. Alternative signaling was induced by nuclear factor kappa B (NF-κB), mutated tumor protein p53 (p53), and programmed death-ligand1 (PD-L1), which lead to tumor growth. We identified key regulators in cancer progression by simulating the model and validating it with the following model estimation parameters: local sensitivity analysis, principal component analysis, rate of flow of metabolites, and model reduction. Integration of multiple signaling axes revealed that cGAS-STING, phosphoinositide 3-kinases (PI3K), and Ak strain transforming (AKT) may be potential targets that can be validated for cancer therapy.

0 Q&A 461 Views Feb 5, 2025

Cellular communication relies on the intricate interplay of signaling molecules, which come together to form the cell–cell interaction (CCI) network that orchestrates tissue behavior. Researchers have shown that shallow neural networks can effectively reconstruct the CCI from the abundant molecular data captured in spatial transcriptomics (ST). However, in scenarios characterized by sparse connections and excessive noise within the CCI, shallow networks are often susceptible to inaccuracies, leading to suboptimal reconstruction outcomes. To achieve a more comprehensive and precise CCI reconstruction, we propose a novel method called triple-enhancement-based graph neural network (TENET). The TENET framework has been implemented and evaluated on both real and synthetic ST datasets. This protocol primarily introduces our network architecture and its implementation.