系统生物学-BIO-PROTOCOL

Phylogenomics of Plant NLR Immune Receptors to Identify Functionally Conserved Sequence Motifs

TS Toshiyuki Sakai AT AmirAli Toghani HA Hiroaki Adachi*

0 Q&A 608 Views Jul 5, 2024

In recent years, the increase in genome sequencing across diverse plant species has provided a significant advantage for phylogenomics studies, allowing the analysis of one of the most diverse gene families in plants: nucleotide-binding leucine-rich repeat receptors (NLRs). However, due to the sequence diversity of the NLR gene family, identifying key molecular features and functionally conserved sequence patterns is challenging through multiple sequence alignment. Here, we present a step-by-step protocol for a computational pipeline designed to identify evolutionarily conserved motifs in plant NLR proteins. In this protocol, we use a large-scale NLR dataset, including 1,862 NLR genes annotated from monocot and dicot species, to predict conserved sequence motifs, such as the MADA and EDVID motifs, within the coiled-coil (CC)-NLR subfamily. Our pipeline can be applied to identify molecular signatures that have remained conserved in the gene family over evolutionary time across plant species.

Bilateral Common Carotid Artery Stenosis in Mice: A Model of Chronic Cerebral Hypoperfusion-Induced Vascular Cognitive Impairment

小鼠双侧颈总动脉狭窄模型：慢性脑低灌注引起的血管性认知障碍研究

MK Masashi Kakae AK Ayaka Kawashita HO Haruya Onogi TN Takayuki Nakagawa HS Hisashi Shirakawa*

0 Q&A 433 Views Jul 5, 2024

Vascular cognitive impairment (VCI) is a syndrome defined as cognitive decline caused by vascular disease and is associated with various types of dementia. Chronic cerebral hypoperfusion (CCH) is one of the major contributors to VCI. Among the various rodent models used to study CCH-induced VCI, we have found the mouse bilateral common carotid artery stenosis (BCAS) model to be highly suitable. Here, we introduce the BCAS model of C57BL/6J mice generated using microcoils with an internal diameter of 0.18 mm. To produce the mouse BCAS model, the bilateral common carotid arteries are isolated from the adhering tissues and vagus nerves and twined around the microcoils. This model shows cognitive impairment and white matter lesions preceding neuronal dysfunction around postoperative day 28, which is similar to the human clinical picture. Overall, the mouse BCAS model will continue to be useful in studying CCH-induced VCI.

Streamlining Protein Fractional Synthesis Rates Using SP3 Beads and Stable Isotope Mass Spectrometry: A Case Study on the Plant Ribosome

利用SP3珠和稳定同位素质谱技术优化蛋白质合成速率：植物核糖体的案例研究

DG Dione Gentry-Torfer* EM Ester Murillo CB Chloe L. Barrington SN Shuai Nie ML Michael G. Leeming PS Pipob Suwanchaikasem NW Nicholas A. Williamson UR Ute Roessner BB Berin A. Boughton JK Joachim Kopka FM Federico Martinez-Seidel*

0 Q&A 602 Views May 5, 2024

Ribosomes are an archetypal ribonucleoprotein assembly. Due to ribosomal evolution and function, r-proteins share specific physicochemical similarities, making the riboproteome particularly suited for tailored proteome profiling methods. Moreover, the structural proteome of ribonucleoprotein assemblies reflects context-dependent functional features. Thus, characterizing the state of riboproteomes provides insights to uncover the context-dependent functionality of r-protein rearrangements, as they relate to what has been termed the ribosomal code, a concept that parallels that of the histone code, in which chromatin rearrangements influence gene expression. Compared to high-resolution ribosomal structures, omics methods lag when it comes to offering customized solutions to close the knowledge gap between structure and function that currently exists in riboproteomes. Purifying the riboproteome and subsequent shot-gun proteomics typically involves protein denaturation and digestion with proteases. The results are relative abundances of r-proteins at the ribosome population level. We have previously shown that, to gain insight into the stoichiometry of individual proteins, it is necessary to measure by proteomics bound r-proteins and normalize their intensities by the sum of r-protein abundances per ribosomal complex, i.e., 40S or 60S subunits. These calculations ensure that individual r-protein stoichiometries represent the fraction of each family/paralog relative to the complex, effectively revealing which r-proteins become substoichiometric in specific physiological scenarios. Here, we present an optimized method to profile the riboproteome of any organism as well as the synthesis rates of r-proteins determined by stable isotope-assisted mass spectrometry. Our method purifies the r-proteins in a reversibly denatured state, which offers the possibility for combined top-down and bottom-up proteomics. Our method offers a milder native denaturation of the r-proteome via a chaotropic GuHCl solution as compared with previous studies that use irreversible denaturation under highly acidic conditions to dissociate rRNA and r-proteins. As such, our method is better suited to conserve post-translational modifications (PTMs). Subsequently, our method carefully considers the amino acid composition of r-proteins to select an appropriate protease for digestion. We avoid non-specific protease cleavage by increasing the pH of our standardized r-proteome dilutions that enter the digestion pipeline and by using a digestion buffer that ensures an optimal pH for a reliable protease digestion process. Finally, we provide the R package ProtSynthesis to study the fractional synthesis rates of r-proteins. The package uses physiological parameters as input to determine peptide or protein fractional synthesis rates. Once the physiological parameters are measured, our equations allow a fair comparison between treatments that alter the biological equilibrium state of the system under study. Our equations correct peptide enrichment using enrichments in soluble amino acids, growth rates, and total protein accumulation. As a means of validation, our pipeline fails to find “false” enrichments in non-labeled samples while also filtering out proteins with multiple unique peptides that have different enrichment values, which are rare in our datasets. These two aspects reflect the accuracy of our tool. Our method offers the possibility of elucidating individual r-protein family/paralog abundances, PTM status, fractional synthesis rates, and dynamic assembly into ribosomal complexes if top-down and bottom-up proteomic approaches are used concomitantly, taking one step further into mapping the native and dynamic status of the r-proteome onto high-resolution ribosome structures. In addition, our method can be used to study the proteomes of all macromolecular assemblies that can be purified, although purification is the limiting step, and the efficacy and accuracy of the proteases may be limited depending on the digestion requirements.

Classification of a Massive Number of Viral Genomes and Estimation of Time of Most Recent Common Ancestor (tMRCA) of SARS-CoV-2 Using Phylodynamic Analsysis

使用系统动力学对大量病毒基因组分类并预估SARS-CoV-2最近共同祖先年代（tMRCA）

XH Xiaowen Hu SG Siqin Guan YH Yiliang He GY Guohui Yi LY Lei Yao* JZ Jiaming Zhang*

0 Q&A 1136 Views Mar 20, 2024

Estimating the time of most recent common ancestor (tMRCA) is important to trace the origin of pathogenic viruses. This analysis is based on the genetic diversity accumulated in a certain time period. There have been thousands of mutant sites occurring in the genomes of SARS-CoV-2 since the COVID-19 pandemic started; six highly linked mutation sites occurred early before the start of the pandemic and can be used to classify the genomes into three main haplotypes. Tracing the origin of those three haplotypes may help to understand the origin of SARS-CoV-2. In this article, we present a complete protocol for the classification of SARS-CoV-2 genomes and calculating tMRCA using Bayesian phylodynamic method. This protocol may also be used in the analysis of other viral genomes.

Key features

• Filtering and alignment of a massive number of viral genomes using custom scripts and ViralMSA.

• Classification of genomes based on highly linked sites using custom scripts.

• Phylodynamic analysis of viral genomes using Bayesian evolutionary analysis sampling trees (BEAST).

• Visualization of posterior distribution of tMRCA using Tracer.v1.7.2.

• Optimized for the SARS-CoV-2.

Graphical overview

Graphical workflow of time of most recent common ancestor (tMRCA) estimation process

Phylogenetic Inference of Homologous/Orthologous Genes among Distantly Related Plants

远缘植物中同源/直系同源基因的系统发育推断

ZX Zilong Xu WS Wenyan Sun

Ziqiang Zhu BZ Bojian Zhong ZZ Zhenhua Zhang*

0 Q&A 665 Views Dec 5, 2023

The recent surge in plant genomic and transcriptomic data has laid a foundation for reconstructing evolutionary scenarios and inferring potential functions of key genes related to plants’ development and stress responses. The classical scheme for identifying homologous genes is sequence similarity–based searching, under the crucial assumption that homologous sequences are more similar to each other than they are to any other non-homologous sequences. Advances in plant phylogenomics and computational algorithms have enabled us to systemically identify homologs/orthologs and reconstruct their evolutionary histories among distantly related lineages. Here, we present a comprehensive pipeline for homologous sequences identification, phylogenetic relationship inference, and potential functional profiling of genes in plants.

Key features

• Identification of orthologs using large-scale genomic and transcriptomic data.

• This protocol is generalized for analyzing the evolution of plant genes.

Workflow for High-throughput Screening of Enzyme Mutant Libraries Using Matrix-assisted Laser Desorption/Ionization Mass Spectrometry Analysis of Escherichia coli Colonies

使用基质辅助激光解吸/电离质谱分析大肠杆菌菌落高通量筛选酶突变体文库的工作流程

KC Kisurb Choe JS Jonathan V. Sweedler*

0 Q&A 537 Views Nov 5, 2023

High-throughput molecular screening of microbial colonies and DNA libraries are critical procedures that enable applications such as directed evolution, functional genomics, microbial identification, and creation of engineered microbial strains to produce high-value molecules. A promising chemical screening approach is the measurement of products directly from microbial colonies via optically guided matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). Measuring the compounds from microbial colonies bypasses liquid culture with a screen that takes approximately 5 s per sample. We describe a protocol combining a dedicated informatics pipeline and sample preparation method that can prepare up to 3,000 colonies in under 3 h. The screening protocol starts from colonies grown on Petri dishes and then transferred onto MALDI plates via imprinting. The target plate with the colonies is imaged by a flatbed scanner and the colonies are located via custom software. The target plate is coated with MALDI matrix, MALDI-MS analyzes the colony locations, and data analysis enables the determination of colonies with the desired biochemical properties. This workflow screens thousands of colonies per day without requiring additional automation. The wide chemical coverage and the high sensitivity of MALDI-MS enable diverse screening projects such as modifying enzymes and functional genomics surveys of gene activation/inhibition libraries.

Key features

• Mass spectrometry analyzes a range of compounds from E. coli colonies as a proxy for liquid culture testing enzyme mutant libraries.

• Colonies are transferred to a MALDI target plate by a simple imprinting method.

• The screen compares the ratio among several products or searches for the qualitative presence of specific compounds.

• The protocol requires a MALDI mass spectrometer.

Graphical overview

Overview of the MALDI-MS analysis of microbial colonies for screening mutant libraries. Microbial cells containing a mutant library for enzymes/metabolic pathways are first grown in agar. The colonies are then imprinted onto a MALDI target plate using a filter paper intermediate. An optical image of the MALDI target plate is analyzed by custom software to find the locations of individual colonies and direct subsequent MALDI-MS analyses to the selected colonies. After applying MALDI matrix onto the target plate, MALDI-MS analysis of the colonies is performed. Colonies showing the desired product profiles are found by data analysis via the software, and the colonies are picked for downstream analysis.

Testing for Allele-specific Expression from Human Brain Samples

测试人脑样本的等位基因特异性表达

MD Maria E. Diaz-Ortiz NJ Nimansha Jain MG Michael D. Gallagher MP Marijan Posavi TU Travis L. Unger AC Alice S. Chen-Plotkin*

0 Q&A 452 Views Oct 5, 2023

Many single nucleotide polymorphisms (SNPs) identified by genome-wide association studies exert their effects on disease risk as expression quantitative trait loci (eQTL) via allele-specific expression (ASE). While databases for probing eQTLs in tissues from normal individuals exist, one may wish to ascertain eQTLs or ASE in specific tissues or disease-states not characterized in these databases. Here, we present a protocol to assess ASE of two possible target genes (GPNMB and KLHL7) of a known genome-wide association study (GWAS) Parkinson’s disease (PD) risk locus in postmortem human brain tissue from PD and neurologically normal individuals. This was done using a sequence of RNA isolation, cDNA library generation, enrichment for transcripts of interest using customizable cDNA capture probes, paired-end RNA sequencing, and subsequent analysis. This method provides increased sensitivity relative to traditional bulk RNAseq-based and a blueprint that can be extended to the study of other genes, tissues, and disease states.

Key features

• Analysis of GPNMB allele-specific expression (ASE) in brain lysates from cognitively normal controls (NC) and Parkinson’s disease (PD) individuals.

• Builds on the ASE protocol of Mayba et al. (2014) and extends application from cells to human tissue.

• Increased sensitivity by enrichment for desired transcript via RNA CaptureSeq (Mercer et al., 2014).

• Optimized for human brain lysates from cingulate gyrus, caudate nucleus, and cerebellum.

Graphical overview

Controlled Level of Contamination Coupled to Deep Sequencing (CoLoC-seq) Probes the Global Localisation Topology of Organelle Transcriptomes

控制污染水平与深度测序 (CoLoC-seq) 相结合以探索细胞器转录组的全局定位拓扑

AS Anna Smirnova DJ Damien Jeandard

Alexandre Smirnov*

0 Q&A 352 Views Sep 20, 2023

Information on RNA localisation is essential for understanding physiological and pathological processes, such as gene expression, cell reprogramming, host–pathogen interactions, and signalling pathways involving RNA transactions at the level of membrane-less or membrane-bounded organelles and extracellular vesicles. In many cases, it is important to assess the topology of RNA localisation, i.e., to distinguish the transcripts encapsulated within an organelle of interest from those merely attached to its surface. This allows establishing which RNAs can, in principle, engage in local molecular interactions and which are prevented from interacting by membranes or other physical barriers. The most widely used techniques interrogating RNA localisation topology are based on the treatment of isolated organelles with RNases with subsequent identification of the surviving transcripts by northern blotting, qRT-PCR, or RNA-seq. However, this approach produces incoherent results and many false positives. Here, we describe Controlled Level of Contamination coupled to deep sequencing (CoLoC-seq), a more refined subcellular transcriptomics approach that overcomes these pitfalls. CoLoC-seq starts by the purification of organelles of interest. They are then either left intact or lysed and subjected to a gradient of RNase concentrations to produce unique RNA degradation dynamics profiles, which can be monitored by northern blotting or RNA-seq. Through straightforward mathematical modelling, CoLoC-seq distinguishes true membrane-enveloped transcripts from degradable and non-degradable contaminants of any abundance. The method has been implemented in the mitochondria of HEK293 cells, where it outperformed alternative subcellular transcriptomics approaches. It is applicable to other membrane-bounded organelles, e.g., plastids, single-membrane organelles of the vesicular system, extracellular vesicles, or viral particles.

Key features

• Tested on human mitochondria; potentially applicable to cell cultures, non-model organisms, extracellular vesicles, enveloped viruses, tissues; does not require genetic manipulations or highly pure organelles.

• In the case of human cells, the required amount of starting material is ~2,500 cm² of 80% confluent cells (or ~3 × 10⁸ HEK293 cells).

• CoLoC-seq implements a special RNA-seq strategy to selectively capture intact transcripts, which requires RNases generating 5′-hydroxyl and 2′/3′-phosphate termini (e.g., RNase A, RNase I).

• Relies on nonlinear regression software with customisable exponential functions.

Graphical overview

Computational Analysis of Plasma Lipidomics from Mice Fed Standard Chow and Ketogenic Diet

标准饲料和生酮饮食小鼠血浆脂质组学的计算分析

AS Amy L. Seufert JH James W. Hickman JC Jaewoo Choi BN Brooke A. Napier*

0 Q&A 941 Views Sep 20, 2023

Dietary saturated fatty acids (SFAs) are upregulated in the blood circulation following digestion. A variety of circulating lipid species have been implicated in metabolic and inflammatory diseases; however, due to the extreme variability in serum or plasma lipid concentrations found in human studies, established reference ranges are still lacking, in addition to lipid specificity and diagnostic biomarkers. Mass spectrometry is widely used for identification of lipid species in the plasma, and there are many differences in sample extraction methods within the literature. We used ultra-high performance liquid chromatography (UPLC) coupled to a high-resolution hybrid triple quadrupole-time-of-flight (QToF) mass spectrometry (MS) to compare relative peak abundance of specific lipid species within the following lipid classes: free fatty acids (FFAs), triglycerides (TAGs), phosphatidylcholines (PCs), and sphingolipids (SGs), in the plasma of mice fed a standard chow (SC; low in SFAs) or ketogenic diet (KD; high in SFAs) for two weeks. In this protocol, we used Principal Component Analysis (PCA) and R to visualize how individual mice clustered together according to their diet, and we found that KD-fed mice displayed unique blood profiles for many lipid species identified within each lipid class compared to SC-fed mice. We conclude that two weeks of KD feeding is sufficient to significantly alter circulating lipids, with PCs being the most altered lipid class, followed by SGs, TAGs, and FFAs, including palmitic acid (PA) and PA-saturated lipids. This protocol is needed to advance knowledge on the impact that SFA-enriched diets have on concentrations of specific lipids in the blood that are known to be associated with metabolic and inflammatory diseases.

Key features

• Analysis of relative plasma lipid concentrations from mice on different diets using R.

• Lipidomics data collected via ultra-high performance liquid chromatography (UPLC) coupled to a high-resolution hybrid triple quadrupole-time-of-flight (QToF) mass spectrometry (MS).

• Allows for a comprehensive comparison of diet-dependent plasma lipid profiles, including a variety of specific lipid species within several different lipid classes.

• Accumulation of certain free fatty acids, phosphatidylcholines, triglycerides, and sphingolipids are associated with metabolic and inflammatory diseases, and plasma concentrations may be clinically useful.

Graphical overview

Sample Preparation and Integrative Data Analysis of a Droplet-based Single-Cell ATAC-sequencing Using Murine Thymic Epithelial Cells

使用小鼠胸腺上皮细胞进行基于液滴的单细胞 ATAC 测序的样品制备和综合数据分析

TI Tatsuya Ishikawa HI Hiroto Ishii TM Takahisa Miyao KH Kenta Horie MM Maki Miyauchi NA Nobuko Akiyama* TA Taishin Akiyama*

0 Q&A 602 Views Jan 5, 2023

Accessible chromatin regions modulate gene expression by acting as cis-regulatory elements. Understanding the epigenetic landscape by mapping accessible regions of DNA is therefore imperative to decipher mechanisms of gene regulation under specific biological contexts of interest. The assay for transposase-accessible chromatin sequencing (ATAC-seq) has been widely used to detect accessible chromatin and the recent introduction of single-cell technology has increased resolution to the single-cell level. In a recent study, we used droplet-based, single-cell ATAC-seq technology (scATAC-seq) to reveal the epigenetic profile of the transit-amplifying subset of thymic epithelial cells (TECs), which was identified previously using single-cell RNA-sequencing technology (scRNA-seq). This protocol allows the preparation of nuclei from TECs in order to perform droplet-based scATAC-seq and its integrative analysis with scRNA-seq data obtained from the same cell population. Integrative analysis has the advantage of identifying cell types in scATAC-seq data based on cell cluster annotations in scRNA-seq analysis.