Protein and mRNA levels correlate only moderately. The availability of proteogenomics data sets with protein and transcript measurements from matching samples is providing new opportunities to assess the degree to which protein levels in a system can be predicted from mRNA information. Here we examined the contributions of input features in protein abundance prediction models. Using large proteogenomics data from 8 cancer types within the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data set, we trained models to predict the abundance of over 13,000 proteins using matching transcriptome data from up to 958 tumor or normal adjacent tissue samples each, and compared predictive performances across algorithms, data set sizes, and input features. Over one-third of proteins (4,648) showed relatively poor predictability (elastic net r ≤ 0.3) from their cognate transcripts. Moreover, we found widespread occurrences where the abundance of a protein is considerably less well explained by its own cognate transcript level than that of one or more trans locus transcripts. The incorporation of additional trans-locus transcript abundance data as input features increasingly improved the ability to predict sample protein abundance. Transcripts that contribute to non-cognate protein abundance primarily involve those encoding known or predicted interaction partners of the protein of interest, including not only large multi-protein complexes as previously shown, but also small stable complexes in the proteome with only one or few stable interacting partners. Network analysis further shows a complex proteome-wide interdependency of protein abundance on the transcript levels of multiple interacting partners. The predictive model analysis here therefore supports that protein-protein interaction including in small protein complexes exert post-transcriptional influence on proteome compositions more broadly than previously recognized. Moreover, the results suggest mRNA and protein co-expression analysis may have utility for finding gene interactions and predicting expression changes in biological systems.
D. M. Eaton, R. M. Berretta, J. E. Lynch, J. G. Travers, R. D. Pfeiffer, M. L. Hulke, H. Zhao, A. R. H. Hobby, G. Schena, J. P. Johnson, M. Wallner, E. Lau, M. P. Y. Lam, K. C. Woulfe, N. R. Tucker, T. A. McKinsey, M. R. Wolfson, and S. R. HouserAm J Physiol Heart Circ Physiol, 323(4), H797-H817, 2022
Changes in the abundance of individual proteins in the proteome can be elicited by modulation of protein synthesis (the rate of input of newly synthesized proteins into the protein pool) or degradation (the rate of removal of protein molecules from the pool). A full understanding of proteome changes therefore requires a definition of the roles of these two processes in proteostasis, collectively known as protein turnover. Because protein turnover occurs even in the absence of overt changes in pool abundance, turnover measurements necessitate monitoring the flux of stable isotope-labeled precursors through the protein pool such as labeled amino acids or metabolic precursors such as ammonium chloride or heavy water. In cells in culture, the ability to manipulate precursor pools by rapid medium changes is simple, but for more complex systems such as intact animals, the approach becomes more convoluted. Individual methods bring specific complications, and the suitability of different methods has not been comprehensively explored. In this study, we compare the turnover rates of proteins across four mouse tissues, obtained from the same inbred mouse strain maintained under identical husbandry conditions, measured using either [13C6]lysine or [2H2]O as the labeling precursor. We show that for long-lived proteins, the two approaches yield essentially identical measures of the first-order rate constant for degradation. For short-lived proteins, there is a need to compensate for the slower equilibration of lysine through the precursor pools. We evaluate different approaches to provide that compensation. We conclude that both labels are suitable, but careful determination of precursor enrichment kinetics in amino acid labeling is critical and has a considerable influence on the numerical values of the derived protein turnover rates.
In recent years an expanding collection of heart-secreted signaling proteins have been discovered that play cellular communication roles in diverse pathophysiological processes. This minireview briefly discusses current evidence for the roles of cardiokines in systemic regulation of aging and age-associated diseases. An analysis of human transcriptome and secretome data suggests the possibility that many other cardiokines remain to be discovered that may function in long-range physiological regulations. We discuss the ongoing challenges and emerging technologies for elucidating the identity and function of cardiokines in endocrine regulations
T. T. Wei, M. Chandy, M. Nishiga, A. Zhang, K. K. Kumar, D. Thomas, A. Manhas, S. Rhee, J. M. Justesen, I. Y. Chen, H. T. Wo, S. Khanamiri, J. Y. Yang, F. J. Seidl, N. Z. Burns, C. Liu, N. Sayed, J. J. Shie, C. F. Yeh, K. C. Yang, E. Lau, K. L. Lynch, M. Rivas, B. K. Kobilka, and J. C. WuCell, 185(10), 1676-1693, 2022
S. Veitch, M. S. Njock, M. Chandy, M. A. Siraj, L. Chi, H. Mak, K. Yu, K. Rathnakumar, C. A. Perez-Romero, Z. Chen, F. J. Alibhai, D. Gustafson, S. Raju, R. Wu, D. Z. Khat, Y. Wang, A. Caballero, P. Meagher, E. Lau, L. Pepic, H. S. Cheng, N. J. Galant, K. L. Howe, R. K. Li, K. A. Connelly, M. Husain, P. Delgado-Olguin, and J. E. FishCardiovasc Diabetol, 21(1), 31, 2022
JCAST is an open-source Python software tool that allows users to easily create custom protein sequence databases for proteogenomic applications. JCAST takes in RNA sequencing data containing alternative splicing junctions as input, models the likely translatable protein isoform sequences within a particular sample, performs in silico translation using annotated open reading frames, and outputs sample-specific protein sequence databases in FASTA format to support downstream mass spectrometry data analysis of protein isoforms. This article describes the functionality and usage of the JCAST software and documents a stable code repository for user access.
We performed total RNA sequencing and multi-omics analysis comparing skeletal muscle and cardiac muscle in young adult (4 months) vs. early aging (20 months) mice to examine the molecular mechanisms of striated muscle aging. We observed that aging cardiac and skeletal muscles both invoke transcriptomic changes in innate immune system and mitochondria pathways but diverge in extracellular matrix processes. On an individual gene level, we identified 611 age-associated signatures in skeletal and cardiac muscles, including a number of myokine and cardiokine encoding genes. Because RNA and protein levels correlate only partially, we reason that differentially expressed transcripts that accurately reflect their protein counterparts will be more valuable proxies for proteomic changes and by extension physiological states. We applied a computational data analysis workflow to estimate which transcriptomic changes are more likely relevant to protein-level regulation using large proteogenomics data sets. We estimate about 48% of the aging-associated transcripts predict protein levels well (r ≥ 0.5). In parallel, a comparison of the identified aging-regulated genes with public human transcriptomics data showed that only 35–45% of the identified genes show an age-dependent expression in corresponding human tissues. Thus, integrating both RNA–protein correlation and human conservation across data sources, we nominate 134 prioritized aging striated muscle signatures that are predicted to correlate strongly with protein levels and that show age-dependent expression in humans. The results here reveal new details into how aging reshapes gene expression in striated muscles at the transcript and protein levels.
Cardiac-derived exosomes have received intense interest for their roles in paracrine communications and regenerative therapies. However, current understanding of how exosomes mediate cellular signaling is incomplete, in part because the contents of exosomes from different cardiac cell types are poorly defined. To learn what signals cardiac cells release, we examined the microRNA (miRNA) compositions secreted in exosomes from human induced pluripotent stem cells (iPSCs) and 3 major iPSC-derived cardiac cell types.
We describe the procedure to isolate genomic DNA, RNA, and protein directly from cryopreserved induced pluripotent stem cell (iPSC) vials using commercially available solid‐phase extraction kits, and we report the relationship between macromolecule yields and experimental and storage factors. Sufficient quantities of DNA, RNA, and protein are recoverable from as low as 1 million cryopreserved cells across 728 distinct iPSC lines suitable for whole‐genome sequencing, RNA sequencing, and mass spectrometry experiments. Nucleic acids extracted from iPSC stocks cryopreserved up to 4 years maintain sufficient quantity and integrity for downstream analysis with minimal genomic DNA fragmentation. An expected positive correlation exists between cell count and DNA or RNA yield, with comparable yields recovered between cells across different cryostorage timespans. This article provides an effective way to simultaneously isolate iPSC biomolecules for multi‐omics investigations.
J. Lee, V. Termglinchan, S. Diecke, I. Itzhaki, C. K. Lam, P. Garg, E. Lau, M. Greenhaw, T. Seeger, H. Wu, J. Z. Zhang, X. Chen, I. P. Gil, M. Ameen, K. Sallam, J.-W. Rhee, J. M. Churko, R. Chaudhary, T. Chour, P. J. Wang, M. P. Snyder, H. Y. Chang, I. Karakikes, and J. C. WuNature, 572(7769), 335—340, 2019
Human induced pluripotent stem cells (iPSCs) provide a renewable supply of patient-specific and tissue-specific cells for cellular and molecular studies of disease mechanisms. Combined with advances in various omics technologies, iPSC models can be used to profile the expression of genes, transcripts, proteins, and metabolites in relevant tissues. In the past 2 years, large panels of iPSC lines have been derived from hundreds of genetically heterogeneous individuals, further enabling genome-wide mapping to identify coexpression networks and elucidate gene regulatory networks. Here, we review recent developments in omics profiling of various molecular phenotypes and the emergence of human iPSCs as a systems biology model of human diseases.
A. S. Lee, M. Inayathullah, M. A. Lijkwan, X. Zhao, W. Sun, S. Park, W. X. Hong, M. B. Parekh, A. V. Malkovskiy, E. Lau, X. Qin, V. R. Pothineni, V. Sanchez-Freire, W. Y. Zhang, N. G. Kooreman, A. D. Ebert, C. K. F. Chan, P. K. Nguyen, J. Rajadas, and J. C. WuNature biomedical engineering, 2(2), 104—113, 2018
F. Olmeta-Schult, L. M. Segal, S. Tyner, T. A. Moon, R. D.-W. Chow, P. Chakrabarty, M. Pacesa, A. I. Podgornaia, J. Chen, B. Singh, B. Cao, R. R. S. Sidhu, B. W. Q. Tan, P. Sood, S. Parker, M. A. Scult, D. V. Haute, N. Konstantinides, B. A. Schwendimann, S. Srivastava, R. Fiorenza, K. Dutton-Regester, R. Hale, E. O. Polat, E. Lau, A. L. Mayer, and E. R. WhiteScience (New York, N.Y.), 359(6371), 26—28, 2018
Transcript abundance and protein abundance show modest correlation in many biological models, but how this impacts disease signature discovery in omics experiments is rarely explored. Here we report an integrated omics approach, incorporating measurements of transcript abundance, protein abundance, and protein turnover to map the landscape of proteome remodeling in a mouse model of pathological cardiac hypertrophy. Analyzing the hypertrophy signatures that are reproducibly discovered from each omics data type across six genetic strains of mice, we find that the integration of transcript abundance, protein abundance, and protein turnover data leads to 75% gain in discovered disease gene candidates. Moreover, the inclusion of protein turnover measurements allows discovery of post-transcriptional regulations across diverse pathways, and implicates distinct disease proteins not found in steady-state transcript and protein abundance data. Our results suggest that multi-omics investigations of proteome dynamics provide important insights into disease pathogenesis in vivo.
Mitochondrial proteins carry out diverse cellular functions including ATP synthesis, ion homeostasis, cell death signaling, and fatty acid metabolism and biogenesis. Compromised mitochondrial quality control is implicated in various human disorders including cardiac diseases. Recently it has emerged that mitochondrial protein turnover can serve as an informative cellular parameter to characterize mitochondrial quality and uncover disease mechanisms. The turnover rate of a mitochondrial protein reflects its homeostasis and dynamics under the quality control systems acting on mitochondria at a particular cell state. This review article summarizes some recent advances and outstanding challenges for measuring the turnover rates of mitochondrial proteins in health and disease. This article is part of a Special Issue entitled "Mitochondria: From Basic Mitochondrial Biology to Cardiovascular Disease".