Current Proteomics - Volume 18, Issue 5, 2021
Volume 18, Issue 5, 2021
-
-
Random Walks on Biomedical Networks
Authors: Guiyang Zhang, Pan Wang, You Li and Guohua HuangThe biomedical network is becoming a fundamental tool to represent sophisticated biosystems, while Random Walk (RW) models on it are becoming a sharp sword to address such challenging issues as gene function annotation, drug target identification, and disease biomarker recognition. Recently, numerous random walk models have been proposed and applied to biomedical networks. Due to good performances, the random walk is attracting increasing attentions from multiple communities. In this survey, we firstly introduced various random walk models, with emphasis on the PageRank and the random walk with restart. We then summarized applications of the random work RW on the biomedical networks from the graph learning point of view, which mainly included node classification, link prediction, cluster/community detection, and learning representation of the node. We discussed briefly its limitation and existing issues also.
-
-
-
Machine Learning for Mass Spectrometry Data Analysis in Proteomics
Authors: Juntao Li, Kanglei Zhou and Bingyu MuWith the rapid development of high-throughput techniques, mass spectrometry has been widely used for large-scale protein analysis. To search for the existing proteins, discover biomarkers, and diagnose and prognose diseases, machine learning methods are applied in mass spectrometry data analysis. This paper reviews the applications of five kinds of machine learning methods to mass spectrometry data analysis from an algorithmic point of view, including support vector machine, decision tree, random forest, naive Bayesian classifier and deep learning.
-
-
-
Prediction of Protein Structural Classes: Features Extraction to Classification Algorithm
Authors: Xiaoqing Liu, Zhenyu Yang, Yaoxin Wang and Qi DaiThe fast growth of protein sequencing and protein structure data has promoted the development of the protein structural class prediction. Several prediction methods have been proposed to study protein folding rate, DNA binding sites, as well as reducing the search of conformational space and realizing the prediction of tertiary structure. This paper introduces the current approaches of protein structural class prediction and emphasizes their steps from information extraction to classification algorithms.
-
-
-
Identifying Protein Subcellular Location with Embedding Features Learned from Networks
Authors: Hongwei Liu, Bin Hu, Lei Chen and Lin LuBackground: Identification of protein subcellular location is an important problem because the subcellular location is highly related to protein function. It is fundamental to determine the locations with biology experiments. However, these experiments are of high costs and time-consuming. The alternative way to address such a problem is to design effective computational methods. Objective: To date, several computational methods have been proposed in this regard. However, these methods mainly adopted the features derived from the proteins themselves. On the other hand, with the development of the network technique, several embedding algorithms have been proposed, which can encode nodes in the network into feature vectors. Such algorithms connected the network and traditional classification algorithms. Thus, they provided a new way to construct models for the prediction of protein subcellular location. Methods: In this study, we analyzed features produced by three network embedding algorithms (DeepWalk, Node2vec and Mashup) that were applied on one or multiple protein networks. Obtained features were learned by one machine learning algorithm (support vector machine or random forest) to construct the model. The cross-validation method was adopted to evaluate all constructed models. Results: After evaluating models with the cross-validation method, embedding features yielded by Mashup on multiple networks were quite informative for predicting protein subcellular location. The model based on these features were superior to some classic models. Conclusion: Embedding features yielded by a proper and powerful network embedding algorithm were effective for building the model for prediction of protein subcellular location, providing new pipelines to build more efficient models.
-
-
-
A Useful Tool for the Identification of DNA-Binding Proteins Using Graph Convolutional Network
Authors: Dasheng Chen and Leyi WeiBackground: DNA and protein are important components of living organisms. DNA binding protein is a helicase, which is a protein specifically responsible for binding to DNA single- stranded regions. It is a necessary component for DNA replication, recombination and repair, and plays a key role in the function of various biomolecules. Although there are already some classification prediction methods for this protein, the use of graph neural networks for this work is still limited. Objective: The classification of unknown protein sequences into the correct categories, subcategories and families is important for biological sciences. In this article, using graph neural networks, we developed a novel predictor GCN-DBP for protein classification prediction. Methods: Each protein sequence is treated as a document in this study, and then segment the words according to the concept of k-mer, thereby, finally achieving the purpose of segmenting the document. This research aims to use document word relationships and word co-occurrence as a corpus to construct a text graph, and then learn protein sequence information by two-layer graph convolutional networks. Results: Finally, we tested GCN-DBP on the independent data set PDB2272, and its accuracy reached 64.17% and MCC was 28.32%. Moreover, in order to compare the proposed method with other existing methods, we have conducted more experiments. Conclusion: The results show that the proposed method is superior to the other four methods and will be a useful tool.
-
-
-
Prediction of Nitration Sites Based on FCBF Method and Stacking Ensemble Model
More LessBackground: Nitration is an important Post-Translational Modification (PTM) occurring on the tyrosine residues of proteins. The occurrence of protein tyrosine nitration under disease conditions is inevitable and represents a shift from the signal transducing physiological actions of - NO to oxidative and potentially pathogenic pathways. Abnormal protein nitration modification can lead to serious human diseases, including neurodegenerative diseases, acute respiratory distress, organ transplant rejection and lung cancer. Objective: It is necessary and important to identify the nitration sites in protein sequences. Predicting which tyrosine residues in the protein sequence are nitrated and which are not is of great significance for the study of nitration mechanism and related diseases. Methods: In this study, a prediction model of nitration sites based on the over-under sampling strategy and the FCBF method was proposed by stacking ensemble learning and fusing multiple features. Firstly, the protein sequence sample was encoded by 2701-dimensional fusion features (PseAAC, PSSM, AAIndex, CKSAAP, Disorder). Secondly, the ranked feature set was generated by the FCBF method according to the symmetric uncertainty metric. Thirdly, in the process of model training, the over- and under- sampling technique was used to tackle the imbalanced dataset. Finally, the Incremental Feature Selection (IFS) method was adopted to extract an optimal classifier based on 10-fold cross-validation. Results and Conclusion: Results show that the model has significant performance advantages in indicators such as MCC, Recall and F1-score, no matter in what way the comparison was conducted with other classifiers on the independent test set, or made by cross-validation with single-type feature or with fusion-features on the training set. By integrating the FCBF feature ranking methods, over- and under- sampling technique and a stacking model composed of multiple base classifiers, an effective prediction model for nitration PTM sites was built, which can achieve a better recall rate when the ratio of positive and negative samples is highly imbalanced.
-
-
-
Comparative Proteomic Analysis of Hydrogen Peroxide-Induced Protein Expression in Streptococcus pneumoniae D39
Authors: Sungkyoung Lee, Myoung-Ro Lee, Songmee Bae and Min-Kyu KwakBackground: Streptococcus pneumoniaeis a leading cause of human respiratory tract infection. Despite the lack of activities of antioxidative enzymes, including cytochromes, hemoproteins, and peroxidases/catalases, traits conferring the aerotolerant-anaerobic growth of this bacterium are conserved, with the high efficacy of antioxidative actions, in an oxygen-rich environment. Objective: Through proteome analysis, this study's intention was to evaluate differentially expressed proteins and/or gene products modeled in a highly virulent strain, S. pneumoniae D39, exogenously- treated with millimolar concentrations of H2O2. Methods: For two-dimensional gel electrophoresis (2-DE) analysis, following one dimensional isoelectric focusing with an immobilized pH gradient of pH 4-7, the most significantly mobilized proteins expressed were separated by SDS-PAGE in the second dimension. With a total of 431 protein spots detected, certain proteins were excised, in-gel trypsin digested, and analyzed by combination with MALDI-TOF and LC-ESI-MS/MS for mass spectrometric peptide mapping and protein identification. Utilizing mass spectrometry analysis of spots excised from 2-DE, the selected protein spots were identified with a variety of databases and MASCOT. Results: With the aid of comparisons to proteome reference maps, the most differentially expressed 38 proteins, those with approximately 1.4-fold or more increase and/or decrease or with multiple isoforms exhibiting variable pI values, were induced by treatment of exogenous 2 mM H2O2.The identified proteins were seen to be involved in pneumococcal pathogenesis and primary metabolism, amongst others. Conclusion: This is the first study to convincingly document proteomic information associated with pathophysiological adaptation under the given oxidative conditions, and corresponding potential antioxidative mechanisms, in S. pneumoniae.
-
-
-
Heteromerization as a Mechanism Modulating the Affinity of the ACE2 Receptor to the Receptor Binding Domain of SARS-CoV-2 Spike Protein
Authors: Diego Guidolin, Cinzia Tortorella, Deanna Anderlini, Manuela Marcoli, Guido Maura and Luigi F. AgnatiBackground: Angiotensin-converting enzyme 2 (ACE2) is primarily involved in the maturation of angiotensin. It also represents the main receptor for the Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) that caused a serious epidemic COVID-19. Available evidence indicates that at the cell membrane, ACE2 can form heteromeric complexes with other membrane proteins, including the amino acid transporter B0 AT1 and G protein-coupled receptors (GPCR). Objective: It is well known that during the formation of quaternary structures, the configuration of every single monomer is re-shaped by its interaction pattern in the macromolecular complex. Therefore, it can be hypothesized that the affinity of ACE2 to the viral receptor-binding domain (RBD), when in a heteromeric complex, may depend on the associated partner. Methods: By using established docking and molecular dynamics procedures, the reshaping of monomer was explored in silico to predict possible heterodimeric structures between ACE2 and GPCR, such as angiotensin and bradykinin receptors. The associated possible changes in the binding affinity between the viral RBD and ACE2 when in the heteromeric complexes were also estimated. Results and Conclusion: The results provided support to the hypothesis that the heteromerization state of ACE2 may modulate its affinity to the viral RBD. If experimentally confirmed, ACE2 heteromerization may contribute to explain the observed differences in susceptibility to virus infection among individuals and to devise new therapeutic opportunities.
-
-
-
Proteomic Study of the Mechanism of Talin-C as an Inhibitor of HIV Infection
Authors: Lin Yin, Yujiao Zhang, Huichun Shi, Yaru Xing, Hongzhou Lu and Lijun ZhangBackground: Talin-1 is involved in the invasion and synapse development of the Human Immunodeficiency Virus (HIV). We found that talin-1 was cleaved into a 38 KDa fragment (talin-C) in the Peripheral Blood Mononuclear Cells (PBMCs) of HIV patients; however, the underlying mechanisms remain unknown. Objective: This study aimed to determine the relationship between talin-C and HIV infection and to identify the mechanisms underlying the ability of this protein to influence HIV infection. Methods: PBMCs were derived from HIV-infected patients enrolled in this study. N- and C-terminal peptides matching the potential sequence of talin-C were detected in PBMCs by Multiple Reaction Monitoring (MRM) mass spectrometry. TZM-b1 cells were infected with HIV-1 pseudotyped virus (HIVpp) for different durations to detect the talin-C product. Three stable cell lines overexpressing the talin head (TLN1-H) or TLN1-C or with TLN1 knockdown (shTLN1) were created and infected by HIVpp. The HIV marker protein (P24) was then detected by enzyme-linked immunosorbent assay. Finally, an isobaric tag for relative and absolute quantification (iTRAQ)-based proteomics study was performed to detect the TLN1-C-regulated proteins with or without HIVpp infection in TZM-bl cells. The identified proteins were analyzed by R version 4.0.2 and STRING software (Version: 11.0) (https://string-db.org). Results: N- and C-peptides of talin-C were detected to have higher expression in patients with lower HIV load. Talin-C was produced during HIVpp infection. TLN1-C significantly inhibited HIVpp infection in the TZM-b1 cells. Additionally, a proteomic study found that TLN1-C regulated the expression of 99 proteins in TZM-b1 cells with and without HIVpp infection, respectively. According to Gene Ontology (GO) annotation, proteins with cellular metabolic processes and binding function were found to be enriched. Thirty-four proteins have protein-protein interaction, 19 down- and 15 up-regulated proteins, respectively. Conclusion: Talin-C was produced following HIV infection and was inversely proportional to HIV load. A proteomic study indicated that TLN1-C might be involved in HIV infection through regulating metabolic processes.
-
-
-
Proteomic Analysis of Aqueous Humor Proteins Associated with Neovascular Glaucoma Secondary to Proliferative Diabetic Retinopathy
Authors: Ying Wang, Shaolin Xu, Junyi Li, Fujie Yuan, Yue Chen and Kelin LiuObjective: Extensive retinal ischemia caused by proliferative diabetic retinopathy (PDR) may develop into neovascular glaucoma (NVG). We searched for the proteins which might participate in neovascularization through the analysis of aqueous humor (AH) proteomics in patients with NVG secondary to PDR to increasing the understanding of the possible mechanism of neovascularization. Methods: We collected 12 samples (group A) of AH from patients with NVG secondary to PDR as the experimental group and 7 samples (group B) of AH from patients with primary acute angle-closure glaucoma (PAACG) & diabetes mellitus without diabetic retinopathy (NDR) as the control group. Differential quantitative proteome analysis of the aqueous humor samples was performed based on the data-independent acquisition (DIA) method. The differentially expressed proteins were functionally annotated by Ingenuity Pathway Analysis (IPA). The important differentially expressed proteins were validated in another group (group A: 5 samples and group B: 5 samples) by parallel reaction monitor (PRM) approach. Results: A total of 636 AH proteins were identified, and 82 proteins were differentially expressed between the two groups. Functional annotation showed that the differentially expressed proteins were mainly associated with angiogenesis and cell migration. Signaling pathways analysis showed that the proteins up-regulated in group A were mainly related to Liver X receptor/Retinoid X receptor (LXR/RXR) activation and acute reaction. Conclusion: This study presented a pilot work related to NVG secondary to PDR, which provided a better understanding of the mechanisms governing the pathophysiology of NVG.
-
-
-
A Bottom-Up Proteomic Approach in Bone Marrow Plasma Cells of Newly Diagnosed Multiple Myeloma Patients
Authors: Beycan Ayhan, Seçil K. Turan, N. Pınar Barkan, Klara Dalva, Meral Beksaç and Duygu Özel DemiralpBackground: Multiple myeloma (MM) is characterized by infiltration of bone marrow (BM) with clonal malignant plasma cells. The percentage of plasma cells in the BM is required for both diagnosis and prognosis. Objective: Intracellular protein screening and quantitative proteomic analysis were performed in myeloma plasma cells with an aim to compare expressions between low (0-9%), intermediate (10-20%) and high (>20%) plasma cell infiltration groups. Methods: BM aspiration samples were collected from newly diagnosed untreated patients with MM. The samples were pooled into three groups according to the plasma cell content (PCC) in the BM: group 1 (0-9%), group 2 (10-20%) and group 3 (>20%). Protein profiles were obtained and proteins were identified by peptide mass fingerprinting analysis. Results: Differentially expressed proteins were detected between all groups. The identified proteins are Endoplasmin, Calreticulin, Protein Disulfide-isomerase, Marginal zone B and B1 cell specific protein/pERp1, Actin cytoplasmic 1, Myeloblastin, Thioredoxin domain-containing protein 5, Ig kappa chain C region, Apoptosis regulator B-cell lymphoma 2 and Peroxiredoxin-4. Conclusion: Proteins involved in cell proliferation, apoptosis, redox homeostasis and unfolded protein disposal through endoplasmic reticulum-associated degradation machinery have been found to be correlated to PCC. Our results confirm earlier reports regarding the potential effects of identified proteins in the major signaling pathways that lead to cancer. Moreover, this study reveals a novel association between PCC levels and MM. It further highlights the roles of Marginal zone B and B1 cell specific proteins in MM, which could be used as candidate biomarkers in future studies.
-
-
-
A Combined Method of Protein Extraction from Unorthodox Plant Samples for Proteomics
Authors: Can Yilmaz and Mesude IscanAims: This study aimed to generate an improved method of protein extraction and purification from plant tissues containing very high amounts of phenolic compounds and other interfering biomolecules. Background: Protein extraction at proteomic studies on some plant species, including conifers, is challenging, and the yield and quality are unpredictable. Objective: Two popular protocols were combined with each other to construct a novel one with enhanced abilities to produce higher purity of samples compatible for high precision molecular systems and analysis. Methods: The new method was compared with the other two for their efficiencies in classical SDS- PAGE, 2-DE and capillary chromatography applications. Results: All three methods were comparable in SDS-PAGE procedure; however, only the new method created acceptable gel images in 2-DE. Bioanalyzer results, also, demonstrated that the new method provided protein samples pure enough to be used in capillary chromatography with 2 times more peaks in electropherograms with lower noise and higher total relative protein concentrations closest to the applied amount. Conclusion: The new combined method is a successful alternative for plant proteomicists with higher yield and quality of proteins from recalcitrant tissues. Other: The new method could be preferred, especially, for high-tech, sensitive proteomic analysis.
-
-
-
Machine Learning-Based Virtual Screening Strategy RevealsSome Natural Compounds as Potential PAK4 Inhibitors in Triple Negative Breast Cancer
Background: P-21 activating kinase 4 (PAK4) is implicated in the poor prognosis of many cancers, especially in the progression of Triple Negative Breast Cancer (TNBC). The present study was aimed at designing some potential drug candidates as PAK4 inhibitors for breast cancer therapy. Objective: This study aimed to finding novel inhibitors of PAK4 from natural compounds using computational approach. Methods: An e-pharmacophore model was developed from docked PAK4-co-ligand complex and used to screen over a thousand natural compounds downloaded from BIOFACQUIM and NPASS databases to match a minimum of 5 sites for selected (ADDDHRR) hypothesis. The robustness of the virtual screening method was accessed by well-established methods including EF, ROC, BEDROC, AUAC, and the RIE. Compounds with fitness score greater than one were filtered by applying molecular docking (HTVS, SP, XP and Induced fit docking) and ADME prediction. Using Machine learning-based approach QSAR model was generated using Automated QSAR. The computed top model kpls_des_17 (R2= 0.8028, RMSE = 0.4884 and Q2 = 0.7661) was used to predict the pIC50 of the lead compounds. Internal and external validations were accessed to determine the predictive quality of the model. Finally, the binding free energy calculation was computed. Results: The robustness/predictive quality of the models was affirmed. The hits had better binding affinity than the reference drug and interacted with key amino acids for PAK4 inhibition. Overall, the present analysis yielded three potential inhibitors that are predicted to bind with PAK4 better than the reference drug tamoxifen. The three potent novel inhibitors, vitexin, emodin and ziganein recorded IFD score of -621.97 kcal/mol, -616.31 kcal/mol and -614.95 kcal/mol, respectively while showing moderation for ADME properties and inhibition constant. Conclusion: It is expected that the findings reported in this study may provide insight for designing effective and less toxic PAK4 inhibitors for triple negative breast cancer.
-
Volumes & issues
-
Volume 21 (2024)
-
Volume 20 (2023)
-
Volume 19 (2022)
-
Volume 18 (2021)
-
Volume 17 (2020)
-
Volume 16 (2019)
-
Volume 15 (2018)
-
Volume 14 (2017)
-
Volume 13 (2016)
-
Volume 12 (2015)
-
Volume 11 (2014)
-
Volume 10 (2013)
-
Volume 9 (2012)
-
Volume 8 (2011)
-
Volume 7 (2010)
-
Volume 6 (2009)
-
Volume 5 (2008)
-
Volume 4 (2007)
-
Volume 3 (2006)
-
Volume 2 (2005)
-
Volume 1 (2004)
Most Read This Month
