Current Protein and Peptide Science - Volume 15, Issue 6, 2014
Volume 15, Issue 6, 2014
-
-
Editorial (Thematic Issue: Protein Systems Biology: Method, Regulation, and Network)
Authors: Qingfeng Chen and Ming ChenThe advent of various advanced biological experimental techniques for discovery of conservations and interactions between molecules has prompted the study of biological regulation networks in a systematic way. The increasingly growth of organisms, genome-scale conservation and co-expression yield valuable data sources for exploring the functional and regulatory roles of biological systems. Graphs can be easily applied to represent a given biological network by transferring its basic entities and interactions into nodes and edges, respectively. It is important to develop effective methods for comparing these networks from varied organisms and discover the common/frequent subgraphs. This is able help reveal their functions and the exact features or scheme to carry out these functions. There have been increasing efforts to find featured patterns of interaction conserved by several biological networks. The purpose of this special issue is to discuss the state of arts of the latest techniques and methods for discovering diverse regulatory networks with respect to metabolism, PPI (protein-protein interaction), gene expression, and signaling pathways. THE ADVENT OF BIOLOGY BIG DATA There has been a biology data explosion owing to the application of advanced experiment technologies, such as next generation sequencing or deep sequencing. For example, the European Bioinformatics Institute (EBI), “one of the world's largest biology-data repositories, currently contains 20 petabytes (1 petabyte is 1015 bytes) of data and back-ups regarding genes, proteins and small molecules” [1]. This not only makes it possible to perform a comprehensive analysis of the genome and the transcriptome of a specific species, but also generates a big challenge to handle, process and extract information from the massive big data sets. These data have been widely manipulated to produce various research solutions. Since costs have been largely reduced owing to high-throughput instruments, small biology labs can also yield big data. It is possible that big data users are from small labs without such good facilities but they can access online data from public repositories. The biology data are usually more heterogeneous in contrast with conventional transaction data since a variety of experiments can give rise to different kinds of information, such as protein-protein interactions (PPI), various sequence, and RNA secondary structures or detections in the transcriptome. This generates a demand for biology data mining to access big data sets, integrate, analyze, compare and interpret the complex data [2]. In many ongoing studies, data-sharing has become a popular way for a large scale genomic or proteomic comparisons. However, the traditional approaches by downloading the data, storing them in own computer and analyze the data are time-consuming and high cost. Also, it is impossible for all users to have the required computational facilities, such as supercomputer and software. Without a doubt, this addresses the needs of flexible and public computational platform, including intelligent strategies about storage, management and analysis for dealing with biology big-data. A number of companies, institutes and labs have been established for different commercial and academic purposes, such as the National Center for Biotechnology Information (NCBI), EBI and Beijing Genomics Institute (BGI). They provide open access data sources, including data download and search, and usually allow one to obtain a data set from one location. To aid scientists in sharing their data across different countries and regions, this highlights the need to share computational resources and users can access the hardware and software on demand. Cloud computation is a recently emerging technology to cope with biology big-data mining. They not only offer virtual storage for data sharing, software and outcomes that a selected group of collaborators can share, but also prevents unauthorized users from accessing them [3]. A number of scientists can download/upload data and use software via cloud-based platforms. There have been many data sets and software programs situated in huge and offsite centers. For example, IT Center for Science at http://www.csc.fi/english is a high-performance computing centre run and funded by the government of Finland. Embassy Cloud is a cloud-computing component for ELIXIR by EBI, which provides safe computational environments and data download service for comparison purposes. These cloud-based infrastructures address the continued data growth and facilitate scientists to have a quick access to the information they need. Nevertheless, big-data transfer between local and remote sites and data sharing between collaborators remain a big challenge owing to unexpected interruption of data transfer. Regulatory networks have become a prevalent way to store and manage a large volume of biological data by modeling the molecular interactions. A deep study of identifying bimolecular networks and their correlations assists in understanding cellular behaviors and uncovering their functions in cellular systems. As a result, protein system biology, one of the most important forms of network system biology, will play a central role in life science. Life science is becoming data-driven. Big data science including data management, sharing and analysis is useful to construct dynamic and interacted protein regulatory network in an organism. THE IMPORTANCE OF PROTEIN SYSTEMS BIOLOGY Systems biology has become a hot research topic since 2000, from the construction of diverse biological systems, data visualization to big-data management and analysis in molecular biology and biomedicine. The molecule, cells, tissues and organs are not independent but perform their function together in a systematic way. It has been widely applied in both biological and biomedical studies to explore complex interactions between components within biology systems by biomedical studies to explore complex interactions between components within biology systems by virtue of computational methods and mathematical models. Gene networks and protein networks are two typical networks of system biology, in which the properties and patterns of protein-based regulation and gene-related components play an important role in understanding functions and behaviors of biology systems [4]. A number of commercial or academic research institutes, centers and labs have been established for systems biology investigation. FAS center for systems biology is an interdepartmental initiative at Harvard University, which aims to explain the structure, behavior and evolution of cells and organisms by combining quantitative measurements and systematic measurement including genomics, proteomics, and computational biology, and mathematical models to extract and describe the dynamical behavior of groups of interacting components. Systems problems have become an important topic to all computational biology research and medicine design. The New South Wales Systems Biology Initiative was funded by the Australian Research Council and NSW State Government, (http://www.systemsbiology.org.au/). It is located at the University of New South Wales and targets at developing bioinformatics algorithms and tools for genomics and proteomics. SBI was established in 2000 and aimed to facilitate systems biology research in several important areas with respect to healthcare and global sustainability. It has been widely applied in a number of research programs mostly supported by Japanese government and private foundations. “Pathways have been viewed as a convenient way of summarizing the results of a collection of experiments to describe the flow of signals or metabolites in a cell. A number of databases regarding metabolic and signaling pathways are developed to represent the relationships between molecules involved in various events, including reactions or as activation or inhibition” [3]. Notwithstanding many attempts to extract properties and details of the interaction, such as phosphorylation sites, there is generally insufficient functional details to interpret the actual meaning of the link between two proteins. Relevant molecules, identified binding sites and their interactions are able to greatly illuminate the understanding of protein system biology. THE MOTIVATION FOR NOVEL COMPUTATIONAL METHODS A great deal of molecular interactions have been unveiled, but the details of precise interactions are still far from perfect and comprehensive. The difficulties to predict the behavior of involved genes and proteins mainly arise from the complexity of turning the abstract biology system into models that exactly report the system reality, and the heterogeneity and size of biological big data from multiple data sources. The paradigm of systems biology thus generates a demand for computational method, interaction prediction and network construction. High-throughput sequencing projects have identified a collection of involved components that function in an organism. Many studies in post-genomic projects target extracting their relationships. Systems biology is thus motivated to make sense of these relationships by considering them together, and simulates the manner by which the participated molecules work together to obtain a designated outcome or perform targeted functions. As a result, traditional molecular biology that focused on studying single molecules has been moved to systems biology by exploring pathways, complexes or even an organism. To understand diverse pathways and or networks regarding gene regulation, scientists must have a good knowledge about the correlations between protein and protein, protein and metabolite and protein and nucleic acid. Structural information has been a useful way to offer a comprehensive understanding of interaction between molecules by relying on atomic details about binding. However, it takes time for detailed structural information of a large complexes or whole systems to be reached. Thus, this urges us to develop new computational methods to discover and model the relationships between interacting molecules. CONTRIBUTIONS TO THIS ISSUE The articles included in this special issue are classified into protein function prediction, protein-protein interaction, and protein regulation pattern. Methods for construction and characterization of amino acid networks are reviewed by Jianhong Zhou et al. The authors summarized and discussed network properties applied to the native structure selection, providing a future perspective on the application of amino acid networks for the native folding detecting among the decoy sets. Wei Peng et al. proposed an unbalanced Bi-random walk (UBiRW) algorithm to predict protein functions which iteratively walks different number of steps in the two networks is adopted to find protein-GO term associations according to some known associations. “The interface in a complex involves two structurally matched protein subunits, and the binding sites can be predicted by identifying structural matches at protein surfaces” [5]. Understanding energetic and mechanism of complexes remains one of the essential problems in binding site prediction. Fei Guo et al. developed a system, PBinder, for identifying binding sites based on structural compatibility, side-chain conformations, amino acid types and contact energies. The system reports improvements in prediction correctness, according to both accuracy and coverage. Among the most important networks maintaining biological functions, protein-protein interactions span from local binary interaction to an entire cell. It is still a long sought scientific goal to understand how the interacting partners recognize and bind each other precisely. Comparing with other existed method, Least Squares regression (LSR) proposed by De-Shuang Huang elvirtue ofal. is a powerful tool to characterize the protein-protein correlations and to infer PPI, whilst keeping high performance on prediction of PPI networks. The review article written by Chiranjib Chakraborty et al. enhances our knowledge on how PPI networks architecture can use to validate a drug target. At the conclusion, future directions of PPI in target discovery and drug-design have been suggested. Based on reviewing the key regulators in the hydroxylated triacylglycerol ricinoleate biosynthesis pathway of castor bean, Yujie Chen et al. analyzed several key regulators from the aspect of the structure/function prediction and similar expression pattern mechanisms aimed to give an insight on the better understandings of the biosynthesis knowledge for this energy-rich molecule and the key regulators performance in the pathways. Lili Liu et al. defines the organelle-focused proteome and interactome of rice based on manual annotation, manual adjustment and predictors’ cross validation. Furthermore, the cross talk bias between different organelles and the function organization accounting for nine organelles are explored. Wei Lan et al. explore the overlooked positions of microRNAs (miRNAs) based on sequential and structural features since they have been recognized as important regulators in a wide range of biological processes. These functions may be exploited for miRNA-mediated regulation of protein expression. Collision entropy is applied to measure the degree of importance of miRNA position. In particular, two thresholds are used to prune those unimportant positions. The findings unveil important positions of miRNAs related to biogenesis and function. “Rapid advances in network biology indicate that cellular networks are governed by universal laws and offer a new conceptual framework that could potentially revolutionize our view of biology and disease pathologies in the twenty-first century” [6]. CONCLUSIONS Systems biology has been successful in predicting the behavior of a set of molecules involved in biological systems and understanding their interactions. As an important branch of systems biology, protein systems biology focuses on investigating the properties and patterns with respect to protein-related interactions. Owing to the application of high-throughput sequencing techniques, biological big-data has become a big challenge to both biologists and computer scientist. Traditional computational methods on the basis of local data and computational facilities have showed their limitations in addressing a large volume of biological data from multiple sources. Thus, it is crucial to develop public computational platforms, including equipment and software for storage, management and analysis of biological big-data. This needs collaboration of scientists from biology, computer science and mathematics. To understand the properties of components in biological systems and their relationships, these result in a number of interesting research topics described in this special issue. ACKNOWLEDGEMENTS We wish to thank all the authors who have contributed with their work to foster the dissemination of scientific excellence in the protein network biology field; all the reviewers for giving their time and expertise to evaluate manuscripts submitted for this publication. The work reported in this paper was partially supported by a National Natural Science Foundation of China project 61363025 and 31371328, and two key projects of Natural Science Foundation of Guangxi 053006 and 019029.
-
-
-
Amino Acid Network for the Discrimination of Native Protein Structures from Decoys
Authors: Jianhong Zhou, Wenying Yan, Guang Hu and Bairong ShenWith the development of structural genomics projects, the discrimination of native proteins from decoys has become one of the major challenges in protein structure prediction. In comparison with the energy function based techniques, amino acid network provides a simple but efficient method for the native structure selection. Amino acid network (AAN) is a graph representation of protein structure where amino acids in the protein are the nodes and their interactions or contacts are the edges. In this review, we first briefly summarized the methods for the construction and characterization of AANs. Then the four network properties, i.e. average degree, complexity, clustering coefficient of the largest cluster (CCoe) and the size of the top large communities (CComS), applied to the native structure selection are discussed and summarized. We concluded with the discussion of the future perspective on the application of AAN for the native folding detecting among the decoy sets.
-
-
-
Predicting Protein Functions by Using Unbalanced Bi-Random Walk Algorithm on Protein-Protein Interaction Network and Functional Interrelationship Network
Authors: Wei Peng, Jianxin Wang, Lu Chen, Jiancheng Zhong, Zhen Zhang and Yi PanAccurate annotation of protein functions is still a big challenge for understanding life in the post-genomic era. Recently, some methods have been developed to solve the problem by incorporating functional similarity of GO terms into protein-protein interaction (PPI) network, which are based on the observation that a protein tends to share some common functions with proteins that interact with it in PPI network, and two similar GO terms in functional interrelationship network usually co-annotate some common proteins. However, these methods annotate functions of proteins by considering at the same level neighbors of proteins and GO terms respectively, and few attempts have been made to investigate their difference. Given the topological and structural difference between PPI network and functional interrelationship network, we firstly investigate at which level neighbors of proteins tend to have functional associations and at which level neighbors of GO terms usually co-annotate some common proteins. Then, an unbalanced Bi-random walk (UBiRW) algorithm which iteratively walks different number of steps in the two networks is adopted to find protein-GO term associations according to some known associations. Experiments are carried out on S. cerevisiae data. The results show that our method achieves better prediction performance not only than methods that only use PPI network data, but also than methods that consider at the same level neighbors of proteins and of GO terms.
-
-
-
Identifying Protein-Protein Binding Sites with a Combined Energy Function
Authors: Fei Guo, Shuai C. Li, Ying Fan and Lusheng WangDetermination of binding sites between proteins is widely applied in many fields, such as drug design and the structural and functional analysis. The protein-protein binding sites can be formed by two subunits in a complex. Understanding energetics and mechanisms of complexes remains one of the essential problems in binding site prediction. We develop a system, P-Binder, for identifying binding sites based on shape complementarity, side-chain conformations and interacting amino acid information. P-Binder utilizes an enumeration method to generate all possible configurations between two proteins, and uses a side-chain packing program to identify the bound states. The system reports the binding sites with the highest ranked configurations, evaluated through a linear combination of four statistical energy items. The experiments show that our method performs better than other prediction methods. A comparison with some existing approaches shows P-Binder to improve the success rate by at least 12.3%. We test P-Binder on 176 protein-protein complexes in Benchmark v4.0. The overall values of accuracy and coverage are 63.8% and 68.8% for the bound state, and 51.0% and 60.9% for the unbound state.
-
-
-
Prediction of Protein-Protein Interactions Based on Protein-Protein Correlation Using Least Squares Regression
Authors: De-Shuang Huang, Lei Zhang, Kyungsook Han, Suping Deng, Kai Yang and Hongbo ZhangIn order to transform protein sequences into the feature vectors, several works have been done, such as computing auto covariance (AC), conjoint triad (CT), local descriptor (LD), moran autocorrelation (MA), normalized moreaubroto autocorrelation (NMB) and so on. In this paper, we shall adopt these transformation methods to encode the proteins, respectively, where AC, CT, LD, MA and NMB are all represented by ‘+’ in a unified manner. A new method, i.e. the combination of least squares regression with ‘+’ (abbreviated as LSR+), will be introduced for encoding a protein-protein correlation-based feature representation and an interacting protein pair. Thus there are totally five different combinations for LSR+, i.e. LSRAC, LSRCT, LSRLD, LSRMA and LSRNMB. As a result, we combined a support vector machine (SVM) approach with LSR+ to predict protein-protein interactions (PPI) and PPI networks. The proposed method has been applied on four datasets, i.e. Saaccharomyces cerevisiae, Escherichia coli, Homo sapiens and Caenorhabditis elegans. The experimental results demonstrate that all LSR+ methods outperform many existing representative algorithms. Therefore, LSR+ is a powerful tool to characterize the protein-protein correlations and to infer PPI, whilst keeping high performance on prediction of PPI networks.
-
-
-
Evaluating Protein-protein Interaction (PPI) Networks for Diseases Pathway, Target Discovery, and Drug-design Using ‘In silico Pharmacology’
Authors: Chiranjib Chakraborty, George Priya Doss. C, Luonan Chen and Hailong ZhuIn silico pharmacology is a promising field in the current state-of drug discovery. This area exploits “proteinprotein Interaction (PPI) network analysis for drug discovery using the drug “target class”. To document the current status, we have discussed in this article how this an integrated system of PPI networks contribute to understand the disease pathways, present state-of-the-art drug target discovery and drug discovery process. This review article enhances our knowledge on conventional drug discovery and current drug discovery using in silico techniques, best “target class”, universal architecture of PPI networks, the present scenario of disease pathways and protein-protein interaction networks as well as the method to comprehend the PPI networks. Taken all together, ultimately a snapshot has been discussed to be familiar with how PPI network architecture can used to validate a drug target. At the conclusion, we have illustrated the future directions of PPI in target discovery and drug-design.
-
-
-
Crucial Enzymes in the Hydroxylated Triacylglycerol-ricinoleate Biosynthesis Pathway of Castor Bean
Authors: Yujie Chen, Lili Liu, Xun Tian, Jianjun Di, Yalatu Su, Fenglan Huang and Yongsheng ChenCastor bean (Ricinus communis L.) is an important oilseed crop for the rich hydroxylated triacylglycerol (TAG)-ricinoleate which is a raw material with wide applications in industry. Hydroxylated TAG synthesis occurs through complicated pathways among multiple subcellular organelles. Some crucial enzymes have been identified in previous studies. After analyzing the available castor tissue-specific transcriptome sequencing data and comparing the classic pathways in other plants, a possible de novo biosynthesis pathway for the hydroxylated TAG has been revealed. In this study, some other crucial enzymes were ascertained and their expression levels were characterized and pinpointed into the pathways in castor. Several key enzymes were analyzed in terms of structure, biofunction prediction and similarity of expression pattern mechanisms, aiming to give an insight on the better understandings of the molecular knowledge for this oil-rich plant and the crucial enzyme performances in the hydroxylated triacylglycerol-ricinoleate biosynthesis pathways.
-
-
-
The Organelle-focused Proteomes and Interactomes in Rice
More LessProtein subcellular localization has been a long-standing key problem in investigating proteins’ function, which provides important clues for revealing their functions and aids in understanding their interactions with other biomolecules at the cellular level. Here, we systematically defined the organelle-focused proteomes and interactomes in Oryza sativa. A total of 83.42% of the whole rice proteome obtained their subcellular localizations based on manual annotation, manual adjustment and predictors’ cross validation. The final organelle-focused interactomes were located in nine organelles. Furthermore, we discussed the cross talk bias between different organelles and the function organization accounting for nine organelles. Motif analysis illustrated the protein interaction bias in different organelles to implement certain biology functions. We exemplified the connection between functions and the overrepresented motifs in the organelle-focused interactomes and exemplified how to infer the functions of unknown proteins via expanding the overrepresented motifs.
-
-
-
Identification of Important Positions within miRNAs by Integrating Sequential and Structural Features
Authors: Wei Lan, Qingfeng Chen, Taoshen Li, Changan Yuan, Scott Mann and Baoshan ChenMicroRNA(miRNA) is a small, single stranded non-coding RNA which plays an important regulatory role in gene expression. Additionally, miRNAs perform crucial functions in a wide range of biological processes. These functions may be exploited for miRNA-mediated regulation of protein-protein interaction and thus protein function. Many computational methods have been developed to predict the miRNA targets and to explore the regulatory mechanism between miRNA and protein. However, the efforts to investigate important positions within miRNAs are not comprehensive. This paper presents a framework to identify important positions using collision entropy. The information of contained in the sequence and secondary structure of miRNAs is considered. Further, the single base collision entropy and the adjacent base related collision entropy are integrated to measure the importance of miRNA position. Two thresholds are employed to select those positions with more biological meaning. A dataset of Drosophila melanogaster is used in the experiments. The results demonstrate that our approach can find interesting and important positions within miRNAs and may lead to a better understanding of miRNA biogenesis and function.
-
-
-
Selenium and Selenoproteins: An Overview on Different Biological Systems
Authors: Erika Mangiapane, Alessandro Pessione and Enrica PessioneSelenium (Se) is an essential trace element for humans, plants and microorganisms. Inorganic selenium is present in nature in four oxidation states: selenate, selenite, elemental Se and selenide in decreasing order of redox status. These forms are converted by all biological systems into more bioavailable organic forms, mainly as the two seleno-amino acids selenocysteine and selenomethionine. Humans, plants and microorganisms are able to fix twhese amino acids into proteins originating Se-containing proteins by a simple replacement of methionine with selenomethionine, or “true” selenoproteins if the insertion of selenocysteine is genetically encoded by a specific UGA codon. Selenocysteine is usually present in the active site of enzymes, being essential for their catalytic activity. This review will focus on the strategies adopted by the different biological systems for selenium incorporation into proteins and on the importance of this element for the physiological functions of living organisms. The most known selenoproteins of humans and microorganisms will be listed highlighting the importance of this element and the problems connected with its deficiency.
-
-
-
Multi-Faceted Arginine: Mechanism of the Effects of Arginine on Protein
Authors: Tsutomu Arakawa and Yoshiko KitaArginine is widely used in such applications as protein refolding, solubilization of proteins and small molecules, protein and small molecule formulation, column chromatography and viral inactivation as summarized in this review. What makes arginine effective in these applications is largely based on its ability to suppress protein-protein interactions and protein-surface interactions. The mechanism of these widespread effects of arginine on proteins can be explained at least in part from its unique interactions with the protein surface. Here we describe the modes of the interactions of arginine with model compounds and proteins and also water molecules, and then attempt to explain the mechanism of its effect on proteins by comparing with the interactions that occur between protein and protein denaturants or stabilizers.
-
-
-
Thioredoxin Reductase and its Inhibitors
Thioredoxin plays a crucial role in a wide number of physiological processes, which span from reduction of nucleotides to deoxyriboucleotides to the detoxification from xenobiotics, oxidants and radicals. The redox function of Thioredoxin is critically dependent on the enzyme Thioredoxin NADPH Reductase (TrxR). In view of its indirect involvement in the above mentioned physio/pathological processes, inhibition of TrxR is an important clinical goal. As a general rule, the affinities and mechanisms of binding of TrxR inhibitors to the target enzyme are known with scarce precision and conflicting results abound in the literature. A relevant analysis of published results as well as the experimental procedures is therefore needed, also in view of the critical interest of TrxR inhibitors. We review the inhibitors of TrxR and related flavoreductases and the classical treatment of reversible, competitive, non competitive and uncompetitive inhibition with respect to TrxR, and in some cases we are able to reconcile contradictory results generated by oversimplified data analysis.
-
Volumes & issues
-
Volume 26 (2025)
-
Volume (2025)
-
Volume 25 (2024)
-
Volume 24 (2023)
-
Volume 23 (2022)
-
Volume 22 (2021)
-
Volume 21 (2020)
-
Volume 20 (2019)
-
Volume 19 (2018)
-
Volume 18 (2017)
-
Volume 17 (2016)
-
Volume 16 (2015)
-
Volume 15 (2014)
-
Volume 14 (2013)
-
Volume 13 (2012)
-
Volume 12 (2011)
-
Volume 11 (2010)
-
Volume 10 (2009)
-
Volume 9 (2008)
-
Volume 8 (2007)
-
Volume 7 (2006)
-
Volume 6 (2005)
-
Volume 5 (2004)
-
Volume 4 (2003)
-
Volume 3 (2002)
-
Volume 2 (2001)
-
Volume 1 (2000)
Most Read This Month
