Skip to content
2000
Volume 15, Issue 3
  • ISSN: 1570-1646
  • E-ISSN: 1875-6247

Abstract

Background: Protein Data Bank is a world-wide repository that collects and provides macromolecular data of protein structures and other molecules for Life sciences community. Manipulation of vast amount of 3D protein structures and exploration of their properties require parsing thousands of flat files that are used to describe these macromolecular structures every time we perform calculations. Objective: Expecting more protein structures to appear in the future in open access repositories, like the Protein Data Bank, and meeting the expectations of the era of fast data analytics, we propose inmemory management system for protein structures that predominantly uses main memory of the host server to store, manage and manipulate data. This allows to eliminate the overhead related to loading data from hard drives and storing them in a buffer cache. Method: In this paper, we show in-memory protein structure management system (IMPSMS), which allows performing various operations, including basic functions like: selection, inserting, updating and searching of protein structures, and execution of more sophisticated functions, like batch calculation of root mean square deviation between proteins stored in the database, batch calculation of torsion angles, structure comparison, structural alignment and superposition of the given molecule to molecules stored in the in-memory database. Results: In the experimental part, we show that with dedicated in-memory data structures particular operations on proteins can be performed even a hundred times faster than analogous operations preceded by traditional loading and parsing macromolecular data from standard PDB flat files. Conclusion: Our work proves that designing dedicated data structures and management systems for frequent protein data manipulations brings significant time savings and increases capabilities of running fast data analytics in bioinformatics.

Loading

Article metrics loading...

/content/journals/cp/10.2174/1570164615666180320151452
2018-06-01
2025-10-24
Loading full text...

Full text loading...

/content/journals/cp/10.2174/1570164615666180320151452
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test