Skip to content
2000
Volume 20, Issue 8
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background

Protein secondary structure prediction is an important task in bioinformatics and structural biology. Protein’s structure is the basis for its corresponding function. Experimental methods for determining the tertiary structure of proteins are both costly and time-consuming. Since the tertiary structure of proteins is further formed by secondary structure, leveraging computational approaches for efficient prediction of protein secondary structure is important. Both local and global interactions between amino acids affect the prediction results.

Objective

We propose a module aimed at processing sequence profile features for deep feature extraction and constructing a lightweight network to extract fused features.

Methods

To enhance the network’s ability to capture both local and global interactions, we propose an efficient method InConTPSS, which integrates convolution operation with different receptive fields and temporal convolutional networks in the inception architecture. Concurrently, InConTPSS takes into account the issue of distribution imbalance across various states of secondary structures and improves the predictive performance of scarce categories.

Results

Experimental results on six benchmark datasets (including CASP12, CASP13, CASP14, CB513, TEST2016, and TEST2018) demonstrate our method achieves state-of-the-art performance with a simpler model on both 3-state and 8-state secondary structure prediction.

Conclusion

Through the combination of the convolutional layer and temporal convolutional network, the inception network structure can effectively process the fused features and improve the prediction results. InConTPSS achieves the most advanced performance in protein secondary structure prediction, and the reasonable use of label-distribution-aware margin loss in our method can effectively improve the prediction accuracy of scarce secondary structures.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/0115748936330905241220203450
2025-01-17
2026-03-01
Loading full text...

Full text loading...

References

  1. GreenerJ.G. KandathilS.M. JonesD.T. Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints.Nat. Commun.2019101397710.1038/s41467‑019‑11994‑031484923
    [Google Scholar]
  2. AlQuraishiM. End-to-end differentiable learning of protein structure.Cell Syst.201984292301.e310.1016/j.cels.2019.03.00631005579
    [Google Scholar]
  3. SeniorA.W. EvansR. JumperJ. KirkpatrickJ. SifreL. GreenT. QinC. ŽídekA. NelsonA.W.R. BridglandA. PenedonesH. PetersenS. SimonyanK. CrossanS. KohliP. JonesD.T. SilverD. KavukcuogluK. HassabisD. Improved protein structure prediction using potentials from deep learning.Nature2020577779270671010.1038/s41586‑019‑1923‑731942072
    [Google Scholar]
  4. JumperJ. EvansR. PritzelA. GreenT. FigurnovM. RonnebergerO. TunyasuvunakoolK. BatesR. ŽídekA. PotapenkoA. BridglandA. MeyerC. KohlS.A.A. BallardA.J. CowieA. Romera-ParedesB. NikolovS. JainR. AdlerJ. BackT. PetersenS. ReimanD. ClancyE. ZielinskiM. SteineggerM. PacholskaM. BerghammerT. BodensteinS. SilverD. VinyalsO. SeniorA.W. KavukcuogluK. KohliP. HassabisD. Highly accurate protein structure prediction with AlphaFold.Nature2021596787358358910.1038/s41586‑021‑03819‑234265844
    [Google Scholar]
  5. PereiraJ. SimpkinA.J. HartmannM.D. RigdenD.J. KeeganR.M. LupasA.N. High‐accuracy protein structure prediction in CASP14.Proteins202189121687169910.1002/prot.2617134218458
    [Google Scholar]
  6. VaradiM. AnyangoS. DeshpandeM. NairS. NatassiaC. YordanovaG. YuanD. StroeO. WoodG. LaydonA. ŽídekA. GreenT. TunyasuvunakoolK. PetersenS. JumperJ. ClancyE. GreenR. VoraA. LutfiM. FigurnovM. CowieA. HobbsN. KohliP. KleywegtG. BirneyE. HassabisD. VelankarS. AlphaFold protein structure database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models.Nucleic Acids Res.202250D1D439D44410.1093/nar/gkab106134791371
    [Google Scholar]
  7. YangW. LiuY. XiaoC. Deep metric learning for accurate protein secondary structure prediction.Knowl. Base. Syst.202224210835610.1016/j.knosys.2022.108356
    [Google Scholar]
  8. RahmanJ. NewtonM.A.H. HasanM.A.M. SattarA. Real-to-bin conversion for protein residue distances.Comput. Biol. Chem.202310410783410.1016/j.compbiolchem.2023.10783436863243
    [Google Scholar]
  9. RahmanJ. NewtonM.A.H. HasanM.A.M. SattarA. A stacked meta-ensemble for protein inter-residue distance prediction.Comput. Biol. Med.202214810582410.1016/j.compbiomed.2022.10582435863250
    [Google Scholar]
  10. NewtonM.A.H. RahmanJ. ZamanR. SattarA. Enhancing protein contact map prediction accuracy via ensembles of inter-residue distance predictors.Comput. Biol. Chem.20229910770010.1016/j.compbiolchem.2022.10770035665657
    [Google Scholar]
  11. GaoY. WangS. DengM. XuJ. RaptorX-Angle: Real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.BMC Bioinformatics201819S4Suppl. 410010.1186/s12859‑018‑2065‑x29745828
    [Google Scholar]
  12. NewtonM.A.H. MataeimoghadamF. ZamanR. SattarA. Secondary structure specific simpler prediction models for protein backbone angles.BMC Bioinformatics2022231610.1186/s12859‑021‑04525‑634983370
    [Google Scholar]
  13. MataeimoghadamF. NewtonM.A.H. DehzangiA. KarimA. JayaramB. RanganathanS. SattarA. Enhancing protein backbone angle prediction by using simpler models of deep neural networks.Sci. Rep.20201011943010.1038/s41598‑020‑76317‑633173130
    [Google Scholar]
  14. AltschulS. MaddenT.L. SchäfferA.A. ZhangJ. ZhangZ. MillerW. LipmanD.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs.Nucleic Acids Res.199725173389340210.1093/nar/25.17.33899254694
    [Google Scholar]
  15. EddyS.R. Profile hidden Markov models.Bioinformatics199814975576310.1093/bioinformatics/14.9.7559918945
    [Google Scholar]
  16. PaulingL. CoreyR.B. BransonH.R. The structure of proteins: Two hydrogen-bonded helical configurations of the polypeptide chain.Proc. Natl. Acad. Sci. USA195137420521110.1073/pnas.37.4.20514816373
    [Google Scholar]
  17. KabschW. SanderC. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features.Biopolymers198322122577263710.1002/bip.3602212116667333
    [Google Scholar]
  18. HuaS. SunZ. A novel method of protein secondary structure prediction with high segment overlap measure: Support vector machine approach1 1Edited by B. Holland.J. Mol. Biol.2001308239740710.1006/jmbi.2001.458011327775
    [Google Scholar]
  19. YangB. WuQ. YingZ. SuiH. Predicting protein secondary structure using a mixed-modal SVM method in a compound pyramid model.Knowl. Base. Syst.201124230431310.1016/j.knosys.2010.10.002
    [Google Scholar]
  20. SalzbergS. CostS. Predicting protein secondary structure with a nearest-neighbor algorithm.J. Mol. Biol.1992227237137410.1016/0022‑2836(92)90892‑N1404357
    [Google Scholar]
  21. BondugulaR. DuzlevskiO. XuD. Profiles and fuzzy k-nearest neighbor algorithm for protein secondary structure prediction.Proceedings of the 3rd Asia-Pacific Bioinformatics ConferenceWorld Scientific2005859410.1142/9781860947322_0009
    [Google Scholar]
  22. ChuW. GhahramaniZ. WildD.L. A graphical model for protein secondary structure prediction.Proceedings of the twenty-first international conference on Machine learningBanff, Canada, 2004, p. 21.10.1145/1015330.1015354
    [Google Scholar]
  23. Van Der MaatenL. WellingM. SaulL. Hidden-unit conditional random fields.JMLR Workshop and Conference ProceedingsFort Lauderdale, FL, USA, 2011, pp. 479-88.
    [Google Scholar]
  24. QianN. SejnowskiT.J. Predicting the secondary structure of globular proteins using neural network models.J. Mol. Biol.1988202486588410.1016/0022‑2836(88)90564‑53172241
    [Google Scholar]
  25. FaraggiE. ZhangT. YangY. KurganL. ZhouY. SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles.J. Comput. Chem.201233325926710.1002/jcc.2196822045506
    [Google Scholar]
  26. YangY. GaoJ. WangJ. HeffernanR. HansonJ. PaliwalK. ZhouY. Sixty-five years of the long march in protein secondary structure prediction: The final stretch?Brief. Bioinform.201819348249428040746
    [Google Scholar]
  27. RamachandranG.N. SasisekharanV. Conformation of polypeptides and proteins.Adv. Protein Chem.19682328343710.1016/S0065‑3233(08)60402‑74882249
    [Google Scholar]
  28. YaseenA. LiY. Template-based C8-SCORPION: A protein 8-state secondary structure prediction method using structural information and context-based features.BMC Bioinformatics201415S8Suppl. 8S310.1186/1471‑2105‑15‑S8‑S325080939
    [Google Scholar]
  29. LiZ. YuY. Protein secondary structure prediction using cascaded convolutional and recurrent neural networks.arXiv preprint2016160407176
    [Google Scholar]
  30. BusiaA. JaitlyN. Next-step conditioned deep convolutional neural networks improve protein secondary structure prediction.arXiv preprint2017170203865
    [Google Scholar]
  31. DroriI. DwivediI. ShresthaP. WanJ. WangY. HeY. High quality prediction of protein q8 secondary structure by diverse neural network architectures.arXiv preprint2018181107143
    [Google Scholar]
  32. GuoY. LiW. WangB. LiuH. ZhouD. DeepACLSTM: Deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction. In: BMC Bioinformatics201920134110.1186/s12859‑019‑2940‑031208331
    [Google Scholar]
  33. HeffernanR. YangY. PaliwalK. ZhouY. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility.Bioinformatics201733182842284910.1093/bioinformatics/btx21828430949
    [Google Scholar]
  34. HeffernanR. PaliwalK. LyonsJ. SinghJ. YangY. ZhouY. Single‐sequence‐based prediction of protein secondary structures and solvent accessibility by deep whole‐sequence learning.J. Comput. Chem.201839262210221610.1002/jcc.2553430368831
    [Google Scholar]
  35. GravesA GravesA. Long short-term memory.Supervised Sequence Labelling with Recurrent Neural Networks.SpringerBerlin, Heidelberg.2012385374510.1007/978‑3‑642‑24797‑2_4
    [Google Scholar]
  36. HansonJ. PaliwalK. LitfinT. YangY. ZhouY. Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks.Bioinformatics201935142403241010.1093/bioinformatics/bty100630535134
    [Google Scholar]
  37. FangC. ShangY. XuD. MUFOLD‐SS: New deep inception‐inside‐inception networks for protein secondary structure prediction.Proteins201886559259810.1002/prot.2548729492997
    [Google Scholar]
  38. FangC. ShangY. XuD. Prediction of protein backbone torsion angles using deep residual inception neural networks.IEEE/ACM Trans. Comput. Biol. Bioinformatics20191631020102810.1109/TCBB.2018.281458629994074
    [Google Scholar]
  39. SzegedyC. LiuW. JiaY. SermanetP. ReedS. AnguelovD. Going deeper with convolutions.Proceedings of the IEEE conference on computer vision and pattern recognitionBoston, MA, USA, 2015, pp. 1-9.
    [Google Scholar]
  40. UddinM.R. MahbubS. RahmanM.S. BayzidM.S. SAINT: Self-attention augmented inception-inside-inception network improves protein secondary structure prediction.Bioinformatics202036174599460810.1093/bioinformatics/btaa53132437517
    [Google Scholar]
  41. GuoY. WuJ. MaH. WangS. HuangJ. Deep ensemble learning with atrous spatial pyramid networks for protein secondary structure prediction.Biomolecules202212677410.3390/biom1206077435740899
    [Google Scholar]
  42. YangB. BenderG. LeQ.V. NgiamJ. Condconv: Conditionally parameterized convolutions for efficient inference.Adv. Neural Inf. Process. Syst.201932
    [Google Scholar]
  43. DauphinY.N. FanA. AuliM. GrangierD. Language modeling with gated convolutional networks.Proceedings of the 34 th International Conference on Machine LearningSydney, Australia, PMLR 70, 2017, pp. 933-41.
    [Google Scholar]
  44. IsmiD.P. PulunganR. Afiahayati Self-attention and asymmetric multi-layer perceptron-gated recurrent unit blocks for protein secondary structure prediction.Appl. Soft Comput.202415911160410.1016/j.asoc.2024.111604
    [Google Scholar]
  45. ZhangY. MaY. LiuY. Convolution-bidirectional temporal convolutional network for protein secondary structure prediction.IEEE Access20221011746911747610.1109/ACCESS.2022.3219490
    [Google Scholar]
  46. YuanL. MaY. LiuY. Ensemble deep learning models for protein secondary structure prediction using bidirectional temporal convolution and bidirectional long short-term memory.Front. Bioeng. Biotechnol.202311105126810.3389/fbioe.2023.105126836860882
    [Google Scholar]
  47. BaiS. KolterJ.Z. KoltunV. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.arXiv preprint2018180301271
    [Google Scholar]
  48. MeierJ. RaoR. VerkuilR. LiuJ. SercuT. RivesA. Language models enable zero-shot prediction of the effects of mutations on protein function.Adv. Neural Inf. Process. Syst.202134292872930310.1101/2021.07.09.450648
    [Google Scholar]
  49. RaoR.M. LiuJ. VerkuilR. MeierJ. CannyJ. AbbeelP. MSA transformer.bioRxiv202110.1101/2021.02.12.430858
    [Google Scholar]
  50. RivesA. MeierJ. SercuT. GoyalS. LinZ. LiuJ. GuoD. OttM. ZitnickC.L. MaJ. FergusR. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.Proc. Natl. Acad. Sci. USA202111815e201623911810.1073/pnas.201623911833876751
    [Google Scholar]
  51. LinZ. AkinH. RaoR. HieB. ZhuZ. LuW. Language models of protein sequences at the scale of evolution enable accurate structure prediction.BioRxiv20222022500902
    [Google Scholar]
  52. ElnaggarA. HeinzingerM. DallagoC. RihawiG. WangY. JonesL. ProtTrans: Towards cracking the language of Life’s code through self-supervised deep learning and high performance computing.arXiv preprint2007200706225
    [Google Scholar]
  53. RemmertM. BiegertA. HauserA. SödingJ. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment.Nat. Methods20129217317510.1038/nmeth.181822198341
    [Google Scholar]
  54. RaoR. BhattacharyaN. ThomasN. DuanY. ChenX. CannyJ. AbbeelP. SongY.S. Evaluating protein transfer learning with TAPE.Adv. Neural Inf. Process. Syst.2019329689970133390682
    [Google Scholar]
  55. XingE. JordanM. RussellS.J. NgA. Distance metric learning with application to clustering with side-information.Adv. Neural Inf. Process. Syst.200215
    [Google Scholar]
  56. HøieM.H. KiehlE.N. PetersenB. NielsenM. WintherO. NielsenH. HallgrenJ. MarcatiliP. NetSurfP-3.0: Accurate and fast prediction of protein structural features by protein language models and deep learning.Nucleic Acids Res.202250W1W510W51510.1093/nar/gkac43935648435
    [Google Scholar]
  57. KlausenM.S. JespersenM.C. NielsenH. JensenK.K. JurtzV.I. SønderbyC.K. SommerM.O.A. WintherO. NielsenM. PetersenB. MarcatiliP. NetSurfP‐2.0: Improved prediction of protein structural features by integrated deep learning.Proteins201987652052710.1002/prot.2567430785653
    [Google Scholar]
  58. SinghJ. PaliwalK. LitfinT. SinghJ. ZhouY. Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment.Sci. Rep.2022121760710.1038/s41598‑022‑11684‑w35534620
    [Google Scholar]
  59. KelleyL.A. MezulisS. YatesC.M. WassM.N. SternbergM.J.E. The Phyre2 web portal for protein modeling, prediction and analysis.Nat. Protoc.201510684585810.1038/nprot.2015.05325950237
    [Google Scholar]
  60. BakerD. SaliA. Protein structure prediction and structural genomics.Science20012945540939610.1126/science.106565911588250
    [Google Scholar]
  61. RostB. Twilight zone of protein sequence alignments.Protein Eng. Des. Sel.1999122859410.1093/protein/12.2.8510195279
    [Google Scholar]
  62. DurbinR. EddyS.R. KroghA. MitchisonG. Biological sequence analysis: Probabilistic models of proteins and nucleic acids.Cambridge university press199810.1017/CBO9780511790492
    [Google Scholar]
  63. MeilerJ MüllerM ZeidlerA SchmäschkeF Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks.Mol. Model. Annu.20017936036910.1007/s008940100038
    [Google Scholar]
  64. JozefowiczR. ZarembaW. SutskeverI. An empirical exploration of recurrent network architectures.Proceedings of the 32 nd International Conference on Machine LearningLille, France, 2015. JMLR: W&CP volume 37, pp. 2342-50.
    [Google Scholar]
  65. ZhangS. WuY. CheT. LinZ. MemisevicR. SalakhutdinovR.R. Architectural complexity measures of recurrent neural networks.Adv. Neural Inf. Process. Syst.201629
    [Google Scholar]
  66. ChoK. Learning phrase representations using RNN encoder-decoder for statistical machine translation.arXiv preprint20141406107810.3115/v1/D14‑1179
    [Google Scholar]
  67. YuF. Multi-scale context aggregation by dilated convolutions.arXiv preprint2015151107122
    [Google Scholar]
  68. CaoK. WeiC. GaidonA. ArechigaN. MaT. Learning imbalanced datasets with label-distribution-aware margin loss.Adv. Neural Inf. Process. Syst.201932
    [Google Scholar]
  69. LoshchilovI. Decoupled weight decay regularization.arXiv preprint2017171105101
    [Google Scholar]
  70. WangG. DunbrackR.L.Jr PISCES: A protein sequence culling server.Bioinformatics200319121589159110.1093/bioinformatics/btg22412912846
    [Google Scholar]
  71. ZemlaA. Venclovas FidelisK. RostB. A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment.Proteins199934222022310.1002/(SICI)1097‑0134(19990201)34:2<220::AID‑PROT7>3.0.CO;2‑K10022357
    [Google Scholar]
  72. Van der MaatenL. HintonG. Visualizing data using t-SNE.J. Mach. Learn. Res.2008911
    [Google Scholar]
  73. YangW. LiuC. LiZ. Lightweight fine-tuning a pretrained protein language model for protein secondary structure prediction.bioRxiv2023
    [Google Scholar]
  74. KazmA. AliA. HashimH. Transformer encoder with protein language model for protein secondary structure prediction.Eng. Technol. Appl. Sci. Res.20241421312413132
    [Google Scholar]
  75. DeLanoWL Pymol: An open-source molecular graphics tool.CCP4 Newsl Protein Crystallogr20024018292
    [Google Scholar]
/content/journals/cbio/10.2174/0115748936330905241220203450
Loading
/content/journals/cbio/10.2174/0115748936330905241220203450
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test