ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization

Sandeep Kumar; Arun Solanki; Noor Zaman Jhanjhi

doi:10.2174/0126662558304595240528111535

ISSN: 2666-2558
E-ISSN: 2666-2566

ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization
Authors: Sandeep Kumar¹, Arun Solanki¹ and Noor Zaman Jhanjhi²
View Affiliations Hide Affiliations

¹ Department of Computer Science and Engineering, SoICT, Gautam Buddha University, Greater Noida 201312, India; ² School of Computer Science (SCS), Taylor’s University, Subang Jaya 47500, Malaysia
Source: Recent Advances in Computer Science and Communications, Volume 18, Issue 6, Sep 2025, E060624230748
DOI: https://doi.org/10.2174/0126662558304595240528111535
- Received: 15 Jan 2024
- Accepted: 27 Apr 2024
- Available online: 06 Jun 2024

Abstract

Background

Prior research on abstractive text summarization has predominantly relied on the ROUGE evaluation metric, which, while effective, has limitations in capturing semantic meaning due to its focus on exact word or phrase matching. This deficiency is particularly pronounced in abstractive summarization approaches, where the goal is to generate novel summaries by rephrasing and paraphrasing the source text, highlighting the need for a more nuanced evaluation metric capable of capturing semantic similarity.

Methods

In this study, the limitations of existing ROUGE metrics are addressed by proposing a novel variant called ROUGE-SS. Unlike traditional ROUGE metrics, ROUGE-SS extends beyond exact word matching to consider synonyms and semantic similarity. Leveraging resources such as the WordNet online dictionary, ROUGE-SS identifies matches between source text and summaries based on both exact word overlaps and semantic context. Experiments are conducted to evaluate the performance of ROUGE-SS compared to other ROUGE variants, particularly in assessing abstractive summarization models. The algorithm for the synonym features (ROUGE-SS) is also proposed.

Results

The experiments demonstrate the superior performance of ROUGE-SS in evaluating abstractive text summarization models compared to existing ROUGE variants. ROUGE-SS yields higher F1 scores and better overall performance, achieving a significant reduction in training loss and impressive accuracy. The proposed ROUGE-SS evaluation technique is evaluated in different datasets like CNN/Daily Mail, DUC-2004, Gigawords, and Inshorts News datasets. ROUGE-SS gives better results than other ROUGE variant metrics. The F1-score of the proposed ROUGE-SS metric is improved by an average of 8.8%. These findings underscore the effectiveness of ROUGE-SS in capturing semantic similarity and providing a more comprehensive evaluation metric for abstractive summarization.

Conclusion

In conclusion, the introduction of ROUGE-SS represents a significant advancement in the field of abstractive text summarization evaluation. By extending beyond exact word matching to incorporate synonyms and semantic context, ROUGE-SS offers researchers a more effective tool for assessing summarization quality. This study highlights the importance of considering semantic meaning in evaluation metrics and provides a promising direction for future research on abstractive text summarization.

Article metrics loading...

/content/journals/rascs/10.2174/0126662558304595240528111535

2024-06-06

2025-10-11

From This Site

/content/journals/rascs/10.2174/0126662558304595240528111535

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

References

SolankiA. KumarA. A system to transform natural language queries into SQL queries.Int. J. Inf. Technol.202214143744610.1007/s41870‑018‑0095‑2
[Google Scholar]
KumarS. SolankiA. A natural language processing system using CWS pipeline for extraction of linguistic features.Procedia Comp. Sci.202321817681777
[Google Scholar]
SinghG. SolankiA. An algorithm to transform natural language into SQL queries for relational databases.Selforganizology201633100116
[Google Scholar]
KumarS. SolankiA. Named entity recognition for natural language understanding using BERT model10.1063/5.0181535
[Google Scholar]
NenkovaA. McKeownK. Automatic summarizationFoundations Trends® Inf. Retrieval20115210323310.1561/1500000015
[Google Scholar]
ManningC. SurdeanuM. BauerJ. FinkelJ. BethardS. McCloskyD. The stanford CoreNLP natural language processing toolkitProceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014pp. 55-60 Baltimore, Maryland10.3115/v1/P14‑5010
[Google Scholar]
LinC-Y. ROUGE: A package for automatic evaluation of summariesConference: In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), 2004pp. 74-81 Barcelona, Spain
[Google Scholar]
PapineniK. RoukosS. WardT. ZhuW-J. Bleu: A method for automatic evaluation of machine translationProceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002pp. 311-318 Philadelphia, Pennsylvania, USA10.3115/1073083.1073135
[Google Scholar]
ISCA - International Speech Communication AssociationINTERSPEECH 2006/ICSLP Abstract: Nenkova, AniAvailable From: https://www.isca-speech.org/archive_v0/interspeech_2006/i06_2079.html
SolankiA. SinghT. “COVID-19 epidemic analysis and prediction using machine learning algorithms”. Emerging Technologies for Battling Covid-19.United StatesSpringer Link202110.1007/978‑3‑030‑60039‑6_3
[Google Scholar]
SolankiA. KumarS. RohanS.P. Prediction of breast and lung cancer, comparative review and analysis using machine learning techniques.Smart Computing and Self-Adaptive Systems1st ed.Boca Raton, FloridaCRC Press2021251271
[Google Scholar]
PatilM. RekhaP. SolankiA. NayyarA. QureshiB. Big data analytics using swarm-based long short-term memory for temperature forecastingComp., Mater. Continua202271223472361
[Google Scholar]
TayalA. Integrated frame work for identifying sustainable manufacturing layouts based on big data, machine learning, meta-heuristic and data envelopment analysis.Sustainable Cit. Soc.202062102383
[Google Scholar]
WidyassariA.P. Review of automatic text summarization techniques & methodsJ. King Saud Uni. - Comp. Inform. Sci.202234410291046
[Google Scholar]
SteinbergerJ. Evaluation measures for text summarization.Comput. Informat.2009282251275
[Google Scholar]
MillerG.A. WordNet.Commun. ACM19953811394110.1145/219717.219748
[Google Scholar]
SolankiA. Review of sentimental analysis methods using lexicon based approach.Int. J. Comp. Sci. Mobile Comput.201652159166
[Google Scholar]
SyedA.A. A survey of the state-of-the-art models in neural abstractive text summarization.IEEE Access202191324813265
[Google Scholar]
WanX. Multi-document summarization using cluster-based link analysisConference: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, 2008 Singapore
[Google Scholar]
CaoZ. WeiF. DongL. LiS. ZhouM. Ranking with recursive neural networks and its application to multi-document summarizationProc. AAAI Conf. Arti. Intell.20152912153215910.1609/aaai.v29i1.9490
[Google Scholar]
NallapatiR. ZhouB. dos SantosC. Gu̇lçehreÇ. XiangB. Abstractive text summarization using sequence-to-sequence RNNs and beyondProceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, 2016pp. 280-290 Berlin, Germany10.18653/v1/K16‑1028
[Google Scholar]
SinghT. Multilingual opinion mining movie recommendation system using RNNProc. First Int. Conf. Comput., Commun., Cyber-Secur. (IC4S 2019)2020121589605
[Google Scholar]
PaulusR. XiongC. SocherR. A deep reinforced model for abstractive summarizationarXiv:1705.043042017
[Google Scholar]
GehrmannS. DengY. RushA. Bottom-up abstractive summarizationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018pp. 4098-4109 Brussels, Belgium.10.18653/v1/D18‑1443
[Google Scholar]
ShaoY. GouwsS. BritzD. GoldieA. StropeB. KurzweilR. Generating high-quality and informative conversation responses with sequence-to-sequence modelsProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017pp. 2210-2219 Copenhagen, Denmark.10.18653/v1/D17‑1235
[Google Scholar]
LiuY. LapataM. Hierarchical transformers for multi-document summarizationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019pp. 5070-5081 Florence, Italy.10.18653/v1/P19‑1500
[Google Scholar]
ChenY-C. BansalM. Fast abstractive summarization with reinforce-selected sentence rewritingarXiv:1805.11080201810.18653/v1/P18‑1063
[Google Scholar]
HuangL. CaoS. ParulianN. JiH. WangL. Efficient attentions for long document summarizationProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1419-1436, 2021.Seattle, United States.10.18653/v1/2021.naacl‑main.112
[Google Scholar]
BhandariM. GourP.N. AshfaqA. LiuP. NeubigG. Re-evaluating evaluation in text summarizationProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020pp. 9347-9359.10.18653/v1/2020.emnlp‑main.751
[Google Scholar]
SuleimanD. AwajanA. Deep learning based abstractive text summarization: Approaches, datasets, evaluation measures, and challenges.Math. Probl. Eng.2020202012910.1155/2020/9365340
[Google Scholar]
BishtP. SolankiA. Exploring practical deep learning approaches for english-to-hindi image caption translation using transformers and object detectors. Applications of Artificial Intelligence and Machine Learning.United StatesSpringer Link2022476010.1007/978‑981‑19‑4831‑2_5
[Google Scholar]
ChoudharyR. SolankiA. Violence detection in videos using transfer learning and LSTM.United StatesAdvances in Data Computing, Communication and Security. Springer Link20225162
[Google Scholar]
ZhangY. A close examination of factual correctness evaluation in abstractive summarizationAvailable From: https://www.semanticscholar.org/paper/A-Close-Examination-of-Factual-Correctness-in-Zhang-Zhang/4d8be27e9204ded12b7f2200d1226216faa8dd47
XieY. SunF. DengY. LiY. DingB. Factual consistency evaluation for text summarization via counterfactual estimationarXiv:2108.13134202110.18653/v1/2021.findings‑emnlp.10
[Google Scholar]
ZhengC. WangH.J. Topic-aware abstractive text summarizationarXiv:2010.103232020
[Google Scholar]
ZhangJ. ZhaoY. SalehM. LiuP.J. PEGASUS: Pre-training with extracted gap-sentences for abstractive summarizationarXiv:1912.087772020
[Google Scholar]
QiW. ProphetNet: Predicting future N-gram for sequence-to-sequence pre-trainingarXiv:2001.04063202010.18653/v1/2020.findings‑emnlp.217
[Google Scholar]
NgJ-P. AbrechtV. Better summarization evaluation with word embeddings for ROUGEProceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015 pp. 1925-1930 Lisbon, Portugal10.18653/v1/D15‑1222
[Google Scholar]
HuangY. The factual inconsistency problem in abstractive text summarization: A survey.arXiv:2104.148392021
[Google Scholar]
LiuY. Leveraging locality in abstractive text summarizationarXiv.2205.124762022
[Google Scholar]
ZhongM. QMSum: A new benchmark for query-based multi-domain meeting summarizationarXiv:2104.059382021
[Google Scholar]
JingB. YouZ. YangT. FanW. TongH. Multiplex graph neural network for extractive text summarizationarXiv:210812870202110.18653/v1/2021.emnlp‑main.11
[Google Scholar]
WangG. LiW. LaiE. JiangJ. KATSum: Knowledge-aware abstractive text summarizationarXiv:2212.033712022
[Google Scholar]
KourisP. AlexandridisG. StafylopatisA. Text summarization based on semantic graphs: An abstract meaning representation graph-to-text deep learning approach.Res. Sq.202210.21203/rs.3.rs‑1938526/v1
[Google Scholar]
BarbellaM. TortoraG. Rouge metric evaluation for text summarization techniques.SSRN2022202231
[Google Scholar]
LiuY. JiaQ. ZhuK. Reference-free summarization evaluation via semantic correlation and compression ratioProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022 pp. 2109-2115 Seattle, United States10.18653/v1/2022.naacl‑main.153
[Google Scholar]
KumarS. SolankiA. An abstractive text summarization technique using transformer model with self-attention mechanism.Neural Comput. Appl.20233525186031862210.1007/s00521‑023‑08687‑7
[Google Scholar]
ZhangM. LiC. WanM. ZhangX. ZhaoQ. ROUGE-SEM: Better evaluation of summarization using ROUGE combined with semantics.Expert Syst. Appl.202423712136410.1016/j.eswa.2023.121364
[Google Scholar]
DingJ. LiY. NiH. YangZ. Generative text summary based on enhanced semantic attention and gain-benefit gate.IEEE Access202020201110.1109/ACCESS.2020.2994092
[Google Scholar]
LiuP.J. ManningC.D. SeeA. Get to the point: Summarization with pointer-generator networksProceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017pp. 1073-1083 Vancouver, Canada.10.18653/v1/P17‑1099
[Google Scholar]

/content/journals/rascs/10.2174/0126662558304595240528111535

ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization

Recent Advances in Computer Science and Communications 18, E060624230748 (2025); https://doi.org/10.2174/0126662558304595240528111535

/content/journals/rascs/10.2174/0126662558304595240528111535

Data & Media loading...

Article Type: Research Article

Keyword(s): artificial intelligence; evaluation metrics; NLP; ROUGE-SS; T2SAM; Text summarization

ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization

Abstract

From This Site

Most Read This Month

Most Cited Most Cited RSS feed

Key Issues in Software Reliability Growth Models

Remaining Useful Life Prediction of Lithium-ion Batteries Using Multiple Kernel Extreme Learning Machine

An Ensemble of Bacterial Foraging, Genetic, Ant Colony and Particle Swarm Approach EB-GAP: A Load Balancing Approach in Cloud Computing

ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization

Container Elasticity: Based on Response Time using Docker

Research on Monitoring System of Daily Statistical Indexes Through Big Data

A Rapid Transition from Subversion to Git: Time, Space, Branching, Merging, Offline Commits & Offline builds and Repository Aspects

A Study on E-Learning and Recommendation System

An Analog Circuit Fault Diagnosis Approach Based on Wavelet-based Fractal Analysis and Multiple Kernel SVM

Revolutionizing Agriculture: A Comprehensive Review of IoT Farming Technologies