Enhancing Image Captioning Using Deep Convolutional Generative Adversarial Networks

Tarun Jaiswal; Manju Pandey; Priyanka Tripathi

doi:10.2174/0126662558282389231229063607

ISSN: 2666-2558
E-ISSN: 2666-2566

Enhancing Image Captioning Using Deep Convolutional Generative Adversarial Networks
By Tarun Jaiswal, Manju Pandey and Priyanka Tripathi
Source: Recent Advances in Computer Science and Communications, Volume 17, Issue 5, Jul 2024, p. 37 - 47
DOI: https://doi.org/10.2174/0126662558282389231229063607
- Available online: 01 Jul 2024

Abstract

Introduction: Image caption generation has long been a fundamental challenge in the area of computer vision (CV) and natural language processing (NLP). In this research, we present an innovative approach that harnesses the power of Deep Convolutional Generative Adversarial Networks (DCGAN) and adversarial training to revolutionize the generation of natural and contextually relevant image captions. Method: Our method significantly improves the fluency, coherence, and contextual relevance of generated captions and showcases the effectiveness of RL reward-based fine-tuning. Through a comprehensive evaluation of COCO datasets, our model demonstrates superior performance over baseline and state-of-the-art methods. On the COCO dataset, our model outperforms current state-of-the-art (SOTA) models across all metrics, achieving BLEU-4 (0.327), METEOR (0.249), Rough (0.525) and CIDEr (1.155) scores. Result: The integration of DCGAN and adversarial training opens new possibilities in image captioning, with applications spanning from automated content generation to enhanced accessibility solutions. Conclusion: This research paves the way for more intelligent and context-aware image understanding systems, promising exciting future exploration and innovation prospects.

Article metrics loading...

/content/journals/rascs/10.2174/0126662558282389231229063607

2024-07-01

2025-12-29

From This Site

/content/journals/rascs/10.2174/0126662558282389231229063607

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

/content/journals/rascs/10.2174/0126662558282389231229063607

Article Type: Research Article

Keyword(s): CNN; DCGAN; decoder; discriminator; encoder; generator; RNN

Enhancing Image Captioning Using Deep Convolutional Generative Adversarial Networks

Abstract

From This Site

Most Read This Month

Most Cited Most Cited RSS feed

Key Issues in Software Reliability Growth Models

Remaining Useful Life Prediction of Lithium-ion Batteries Using Multiple Kernel Extreme Learning Machine

An Ensemble of Bacterial Foraging, Genetic, Ant Colony and Particle Swarm Approach EB-GAP: A Load Balancing Approach in Cloud Computing

ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization

Container Elasticity: Based on Response Time using Docker

A Study on E-Learning and Recommendation System

An Analog Circuit Fault Diagnosis Approach Based on Wavelet-based Fractal Analysis and Multiple Kernel SVM

Research on Monitoring System of Daily Statistical Indexes Through Big Data

Biofuels Policy as the Indian Strategy to Achieve the 2030 Sustainable Development Goal 7: Targets, Progress, and Barriers

Revolutionizing Agriculture: A Comprehensive Review of IoT Farming Technologies