Deep Learning-Powered Visual Augmentation for the Visually Impaired

Gandrapu Satya Sai Surya Subrahmanya Venkata Krishna Mohan; Mahammad Firose Shaik; G. Usandra Babu; Manikandan Hariharan; Kiran Kumar Patro

Deep Learning-Powered Visual Augmentation for the Visually Impaired

Authors: Gandrapu Satya Sai Surya Subrahmanya Venkata Krishna Mohan¹, Mahammad Firose Shaik², G. Usandra Babu³, Manikandan Hariharan⁴, Kiran Kumar Patro⁵
View Affiliations Hide Affiliations

¹ Department of Electronics and Communication Engineering, Aditya Institute of Technology and Management, Tekkali, India ² Department of Electronics and Instrumentation Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Deemed to be University, Vijayawada, India ³ Department of Electronics and Communication Engineering, Aditya University, Surampalem, Andhra Pradesh, India ⁴ CMR Institute of Technology, Bangaluru, India ⁵ Department of Electronics and Communication Engineering, Aditya Institute of Technology and Management, Tekkali, India
Source: Blockchain-Enabled Internet of Things Applications in Healthcare: Current Practices and Future Directions , pp 218-233
Publication Date: January 2025
Language: English

The interdisciplinary convergence of computer vision and object detection is pivotal for advancing intelligent image analysis. This research surpasses conventional object recognition methodologies by delving into a more nuanced understanding of images, akin to human visual comprehension. It explores deep learning and established object detection systems such as convolutional neural networks (CNN), Region-based CNN (R-CNN), and you only look once (YOLO). The proposed model excels in realtime object recognition, outperforming its predecessors, as previous systems typically detect only a limited number of objects in an image and are most effective at a distance of 5-6 meters. Uniquely, it employs Google Translate for the verbal identification of detected objects, offering a crucial accessibility feature for individuals with visual impairments. This study integrates computer vision, deep learning, and real-time object recognition to enhance visual perception, providing valuable assistance to those facing visual challenges. The proposed method utilizes the Common Objects in Context (COCO) dataset for image comprehension, employing object detection and object tracking with a deep neural network (DNN). The system's output is converted into spoken words through a text-to-speech feature, empowering visually impaired individuals to comprehend their surroundings effectively. The implementation involves key technologies such as NumPy, OpenCV, pyttsx3, PyWin32, OpenCV-contribpython, and winsound, contributing to a comprehensive system for computer vision and audio processing. Results demonstrate successful execution, with the camera consistently detecting and labeling 5-6 objects in real time.

Hardbound ISBN: 9789815305227

Ebook ISBN: 9789815305210

Book DOI: https://doi.org/10.2174/97898153052101250101

From This Site

/content/books/9789815305210.chapter-10

dcterms_subject,pub_keyword

-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData

10

5

/content/books/9789815305210.chapter-10

dcterms_subject,pub_keyword

-contentType:Journal -contentType:Figure -contentType:Table -contentType:SupplementaryData

10

5

Chapter

content/books/9789815305210

Book

false

en

Deep Learning-Powered Visual Augmentation for the Visually Impaired

From This Site