Multilingual Image Caption Generator

Journal: GRENZE International Journal of Engineering and Technology
Authors: Milind Rane, Piyush Waghulde, Srushti Yadav, Pranamya Vemula, Om Wagh
Volume: 10 Issue: 1
Grenze ID: 01.GIJET.10.1.562_1 Pages: 1972-1977

Abstract

The rise of social media and global communication requires multilingual image captioning for diverse participants, enabling cross-cultural understanding and engagement. By automatically generating captions in multiple languages, this technology enables seamless communication and enriches the user experience in an increasingly interconnected world. Existing image captioning models leverage alternative methodologies, such as transformerbased models, attention mechanisms, and pre-trained language models, to generate descriptive captions for images without relying on the CNN-LSTM architecture. These approaches provide effective solutions for image captioning, showcasing advancements in natural language processing and machine learning. However, the integration of CNN-LSTM architecture with the Flickr8k dataset and translation enables more accurate and contextually relevant multilingual image captions, enhancing versatility and adaptability for real-world applications. This approach combines deep learning techniques, diverse training data, and machine translation capabilities for superior multilingual image caption generation. The model's accuracy is assessed using accepted evaluation metrics and the model generates accurate captions in multiple languages for the images provided

Download Now << BACK

GIJET