ISSN:2582-5208

www.irjmets.com

Paper Key : IRJ************491
Author: Abhishek Arvind Pawar,Tejas Pandurang Pawar,Kartik Sunil Pawar,Durga Ravindra Konde
Date Published: 06 Apr 2024
Abstract
In the field of caption generation, a system is developed to produce descriptive captions for images using natural language. This process involves understanding the content of both the image and the accompanying text. Caption generation plays a crucial role in both natural language processing and image processing domains. Recently, there has been a growing interest among researchers in employing deep learning techniques to build caption generation systems. Deep learning offers the advantage of constructing an intermediate representation that is shared between image processing and natural language processing tasks.The caption generation system comprises two main modules: an image processing module and a language model module. These modules are trained simultaneously using a dataset specifically curated for this purpose. Our experimental results highlight the effectiveness of our collaborative approach with deep learning and the improvements observed in caption generation.Keywords: multimodal learning, deep learning, caption generation, natural language processing, image processing
DOI LINK : 10.56726/IRJMETS51908 https://www.doi.org/10.56726/IRJMETS51908
Paper File to download :