Content Indexing with Automatic Image Captioning using Contextual Information Fusion

Authors

  • Karthik C. Kallur, Manish Yadav, Himanshu Gaupale, Harmanpreet Singh, Prof. Rushali Patil

Keywords:

CNN, Bi-LSTM, image caption generation, semantic feature, content indexing

Abstract

Dealing with and retrieving collections of photos presents a big challenge in the modern world. Effective image organization and retrieval depend on effective content indexing. In order to achieve content indexing, this research study offers a technique that blends automatic image captioning with information fusion. Convolutional Neural Networks (CNNs) are used in our suggested strategy to extract characteristics from photos by utilizing cutting-edge machine learning techniques. The Bidirectional Long Short Term Memory (bi LSTM) network uses these features as input to create captions for the images. These captions, together with other data like user-generated tags or metadata, let us understand the image's content completely. This information fusion greatly enhances picture indexing's accuracy and relevance, making it more suited to users' demands and flexible across applications. We demonstrate the efficacy of our methodology through experiments on image datasets by displaying considerable improvements in precision and recall rates for content indexing. The advancement of content management systems, multimedia retrieval methods, and user-centric methods of organizing photos through this research will lead to more effective searching and exploration experiences.

 

Published

2023-11-21

How to Cite

Karthik C. Kallur, Manish Yadav, Himanshu Gaupale, Harmanpreet Singh, Prof. Rushali Patil. (2023). Content Indexing with Automatic Image Captioning using Contextual Information Fusion. SJIS-P, 35(3), 658–665. Retrieved from http://sjis.scandinavian-iris.org/index.php/sjis/article/view/735

Issue

Section

Articles