•  
  •  
 

Turkish Journal of Electrical Engineering and Computer Sciences

Author ORCID Identifier

SERDAR YILDIZ: 0000-0002-0430-690X

ABBAS MEMİŞ: 0000-0003-2645-8071

SONGÜL VARLI: 0000-0002-1786-6869

Abstract

This paper introduces a novel and high-performance encoder-decoder-based deep model called TRCaption Net++ for generic Turkish image captioning tasks. The proposed model is an improved and refined version of TRCap tionNet, which essentially employs a CLIP (contrastive language–image pretraining) image encoder, a feature projection layer and a BERT (bidirectional encoder representations from transformers) text decoder. Within the scope of the study, the regular TRCaptionNet model was trained and specifically fine-tuned with a massive set of image data. In this respect, approximately 2,000,000 random images representing the words in the MS COCO and Flickr caption sets were retrieved through web crawling in the initial stage. Then, nearly 8,000,000 caption texts were generated for each image via 4 different image captioning models. Finally, the text decoder module of the proposed model was improved by using the image-caption features of these crawled images. The performance evaluation test of the TRCaptionNet++ model was carried out on two Turkish caption datasets (TasvirEt and Turkish MS COCO) and two machine-translated caption sets (MS COCO and Flickr30K) by measuring common image captioning metrics such as BLEU, METEOR, ROUGE-L, CIDEr and SPICE. As a result of the performance tests, quite remarkable captioning success rates were achieved and it is observed that the proposed model has a superior performance outperforming all the related works. Project details and demo links of TRCaptionNet++ will also be available on the project’s page https://serdaryildiz.com/TRCaptionNetpp

DOI

10.55730/1300-0632.4150

Keywords

Image captioning, Turkish image captioning, image encoders, text decoders, deep learning

First Page

669

Last Page

687

Publisher

The Scientific and Technological Research Council of Türkiye (TÜBİTAK)

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS