Jump to content

Overview of ImageCLEFmedical 2024 – Caption Prediction and Concept Detection

Fast facts

Content

The ImageCLEFmedical 2024 Caption task on caption prediction and concept detection follows similar challenges held from 2017-2023.
held from 2017-2023. The goal is to extract Unified Medical Language System (UMLS) concept annotations and/or define captions from image data.
define captions from image data. Predictions are compared to original image captions. Images for both tasks
are part of the Radiology Objects in COntext version 2 (ROCOv2) dataset. For concept detection, multi-label
predictions are compared against UMLS terms extracted from the original captions with additional manually
curated concepts via the F1-score. For caption prediction, the semantic similarity of the predictions to the original
captions is evaluated using the BERTScore. The task attracted strong participation with 50 registered teams,
14 teams submitted 82 graded runs for the two subtasks. Participants mainly used multi-label classification
systems for the concept detection subtask, the winning team DBS-HHU utilized an ensemble of four different
Convolutional Neural Networks (CNNs). For the caption prediction subtask, most teams used encoder-decoder
frameworks with various backbones, including transformer-based decoders and Long Short-Term Memories
(LSTMs), with the winning team PCLmed using medical vision-language foundation models (Med-VLFMs) by
combining general and specialist vision models.

About the publication

Notes and references

This site uses cookies to ensure the functionality of the website and to collect statistical data. You can object to the statistical collection via the data protection settings (opt-out).

Settings(Opens in a new tab)