by U Sahin, H Li, Q Khan, D Cremers and T Volker
Reference:
Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining (U Sahin, H Li, Q Khan, D Cremers and T Volker), In IEEE Winter Conference on Applications of Computer Vision (WACV, 2024. ([arXiv][project page][code])
Bibtex Entry:
@inproceedings{compreason2024,
title = {Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining},
booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV},
author = {U Sahin and H Li and Q Khan and D Cremers and T Volker},
year = {2024},
keywords = {neural networks, deep learning, Large Language Models},
}