--- base_model: QWen/QWen2-VL-7B-Instruct library_name: transformers license: apache-2.0 pipeline_tag: image-text-to-text --- # Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection This repository contains the RA-HMD model presented in the paper [Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection](https://huggingface.co/papers/2502.13061). ## Model Details ### Model Description RA-HMD proposes a robust adaptation framework for hateful meme detection that enhances in-domain accuracy and cross-domain generalization while preserving the general vision-language capabilities of LMMs. It achieves improved robustness under adversarial attacks compared to SFT models and demonstrates state-of-the-art performance across various meme classification datasets. Additionally, RA-HMD generates higher-quality rationales for explaining hateful content, enhancing model interpretability. - **Developed by:** Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne - **Model type:** Fine-tuned QWen2-VL-7B-Instruct using PEFT (LoRA) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model:** `QWen/QWen2-VL-7B-Instruct` ### Model Sources - **Repository:** https://github.com/JingbiaoMei/RGCL - **Paper:** https://huggingface.co/papers/2502.13061 - **Project page:** https://rgclmm.github.io/ ## Uses ### Direct Use The model is intended for robust hateful meme detection and generating explanatory rationales. ### Out-of-Scope Use This model is specifically trained for hateful meme detection. Using it for general image captioning or unrelated classification tasks may lead to suboptimal results. ## How to Get Started with the Model Refer to the [GitHub repository](https://github.com/JingbiaoMei/RGCL) for detailed installation and usage instructions. The RA-HMD Stage 1 code is released as a submodule in [LLaMA-Factory@a88f610](https://github.com/JingbiaoMei/LLaMA-Factory-LMM-RGCL/tree/a88f610e9fa46d1ef1669c5dbc39ee9008f95c21). ## Citation If our work helped your research, please kindly cite our paper: ```bibtex @article{RAHMD2025Mei, title={Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection}, url={http://arxiv.org/abs/2502.13061}, DOI={10.48550/arXiv.2502.13061}, note={arXiv:2502.13061 [cs]}, number={arXiv.2502.13061}, publisher={arXiv}, author={Mei, Jingbiao and Chen, Jinghong and Yang, Guangyu and Lin, Weizhe and Byrne, Bill}, year={2025}, month=may } ```