Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning
This repository contains the official implementation of our paper, Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning.
The complete data of VGCure can be found in 🤗Huggingface and Google Driver.
We provide 10 visual graphs as references in this repository for each graph structure in VGCure and for each task in MCDGraph.
In addition, we also provide graphs with different visual styles drawn by networkx and matplotlib here for reference.
The complete data of MCDGraph can be found in 🤗Huggingface and Google Driver.
| Models | URL |
|---|---|
| Qwen2VL-MCDGraph | 🤗Huggingface |
| InternVL2-MCDGraph | 🤗Huggingface |
Please cite our paper if you use VGCURE or MCDGRAPH we provided in your work:
@inproceedings{zhu-etal-2025-benchmarking,
title={Benchmarking and improving large vision-language models for fundamental visual graph understanding and reasoning},
author={Zhu, Yingjie and Bai, Xuefeng and Chen, Kehai and Xiang, Yang and Yu, Jun and Zhang, Min},
booktitle={Proceedings of the 63nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
year = {2025},
publisher = {Association for Computational Linguistics},
}

