Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.CV

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

  • Fri, 3 Oct 2025
  • Thu, 2 Oct 2025
  • Wed, 1 Oct 2025
  • Tue, 30 Sep 2025
  • Mon, 29 Sep 2025

See today's new changes

Total of 1031 entries : 1-50 51-100 101-150 151-200 ... 1001-1031
Showing up to 50 entries per page: fewer | more | all

Fri, 3 Oct 2025 (showing first 50 of 112 entries )

[1] arXiv:2510.02315 [pdf, html, other]
Title: Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity
Eric Tillmann Bill, Enis Simsar, Thomas Hofmann
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[2] arXiv:2510.02314 [pdf, html, other]
Title: StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided Illusions
Bo-Hsu Ke, You-Zhe Xie, Yu-Lun Liu, Wei-Chen Chiu
Comments: ICCV 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2510.02313 [pdf, html, other]
Title: Clink! Chop! Thud! -- Learning Object Sounds from Real-World Interactions
Mengyu Yang, Yiming Chen, Haozheng Pei, Siddhant Agarwal, Arun Balajee Vasudevan, James Hays
Comments: ICCV 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2510.02311 [pdf, html, other]
Title: Inferring Dynamic Physical Properties from Video Foundation Models
Guanqi Zhan, Xianzheng Ma, Weidi Xie, Andrew Zisserman
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[5] arXiv:2510.02307 [pdf, html, other]
Title: NoiseShift: Resolution-Aware Noise Recalibration for Better Low-Resolution Image Generation
Ruozhen He, Moayed Haji-Ali, Ziyan Yang, Vicente Ordonez
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[6] arXiv:2510.02295 [pdf, html, other]
Title: VideoNSA: Native Sparse Attention Scales Video Understanding
Enxin Song, Wenhao Chai, Shusheng Yang, Ethan Armand, Xiaojun Shan, Haiyang Xu, Jianwen Xie, Zhuowen Tu
Comments: Project Page: this https URL, Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[7] arXiv:2510.02287 [pdf, html, other]
Title: MultiModal Action Conditioned Video Generation
Yichen Li, Antonio Torralba
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2510.02284 [pdf, html, other]
Title: Learning to Generate Object Interactions with Physics-Guided Video Diffusion
David Romero, Ariana Bermudez, Hao Li, Fabio Pizzati, Ivan Laptev
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[9] arXiv:2510.02283 [pdf, other]
Title: Self-Forcing++: Towards Minute-Scale High-Quality Video Generation
Justin Cui, Jie Wu, Ming Li, Tao Yang, Xiaojie Li, Rui Wang, Andrew Bai, Yuanhao Ban, Cho-Jui Hsieh
Comments: preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[10] arXiv:2510.02282 [pdf, html, other]
Title: VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL
Kyoungjun Park, Yifan Yang, Juheon Yi, Shicheng Zheng, Yifei Shen, Dongqi Han, Caihua Shan, Muhammad Muaz, Lili Qiu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[11] arXiv:2510.02270 [pdf, html, other]
Title: microCLIP: Unsupervised CLIP Adaptation via Coarse-Fine Token Fusion for Fine-Grained Image Classification
Sathira Silva, Eman Ali, Chetan Arora, Muhammad Haris Khan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[12] arXiv:2510.02266 [pdf, html, other]
Title: NeuroSwift: A Lightweight Cross-Subject Framework for fMRI Visual Reconstruction of Complex Scenes
Shiyi Zhang, Dong Liang, Yihang Zhou
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[13] arXiv:2510.02264 [pdf, other]
Title: Paving the Way Towards Kinematic Assessment Using Monocular Video: A Preclinical Benchmark of State-of-the-Art Deep-Learning-Based 3D Human Pose Estimators Against Inertial Sensors in Daily Living Activities
Mario Medrano-Paredes, Carmen Fernández-González, Francisco-Javier Díaz-Pernas, Hichem Saoudi, Javier González-Alonso, Mario Martínez-Zarzuela
Comments: All tables, graphs and figures generated can be obtained in the Zenodo repository complementary to this work: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[14] arXiv:2510.02262 [pdf, html, other]
Title: From Frames to Clips: Efficient Key Clip Selection for Long-Form Video Understanding
Guangyu Sun, Archit Singhal, Burak Uzkent, Mubarak Shah, Chen Chen, Garin Kessler
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[15] arXiv:2510.02253 [pdf, html, other]
Title: DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
Zihan Zhou, Shilin Lu, Shuli Leng, Shaocong Zhang, Zhuming Lian, Xinlei Yu, Adams Wai-Kin Kong
Comments: Preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[16] arXiv:2510.02240 [pdf, html, other]
Title: RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Sicheng Feng, Kaiwen Tuo, Song Wang, Lingdong Kong, Jianke Zhu, Huan Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[17] arXiv:2510.02226 [pdf, html, other]
Title: TempoControl: Temporal Attention Guidance for Text-to-Video Models
Shira Schiber, Ofir Lindenbaum, Idan Schwartz
Comments: Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[18] arXiv:2510.02213 [pdf, html, other]
Title: MMDEW: Multipurpose Multiclass Density Estimation in the Wild
Villanelle O'Reilly, Jonathan Cox, Georgios Leontidis, Marc Hanheide, Petra Bosilj, James Brown
Comments: 8+1 pages, 4 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2510.02197 [pdf, html, other]
Title: Cross-Breed Pig Identification Using Auricular Vein Pattern Recognition: A Machine Learning Approach for Small-Scale Farming Applications
Emmanuel Nsengiyumvaa, Leonard Niyitegekaa, Eric Umuhoza
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[20] arXiv:2510.02186 [pdf, html, other]
Title: GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation
Weijia Dou, Xu Zhang, Yi Bin, Jian Liu, Bo Peng, Guoqing Wang, Yang Yang, Heng Tao Shen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[21] arXiv:2510.02155 [pdf, html, other]
Title: Unlocking Vision-Language Models for Video Anomaly Detection via Fine-Grained Prompting
Shu Zou, Xinyu Tian, Lukas Wesemann, Fabian Waschkowski, Zhaoyuan Yang, Jing Zhang
Comments: 14 pages, video anomaly detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[22] arXiv:2510.02114 [pdf, html, other]
Title: FRIEREN: Federated Learning with Vision-Language Regularization for Segmentation
Ding-Ruei Shen
Comments: Master Thesis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2510.02100 [pdf, html, other]
Title: When Tracking Fails: Analyzing Failure Modes of SAM2 for Point-Based Tracking in Surgical Videos
Woowon Jang, Jiwon Im, Juseung Choi, Niki Rashidian, Wesley De Neve, Utku Ozbulak
Comments: Accepted for publication in the 28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) Workshop on Collaborative Intelligence and Autonomy in Image-guided Surgery (COLAS), 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[24] arXiv:2510.02097 [pdf, other]
Title: Mapping Historic Urban Footprints in France: Balancing Quality, Scalability and AI Techniques
Walid Rabehi, Marion Le Texier, Rémi Lemoy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2510.02086 [pdf, html, other]
Title: VGDM: Vision-Guided Diffusion Model for Brain Tumor Detection and Segmentation
Arman Behnam
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2510.02043 [pdf, html, other]
Title: Zero-shot Human Pose Estimation using Diffusion-based Inverse solvers
Sahil Bhandary Karnoor, Romit Roy Choudhury
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[27] arXiv:2510.02034 [pdf, html, other]
Title: GaussianMorphing: Mesh-Guided 3D Gaussians for Semantic-Aware Object Morphing
Mengtian Li, Yunshu Bai, Yimin Chu, Yijun Shen, Zhongmei Li, Weifeng Ge, Zhifeng Xie, Chaofeng Chen
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2510.02030 [pdf, html, other]
Title: kabr-tools: Automated Framework for Multi-Species Behavioral Monitoring
Jenna Kline, Maksim Kholiavchenko, Samuel Stevens, Nina van Tiel, Alison Zhong, Namrata Banerji, Alec Sheets, Sowbaranika Balasubramaniam, Isla Duporge, Matthew Thompson, Elizabeth Campolongo, Jackson Miliko, Neil Rosser, Tanya Berger-Wolf, Charles V. Stewart, Daniel I. Rubenstein
Comments: 31 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2510.02028 [pdf, html, other]
Title: LiLa-Net: Lightweight Latent LiDAR Autoencoder for 3D Point Cloud Reconstruction
Mario Resino, Borja Pérez, Jaime Godoy, Abdulla Al-Kaff, Fernando García
Comments: 7 pages, 3 figures, 7 tables, Submitted to ICRA
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[30] arXiv:2510.02001 [pdf, other]
Title: Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework
Nanaka Hosokawa, Ryo Takahashi, Tomoya Kitano, Yukihiro Iida, Chisako Muramatsu, Tatsuro Hayashi, Yuta Seino, Xiangrong Zhou, Takeshi Hara, Akitoshi Katsumata, Hiroshi Fujita
Comments: Intended for submission to Scientific Reports
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[31] arXiv:2510.01997 [pdf, html, other]
Title: Pure-Pass: Fine-Grained, Adaptive Masking for Dynamic Token-Mixing Routing in Lightweight Image Super-Resolution
Junyu Wu, Jie Tang, Jie Liu, Gangshan Wu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2510.01991 [pdf, html, other]
Title: 4DGS-Craft: Consistent and Interactive 4D Gaussian Splatting Editing
Lei Liu, Can Wang, Zhenghao Chen, Dong Xu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2510.01990 [pdf, html, other]
Title: TriAlignXA: An Explainable Trilemma Alignment Framework for Trustworthy Agri-product Grading
Jianfei Xie, Ziyang Li
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[34] arXiv:2510.01954 [pdf, html, other]
Title: Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Yongyi Su, Haojie Zhang, Shijie Li, Nanqing Liu, Jingyi Liao, Junyi Pan, Yuan Liu, Xiaofen Xing, Chong Sun, Chen Li, Nancy F. Chen, Shuicheng Yan, Xulei Yang, Xun Xu
Comments: 24 pages, 12 figures and 9 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2510.01948 [pdf, html, other]
Title: ClustViT: Clustering-based Token Merging for Semantic Segmentation
Fabio Montello, Ronja Güldenring, Lazaros Nalpantidis
Comments: Submitted to IEEE
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2510.01934 [pdf, other]
Title: Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors
Guangyao Zhai, Yue Zhou, Xinyan Deng, Lars Heckler, Nassir Navab, Benjamin Busam
Comments: 23 pages, 13 figures. Code is available at \url{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[37] arXiv:2510.01914 [pdf, html, other]
Title: Automated Defect Detection for Mass-Produced Electronic Components Based on YOLO Object Detection Models
Wei-Lung Mao, Chun-Chi Wang, Po-Heng Chou, Yen-Ting Liu
Comments: 12 pages, 16 figures, 7 tables, and published in IEEE Sensors Journal
Journal-ref: IEEE Sensors Journal, vol. 24, no. 16, Aug. 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP)
[38] arXiv:2510.01912 [pdf, html, other]
Title: Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Yi Ai, Yuanhao Cai, Yulun Zhang, Xiaokang Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2510.01841 [pdf, other]
Title: Leveraging Prior Knowledge of Diffusion Model for Person Search
Giyeol Kim, Sooyoung Yang, Jihyong Oh, Myungjoo Kang, Chanho Eom
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2510.01829 [pdf, html, other]
Title: Calibrating the Full Predictive Class Distribution of 3D Object Detectors for Autonomous Driving
Cornelius Schröder, Marius-Raphael Schlüter, Markus Lienkamp
Journal-ref: 2025 IEEE Intelligent Vehicles Symposium (IV), Cluj-Napoca, Romania, 2025, pp. 187-194
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2510.01784 [pdf, html, other]
Title: Pack and Force Your Memory: Long-form and Consistent Video Generation
Xiaofei Wu, Guozhen Zhang, Zhiyong Xu, Yuan Zhou, Qinglin Lu, Xuming He
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[42] arXiv:2510.01767 [pdf, html, other]
Title: LOBE-GS: Load-Balanced and Efficient 3D Gaussian Splatting for Large-Scale Scene Reconstruction
Sheng-Hsiang Hung, Ting-Yu Yen, Wei-Fang Sun, Simon See, Shih-Hsuan Hung, Hung-Kuo Chu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2510.01715 [pdf, html, other]
Title: PyramidStyler: Transformer-Based Neural Style Transfer with Pyramidal Positional Encoding and Reinforcement Learning
Raahul Krishna Durairaju (1), K. Saruladha (2) ((1) California State University, Fullerton, (2) Puducherry Technological University)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[44] arXiv:2510.01704 [pdf, html, other]
Title: Holistic Order Prediction in Natural Scenes
Pierre Musacchio, Hyunmin Lee, Jaesik Park
Comments: 25 pages, 11 figures, 6 tables
Journal-ref: The Thirty-Ninth Annual Conference on Neural Information Processing Systems (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[45] arXiv:2510.01691 [pdf, html, other]
Title: MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs
Jiyao Liu, Jinjie Wei, Wanying Qu, Chenglong Ma, Junzhi Ning, Yunheng Li, Ying Chen, Xinzhe Luo, Pengcheng Chen, Xin Gao, Ming Hu, Huihui Xu, Xin Wang, Shujian Gao, Dingkang Yang, Zhongying Deng, Jin Ye, Lihao Liu, Junjun He, Ningsheng Xu
Comments: 26 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2510.01686 [pdf, html, other]
Title: FreeViS: Training-free Video Stylization with Inconsistent References
Jiacong Xu, Yiqun Mei, Ke Zhang, Vishal M. Patel
Comments: Project Page: \url{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2510.01683 [pdf, html, other]
Title: Uncovering Overconfident Failures in CXR Models via Augmentation-Sensitivity Risk Scoring
Han-Jay Shu, Wei-Ning Chiu, Shun-Ting Chang, Meng-Ping Huang, Takeshi Tohyama, Ahram Han, Po-Chih Kuo
Comments: 5 pages, 1 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2510.01681 [pdf, html, other]
Title: Look Less, Reason More: Rollout-Guided Adaptive Pixel-Space Reasoning
Xuchen Li, Xuzhao Li, Jiahui Gao, Renjie Pi, Shiyu Hu, Wentao Zhang
Comments: Preprint, Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[49] arXiv:2510.01678 [pdf, html, other]
Title: An Efficient Deep Template Matching and In-Plane Pose Estimation Method via Template-Aware Dynamic Convolution
Ke Jia, Ji Zhou, Hanxin Li, Zhigan Zhou, Haojie Chu, Xiaojie Li
Comments: Published in Expert Systems with Applications
Journal-ref: Expert Systems with Applications, Volume 298, Part D, 1 March 2026, 129813
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2510.01669 [pdf, html, other]
Title: UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Jin Cao, Hongrui Wu, Ziyong Feng, Hujun Bao, Xiaowei Zhou, Sida Peng
Comments: page: this https URL code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Total of 1031 entries : 1-50 51-100 101-150 151-200 ... 1001-1031
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack