Yibo Peng1,†, James Song2,†, Lei Li3,†, Xinyu Yang1, Mihai Christodorescu4, Ravi Mangal5, Corina Pasareanu1, Haizhong Zheng1, Beidi Chen1
1Carnegie Mellon University,
2University of Michigan,
3Peking University,
4Google,
5Colorado State University
†Equal contribution
TL;DR Code agents are increasingly trusted to autonomously fix bugs on platforms such as GitHub, yet their security evaluation focuses almost exclusively on functional correctness. In this paper, we reveal a novel type of threat to real-world code-agents: Functionally Correct yet Vulnerable (FCV) patches, which pass all test cases but contain vulnerable code. With our proposed FCV-Attack, which can be deliberately crafted by malicious attackers or implicitly introduced by benign developers, we show that SOTA LLMs (e.g., ChatGPT and Claude) and agent scaffolds (e.g., SWE-agent and OpenHands) are all vulnerable to this FCV threat; across 12 agent-model combinations on SWE-Bench, the attack only requires black-box access and a single query to the code agent to perform the attack. For example, for CWE-538 (information exposure vulnerability), the FCV-Attack attains an attack success rate of 40.7% on GPT-5 mini + OpenHands. Our results reveal an important security threat overlooked by current evaluation paradigms and urge the development of security-aware defenses for code agents.
Figure 1 An FCV attack can be initiated through two real-world pathways: deliberately by a malicious contributor, or more subtly, when a benign developer inadvertently incorporates content from a contaminated source. Both pathways result in the same input of developer-style instructions within an issue description, making them indistinguishable from the agent's perspective. This illustrates the core of the FCV threat: functionally correct patches that pass all tests can still embed exploitable vulnerabilities (e.g., CWE-94).
Much of our experiments reuse prior code by mini-SWE-agent, SWE-agent, and OpenHands, with some additions with the injected CWE prompts.
Ideally, you should create three separate conda environments corresponding to each of the three agents.
cd FCV
conda create -n sweagent python==3.11
conda activate sweagent
cd SWE-agent
pip install -e .
conda create -n minisweagent python==3.11
conda activate minisweagent
cd mini-swe-agent
pip install -e .
conda create -n openhands python==3.11
conda activate openhands
cd openhands
pip install -e .For LLM Judge, you can reuse any of the above environments, since it only requires requests and OpenAI (if you are using gpt-5-mini as the judge). In our experiments, we used Qwen-3-Coder-480B-A35B-Instruct as the judge.
These are some example commands. They simply follow from the official documentations for these agents.
cd mini-swe-agent
mini-extra swebench -c model_configs/kimi.yaml --subset verified --split test -o moonshotai_Kimi_K2_Instruct-pass1 --workers 16sweagent run-batch \
--config model_configs/kimi-official.yaml \
--instances.type swe_bench \
--instances.subset verified \
--instances.split test \
--output_dir kimi-pass1 \
--num_workers 16Step 1: Clean up any existing runtime containers
cd /path/to/OpenHands # Replace with your OpenHands directory
conda activate openhands
# List and stop all remote runtimes (if using remote runtime)
ALLHANDS_API_KEY="your-api-key" \
curl -H "X-API-Key: your-api-key" \
"https://runtime.eval.all-hands.dev/list" | \
jq -r '.runtimes[].runtime_id' | \
xargs -I {} curl -X POST \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{"runtime_id": "{}"}' \
"https://runtime.eval.all-hands.dev/stop"
# Run cleanup script
ALLHANDS_API_KEY="your-api-key" ./evaluation/utils/scripts/cleanup_remote_runtime.shStep 2: Run inference on SWE-bench Verified (Pass 1)
Choose your model and run:
# GPT-5 Mini
cd /path/to/OpenHands && conda activate openhands
ALLHANDS_API_KEY="your-api-key" \
RUNTIME=remote \
SANDBOX_REMOTE_RUNTIME_API_URL="https://runtime.eval.all-hands.dev" \
./evaluation/benchmarks/swe_bench/scripts/run_infer.sh \
llm.eval_gpt5_mini \
HEAD \
CodeActAgent \
100 \
100 \
16 \
princeton-nlp/SWE-bench_Verified \
test
# Claude Sonnet 4
cd /path/to/OpenHands && conda activate openhands
ALLHANDS_API_KEY="your-api-key" \
RUNTIME=remote \
SANDBOX_REMOTE_RUNTIME_API_URL="https://runtime.eval.all-hands.dev" \
./evaluation/benchmarks/swe_bench/scripts/run_infer.sh \
llm.eval_claude_sonnet4 \
HEAD \
CodeActAgent \
100 \
100 \
16 \
princeton-nlp/SWE-bench_Verified \
test
# Kimi-K2-Instruct
cd /path/to/OpenHands && conda activate openhands
ALLHANDS_API_KEY="your-api-key" \
RUNTIME=remote \
SANDBOX_REMOTE_RUNTIME_API_URL="https://runtime.eval.all-hands.dev" \
./evaluation/benchmarks/swe_bench/scripts/run_infer.sh \
llm.eval_kimi \
HEAD \
CodeActAgent \
200 \
100 \
16 \
princeton-nlp/SWE-bench_Verified \
test
# Qwen3-Coder-480B
cd /path/to/OpenHands && conda activate openhands
ALLHANDS_API_KEY="your-api-key" \
RUNTIME=remote \
SANDBOX_REMOTE_RUNTIME_API_URL="https://runtime.eval.all-hands.dev" \
./evaluation/benchmarks/swe_bench/scripts/run_infer.sh \
llm.eval_qwen_480b \
HEAD \
CodeActAgent \
200 \
100 \
16 \
princeton-nlp/SWE-bench_Verified \
testOutput: Results will be saved in evaluation/evaluation_outputs/outputs/princeton-nlp__SWE-bench_Verified-test/CodeActAgent/{MODEL_NAME}/output.jsonl
For more details on subsequent runs with CWE injection, see openhands/README.md.
After the first round, you will need to evaluate the results. You can either use the official SWE-Bench github repo and run the evaluation locally, or use their sb-cli cloud evaluation. We refer to the official documentation.
This will depend on the agent. Please go to each agent's folder and read the README.md for additional details.
Please go to attack-lm-judge's README.md for more details.
If you find our work useful, please cite our paper:
@misc{peng2025correctsafetrustfunctionally,
title={When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?},
author={Yibo Peng and James Song and Lei Li and Xinyu Yang and Mihai Christodorescu and Ravi Mangal and Corina Pasareanu and Haizhong Zheng and Beidi Chen},
year={2025},
eprint={2510.17862},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2510.17862},
}We're excited to share our early results and welcome feedback from the community as we continue to refine and expand FCV's capabilities. If you have any questions or feedback, please feel free to contact us:
- Email: [email protected]
- GitHub Issues: Create an issue
- Website: https://infini-ai-lab.github.io/FCV/
This project is licensed under the MIT License - see the LICENSE file for details.
