GitHub - IPADS-SAI/MobiAgent: The Intelligent GUI Agent for Mobile Phones

MobiAgent: A Systematic Framework for Customizable Mobile Agents

| Paper | Huggingface | App |

English | 中文

About

MobiAgent is a powerful and customizable mobile agent system including:

An agent model family: MobiMind
An agent acceleration framework: AgentRR
An agent benchmark: MobiFlow

System Architecture:

News

[2025.11.03] ✅ Added multi-task execution module support and user preference support. For details about multi-task usage and configuration, see here.
[2025.11.03] 🧠 Introduced a user profile memory system: async preference extraction with LLM, raw-text preference storage and retrieval, optional GraphRAG via Neo4j. Preferences are retrieved as original texts and appended to experience prompts to personalize planning, see here.
[2025.10.31] 🔥We've updated the MobiMind-Mixed model based on Qwen3-VL-4B-Instruct! Download it at MobiMind-Mixed-4B-1031, and add --use_qwen3 flag when running dataset creation and agent runner scripts.
[2025.9.30] 🚀 added a local experience retrieval module, supporting experience query based on task description, enhancing the intelligence and efficiency of task planning!
[2025.9.29] We've open-sourced a mixed version of MobiMind, capable of handling both Decider and Grounder tasks! Feel free to download and try it at MobiMind-Mixed-7B.
[2025.8.30] We've open-sourced the MobiAgent!

Evaluation Results

Demo

Mobile App Demo:

MobiAgent_Demo.mp4

AgentRR Demo (Left: first task; Right: subsequent task)

AgentRR.mp4

Multi Task Demo

task: 帮我在小红书找一下推荐的最畅销的男士牛仔裤，然后在淘宝搜这一款裤子，把淘宝中裤子品牌、名称和价格用微信发给小赵

Multi_Task_Demo.mp4

Project Structure

agent_rr/ - Agent Record & Replay framework
collect/ - Data collection, annotation, processing and export tools
runner/ - Agent executor that connects to phone via ADB, executes tasks, and records execution traces
MobiFlow/ - Agent evaluation benchmark based on milestone DAG
app/ - MobiAgent Android app
deployment/ - Service deployment for MobiAgent mobile application

Quick Start

Use with MobiAgent APP

If you would like to try MobiAgent directly with our APP, please download it in Download Link and enjoy yourself!

Use with Python Scripts

If you would like to try MobiAgent with python scripts which leverage Android Debug Bridge (ADB) to control your phone, please follow these steps:

Environment Setup

Create virtual environment, e.g., using conda:

conda create -n MobiMind python=3.10
conda activate MobiMind

Simplest environment setup (in case you want to run the agent runner alone, and do not want heavy dependencies like torch to be installed):

# Install simplest dependencies
pip install -r requirements_simple.txt

Full environment setup (in case you want to run the full pipeline):

pip install -r requirements.txt

# Download OmniParser model weights
for f in icon_detect/{train_args.yaml,model.pt,model.yaml} ; do huggingface-cli download microsoft/OmniParser-v2.0 "$f" --local-dir weights; done

# Download embedding model utils
huggingface-cli download BAAI/bge-small-zh --local-dir ./utils/experience/BAAI/bge-small-zh

# Install OCR utils
sudo apt install tesseract-ocr tesseract-ocr-chi-sim  # chinese optional

# If you need GPU acceleration for OCR, install paddlepaddle-gpu according to your CUDA version
# For details, refer to https://www.paddlepaddle.org.cn/install/quick, for example CUDA 11.8:
python -m pip install paddlepaddle-gpu==3.1.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

Mobile Device Setup

Download and install ADBKeyboard on your Android device
Enable Developer Options on your Android device and allow USB debugging
Connect your phone to the computer using a USB cable

Model Deployment

After downloading the decider, grounder, and planner models, use vLLM to deploy model inference services:

vllm serve IPADS-SAI/MobiMind-Decider-7B --port <decider port>
vllm serve IPADS-SAI/MobiMind-Grounder-3B --port <grounder port>
vllm serve Qwen/Qwen3-4B-Instruct --port <planner port>

Launch Agent Runner

Write the list of tasks that you would like to test in runner/mobiagent/task.json, then launch agent runner:

python -m runner.mobiagent.mobiagent --service_ip <Service IP> --decider_port <Decider Service Port> --grounder_port <Grounder Service Port> --planner_port <Planner Service Port>

Parameters:

--service_ip: Service IP (default: localhost)
--decider_port: Decider service port (default: 8000)
--grounder_port: Grounder service port (default: 8001)
--planner_port: Planner service port (default: 8002)

The runner automatically controls the device and invoke agent models to complete the pre-defined tasks.

Detailed Sub-module Usage

For detailed usage instructions, see the README.md files in each sub-module directory.

Citation

If you find MobiAgent useful in your research, please feel free to cite our paper:

@misc{zhang2025mobiagentsystematicframeworkcustomizable,
  title={MobiAgent: A Systematic Framework for Customizable Mobile Agents}, 
  author={Cheng Zhang and Erhu Feng and Xi Zhao and Yisheng Zhao and Wangbo Gong and Jiahui Sun and Dong Du and Zhichao Hua and Yubin Xia and Haibo Chen},
  year={2025},
  eprint={2509.00531},
  archivePrefix={arXiv},
  primaryClass={cs.MA},
  url={https://arxiv.org/abs/2509.00531}, 
}

Acknowledgements

We gratefully acknowledge the open-source projects like MobileAgent, UI-TARS, and Qwen-VL, etc. We also thank the National Innovation Institute of High-end Smart Appliances for their support of this project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MobiAgent: A Systematic Framework for Customizable Mobile Agents

About

News

Evaluation Results

Demo

Project Structure

Quick Start

Use with MobiAgent APP

Use with Python Scripts

Environment Setup

Mobile Device Setup

Model Deployment

Launch Agent Runner

Detailed Sub-module Usage

Citation

Acknowledgements

Star History

About

Uh oh!

Releases 1

Packages

Contributors 6

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
MobiFlow		MobiFlow
agent_rr		agent_rr
app		app
assets		assets
collect		collect
deployment		deployment
prompts		prompts
runner		runner
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
msyh.ttf		msyh.ttf
requirements.txt		requirements.txt
requirements_simple.txt		requirements_simple.txt

License

IPADS-SAI/MobiAgent

Folders and files

Latest commit

History

Repository files navigation

MobiAgent: A Systematic Framework for Customizable Mobile Agents

About

News

Evaluation Results

Demo

Project Structure

Quick Start

Use with MobiAgent APP

Use with Python Scripts

Environment Setup

Mobile Device Setup

Model Deployment

Launch Agent Runner

Detailed Sub-module Usage

Citation

Acknowledgements

Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 6

Uh oh!

Languages

Packages