Stars
Project AirSim is Microsoft's evolution of AirSim, an advanced simulation platform for building, training, and testing autonomous systems in high-fidelity virtual environments
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Official implementation of the paper "GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation"
NVIDIA Cosmos Dataset Search (CDS) is a comprehensive platform for semantic search across video datasets using advanced AI models.
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
Use python to manage, produce and consume data with Aliyun Log Service.
Python SDK for DingTalk Stream Mode API, Compared with the webhook mode, it is easier to access the DingTalk chatbot
Playwright Model Context Protocol Server - Tool to automate Browsers and APIs in Claude Desktop, Cline, Cursor IDE and More 🔌
A high-throughput and memory-efficient inference and serving engine for LLMs
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.
Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
A code executor for Dify that is compatible with the official sandbox API calls and dependency installation.
dify-on-dingtalk是一个非常轻量级的 Dify 的钉钉机器人集成方案。可以通过简单配置来对接你的Dify应用和企业内部机器人,实现企业内部机器人的群聊、私聊智能问答,且支持钉钉的AI卡片流式打字机输出效果。
A mini, open-weights, version of our Proxy assistant.
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
No fortress, purely open ground. OpenManus is Coming.
The official Python SDK for Model Context Protocol servers and clients
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.