cloudflare · kodster28 · May 1, 2025 · May 1, 2025 · May 1, 2025 · May 1, 2025
@@ -126,14 +126,14 @@ Read the docs for the [Workers oAuth Provider Library](https://github.com/cloudf
 
 If your application already implements an Oauth Provider itself, or you use [Stytch](https://stytch.com/), [Auth0](https://auth0.com/), [WorkOS](https://workos.com/), or authorization-as-a-service provider, you can use this in the same way that you would use a third-party OAuth provider, described above in (2).
 
-You can use the auth provider to: 
+You can use the auth provider to:
 - Allow users to authenticate to your MCP server through email, social logins, SSO (single sign-on), and MFA (multi-factor authentication).
 - Define scopes and permissions that directly map to your MCP tools.
 - Present users with a consent page corresponding with the requested permissions.
-- Enforce the permissions so that agents can only invoke permitted tools.  
+- Enforce the permissions so that agents can only invoke permitted tools.
 
 #### Stytch
-Get started with a [remote MCP server that uses Stytch](https://stytch.com/docs/guides/connected-apps/mcp-servers) to allow users to sign in with email, Google login or enterprise SSO and authorize their AI agent to view and manage their company’s OKRs on their behalf. Stytch will handle restricting the scopes granted to the AI agent based on the user’s role and permissions within their organization. When authorizing the MCP Client, each user will see a consent page that outlines the permissions that the agent is requesting that they are able to grant based on their role.
+Get started with a [remote MCP server that uses Stytch](https://stytch.com/docs/guides/connected-apps/mcp-servers) to allow users to sign in with email, Google login or enterprise SSO and authorize their AI agent to view and manage their company's OKRs on their behalf. Stytch will handle restricting the scopes granted to the AI agent based on the user's role and permissions within their organization. When authorizing the MCP Client, each user will see a consent page that outlines the permissions that the agent is requesting that they are able to grant based on their role.
 
 [![Deploy to Cloudflare](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/ai/tree/main/demos/mcp-stytch-b2b-okr-manager)
 
@@ -144,11 +144,11 @@ For more consumer use cases, deploy a remote MCP server for a To Do app that use
 #### Auth0
 Get started with a remote MCP server that uses Auth0 to authenticate users through email, social logins, or enterprise SSO to interact with their todos and personal data through AI agents. The MCP server securely connects to API endpoints on behalf of users, showing exactly which resources the agent will be able to access once it gets consent from the user. In this implementation, access tokens are automatically refreshed during long running interactions.
 
-To set it up, first deploy the protected API endpoint: 
+To set it up, first deploy the protected API endpoint:
 
 [![Deploy to Cloudflare](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/ai/tree/main/demos/remote-mcp-auth0/todos-api)
 
-Then, deploy the MCP server that handles authentication through Auth0 and securely connects AI agents to your API endpoint. 
+Then, deploy the MCP server that handles authentication through Auth0 and securely connects AI agents to your API endpoint.
 
 [![Deploy to Cloudflare](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/ai/tree/main/demos/remote-mcp-auth0/mcp-auth0-oidc)
 
@@ -190,7 +190,6 @@ function requirePermission(permission, handler) {
   return async (request, context) => {
     // Check if user has the required permission
     const userPermissions = context.props.permissions || [];
-
     if (!userPermissions.includes(permission)) {
       return {
         content: [{ type: "text", text: `Permission denied: requires ${permission}` }],

@@ -27,10 +27,10 @@ The MCP standard supports two modes of operation:
 - **Local MCP connections**: MCP clients connect to MCP servers on the same machine, using [stdio](https://spec.modelcontextprotocol.io/specification/draft/basic/transports/#stdio) as a local transport method.
 
 ### Best Practices
-- **Tool design**: Don’t treat your MCP server as a wrapper around your full API schema. Instead, build tools that are optimized for specific user goals and reliable outcomes. Fewer, well-designed tools often outperform many granular ones, especially for agents with small context windows or tight latency budgets.
+- **Tool design**: Do not treat your MCP server as a wrapper around your full API schema. Instead, build tools that are optimized for specific user goals and reliable outcomes. Fewer, well-designed tools often outperform many granular ones, especially for agents with small context windows or tight latency budgets.
 - **Scoped permissions**: Deploying several focused MCP servers, each with narrowly scoped permissions, reduces the risk of over-privileged access and makes it easier to manage and audit what each server is allowed to do.
 - **Tool descriptions**: Detailed parameter descriptions help agents understand how to use your tools correctly — including what values are expected, how they affect behavior, and any important constraints. This reduces errors and improves reliability.
-- **Evaluation tests**: Use evaluation tests (“evals”) to measure the agent’s ability to use your tools correctly. Run these after any updates to your server or tool descriptions to catch regressions early and track improvements over time. 
+- **Evaluation tests**: Use evaluation tests ('evals') to measure the agent’s ability to use your tools correctly. Run these after any updates to your server or tool descriptions to catch regressions early and track improvements over time.
 
 ### Get Started
 

@@ -46,7 +46,7 @@ But if you want your MCP server to:
 You can use the APIs below in order to do so.
 
 #### Hibernation Support
-`McpAgent` instances automatically support [WebSockets Hibernation](https://developers.cloudflare.com/durable-objects/best-practices/websockets/#websocket-hibernation-api), allowing stateful MCP servers to sleep during inactive periods while preserving their state. This means your agents only consume compute resources when actively processing requests, optimizing costs while maintaining the full context and conversation history.
+`McpAgent` instances automatically support [WebSockets Hibernation](/durable-objects/best-practices/websockets/#websocket-hibernation-api), allowing stateful MCP servers to sleep during inactive periods while preserving their state. This means your agents only consume compute resources when actively processing requests, optimizing costs while maintaining the full context and conversation history.
 
 ### State synchronization APIs
 

@@ -12,7 +12,7 @@ The Model Context Protocol (MCP) specification defines three standard [transport
 
 1. **stdio, communication over standard in and standard out** — designed for local MCP connections.
 2. **Server-Sent Events (SSE)** — Currently supported by most remote MCP clients, but is expected to be replaced by Streamable HTTP over time. It requires two endpoints: one for sending requests, another for receiving streamed responses.
-3. **Streamable HTTP** —	New transport method [introduced](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http) in March 2025. It simplifies the communication by using a single HTTP endpoint for bidirectional messaging. It's currently gaining adoption among remote MCP clients, but is expected to become the standard transport in the future.
+3. **Streamable HTTP** — New transport method [introduced](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http) in March 2025. It simplifies the communication by using a single HTTP endpoint for bidirectional messaging. It is currently gaining adoption among remote MCP clients, but it is expected to become the standard transport in the future.
 
 MCP servers built with the [Agents SDK](/agents) can support both remote transport methods (SSE and Streamable HTTP), with the [`McpAgent` class](https://github.com/cloudflare/agents/blob/2f82f51784f4e27292249747b5fbeeef94305552/packages/agents/src/mcp.ts) automatically handling the transport configuration. 
 
@@ -78,7 +78,7 @@ export default app
 ```
 </TabItem> </Tabs>
 
-#### MCP Server with Authentication 
+#### MCP Server with Authentication
 If your MCP server implements authentication & authorization using the [Workers OAuth Provider](https://github.com/cloudflare/workers-oauth-provider) Library, then you can configure it to support both transport methods using the `apiHandlers` property.
 
 ```js
@@ -105,6 +105,6 @@ To use apiHandlers, update to @cloudflare/workers-oauth-provider v0.0.4 or later
 With these few changes, your MCP server will support both transport methods, making it compatible with both existing and new clients.
 
 ### Testing with MCP Clients
-While most MCP clients haven't yet adopted the new Streamable HTTP transport, you can start testing it today using [`mcp-remote`](https://www.npmjs.com/package/mcp-remote), an adapter that lets MCP clients that otherwise only support local connections work with remote MCP servers.
+While most MCP clients have not yet adopted the new Streamable HTTP transport, you can start testing it today using [`mcp-remote`](https://www.npmjs.com/package/mcp-remote), an adapter that lets MCP clients that otherwise only support local connections work with remote MCP servers.
 
 Follow [this guide](/agents/guides/test-remote-mcp-server/) for instructions on how to connect to your remote MCP server from Claude Desktop, Cursor, Windsurf, and other local MCP clients, using the [`mcp-remote` local proxy](https://www.npmjs.com/package/mcp-remote).
@@ -18,7 +18,7 @@ Many limits are inherited from those applied to Workers scripts and/or Durable O
 | Max definitions per account         | ~250,000+ [^2]
 | Max state stored per unique Agent | 1 GB |
 | Max compute time per Agent | 30 seconds (refreshed per HTTP request / incoming WebSocket message) [^3] |
-| Duration (wall clock) per step [^3]       | Unlimited (e.g. waiting on a database call or an LLM response) |
+| Duration (wall clock) per step [^3]       | Unlimited (for example, waiting on a database call or an LLM response) |
 
 ---
 

@@ -15,7 +15,7 @@ Enable and customize your gateway cache to serve requests directly from Cloudfla
 
 Currently caching is supported only for text and image responses, and it applies only to identical requests.
 
-This is helpful for use cases when there are limited prompt options - for example, a support bot that asks "How can I help you?" and lets the user select an answer from a limited set of options works well with the current caching configuration.
+This configuration benefits use cases with limited prompt options. For example, a support bot that asks "How can I help you?" and lets the user select an answer from a limited set of options works well with the current caching configuration.
 We plan on adding semantic search for caching in the future to improve cache hit rates.
 :::
 

@@ -11,7 +11,7 @@ The AI Gateway WebSockets API provides a persistent connection for AI interactio
 - **Realtime APIs** - Designed for AI providers that offer low-latency, multimodal interactions over WebSockets.
 - **Non-Realtime APIs** - Supports standard WebSocket communication for AI providers, including those that do not natively support WebSockets.
 
-## When to use WebSockets?
+## When to use WebSockets
 
 WebSockets are long-lived TCP connections that enable bi-directional, real-time and non realtime communication between client and server. Unlike HTTP connections, which require repeated handshakes for each request, WebSockets maintain the connection, supporting continuous data exchange with reduced overhead. WebSockets are ideal for applications needing low-latency, real-time data, such as voice assistants.
 

@@ -10,7 +10,7 @@ sidebar:
 
 Understanding your application's performance is essential for optimization. Developers often have different priorities, and finding the optimal solution involves balancing key factors such as cost, latency, and accuracy. Some prioritize low-latency responses, while others focus on accuracy or cost-efficiency.
 
-AI Gateway's Evaluations provide the data needed to make informed decisions on how to optimize your AI application. Whether it's adjusting the model, provider, or prompt, this feature delivers insights into key metrics around performance, speed, and cost. It empowers developers to better understand their application's behavior, ensuring improved accuracy, reliability, and customer satisfaction.
+AI Gateway's Evaluations provide the data needed to make informed decisions on how to optimize your AI application. Whether it is adjusting the model, provider, or prompt, this feature delivers insights into key metrics around performance, speed, and cost. It empowers developers to better understand their application's behavior, ensuring improved accuracy, reliability, and customer satisfaction.
 
 Evaluations use datasets which are collections of logs stored for analysis. You can create datasets by applying filters in the Logs tab, which help narrow down specific logs for evaluation.
 

@@ -55,7 +55,7 @@ Now that your provider is connected to the AI Gateway, you can view analytics fo
 
 :::note[Note]
 
-The cost metric is an estimation based on the number of tokens sent and received in requests. While this metric can help you monitor and predict cost trends, refer to your provider’s dashboard for the most accurate cost details.
+The cost metric is an estimation based on the number of tokens sent and received in requests. While this metric can help you monitor and predict cost trends, refer to your provider's dashboard for the most accurate cost details.
 
 :::
-Original file line number
+Diff line change
@@ Expand Up @@
     :::note[Note]
-    The cost metric is an estimation based on the number of tokens sent and received in requests. While this metric can help you monitor and predict cost trends, refer to your provider’s dashboard for the most accurate cost details.
+    The cost metric is an estimation based on the number of tokens sent and received in requests. While this metric can help you monitor and predict cost trends, refer to your provider's dashboard for the most accurate cost details.
     :::
@@ Expand Down @@