Skip to content

[BRAPI]Rest API guide #22056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: production
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions src/content/docs/browser-rendering/get-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,9 @@ Browser rendering can be used in two ways:

- [Workers Binding API](/browser-rendering/workers-binding-api) for complex scripts.
- [REST API](/browser-rendering/rest-api/) for simple actions.

## Examples

- [Workers Binding API](/browser-rendering/how-to/ai/): Fetch [https://labs.apnic.net/](https://labs.apnic.net/) and apply a machine-learning model via Workers AI to extract the first post as JSON according to your schema.

- [REST API](/browser-rendering/how-to/markdown-extraction/): Render and extract the complete JSON output from the [`/markdown` endpoint](/browser-rendering/rest-api/markdown-endpoint) by processing the blog post [Introducing AutoRAG on Cloudflare](https://blog.cloudflare.com/introducing-autorag-on-cloudflare/).
178 changes: 178 additions & 0 deletions src/content/docs/browser-rendering/how-to/markdown-extraction.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
title: Extracting blog post content as markdown using the markdown endpoint
sidebar:
order: 4
---

This guide shows you how to capture the complete JSON output from Cloudflare's [`/markdown` API endpoint](/browser-rendering/rest-api/markdown-endpoint/).

We are extracting the content of a blog post from the Cloudflare Blog: [Introducing AutoRAG on Cloudflare](https://blog.cloudflare.com/introducing-autorag-on-cloudflare/)

## Prerequisites

1. Cloudflare Account and API Token.

- [Create a token](/fundamentals/api/get-started/create-token/) with **Browser Rendering: Edit** permissions.
- You can do this under **My Profile → API Tokens → Create Token** on your [Cloudflare dashboard](https://dash.cloudflare.com/).
- Note your **Account ID** (from the dashboard homepage) and **API Token**.

2. Command-line tools installed.

- cURL: a command-line tool for sending HTTP requests.
- macOS/Linux: usually preinstalled.
- Windows: available via WSL, Git Bash, or native Windows builds.

## 1: Configure your environment variables

Save your sensitive information into environment variables to avoid hardcoding credentials.

```bash
export CF_ACCOUNT_ID="your-cloudflare-account-id"
export CF_API_TOKEN="your-api-token-with-edit-permissions"
```

## 2: Make the API Request and save the raw JSON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about this tutorial..... if we're already running a Python script, couldn't we just have steps 2-4 included in the Python script as well?

Seems like it would simplify things a lot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to do that. Maybe you could guide me in the right direction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to do that. Maybe you could guide me in the right direction?

Python has a requests library, which could be used to make the cURL request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the broader question is also.... why Python here?

If we're using the REST API, all of the other examples are using the TypeScript SDK... so why would we change languages from what we provide in the rest of the docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused too..
if our end goal is to generate an md file for a blog post, we could perhaps do it in a worker?
And if that is the case, it would make more sense to use BR bindings (instead of the REST API). We could all the BR work in the worker and finally return the md file as the worker response.

If it's not possible to do it in a worker, instead of python, we should do it as a node script with the typescript SDK or directly calling the REST API through fetch. Lemme know if you need any help figuring this out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omarmosid I specifically did not do it with Workers because we do not have any guide of using our REST API endpoints. The goal is to show how to use our REST API endpoints.
Please assist in doing it through the fetch API.

Copy link
Contributor

@omarmosid omarmosid May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this? (using node)

// saveMarkdown.js
const fs = require('fs');

async function downloadMarkdown() {
  const accountId = '<accountId>';
  const apiToken = '<apiToken>';
  const blogUrl = 'https://example.com';
  const endpoint = `https://api.cloudflare.com/client/v4/accounts/${accountId}/browser-rendering/markdown`;

  try {
    const response = await fetch(endpoint, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${apiToken}`
      },
      body: JSON.stringify({ url: blogUrl })
    });

    if (!response.ok) {
      throw new Error(`Failed to fetch markdown: ${response.status} ${response.statusText}`);
    }

    const data = await response.json();
    const markdown = data.result;

    const filename = 'output.md';
    fs.writeFileSync(filename, markdown);
    console.log(`Markdown saved to ${filename}`);
  } catch (error) {
    console.error('Error:', error.message);
  }
}

downloadMarkdown();

Copy link
Contributor

@kodster28 kodster28 May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omarmosid I specifically did not do it with Workers because we do not have any guide of using our REST API endpoints. The goal is to show how to use our REST API endpoints. Please assist in doing it through the fetch API.

If it's something that fundamentally doesn't make sense to do via the REST API, shouldn't we be looking for another use case?

We don't want to promote an inefficient approach to a problem, even if it's illustrative.


Run this command to fetch the markdown representation of the AutoRAG blog post and store it into a local JSON file:

```bash
curl -s -X POST \
"https://api.cloudflare.com/client/v4/accounts/${CF_ACCOUNT_ID}/browser-rendering/markdown" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${CF_API_TOKEN}" \
-d '{
"url": "https://blog.cloudflare.com/introducing-autorag-on-cloudflare/"
}' \
> autorag-full-response.json
```

The `>` parameter redirects output into a file (`autorag-full-response.json`).

## 3: Inspect the saved JSON

You can check the start of the saved JSON file to ensure it looks right:

```bash
head -n 20 autorag-full-response.json
```

```json output
{
"success": true,
"errors": [],
"messages": [],
"result": "# "[Get Started Free](https://dash.cloudflare.com/sign-up)|[Contact Sales](https://www.cloudflare.com/plans/enterprise/contact/)\n\n[![The Cloudflare Blog](https://cf-assets.www.cloudflare ..."
}
```

## 4: (Optional) Skip unwanted resources

To ignore unnecessary assets like CSS, JavaScript, or images when fetching the page add `rejectRequestPattern` parameter:

```bash
curl -s -X POST \
"https://api.cloudflare.com/client/v4/accounts/${CF_ACCOUNT_ID}/browser-rendering/markdown" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${CF_API_TOKEN}" \
-d '{
"url": "https://blog.cloudflare.com/introducing-autorag-on-cloudflare/",
"rejectRequestPattern": [
"/^.*\\.(css|js|png|svg)$/"
]
}' \
> autorag-no-assets.json
```

## 5: Extracting and saving the markdown from the JSON file

After saving the full response, below is how to how to extract just the Markdown.

The script does the following:

1. Reads the full JSON response from `autorag-full-response.json`
2. Extracts the Markdown string from the `"result"` field
3. Writes that Markdown to `autorag-blog.md`

```py
#!/usr/bin/env python3
"""
extract_markdown.py

Reads the full JSON response from Cloudflare's Markdown endpoint
and writes the 'result' field (the converted Markdown) to a .md file.
"""

import json
import sys
from pathlib import Path

# Input and output file paths
INPUT_JSON = Path("autorag-full-response.json")
OUTPUT_MD = Path("autorag-blog.md")

def main():
# Check that the input file exists
if not INPUT_JSON.is_file():
print(f"Error: Input file '{INPUT_JSON}' not found.", file=sys.stderr)
sys.exit(1)

# Load the JSON response
try:
with INPUT_JSON.open("r", encoding="utf-8") as f:
data = json.load(f)
except json.JSONDecodeError as e:
print(f"Error: Failed to parse JSON in '{INPUT_JSON}': {e}", file=sys.stderr)
sys.exit(1)

# Validate structure
if not data.get("success", False):
print("Error: API reported failure.", file=sys.stderr)
errors = data.get("errors") or data.get("messages")
if errors:
print("Details:", errors, file=sys.stderr)
sys.exit(1)

if "result" not in data:
print("Error: 'result' field not found in JSON.", file=sys.stderr)
sys.exit(1)

# Extract and write the Markdown
markdown_content = data["result"]
try:
with OUTPUT_MD.open("w", encoding="utf-8") as md_file:
md_file.write(markdown_content)
except IOError as e:
print(f"Error: Could not write to '{OUTPUT_MD}': {e}", file=sys.stderr)
sys.exit(1)

print(f"Success: Markdown content written to '{OUTPUT_MD}'.")

if __name__ == "__main__":
main()
```

### Usage

1. Ensure you have run the `curl` command to produce `autorag-full-response.json`.

2. Place `extract_markdown.py` in the same directory.

3. Run:

```
python3 extract_markdown.py
```

After execution, `autorag-blog.md` will contain the extracted Markdown.

## Final folder structure

After following these steps, your working folder will look like:

```
.
├── autorag-full-response.json # Full API response
├── autorag-no-assets.json # Full API response without extra assets (optional)
├── autorag-blog.md # Extracted Markdown content
└── extract_markdown.py # Python extraction script (optional)
```