gemma-sea-lion-v4-27b-it

Text Generation • aisingapore

@cf/aisingapore/gemma-sea-lion-v4-27b-it

SEA-LION stands for Southeast Asian Languages In One Network, which is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.

Model Info
Context Window ↗	128,000 tokens
Unit Pricing	$0.35 per M input tokens, $0.56 per M output tokens

Playground

Try out this model with Workers AI LLM Playground. It does not require any setup or authentication and an instant way to preview and test a model directly in the browser.

Launch the LLM Playground

Usage

Worker - Streaming

export interface Env {
  AI: Ai;
}

export default {
  async fetch(request, env): Promise<Response> {

    const messages = [
      { role: "system", content: "You are a friendly assistant" },
      {
        role: "user",
        content: "What is the origin of the phrase Hello, World",
      },
    ];

    const stream = await env.AI.run("@cf/aisingapore/gemma-sea-lion-v4-27b-it", {
      messages,
      stream: true,
    });

    return new Response(stream, {
      headers: { "content-type": "text/event-stream" },
    });
  },
} satisfies ExportedHandler<Env>;

Worker

export interface Env {
  AI: Ai;
}

export default {
  async fetch(request, env): Promise<Response> {

    const messages = [
      { role: "system", content: "You are a friendly assistant" },
      {
        role: "user",
        content: "What is the origin of the phrase Hello, World",
      },
    ];
    const response = await env.AI.run("@cf/aisingapore/gemma-sea-lion-v4-27b-it", { messages });

    return Response.json(response);
  },
} satisfies ExportedHandler<Env>;

Python

import os
import requests

ACCOUNT_ID = "your-account-id"
AUTH_TOKEN = os.environ.get("CLOUDFLARE_AUTH_TOKEN")

prompt = "Tell me all about PEP-8"
response = requests.post(
  f"https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/aisingapore/gemma-sea-lion-v4-27b-it",
    headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
    json={
      "messages": [
        {"role": "system", "content": "You are a friendly assistant"},
        {"role": "user", "content": prompt}
      ]
    }
)
result = response.json()
print(result)

curl

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/aisingapore/gemma-sea-lion-v4-27b-it \
  -X POST \
  -H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \
  -d '{ "messages": [{ "role": "system", "content": "You are a friendly assistant" }, { "role": "user", "content": "Why is pizza so good" }]}'

Parameters

* indicates a required field

Input

0 object
- $merge
1 object
- $merge
2 object
- requests array required
  - items one of
    - 0 object
      - $merge
    - 1 object
      - $merge

Output

0 object
- id string
  Unique identifier for the completion
- object string
  Object type identifier
- created number
  Unix timestamp of when the completion was created
- model string
  Model used for the completion
- choices array
  List of completion choices
  - items object
    - index number
      Index of the choice in the list
    - message object
      The message generated by the model
      - role string required
        Role of the message author
      - content string required
        The content of the message
      - reasoning_content string
        Internal reasoning content (if available)
      - tool_calls array
        Tool calls made by the assistant
        
        items object
        
        id string required
        Unique identifier for the tool call
        
        type string required
        Type of tool call
        
        function object required
        
        name string required
        Name of the function to call
        
        arguments string required
        JSON string of arguments for the function
    - finish_reason string
      Reason why the model stopped generating
    - stop_reason string
      Stop reason (may be null)
    - logprobs object
      Log probabilities (if requested)
- usage
- prompt_logprobs object
  Log probabilities for the prompt (if requested)
1 object
- id string
  Unique identifier for the completion
- object string
  Object type identifier
- created number
  Unix timestamp of when the completion was created
- model string
  Model used for the completion
- choices array
  List of completion choices
  - items object
    - index number required
      Index of the choice in the list
    - text string required
      The generated text completion
    - finish_reason string required
      Reason why the model stopped generating
    - stop_reason string
      Stop reason (may be null)
    - logprobs object
      Log probabilities (if requested)
    - prompt_logprobs object
      Log probabilities for the prompt (if requested)
- usage
2 string
3 object
- request_id string
  The async request id that can be used to obtain the results.

API Schemas

The following schemas are based on JSON Schema

Input
Output

{
    "$id": "http://ai.cloudflare.com/schemas/textGenerationInput",
    "type": "object",
    "oneOf": [
        {
            "title": "Prompt",
            "properties": {
                "$merge": {
                    "source": {
                        "prompt": {
                            "$ref": "textGenerationPrompts#/prompt"
                        },
                        "lora": {
                            "$ref": "textGenerationFinetune#/lora"
                        },
                        "response_format": {
                            "$ref": "jsonMode#/response_format"
                        }
                    },
                    "with": {
                        "raw": {
                            "type": "boolean",
                            "default": false,
                            "description": "If true, a chat template is not applied and you must adhere to the specific model's expected formatting."
                        },
                        "stream": {
                            "type": "boolean",
                            "default": false,
                            "description": "If true, the response will be streamed back incrementally using SSE, Server Sent Events."
                        },
                        "max_tokens": {
                            "type": "integer",
                            "default": 2000,
                            "description": "The maximum number of tokens to generate in the response."
                        },
                        "temperature": {
                            "type": "number",
                            "default": 0.6,
                            "minimum": 0,
                            "maximum": 5,
                            "description": "Controls the randomness of the output; higher values produce more random results."
                        },
                        "top_p": {
                            "type": "number",
                            "minimum": 0.001,
                            "maximum": 1,
                            "description": "Adjusts the creativity of the AI's responses by controlling how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses."
                        },
                        "top_k": {
                            "type": "integer",
                            "minimum": 1,
                            "maximum": 50,
                            "description": "Limits the AI to choose from the top 'k' most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises."
                        },
                        "seed": {
                            "type": "integer",
                            "minimum": 1,
                            "maximum": 9999999999,
                            "description": "Random seed for reproducibility of the generation."
                        },
                        "repetition_penalty": {
                            "type": "number",
                            "minimum": 0,
                            "maximum": 2,
                            "description": "Penalty for repeated tokens; higher values discourage repetition."
                        },
                        "frequency_penalty": {
                            "type": "number",
                            "minimum": -2,
                            "maximum": 2,
                            "description": "Decreases the likelihood of the model repeating the same lines verbatim."
                        },
                        "presence_penalty": {
                            "type": "number",
                            "minimum": -2,
                            "maximum": 2,
                            "description": "Increases the likelihood of the model introducing new topics."
                        }
                    }
                }
            },
            "required": [
                "prompt"
            ]
        },
        {
            "title": "Messages",
            "properties": {
                "$merge": {
                    "source": {
                        "messages": {
                            "$ref": "textGenerationPrompts#/messages"
                        },
                        "functions": {
                            "$ref": "textGenerationTools#/functions"
                        },
                        "tools": {
                            "$ref": "textGenerationTools#/tools"
                        },
                        "response_format": {
                            "$ref": "jsonMode#/response_format"
                        }
                    },
                    "with": {
                        "raw": {
                            "type": "boolean",
                            "default": false,
                            "description": "If true, a chat template is not applied and you must adhere to the specific model's expected formatting."
                        },
                        "stream": {
                            "type": "boolean",
                            "default": false,
                            "description": "If true, the response will be streamed back incrementally using SSE, Server Sent Events."
                        },
                        "max_tokens": {
                            "type": "integer",
                            "default": 2000,
                            "description": "The maximum number of tokens to generate in the response."
                        },
                        "temperature": {
                            "type": "number",
                            "default": 0.6,
                            "minimum": 0,
                            "maximum": 5,
                            "description": "Controls the randomness of the output; higher values produce more random results."
                        },
                        "top_p": {
                            "type": "number",
                            "minimum": 0.001,
                            "maximum": 1,
                            "description": "Adjusts the creativity of the AI's responses by controlling how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses."
                        },
                        "top_k": {
                            "type": "integer",
                            "minimum": 1,
                            "maximum": 50,
                            "description": "Limits the AI to choose from the top 'k' most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises."
                        },
                        "seed": {
                            "type": "integer",
                            "minimum": 1,
                            "maximum": 9999999999,
                            "description": "Random seed for reproducibility of the generation."
                        },
                        "repetition_penalty": {
                            "type": "number",
                            "minimum": 0,
                            "maximum": 2,
                            "description": "Penalty for repeated tokens; higher values discourage repetition."
                        },
                        "frequency_penalty": {
                            "type": "number",
                            "minimum": -2,
                            "maximum": 2,
                            "description": "Decreases the likelihood of the model repeating the same lines verbatim."
                        },
                        "presence_penalty": {
                            "type": "number",
                            "minimum": -2,
                            "maximum": 2,
                            "description": "Increases the likelihood of the model introducing new topics."
                        }
                    }
                }
            },
            "required": [
                "messages"
            ]
        },
        {
            "title": "Async Batch",
            "type": "object",
            "properties": {
                "requests": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "oneOf": [
                            {
                                "title": "Prompt",
                                "properties": {
                                    "$merge": {
                                        "source": {
                                            "prompt": {
                                                "$ref": "textGenerationPrompts#/prompt"
                                            },
                                            "lora": {
                                                "$ref": "textGenerationFinetune#/lora"
                                            },
                                            "response_format": {
                                                "$ref": "jsonMode#/response_format"
                                            }
                                        },
                                        "with": {
                                            "$ref": "textGenerationOptions#/common"
                                        }
                                    }
                                },
                                "required": [
                                    "prompt"
                                ]
                            },
                            {
                                "title": "Messages",
                                "properties": {
                                    "$merge": {
                                        "source": {
                                            "messages": {
                                                "$ref": "textGenerationPrompts#/messages"
                                            },
                                            "functions": {
                                                "$ref": "textGenerationTools#/functions"
                                            },
                                            "tools": {
                                                "$ref": "textGenerationTools#/tools"
                                            },
                                            "response_format": {
                                                "$ref": "jsonMode#/response_format"
                                            }
                                        },
                                        "with": {
                                            "$ref": "textGenerationOptions#/common"
                                        }
                                    }
                                },
                                "required": [
                                    "messages"
                                ]
                            }
                        ]
                    }
                }
            },
            "required": [
                "requests"
            ]
        }
    ]
}

{
    "oneOf": [
        {
            "type": "object",
            "contentType": "application/json",
            "title": "Chat Completion Response",
            "properties": {
                "id": {
                    "type": "string",
                    "description": "Unique identifier for the completion"
                },
                "object": {
                    "type": "string",
                    "enum": [
                        "chat.completion"
                    ],
                    "description": "Object type identifier"
                },
                "created": {
                    "type": "number",
                    "description": "Unix timestamp of when the completion was created"
                },
                "model": {
                    "type": "string",
                    "description": "Model used for the completion"
                },
                "choices": {
                    "type": "array",
                    "description": "List of completion choices",
                    "items": {
                        "type": "object",
                        "properties": {
                            "index": {
                                "type": "number",
                                "description": "Index of the choice in the list"
                            },
                            "message": {
                                "type": "object",
                                "description": "The message generated by the model",
                                "properties": {
                                    "role": {
                                        "type": "string",
                                        "description": "Role of the message author"
                                    },
                                    "content": {
                                        "type": "string",
                                        "description": "The content of the message"
                                    },
                                    "reasoning_content": {
                                        "type": "string",
                                        "description": "Internal reasoning content (if available)"
                                    },
                                    "tool_calls": {
                                        "type": "array",
                                        "description": "Tool calls made by the assistant",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "id": {
                                                    "type": "string",
                                                    "description": "Unique identifier for the tool call"
                                                },
                                                "type": {
                                                    "type": "string",
                                                    "enum": [
                                                        "function"
                                                    ],
                                                    "description": "Type of tool call"
                                                },
                                                "function": {
                                                    "type": "object",
                                                    "properties": {
                                                        "name": {
                                                            "type": "string",
                                                            "description": "Name of the function to call"
                                                        },
                                                        "arguments": {
                                                            "type": "string",
                                                            "description": "JSON string of arguments for the function"
                                                        }
                                                    },
                                                    "required": [
                                                        "name",
                                                        "arguments"
                                                    ]
                                                }
                                            },
                                            "required": [
                                                "id",
                                                "type",
                                                "function"
                                            ]
                                        }
                                    }
                                },
                                "required": [
                                    "role",
                                    "content"
                                ]
                            },
                            "finish_reason": {
                                "type": "string",
                                "description": "Reason why the model stopped generating"
                            },
                            "stop_reason": {
                                "type": [
                                    "string",
                                    "null"
                                ],
                                "description": "Stop reason (may be null)"
                            },
                            "logprobs": {
                                "type": [
                                    "object",
                                    "null"
                                ],
                                "description": "Log probabilities (if requested)"
                            }
                        }
                    }
                },
                "usage": {
                    "$ref": "usage#/usage"
                },
                "prompt_logprobs": {
                    "type": [
                        "object",
                        "null"
                    ],
                    "description": "Log probabilities for the prompt (if requested)"
                }
            }
        },
        {
            "type": "object",
            "contentType": "application/json",
            "title": "Text Completion Response",
            "properties": {
                "id": {
                    "type": "string",
                    "description": "Unique identifier for the completion"
                },
                "object": {
                    "type": "string",
                    "enum": [
                        "text_completion"
                    ],
                    "description": "Object type identifier"
                },
                "created": {
                    "type": "number",
                    "description": "Unix timestamp of when the completion was created"
                },
                "model": {
                    "type": "string",
                    "description": "Model used for the completion"
                },
                "choices": {
                    "type": "array",
                    "description": "List of completion choices",
                    "items": {
                        "type": "object",
                        "properties": {
                            "index": {
                                "type": "number",
                                "description": "Index of the choice in the list"
                            },
                            "text": {
                                "type": "string",
                                "description": "The generated text completion"
                            },
                            "finish_reason": {
                                "type": "string",
                                "description": "Reason why the model stopped generating"
                            },
                            "stop_reason": {
                                "type": [
                                    "string",
                                    "null"
                                ],
                                "description": "Stop reason (may be null)"
                            },
                            "logprobs": {
                                "type": [
                                    "object",
                                    "null"
                                ],
                                "description": "Log probabilities (if requested)"
                            },
                            "prompt_logprobs": {
                                "type": [
                                    "object",
                                    "null"
                                ],
                                "description": "Log probabilities for the prompt (if requested)"
                            }
                        },
                        "required": [
                            "index",
                            "text",
                            "finish_reason"
                        ]
                    }
                },
                "usage": {
                    "$ref": "usage#/usage"
                }
            }
        },
        {
            "type": "string",
            "contentType": "text/event-stream",
            "format": "binary"
        },
        {
            "type": "object",
            "contentType": "application/json",
            "title": "Async response",
            "properties": {
                "request_id": {
                    "type": "string",
                    "description": "The async request id that can be used to obtain the results."
                }
            }
        }
    ]
}

Was this helpful?

Community
X
Discord
YouTube
GitHub