RegExp shortcut for structured outputs #91

domenic · 2025-03-28T06:11:01Z

@clarkduvall brought up that it would be useful to extend our JSON schema structured output support with a shortcut for constraining the string output to follow a regexp.

Consider the case where you want to have the language model output a valid email address. Since JSON schemas allow specifying regexps already, you could do this like so:

const validEmailRegExp = /^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/;

const responseJSONSchema = new LanguageModelResponseSchema({
  type: "string",
  pattern: validEmailRegExp.source
});

const result = await session.prompt(
  `Create a fictional email address for ${characterName}.`,
  { responseJSONSchema }
);
const emailAddress = JSON.parse(result); // don't forget this step!

Maybe we should allow something more direct, such as:

const emailAddress = await session.prompt(
  `Create a fictional email address for ${characterName}.`,
  { responseRegExp: validEmailRegExp }
);

Above I've provided one possible API. One slightly unsatisfying part is that the API shape allows specifying both responseJSONSchema and responseRegExp at the same time. (We would throw an error if the web developer does this.)

Another option is to rename the existing responseJSONSchema option into something like responseFormat or responseConstraint, which can accept either a regexp or a JSON schema. This seems slightly cleaner to me right now.

/cc @sushraja-msft for any thoughts, as the person who originated the structured outputs work!

The text was updated successfully, but these errors were encountered:

clarkduvall · 2025-03-28T18:43:49Z

Another option is to not tie this to JSON at all. Not 100% sure, but requiring the model to output as a JSON string may reduce quality slightly (or need a bit more prompt engineering) since it will need to output " before/after the result. For a simple case where you just want e.g. "yes" or "no" I would think it would be more straightforward to have the model output this directly instead of having to JSON.parse it. WDYT?

domenic · 2025-04-01T02:26:50Z

Another option is to not tie this to JSON at all. Not 100% sure, but requiring the model to output as a JSON string may reduce quality slightly (or need a bit more prompt engineering) since it will need to output " before/after the result. For a simple case where you just want e.g. "yes" or "no" I would think it would be more straightforward to have the model output this directly instead of having to JSON.parse it. WDYT?

This seems reasonable, although in the spec we could explain it either way. It's just an implementation strategy choice.

Not sure if it's worth opening an extra Issue for this

This seems unrelated to this issue, so I'll mark it off-topic.

clarkduvall · 2025-04-01T02:43:26Z

This seems reasonable, although in the spec we could explain it either way. It's just an implementation strategy choice.

Do you mean internally we could implement as straight regex (not JSON schema) then artificially add the " to make valid JSON (and escape stuff etc)? I do worry that would be complicated because then the model's internal representation of the output would not match what is shown to the API. This could make it harder to reason about future prompts, token counts etc. Forcing the model to output JSON could either reduce quality or require additional prompt engineering in cases where a simpler response is desired.

domenic · 2025-04-01T03:16:43Z

No, sorry for not being more detailed. I meant that if we add this feature, we could explain it to developers and in the specification as a shorthand for the JSON schema-based version mentioned in the original post. Under the hood, the browser code might implement the feature with a completely separate regexp-specialized code path, that has nothing to do with JSON. But that's an implementation detail, that web developers don't need to know about, and which we could choose to leave out of the specification since it's observably equivalent to just reusing the JSON-based infrastructure.

clarkduvall · 2025-04-01T03:34:44Z

Ah I see. My main point was that if the spec requires us to return as a JSON string (quoted with " to make it a valid JSON parseable string) rather than just raw text that matches the regex, that could cause some impl complications

clarkduvall · 2025-04-01T05:16:45Z

To clarify a bit with an example, I think there's a semantic difference between a shorthand for a JSON schema regex field vs a regex option separate from JSON schema altogether. For example, if this is just JSON schema shorthand, you may get better results using the prompt:

Respond to the following question with Yes or No formatted as a JSON string: Do you like cheese?

Responds with:

"Yes"

Versus if this is just a regex constraint on the output, the prompt could be:

Respond to the following question with Yes or No: Do you like cheese?

Responds with:

Yes

In general, the prompt should attempt to get the model to produce output matching the constraints, so whether this is shorthand for a JSON schema regex field or just regex can make a difference to how to structure the prompt. Allowing just regex may get better quality with simpler prompts, along with simplifying output handling.

domenic · 2025-04-01T05:51:31Z

Right, I was implicitly assuming we'd include a step in the spec that does JSON.parse(output) for you in this "shortcut". (If we specced it as a shortcut.)

clarkduvall · 2025-04-01T06:01:29Z

Yeah I think speccing as a "shortcut" would introduce confusion as to how to prompt (e.g. whether to try to get the model to generate JSON string or not even if we parse for you behind the scenes), so may be clearer to spec as an alternative constraint type to JSON schema.

sushraja-msft · 2025-04-01T20:31:28Z

Yeah, this would work, llguidance supports regex. The preference for JSONSchema earlier was because

The lack of standard with consensus for regex, it looks like ECMAScript has a specification for Regex though, so it is no longer a concern for me. We can use llguidance and then check the response against ECMAScript interpretation of the regex to make sure it is a match before returning results back to the user.
Both OpenAI and Gemini APIs for service based LLMs landed on JSON schema for specifying structured outputs https://ai.google.dev/api/generate-content#v1beta.GenerationConfig, https://openai.com/index/introducing-structured-outputs-in-the-api/.

clarkduvall · 2025-04-23T23:19:59Z

If we switch to responseFormat, then the logic would be something like:

if responseFormat type is string or regex:
  treat as regex
else if responseFormat type is object:
  treat as JSON schema
else
  throw error for unsupported type

is that right? I think that seems cleaner rather than having two keys that are mutually exclusive.

clarkduvall · 2025-04-23T23:29:04Z

Also I may prefer responseConstraint over responseFormat, as "format" seems a bit like more generically saying I want JSON/XML/YAML rather than specifying a schema/constraint. Is there any precedent for naming this type of thing in other APIs?

domenic · 2025-04-24T03:49:05Z

If we switch to responseFormat, then the logic would be something like:

I think we would throw an error for string, instead of treating strings as regexps. But otherwise, yeah, that would be the idea.

Also I may prefer responseConstraint over responseFormat, as "format" seems a bit like more generically saying I want JSON/XML/YAML rather than specifying a schema/constraint. Is there any precedent for naming this type of thing in other APIs?

I agree with your instincts here.

The precedent seems to be mostly "format"...

OpenAI and copycats (Mistral, Cohere, DeepSeek): response_format: { "type":"json_object" | "json_schema", "schema": {…} }
Anthropic: format:"json"
Google: responseMimeType:"application/json"

But arguably this is an artifact of how for JSON schema, both the constraint language and the output format is "json".

Apparently only one smaller API has RegExp support:

Perplexity/Sonar: response_format: { type: "regex", regex: {"regex": str} }

Perhaps the root of the issue is that all of these HTTP APIs (and the JS SDKs that ape their design) are working with pure JSON as inputs, instead of JavaScript. So they often need these "type" tag fields, which we do not since we can do runtime type testing.

So overall I think responseConstraint is the current best option, pending any further web developer feedback.

Closes #91.

victornpb · 2025-04-24T08:20:00Z

At first glance I liked the second proposal to have a single property, but then I was thinking

"responseFormat" being populated with an Object or RegExp, but the actual response is going to be a string or an object if being implicitly parsed, but for regex this doesn't make much sense as you would get a string response. We are just validating the string against the responseFormat, thus calling it "format" doesn't seem like the right fit if that makes sense.

Since this is this proposal is about a syntactic sugar for another underlying API maybe it should just surface up the same terms used on that API instead of introducing new ones (assuming those have already been bikesheded and for consistency assuming we are not introducing new behaviour).

responseJSONSchema : new LanguageModelResponseSchema({
  type: "string",
  pattern: RegExp
})

so maybe the shorthand for the above could be:

await session.prompt(
  `Create a fictional email address for ${characterName}.`,
  { pattern: RegExp }
)

And defining both should throw imo.

domenic · 2025-04-24T08:51:14Z

"responseFormat" being populated with an Object or RegExp, but the actual response is going to be a string or an object if being implicitly parsed,

That's why we changed it to "constraint" instead of format, per the most recent two messages discussing it :)

domenic added the enhancement New feature or request label Mar 28, 2025

tomayac mentioned this issue Apr 2, 2025

Structured output with XML Schema #94

Closed

michaelwasserman mentioned this issue Apr 17, 2025

Clarify LanguageModelResponseSchema interface #102

Closed

domenic added a commit that referenced this issue Apr 24, 2025

Add RegExp constraints and change structured output API

b7db975

Closes #91.

domenic mentioned this issue Apr 24, 2025

Add RegExp constraints and change structured output API #108

Merged

domenic closed this as completed in 60b2ac8 Apr 24, 2025

domenic closed this as completed in #108 Apr 24, 2025

domenic mentioned this issue Apr 26, 2025

Does responseConstraint need a type? #109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RegExp shortcut for structured outputs #91

RegExp shortcut for structured outputs #91

domenic commented Mar 28, 2025

clarkduvall commented Mar 28, 2025

Uh oh!

domenic commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

domenic commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

domenic commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

sushraja-msft commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 23, 2025

Uh oh!

clarkduvall commented Apr 23, 2025

Uh oh!

domenic commented Apr 24, 2025

Uh oh!

victornpb commented Apr 24, 2025

Uh oh!

domenic commented Apr 24, 2025

Uh oh!

RegExp shortcut for structured outputs #91

RegExp shortcut for structured outputs #91

Comments

domenic commented Mar 28, 2025

clarkduvall commented Mar 28, 2025

Uh oh!

domenic commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

domenic commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

domenic commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 1, 2025

Uh oh!

sushraja-msft commented Apr 1, 2025

Uh oh!

clarkduvall commented Apr 23, 2025

Uh oh!

clarkduvall commented Apr 23, 2025

Uh oh!

domenic commented Apr 24, 2025

Uh oh!

victornpb commented Apr 24, 2025

Uh oh!

domenic commented Apr 24, 2025

Uh oh!