Skip to content

RegExp shortcut for structured outputs #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
domenic opened this issue Mar 28, 2025 · 14 comments · Fixed by #108
Closed

RegExp shortcut for structured outputs #91

domenic opened this issue Mar 28, 2025 · 14 comments · Fixed by #108
Labels
enhancement New feature or request

Comments

@domenic
Copy link
Collaborator

domenic commented Mar 28, 2025

@clarkduvall brought up that it would be useful to extend our JSON schema structured output support with a shortcut for constraining the string output to follow a regexp.

Consider the case where you want to have the language model output a valid email address. Since JSON schemas allow specifying regexps already, you could do this like so:

const validEmailRegExp = /^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/;

const responseJSONSchema = new LanguageModelResponseSchema({
  type: "string",
  pattern: validEmailRegExp.source
});

const result = await session.prompt(
  `Create a fictional email address for ${characterName}.`,
  { responseJSONSchema }
);
const emailAddress = JSON.parse(result); // don't forget this step!

Maybe we should allow something more direct, such as:

const emailAddress = await session.prompt(
  `Create a fictional email address for ${characterName}.`,
  { responseRegExp: validEmailRegExp }
);

Above I've provided one possible API. One slightly unsatisfying part is that the API shape allows specifying both responseJSONSchema and responseRegExp at the same time. (We would throw an error if the web developer does this.)

Another option is to rename the existing responseJSONSchema option into something like responseFormat or responseConstraint, which can accept either a regexp or a JSON schema. This seems slightly cleaner to me right now.

/cc @sushraja-msft for any thoughts, as the person who originated the structured outputs work!

@domenic domenic added the enhancement New feature or request label Mar 28, 2025
@clarkduvall
Copy link

Another option is to not tie this to JSON at all. Not 100% sure, but requiring the model to output as a JSON string may reduce quality slightly (or need a bit more prompt engineering) since it will need to output " before/after the result. For a simple case where you just want e.g. "yes" or "no" I would think it would be more straightforward to have the model output this directly instead of having to JSON.parse it. WDYT?

@domenic
Copy link
Collaborator Author

domenic commented Apr 1, 2025

Another option is to not tie this to JSON at all. Not 100% sure, but requiring the model to output as a JSON string may reduce quality slightly (or need a bit more prompt engineering) since it will need to output " before/after the result. For a simple case where you just want e.g. "yes" or "no" I would think it would be more straightforward to have the model output this directly instead of having to JSON.parse it. WDYT?

This seems reasonable, although in the spec we could explain it either way. It's just an implementation strategy choice.

Not sure if it's worth opening an extra Issue for this

This seems unrelated to this issue, so I'll mark it off-topic.

@clarkduvall
Copy link

This seems reasonable, although in the spec we could explain it either way. It's just an implementation strategy choice.

Do you mean internally we could implement as straight regex (not JSON schema) then artificially add the " to make valid JSON (and escape stuff etc)? I do worry that would be complicated because then the model's internal representation of the output would not match what is shown to the API. This could make it harder to reason about future prompts, token counts etc. Forcing the model to output JSON could either reduce quality or require additional prompt engineering in cases where a simpler response is desired.

@domenic
Copy link
Collaborator Author

domenic commented Apr 1, 2025

No, sorry for not being more detailed. I meant that if we add this feature, we could explain it to developers and in the specification as a shorthand for the JSON schema-based version mentioned in the original post. Under the hood, the browser code might implement the feature with a completely separate regexp-specialized code path, that has nothing to do with JSON. But that's an implementation detail, that web developers don't need to know about, and which we could choose to leave out of the specification since it's observably equivalent to just reusing the JSON-based infrastructure.

@clarkduvall
Copy link

Ah I see. My main point was that if the spec requires us to return as a JSON string (quoted with " to make it a valid JSON parseable string) rather than just raw text that matches the regex, that could cause some impl complications

@clarkduvall
Copy link

To clarify a bit with an example, I think there's a semantic difference between a shorthand for a JSON schema regex field vs a regex option separate from JSON schema altogether. For example, if this is just JSON schema shorthand, you may get better results using the prompt:

Respond to the following question with Yes or No formatted as a JSON string: Do you like cheese?

Responds with:

"Yes"

Versus if this is just a regex constraint on the output, the prompt could be:

Respond to the following question with Yes or No: Do you like cheese?

Responds with:

Yes

In general, the prompt should attempt to get the model to produce output matching the constraints, so whether this is shorthand for a JSON schema regex field or just regex can make a difference to how to structure the prompt. Allowing just regex may get better quality with simpler prompts, along with simplifying output handling.

@domenic
Copy link
Collaborator Author

domenic commented Apr 1, 2025

Right, I was implicitly assuming we'd include a step in the spec that does JSON.parse(output) for you in this "shortcut". (If we specced it as a shortcut.)

@clarkduvall
Copy link

Yeah I think speccing as a "shortcut" would introduce confusion as to how to prompt (e.g. whether to try to get the model to generate JSON string or not even if we parse for you behind the scenes), so may be clearer to spec as an alternative constraint type to JSON schema.

@sushraja-msft
Copy link
Contributor

Yeah, this would work, llguidance supports regex. The preference for JSONSchema earlier was because

@clarkduvall
Copy link

If we switch to responseFormat, then the logic would be something like:

if responseFormat type is string or regex:
  treat as regex
else if responseFormat type is object:
  treat as JSON schema
else
  throw error for unsupported type

is that right? I think that seems cleaner rather than having two keys that are mutually exclusive.

@clarkduvall
Copy link

Also I may prefer responseConstraint over responseFormat, as "format" seems a bit like more generically saying I want JSON/XML/YAML rather than specifying a schema/constraint. Is there any precedent for naming this type of thing in other APIs?

@domenic
Copy link
Collaborator Author

domenic commented Apr 24, 2025

If we switch to responseFormat, then the logic would be something like:

I think we would throw an error for string, instead of treating strings as regexps. But otherwise, yeah, that would be the idea.

Also I may prefer responseConstraint over responseFormat, as "format" seems a bit like more generically saying I want JSON/XML/YAML rather than specifying a schema/constraint. Is there any precedent for naming this type of thing in other APIs?

I agree with your instincts here.

The precedent seems to be mostly "format"...

  • OpenAI and copycats (Mistral, Cohere, DeepSeek): response_format: { "type":"json_object" | "json_schema", "schema": {…} }
  • Anthropic: format:"json"
  • Google: responseMimeType:"application/json"

But arguably this is an artifact of how for JSON schema, both the constraint language and the output format is "json".

Apparently only one smaller API has RegExp support:

  • Perplexity/Sonar: response_format: { type: "regex", regex: {"regex": str} }

Perhaps the root of the issue is that all of these HTTP APIs (and the JS SDKs that ape their design) are working with pure JSON as inputs, instead of JavaScript. So they often need these "type" tag fields, which we do not since we can do runtime type testing.

So overall I think responseConstraint is the current best option, pending any further web developer feedback.

@victornpb
Copy link

At first glance I liked the second proposal to have a single property, but then I was thinking

"responseFormat" being populated with an Object or RegExp, but the actual response is going to be a string or an object if being implicitly parsed, but for regex this doesn't make much sense as you would get a string response. We are just validating the string against the responseFormat, thus calling it "format" doesn't seem like the right fit if that makes sense.

Since this is this proposal is about a syntactic sugar for another underlying API maybe it should just surface up the same terms used on that API instead of introducing new ones (assuming those have already been bikesheded and for consistency assuming we are not introducing new behaviour).

responseJSONSchema : new LanguageModelResponseSchema({
  type: "string",
  pattern: RegExp
})

so maybe the shorthand for the above could be:

await session.prompt(
  `Create a fictional email address for ${characterName}.`,
  { pattern: RegExp }
)

And defining both should throw imo.

@domenic
Copy link
Collaborator Author

domenic commented Apr 24, 2025

"responseFormat" being populated with an Object or RegExp, but the actual response is going to be a string or an object if being implicitly parsed,

That's why we changed it to "constraint" instead of format, per the most recent two messages discussing it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants