An output language support detection option #97

domenic · 2025-04-03T07:24:24Z

In Chrome, we only support the prompt API outputting certain human languages, which have been tested to give safe and sensible results.

Right now, there is no way for web developers to know what those supported languages are from the API. This could be problematic for sites which want to ensure that the language model output is of good quality. They could read each browser's documentation, but of course part of the point of the prompt API is to abstract over that.

It would be good to give developers who want to ensure they're only working with supported output languages some way of signaling the expected output language when using LanguageModel.create(), or checking LanguageModel.availability(). Then, the API could reject or return "unavailable" if the developer needs a guarantee from the browser that a given output language has been tested and is supported.

The tricky part of designing this API is that the option would have no impact on the actual output language. The actual output just whatever the model decides to respond with, given its current prompt history. Unlike with the writing assistance APIs, we're not using high-level options to prompt-engineer or apply DPO or fine-tunings. The prompt API is about talking directly to the language model, with what you get back controlled by your prompts and your prompts alone.

So if we named this option something like outputLanguage, that would be misleading. Cases like

const session = await LanguageModel.create({ outputLanguage: "ja" });
console.log(await session.prompt("Write me a poem"));

would almost certainly give you a result back in English, not Japanese.

I think we could tackle this with just better naming. Maybe something as simple as expectedOutputLanguages? This kind of parallels the expectedInputLanguage of the writing assistance APIs, or the expectedInputLanguages of the language detector API. Both of those use the expected prefix to mean "I expect what follows will be from these languages, so please check for availability before creating this model object". But they don't influence the actual inputs, since those are web developer-provided and could be arbitrary. Our case is similar, except instead of web-developer provided inputs we have model-provided outputs.

Is expectedOutputLanguages clear enough that it won't actually influence the output? Or do we need something more explicit?

The text was updated successfully, but these errors were encountered:

zolkis · 2025-04-28T07:35:10Z

Depending on the model, the prompt could also be "Write me a poem in Japanese" - i.e. instruct the model, which may respond "Sorry, I don't know Japanese."

But detecting language support, and setting external constraints by an explicit API could also be a use case.

christianliebel · 2025-04-29T10:24:59Z

I think the naming (expectedOutputLanguages) is precise enough. I think it serves the purpose of not even attempting a model download or initialization if there's certainly no chance it will work. Of course, it is unfortunate that it may still not work due to a problematic prompt, prompt engineering, or hallucination—but that's the non-deterministic nature of LLMs. It should also be future-proof. Should we ever have multilingual LLMs available on all devices, this would either become a no-op, or people would leave out expectedOutputLanguages entirely.

However, I'm questioning whether this parameter should be an array. Does this imply that all specified languages would be supported? If that’s the case, what specific scenario would require this functionality?

const session = await LanguageModel.create({ expectedOutputLanguages: ["de", "en", "fr"] });
console.log(await session.prompt("Write me a poem in English, French or German."));

Closes #97.

domenic · 2025-05-09T01:43:48Z

I've put up a draft at #111. I noticed two things while writing it:

This can be used, not only to fail creation on unsupported languages, but also to trigger the download of necessary fine-tunings, safety models, etc. That is, it's conceivable a model might not support French output out of the box, but could with an additional download. This API allows signalling this. Nice!
For symmetry with expectedInputs, it's better to call this expectedOutputs, and have people express the languages with { type: "text", languages: anArray }. This gives us a natural place to, in the future, put a request for multimodal output capabilities. For now though, putting { type: "image" } or similar in the expectedOutputs array will just fail.

However, I'm questioning whether this parameter should be an array. Does this imply that all specified languages would be supported? If that’s the case, what specific scenario would require this functionality?

Yes, it'd be a request to support all such languages. A typical case for requiring multilingual support is something like the language-tutor example given in https://github.com/webmachinelearning/prompt-api?tab=readme-ov-file#multilingual-content-and-expected-languages .

Closes #97.

domenic added the enhancement New feature or request label Apr 23, 2025

domenic added a commit that referenced this issue May 9, 2025

Add expectedOutputs

61aca54

Closes #97.

domenic mentioned this issue May 9, 2025

Add expectedOutputs #111

Merged

domenic closed this as completed in #111 May 15, 2025

domenic added a commit that referenced this issue May 15, 2025

Add expectedOutputs

a73f82d

Closes #97.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

An output language support detection option #97

An output language support detection option #97

domenic commented Apr 3, 2025

zolkis commented Apr 28, 2025

Uh oh!

christianliebel commented Apr 29, 2025

Uh oh!

domenic commented May 9, 2025

Uh oh!

An output language support detection option #97

An output language support detection option #97

Comments

domenic commented Apr 3, 2025

zolkis commented Apr 28, 2025

Uh oh!

christianliebel commented Apr 29, 2025

Uh oh!

domenic commented May 9, 2025

Uh oh!