Skip to content

An output language support detection option #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
domenic opened this issue Apr 3, 2025 · 3 comments · Fixed by #111
Closed

An output language support detection option #97

domenic opened this issue Apr 3, 2025 · 3 comments · Fixed by #111
Labels
enhancement New feature or request

Comments

@domenic
Copy link
Collaborator

domenic commented Apr 3, 2025

In Chrome, we only support the prompt API outputting certain human languages, which have been tested to give safe and sensible results.

Right now, there is no way for web developers to know what those supported languages are from the API. This could be problematic for sites which want to ensure that the language model output is of good quality. They could read each browser's documentation, but of course part of the point of the prompt API is to abstract over that.

It would be good to give developers who want to ensure they're only working with supported output languages some way of signaling the expected output language when using LanguageModel.create(), or checking LanguageModel.availability(). Then, the API could reject or return "unavailable" if the developer needs a guarantee from the browser that a given output language has been tested and is supported.

The tricky part of designing this API is that the option would have no impact on the actual output language. The actual output just whatever the model decides to respond with, given its current prompt history. Unlike with the writing assistance APIs, we're not using high-level options to prompt-engineer or apply DPO or fine-tunings. The prompt API is about talking directly to the language model, with what you get back controlled by your prompts and your prompts alone.

So if we named this option something like outputLanguage, that would be misleading. Cases like

const session = await LanguageModel.create({ outputLanguage: "ja" });
console.log(await session.prompt("Write me a poem"));

would almost certainly give you a result back in English, not Japanese.

I think we could tackle this with just better naming. Maybe something as simple as expectedOutputLanguages? This kind of parallels the expectedInputLanguage of the writing assistance APIs, or the expectedInputLanguages of the language detector API. Both of those use the expected prefix to mean "I expect what follows will be from these languages, so please check for availability before creating this model object". But they don't influence the actual inputs, since those are web developer-provided and could be arbitrary. Our case is similar, except instead of web-developer provided inputs we have model-provided outputs.

Is expectedOutputLanguages clear enough that it won't actually influence the output? Or do we need something more explicit?

@domenic domenic added the enhancement New feature or request label Apr 23, 2025
@zolkis
Copy link

zolkis commented Apr 28, 2025

Depending on the model, the prompt could also be "Write me a poem in Japanese" - i.e. instruct the model, which may respond "Sorry, I don't know Japanese."

But detecting language support, and setting external constraints by an explicit API could also be a use case.

@christianliebel
Copy link

I think the naming (expectedOutputLanguages) is precise enough. I think it serves the purpose of not even attempting a model download or initialization if there's certainly no chance it will work. Of course, it is unfortunate that it may still not work due to a problematic prompt, prompt engineering, or hallucination—but that's the non-deterministic nature of LLMs. It should also be future-proof. Should we ever have multilingual LLMs available on all devices, this would either become a no-op, or people would leave out expectedOutputLanguages entirely.

However, I'm questioning whether this parameter should be an array. Does this imply that all specified languages would be supported? If that’s the case, what specific scenario would require this functionality?

const session = await LanguageModel.create({ expectedOutputLanguages: ["de", "en", "fr"] });
console.log(await session.prompt("Write me a poem in English, French or German."));

domenic added a commit that referenced this issue May 9, 2025
@domenic
Copy link
Collaborator Author

domenic commented May 9, 2025

I've put up a draft at #111. I noticed two things while writing it:

  • This can be used, not only to fail creation on unsupported languages, but also to trigger the download of necessary fine-tunings, safety models, etc. That is, it's conceivable a model might not support French output out of the box, but could with an additional download. This API allows signalling this. Nice!
  • For symmetry with expectedInputs, it's better to call this expectedOutputs, and have people express the languages with { type: "text", languages: anArray }. This gives us a natural place to, in the future, put a request for multimodal output capabilities. For now though, putting { type: "image" } or similar in the expectedOutputs array will just fail.

However, I'm questioning whether this parameter should be an array. Does this imply that all specified languages would be supported? If that’s the case, what specific scenario would require this functionality?

Yes, it'd be a request to support all such languages. A typical case for requiring multilingual support is something like the language-tutor example given in https://github.com/webmachinelearning/prompt-api?tab=readme-ov-file#multilingual-content-and-expected-languages .

domenic added a commit that referenced this issue May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants