Use cases for parallel requests #59

tomayac · 2024-11-12T16:36:38Z

The Prompt API currently doesn't support parallel requests, but there are use cases for this feature, like, for example, analyzing n (say, 10) post items in an RSS feed to see if any of the post items is about a given topic that the user isn't interested in. While you could process the post items one-by-one, ideally, you process them in parallel, which raises the question of what an ideal maximum number of m (say, 5) requests would be. This issue is for collecting other use cases or examples.

The text was updated successfully, but these errors were encountered:

jacoblee93 · 2024-11-12T18:48:15Z

Big +1 on making this work. Anything that requires responsiveness and more than a single LLM call (map-reduce style summarization, agents, RAG over multiple sources) benefits massively from parallelization.

At the very least the burden shouldn't be on the user to run things serially and the model should implement its own robust queuing strategy.

kowalczyk-krzysztof · 2024-11-12T21:51:54Z

When it comes to developing extensions for Chrome (which I think will be the main use case for prompt API), any content analysis in real time is not gonna work without parallelization (or context window increase, but I guess that's not something that can be done).

To give you a specific example, I was looking into building an extension that would leverage some existing tool to get an a11y report and then feed the HTML elements, which have a11y violations detected and the corresponding violation description into prompt API to get the elements modified and then replace them in DOM. The bottleneck was the lack of parallelization - being able to have 10 API calls running in parallel would be more than enough to make this work in real time.

alecf · 2024-11-12T23:53:43Z

This is very common for other LLM use. (See the name "chain" in langchain)

There are heterogeneous activities that you might want to do that benefit from separate requests, and you want to show users results as they come in. For instance for a single paragraph you might want to do all of these in parallel

extract a list of entity names
create a 1 sentence summary
do sentiment classification

and then present the results of this to the user as results arrive. In addition you might do additional work based on, say, sentiment classification, (i.e. get a list of complaints from a negative review) and you wouldn't want to block/await other activities to begin that task.

domenic · 2024-11-13T06:02:35Z

The API described here supports parallel requests (e.g. via multiple sessions), so I think this is more of a Chromium implementation issue. Please file it at https://crbug.new :)

nilinswap · 2024-11-14T02:15:01Z

here

Sorry, I did not get. API described where?

domenic · 2024-11-14T02:19:33Z

At https://github.com/explainers-by-googlers/prompt-api

domenic closed this as completed Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use cases for parallel requests #59

Use cases for parallel requests #59

tomayac commented Nov 12, 2024 •

edited

Loading

jacoblee93 commented Nov 12, 2024

Uh oh!

kowalczyk-krzysztof commented Nov 12, 2024

Uh oh!

alecf commented Nov 12, 2024

Uh oh!

domenic commented Nov 13, 2024

Uh oh!

nilinswap commented Nov 14, 2024

Uh oh!

domenic commented Nov 14, 2024

Uh oh!

Use cases for parallel requests #59

Use cases for parallel requests #59

Comments

tomayac commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

jacoblee93 commented Nov 12, 2024

Uh oh!

kowalczyk-krzysztof commented Nov 12, 2024

Uh oh!

alecf commented Nov 12, 2024

Uh oh!

domenic commented Nov 13, 2024

Uh oh!

nilinswap commented Nov 14, 2024

Uh oh!

domenic commented Nov 14, 2024

Uh oh!

tomayac commented Nov 12, 2024 •

edited

Loading