Skip to content

Conversation

@cosmastech
Copy link
Contributor

@cosmastech cosmastech commented Nov 29, 2025

Fixes

We introduce a LazyPromise in order to not add all the requests the curl multihandler simultaneously. As it was written originally:

  • A PendingRequest set to async that calls send() adds a request to the CurlMultiHandler
  • When any promise tied to the same multi-handler is waited, all of the requests are fired off because they've already been stacked on the handler
  • The concurrency parameter at that point only serves to resolve N number of requests, without paying any mind to the fact that they've already been dispatched

Now, once we have the requests for the pool, we batch them by $concurrency and only at that time do we add them to the CurlMultiHandler queue.

This aligns with my expectation and the Laravel docs that Http::pool() will only have $concurrency number of simultaneous requests in flight.

Testing

I added a simple node JS server since using Herd was confounding the results

node-server.js
import http from 'http'

const PORT = process.env.PORT || 4000;

const server = http.createServer(async (req, res) => {
  const url = new URL(req.url, `http://${req.headers.host}`);

  if (url.pathname === '/test') {
    const run = url.searchParams.get('run') ?? 'unknown';
    const sleepTime = parseInt(url.searchParams.get('sleep') ?? 4);
    const start = new Date().toISOString();
    console.log(`[${start}] received run=${run} sleeping for: ${sleepTime}`);

    // Simulate work
    await new Promise(resolve => setTimeout(resolve, sleepTime * 1000));

    const done = new Date().toISOString();
    console.log(`[${done}] finished run=${run}`);

    res.writeHead(200, { 'Content-Type': 'text/plain' });
    res.end(String(Math.floor(Math.random() * 900) + 100));
    return;
  }

  res.writeHead(404, { 'Content-Type': 'text/plain' });
  res.end('Not Found');
});

server.listen(PORT, () => {
  console.log(`Server listening on http://localhost:${PORT}`);
});

I then added a test locally like so:

Additional Test
public function test_concurrency()
    {
        $r = Benchmark::measure(function () {
            Http::pool(function (Pool $pool) {
                $pool->as('first')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=1&sleep=9');
                $pool
                    ->as('second')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=2');
                $pool
                    ->as('third')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=3');
                $pool
                    ->as('fourth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=4');
                $pool
                    ->as('fifth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=5');
                $pool
                    ->as('six')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=6');
                $pool
                    ->as('seventh')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=7');
                $pool
                    ->as('eighth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=8');
                $pool
                    ->as('ninth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=9');
                $pool
                    ->as('tenth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=10');
            }, 2);
        });

        dd($r);
    }

By changing the concurrency parameter passed to Http::pool(), I was able to see the times move as I would expect.

Concurrency Time (ms)
null 9014.908542
0 9017.036375
1 45053.090083
2 24038.000042
5 12026.101916
10 9016.515708
25 9015.821375

cosmastech and others added 3 commits November 29, 2025 11:05
Updated the default value of the concurrency parameter in the pool method to 0 for concurrent execution.
@taylorotwell
Copy link
Member

If this is really the case we should just make it 0 and consider it a bug fix.

@cosmastech
Copy link
Contributor Author

cosmastech commented Nov 29, 2025

If this is really the case we should just make it 0 and consider it a bug fix.

@taylorotwell for the test you provided here #57972 (comment), does changing the concurrency to anything have an effect on the outcome? It doesn't for me... 😕

I'm thinking that the implementation in Pool is either incorrect or not what I expect.

The creation of EachPromise just determines how many requests to start simultaneously, but it's not waiting for the promise to resolve before carrying on to the next set.

I guess I had it in my head that pool() would only have $concurrent number of batches outstanding at a time.


if ($this->async) {
return $this->makePromise($method, $url, $options);
return $this->promise = new FluentPromise(fn () => $this->makePromise($method, $url, $options));
Copy link
Contributor Author

@cosmastech cosmastech Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fought long and hard with Codex about this. I mean, I was trying to get this damned LLM to admit it was wrong, but alas, I had to eat crow.

So, when we call Client@requestAsync() it is telling the underlying curl multihandler about the request we're going to make. All of the requests always get fired off by the handler as soon as we run the first promise, making the $concurrent parameter all but meaningless inside of PendingRequest@pool() since they'll the curl multihandler initiates everything in its queue when we first the first request.

@cosmastech cosmastech marked this pull request as draft November 30, 2025 12:42
@cosmastech cosmastech changed the title [12.x] Indicate concurrency parameter in PendingRequest@pool() [12.x] Fix PendingRequest@pool() concurrency Nov 30, 2025
@cosmastech cosmastech marked this pull request as ready for review November 30, 2025 14:59
Comment on lines 900 to 914
if ($concurrency === null) {
(new Collection($requests))->each(function ($item) {
if ($item instanceof static) {
$item = $item->getPromise();
}
if ($item instanceof LazyPromise) {
$item->buildPromise();
}
});
foreach ($requests as $key => $item) {
$results[$key] = $item instanceof static ? $item->getPromise()->wait() : $item->wait();
}

return $results;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this for backwards compatibility: so null executes everything at once as it always had, throwing an exception if the user marked throws() or there's a connection timeout.

I would like to see this entirely removed in Laravel 13.

To me, it makes no sense in the context of request pooling, and it changes the underlying behavior. Here, we throw an exception on a connection timeout, whereas in any other case, we return the ConnectionException as a part of the array.

* @return array<array-key, \Illuminate\Http\Client\Response|\Illuminate\Http\Client\ConnectionException|\Illuminate\Http\Client\RequestException>
*/
public function pool(callable $callback, ?int $concurrency = null)
public function pool(callable $callback, ?int $concurrency = 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taylorotwell the change to 0 for the default probably breaks existing code and maybe we should just forego it until Laravel 13.

$this->guzzlePromise = call_user_func($this->promiseBuilder);

foreach ($this->pending as $pendingCallback) {
$pendingCallback($this->guzzlePromise);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could possibly merge the behavior of FluentPromise with this by doing something like:

$result = $pendingCallback($this->guzzlePromise);
if ($result instanceof PromiseInterface) {
    $this->guzzlePromise = $result;
}

@cosmastech
Copy link
Contributor Author

@WendellAdriel do you mind taking a look at the changes to Batch? 🙇

@taylorotwell
Copy link
Member

I read concurrency meaning the maximum number of requests that are in flight at any given time, I don't think of it as having anything to do with "batches"... For example I wouldn't expect a slow request in the first batch of five requests to prevent more requests from firing if the first four are already done.

@cosmastech
Copy link
Contributor Author

I read concurrency meaning the maximum number of requests that are in flight at any given time, I don't think of it as having anything to do with "batches"... For example I wouldn't expect a slow request in the first batch of five requests to prevent more requests from firing if the first four are already done.

That's a good point. But I don't the original implementation works that way either. Not sure how to make this work so the queue only has N number of grouped requests. 🤔

For a use case I have in mind, I wouldn't want to exhaust API rate limits or get an IP blacklisted for firing off 20 requests at the same time.

@taylorotwell
Copy link
Member

taylorotwell commented Nov 30, 2025

Do we know for certain how the current implementation works? 👀 It definitely doesn't seem to be totally serial but also doesn't seem to respect the concurrency value either. I only usually see 2-3 concurrent requests at once regardless of the concurrency value.

@WendellAdriel
Copy link
Contributor

The definition of concurrency is what Taylor said. It's weird, because some tests I ran during the PR it seemed to be respecting the concurrency. It's using the concurrency option from Guzzle under the hood, so I'd expect this to work

@taylorotwell
Copy link
Member

@WendellAdriel maybe you can dig into the current code and see what you see on your end? Maybe we are missing something?

@WendellAdriel
Copy link
Contributor

@taylorotwell sure, let me run some tests on my end and try to figure out what's happening! I'll report back here ASAP

@cosmastech
Copy link
Contributor Author

@taylorotwell sure, let me run some tests on my end and try to figure out what's happening! I'll report back here ASAP

For whatever it's worth, I added a log inside of the CurlMultiHandler@addRequest method. From what I can see, all of the Pooled or Batched requests get added to the handler at once in the current implementation of these functions.

Possibly a good test would be to pool 100 requests, but only await the first promise in Pool. See if 100 requests hit the local endpoint.

@WendellAdriel
Copy link
Contributor

@taylorotwell @cosmastech I think I found the issue.
I'm going to test a little more to make sure.
If that's the issue, should I create a new PR for it or how do we handle the fix?

@cosmastech
Copy link
Contributor Author

cosmastech commented Nov 30, 2025

Here is some evidence that this dispatches everything at once

Run the node.js script in the original PR description. Then apply this diff patch against the 12.x branch and run the new test.

multifailure.txt

Here's a screen recording as well

multi-failure.mov

Another possibility is that we set a new multi-handler for each group of requests. You'd have to pass the $concurrency when creating the Pool. That still wouldn't allow for what you @taylorotwell had suggested above about not waiting for the handle to resolve before continuing (I don't think). Feels like we may have to drop down into the Guzzle Queue?

@cosmastech
Copy link
Contributor Author

Do we know for certain how the current implementation works? 👀 It definitely doesn't seem to be totally serial but also doesn't seem to respect the concurrency value either. I only usually see 2-3 concurrent requests at once regardless of the concurrency value.

@taylorotwell are you running against Herd or possibly valet? It looks like the FPM config confounds the results.

@WendellAdriel
Copy link
Contributor

@taylorotwell @cosmastech as I mentioned, I think I found the root cause of the issue and how to fix it and opened this PR: #57977 so you can take a look at how it can be fixed. I added there because it would be easier/cleaner to show the fix.

@cosmastech
Copy link
Contributor Author

as I mentioned, I think I found the root cause of the issue and how to fix it and opened this PR: #57977 so you can take a look at how it can be fixed. I added there because it would be easier/cleaner to show the fix.

@WendellAdriel I think both of these implementations functionally are the same. I am not positive that using the generator makes a meaningful difference in how this executes.

That said, I think I have lost the plot a bit here. What is the desired behavior? Release $concurrency number of requests on every tick, regardless of how long they take to resolve? I think both this PR and @WendellAdriel's #57977 suffer from the same "wait for the concurrent group to finish problem".

The current implementation dispatches everything regardless of concurrency limit. The cause is that Guzzle's CurlMultiHandler is shared between all requests. As soon as an async request is built (via PendingRequest@makePromise()) the request is being added to that handler. As soon as one Promise is run, every single request that is on the handler dispatches.

The approach in this PR (and Wendell's too I think) is only adding new requests once the previous ones have been settled, so the multi-handler doesn't dispatch them simultaneously, which is not the desired outcome.

If neither one of these approaches is the desired behavior, then I am not positive what the solution is. The things I haven't tried yet:

  1. Convert Laravel's Pool to use Guzzle's underlying Pool, which accepts an array of RequestInterfaces. This would be a pretty large refactor of how PendingRequest works.
  2. Try to drip requests into the multi-handler via Guzzle's Promises Queue somewhere in the loop of PendingRequest@pool()

@WendellAdriel
Copy link
Contributor

@cosmastech

The current implementation dispatches everything regardless of concurrency limit. The cause is that Guzzle's CurlMultiHandler is shared between all requests. As soon as an async request is built (via PendingRequest@makePromise()) the request is being added to that handler. As soon as one Promise is run, every single request that is on the handler dispatches.

That's why in my PR I create a proxy object, the DeferredRequest and combine with the generators, because the PendingRequest as you mentioned already inits the requests, with this approach, the proxy and generators, the HTTP requests are not initialized before. So we pass the generator function to the EachPromise from Guzzle without them being initialized, this will them handle the concurrency handling to Guzzle EachPromise setting the concurrency parameter.

The approach in this PR (and Wendell's too I think) is only adding new requests once the previous ones have been settled, so the multi-handler doesn't dispatch them simultaneously, which is not the desired outcome.

No, in my implementation I pass all requests at once to EachPromise, but since they're not initialized, Guzzle is able to properly handle the concurrency.

The root issue was that the PendingRequest was already initializing the request, so when it reached the EachPromise, they were already in-flight, so Guzzle was not able to handle properly the concurrency configuration.

@cosmastech
Copy link
Contributor Author

Ahhh, so I modified my test to put in variable sleep times, and this is working as you had described Taylor:

Node Server
import http from 'http'

const PORT = process.env.PORT || 4000;

const server = http.createServer(async (req, res) => {
  const url = new URL(req.url, `http://${req.headers.host}`);

  if (url.pathname === '/test') {
    const run = url.searchParams.get('run') ?? 'unknown';
    const sleepTime = parseInt(url.searchParams.get('sleep') ?? 4);
    const start = new Date().toISOString();
    console.log(`[${start}] received run=${run} sleeping for: ${sleepTime}`);

    // Simulate work
    await new Promise(resolve => setTimeout(resolve, sleepTime * 1000));

    const done = new Date().toISOString();
    console.log(`[${done}] finished run=${run}`);

    res.writeHead(200, { 'Content-Type': 'text/plain' });
    res.end(String(Math.floor(Math.random() * 900) + 100));
    return;
  }

  res.writeHead(404, { 'Content-Type': 'text/plain' });
  res.end('Not Found');
});

server.listen(PORT, () => {
  console.log(`Server listening on http://localhost:${PORT}`);
});

And the additional test case:

    public function test_concurrency()
    {
        $r = Benchmark::measure(function () {
            Http::pool(function (Pool $pool) {
                $pool->as('first')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=1&sleep=9');
                $pool
                    ->as('second')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=2');
                $pool
                    ->as('third')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=3');
                $pool
                    ->as('fourth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=4');
                $pool
                    ->as('fifth')
                    ->connectTimeout(99999999)
                    ->get('http://localhost:4000/test?run=5');
            }, 2);
        });

        dd($r);
    }

Here's the output:

Server listening on http://localhost:4000

[2025-12-01T19:30:35.650Z] received run=1 sleeping for: 9
[2025-12-01T19:30:35.652Z] received run=2 sleeping for: 4
[2025-12-01T19:30:39.653Z] finished run=2
[2025-12-01T19:30:39.663Z] received run=3 sleeping for: 4
[2025-12-01T19:30:43.665Z] finished run=3
[2025-12-01T19:30:43.668Z] received run=4 sleeping for: 4
[2025-12-01T19:30:44.652Z] finished run=1
[2025-12-01T19:30:44.656Z] received run=5 sleeping for: 4
[2025-12-01T19:30:47.670Z] finished run=4
[2025-12-01T19:30:48.657Z] finished run=5

As you can see, another request dispatches as soon as the other request completes. (Run 1 is still pending, but because Run 2 has finished, Run 3 is started).

@cosmastech cosmastech marked this pull request as draft December 1, 2025 19:41
@cosmastech
Copy link
Contributor Author

@cosmastech

The current implementation dispatches everything regardless of concurrency limit. The cause is that Guzzle's CurlMultiHandler is shared between all requests. As soon as an async request is built (via PendingRequest@makePromise()) the request is being added to that handler. As soon as one Promise is run, every single request that is on the handler dispatches.

That's why in my PR I create a proxy object, the DeferredRequest and combine with the generators, because the PendingRequest as you mentioned already inits the requests, with this approach, the proxy and generators, the HTTP requests are not initialized before. So we pass the generator function to the EachPromise from Guzzle without them being initialized, this will them handle the concurrency handling to Guzzle EachPromise setting the concurrency parameter.

The approach in this PR (and Wendell's too I think) is only adding new requests once the previous ones have been settled, so the multi-handler doesn't dispatch them simultaneously, which is not the desired outcome.

No, in my implementation I pass all requests at once to EachPromise, but since they're not initialized, Guzzle is able to properly handle the concurrency.

The root issue was that the PendingRequest was already initializing the request, so when it reached the EachPromise, they were already in-flight, so Guzzle was not able to handle properly the concurrency configuration.

@WendellAdriel I think my test was just faulty. 😅 I wasn't introducing differing sleep timeouts, so it made it seem like it was waiting for the batch to complete, when they all just were finalizing at the same time.

I think both of our approaches are in alignment. I switched to use a generator here as you did in yours 🙇 For both of them, the closure to actually build the request doesn't happen until we're inside the generator. Mine is wrapped in a Promise while yours is built on demand via the DeferredRequest object, but they're effectively the same.

I would mentioning that switching from returning a PendingRequest to a DeferredRequest might be considered a breaking change for some implementations, for instance I think this would break:

function addDefaults(PendingRequest $pendingRequest): PendingRequest
{
    $pendingRequest->withUserAgent('laravel/forever')->connectTimeout(2);

    return $pendingRequest;
}

Http::pool(function (Pool $pool) {
    for($i = 1; $i < 10; $i++) {
        addDefaults($pool->as('request_' . $i))->get('https://fake.dev/api/users/' . $i);
    }
});

It also might not play as nice with IDE auto-completion, but that could probably be solved with a @mixin statement

@cosmastech cosmastech marked this pull request as ready for review December 1, 2025 19:59
@WendellAdriel
Copy link
Contributor

@cosmastech nice!
If both implementations are doing the same, then it's all good!

I never saw the pool and batch being used as you showed in the example TBH, but I do see that in that case this could cause a breaking change (adding the DeferredRequest).

Let's wait on Taylor on it!
Thanks for the work and effort on this! 🫶

@cosmastech cosmastech changed the title [12.x] Fix PendingRequest@pool() concurrency [12.x] Fix PendingRequest@pool() && batch() concurrency Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants