feat(gemini): Add support for Gemini 2.5 Thinking Budget #344

btumbleson · 2025-04-29T04:26:55Z

Description

In the preview Gemini 2.5 models, Google introduced "thinking", and is enabled by default on these models. Without this feature, using the new models will always include thinking, which is billed at a higher rate and not always desirable. This PR enables you to configure a thinkingBudget, which Gemini views a max amount of tokens that may be used for thinking, or it can be set to 0 to disable it altogether.

This PR:

Introduces a new thoughtTokens to the Usage class
Leverages the withProviderOptions() method to inject a thinkingBudget when passed as a key, value pair
Includes two tests that check for thoughtTokens in the Usage output
Updates the documentation for Gemini to include this functionality

Usage

$response = Prism::text()
			->using(Provider::Gemini, 'gemini-2.5-flash-preview')
			->withPrompt('Explain the concept of Occam\'s Razor and provide a simple, everyday example.')
			->withProviderOptions(['thinkingBudget' => 500])
			->asText();

You can set thinkingBudget to 0 to disable thinking altogether. Note this also works with structured.

Potential Discussion

Gemini has made the choice that thinking is on by default, so this PR follows that convention. As in, if you do not specify a thinkingBudget of 0, then thinking will be enabled. An alternative approach could assume that unless you include a thinkingBudget, that Prism can set this value to 0. For this to work effectively you would need to (re)introduce a ThinkingModeResolver, at least in the near term, as I expect all models will eventually gravitate towards this thinking convention.

ChrisB-TL · 2025-04-29T07:32:25Z

Thank you!

Re: default, I think we should be consistent across providers if we can. For Anthropic, we have it default disabled so I'd be tempted to do the same here. (IMO its probably the more sensible default anyway?)

ChrisB-TL · 2025-04-29T07:37:26Z

src/ValueObjects/Usage.php

@@ -10,6 +10,7 @@ public function __construct(
        public int $promptTokens,
        public int $completionTokens,
        public ?int $cacheWriteInputTokens = null,
-        public ?int $cacheReadInputTokens = null
+        public ?int $cacheReadInputTokens = null,
+		public ?int $thoughtTokens = null,


I'd be tempted to add the thoughtTokens to the Response additionalContent property, as it is unique to Gemini.

What do you think @sixlive?

I could go both ways, but I think it's likely that all major models will converge on having "thinking" be a capability of their model, so my thought with this here was to future proof a bit.

I've had a look at the other providers that have reasoning models. As always with providers... its a pretty 50:50 split whether they add thinking tokens:

OpenAI, Gemini, and xAI do; and

Anthropic and Deepseek don't.

Groq and Ollama - not sure.

I reckon there's enough that do to warrant adding.

ChrisB-TL · 2025-04-29T07:39:38Z

tests/Providers/Gemini/GeminiTextTest.php

@@ -440,3 +441,30 @@
        expect($response->usage->cacheReadInputTokens)->toBe(88759);
    });
 });
+
+describe('Thinking Mode for Gemini', function (): void {


Please can you add Http assertions that the correct payload is sent also? Conscious we don't have this enough in the code base (on the todo to add more), and we've been bitten by it a couple of times in refactors.

Good call out. Actually found a bug while adding these tests!

kinsta · 2025-04-29T14:26:11Z

Preview deployments for prism ⚡️

Status	Branch preview	Commit preview
❌ Failed to deploy	N/A	N/A

Commit: e057c7097571c04b536aba8a7d9c609b63998926

Deployment ID: 8e2bd336-a4ed-4c3a-a917-6777636082e6

Static site name: prism-97nz9

btumbleson · 2025-04-29T16:37:38Z

src/Providers/Gemini/Handlers/Text.php

+					'thinkingBudget' => array_key_exists('thinkingBudget', $providerOptions)
+						? $providerOptions['thinkingBudget']
+						: null,
+				], fn($v) => $v !== null),


This is generally messier than I wanted it to be, but just a simple array_filter will throw away a 0 value, which is actually valid.

btumbleson · 2025-04-29T16:38:25Z

tests/Providers/Gemini/GeminiTextTest.php

+		Http::assertSent(function (Request $request) {
+			$data = $request->data();
+
+			expect($data['generationConfig'])->not->toHaveKey('thinkingConfig');


Note this is actually purposeful inline with the default convention that we would expect to see thoughtTokens in the response without specifying a thinkingConfig

ChrisB-TL · 2025-04-30T08:50:53Z

Thank you!

Re: default, I think we should be consistent across providers if we can. For Anthropic, we have it default disabled so I'd be tempted to do the same here. (IMO its probably the more sensible default anyway?)

Sorry, you may have missed this comment as I didn't include in the body of my review!

btumbleson · 2025-04-30T15:02:50Z

Thank you!

Re: default, I think we should be consistent across providers if we can. For Anthropic, we have it default disabled so I'd be tempted to do the same here. (IMO its probably the more sensible default anyway?)

Sorry, you may have missed this comment as I didn't include in the body of my review!

I can make this change, but have two hesitations:

This functionality is enabled via the withProviderOptions which somewhat implies it's provider-specific. Deviating from how Google handles this (which is on by default) feels counter-intuitive. If the method was withThinkingBudget then I think I'd agree with the need to better standardize against all providers.
Setting the thinking budget to 0 by default will require a ThinkingModeResolver which as of today, only supports a single model. This will introduce incremental maintainence. Not major, but I'm conscious of this. Similar to how earlier iterations of Prism, there was a StructuredModeResolver in more providers but now has gotten removed.

Just let me know your thoughts on how we want to proceed.

btumbleson · 2025-05-04T16:09:53Z

@sixlive - anything you're looking to see here to merge?

sixlive · 2025-05-04T16:11:02Z

Just need to sit down and give it a thorough review. Probably later today or tomorrow.

pushpak1300 · 2025-05-04T23:01:36Z

So @btumbleson if i don't specifically set the thinking budget the request will be sent without thinking budget ? Right?

btumbleson · 2025-05-04T23:04:58Z

So @btumbleson if i don't specifically set the thinking budget the request will be sent without thinking budget ? Right?

Other way around. If you don't specify a thinking budget on a model that support it (e.g. Gemini 2.5 Flash Preview), then it will include thinking, just with an unspecified budget (you're effectively letting the model decide how much it needs to "think"). This matches Gemini's default behavior: https://ai.google.dev/gemini-api/docs/thinking#use-thinking-models

The "thoughts" themselves are not returned via the API, only a count of the tokens used. As thinking tokens are billed at a higher rate, you may not want thinking enabled, or may wish to limit the amount of tokens that can be used to "think". Both of which you can accomplish with setting thinkingBudget

sixlive

SICK!!! Thank you so much!

btumbleson added 3 commits April 27, 2025 20:10

Add support for thinkingBudget

fa16b1b

test: add tests for thinking budget on Gemini

6f417d9

feat: add thoughtTokens to Usage

263ffff

ChrisB-TL suggested changes Apr 29, 2025

View reviewed changes

btumbleson added 2 commits April 29, 2025 09:31

fix: do not ignore purposeful "0" thinkingBudget

f635b22

test: add HTTP payload assertion tests "thinking"

cccc4f4

btumbleson commented Apr 29, 2025

View reviewed changes

btumbleson requested a review from ChrisB-TL April 29, 2025 20:15

fix: linting

c793776

sixlive approved these changes May 12, 2025

View reviewed changes

sixlive merged commit b028f84 into prism-php:main May 12, 2025
13 of 14 checks passed

Uh oh!

feat(gemini): Add support for Gemini 2.5 Thinking Budget #344

feat(gemini): Add support for Gemini 2.5 Thinking Budget #344

Uh oh!

Conversation

btumbleson commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Usage

Potential Discussion

Uh oh!

ChrisB-TL commented Apr 29, 2025

Uh oh!

ChrisB-TL Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

btumbleson Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

ChrisB-TL Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChrisB-TL Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

btumbleson Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

kinsta bot commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview deployments for prism ⚡️

Uh oh!

btumbleson Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

btumbleson Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

ChrisB-TL commented Apr 30, 2025

Uh oh!

btumbleson commented Apr 30, 2025

Uh oh!

btumbleson commented May 4, 2025

Uh oh!

sixlive commented May 4, 2025

Uh oh!

pushpak1300 commented May 4, 2025

Uh oh!

btumbleson commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sixlive left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

btumbleson commented Apr 29, 2025 •

edited

Loading

ChrisB-TL Apr 30, 2025 •

edited

Loading

kinsta bot commented Apr 29, 2025 •

edited

Loading

btumbleson commented May 4, 2025 •

edited

Loading