-
-
Notifications
You must be signed in to change notification settings - Fork 140
feat(gemini): Add support for Gemini 2.5 Thinking Budget #344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you! Re: default, I think we should be consistent across providers if we can. For Anthropic, we have it default disabled so I'd be tempted to do the same here. (IMO its probably the more sensible default anyway?) |
src/ValueObjects/Usage.php
Outdated
@@ -10,6 +10,7 @@ public function __construct( | |||
public int $promptTokens, | |||
public int $completionTokens, | |||
public ?int $cacheWriteInputTokens = null, | |||
public ?int $cacheReadInputTokens = null | |||
public ?int $cacheReadInputTokens = null, | |||
public ?int $thoughtTokens = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be tempted to add the thoughtTokens to the Response
additionalContent property, as it is unique to Gemini.
What do you think @sixlive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go both ways, but I think it's likely that all major models will converge on having "thinking" be a capability of their model, so my thought with this here was to future proof a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a look at the other providers that have reasoning models. As always with providers... its a pretty 50:50 split whether they add thinking tokens:
- OpenAI, Gemini, and xAI do; and
- Anthropic and Deepseek don't.
- Groq and Ollama - not sure.
I reckon there's enough that do to warrant adding.
@@ -440,3 +441,30 @@ | |||
expect($response->usage->cacheReadInputTokens)->toBe(88759); | |||
}); | |||
}); | |||
|
|||
describe('Thinking Mode for Gemini', function (): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you add Http assertions that the correct payload is sent also? Conscious we don't have this enough in the code base (on the todo to add more), and we've been bitten by it a couple of times in refactors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call out. Actually found a bug while adding these tests!
Preview deployments for prism ⚡️
Commit: Deployment ID: Static site name: |
'thinkingBudget' => array_key_exists('thinkingBudget', $providerOptions) | ||
? $providerOptions['thinkingBudget'] | ||
: null, | ||
], fn($v) => $v !== null), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generally messier than I wanted it to be, but just a simple array_filter
will throw away a 0
value, which is actually valid.
Http::assertSent(function (Request $request) { | ||
$data = $request->data(); | ||
|
||
expect($data['generationConfig'])->not->toHaveKey('thinkingConfig'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is actually purposeful inline with the default convention that we would expect to see thoughtTokens
in the response without specifying a thinkingConfig
Sorry, you may have missed this comment as I didn't include in the body of my review! |
I can make this change, but have two hesitations:
Just let me know your thoughts on how we want to proceed. |
@sixlive - anything you're looking to see here to merge? |
Just need to sit down and give it a thorough review. Probably later today or tomorrow. |
So @btumbleson if i don't specifically set the thinking budget the request will be sent without thinking budget ? Right? |
Other way around. If you don't specify a thinking budget on a model that support it (e.g. Gemini 2.5 Flash Preview), then it will include thinking, just with an unspecified budget (you're effectively letting the model decide how much it needs to "think"). This matches Gemini's default behavior: https://ai.google.dev/gemini-api/docs/thinking#use-thinking-models The "thoughts" themselves are not returned via the API, only a count of the tokens used. As thinking tokens are billed at a higher rate, you may not want thinking enabled, or may wish to limit the amount of tokens that can be used to "think". Both of which you can accomplish with setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SICK!!! Thank you so much!
Description
In the preview Gemini 2.5 models, Google introduced "thinking", and is enabled by default on these models. Without this feature, using the new models will always include thinking, which is billed at a higher rate and not always desirable. This PR enables you to configure a
thinkingBudget
, which Gemini views a max amount of tokens that may be used for thinking, or it can be set to 0 to disable it altogether.This PR:
thoughtTokens
to theUsage
classwithProviderOptions()
method to inject athinkingBudget
when passed as a key, value pairthoughtTokens
in theUsage
outputUsage
You can set
thinkingBudget
to0
to disable thinking altogether. Note this also works with structured.Potential Discussion
Gemini has made the choice that thinking is on by default, so this PR follows that convention. As in, if you do not specify a
thinkingBudget
of0
, then thinking will be enabled. An alternative approach could assume that unless you include athinkingBudget
, that Prism can set this value to 0. For this to work effectively you would need to (re)introduce aThinkingModeResolver
, at least in the near term, as I expect all models will eventually gravitate towards this thinking convention.