-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat: 为渠道添加上下文限制功能 #1704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: alpha
Are you sure you want to change the base?
feat: 为渠道添加上下文限制功能 #1704
Conversation
WalkthroughAdds a per-channel token limit: new context key and Channel model field/method, middleware propagation into context, RelayInfo field, relay text validation that can return a new error code, and a web UI input to configure the limit. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as Web UI (Edit Channel)
participant API as API Server
participant MW as Middleware (Distributor)
participant RC as RelayCommon (GenRelayInfo)
participant RT as RelayText (TextHelper)
UI->>API: Save Channel with token_limit
Note right of API: token_limit persisted on Channel
API->>MW: Incoming request (selected channel)
MW->>MW: ctx[ContextKeyChannelTokenLimit] = channel.GetTokenLimit()
MW->>RC: GenRelayInfo(ctx)
RC->>RC: RelayInfo.ChannelTokenLimit = ctx value
RC-->>RT: RelayInfo
RT->>RT: Count promptTokens
RT->>RT: checkPromptTokensInBotChannel(promptTokens, RelayInfo)
alt tokens > ChannelTokenLimit
RT-->>API: Error: prompt_tokens_too_large
API-->>UI: 4xx with error code
else OK
RT->>RT: continue pricing/relay
RT-->>API: normal response
API-->>UI: success
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (7)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (6)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
65183d7
to
f200496
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
relay/relay-text.go (1)
116-128
: Bug: relayInfo.PromptTokens isn’t set when computed here (breaks fallback usage/logging).If prompt_tokens isn’t already in the Gin context, you compute it but never assign to relayInfo.PromptTokens. Later, postConsumeQuota uses relayInfo.PromptTokens when usage is nil, resulting in 0.
} else { promptTokens, err = getPromptTokens(textRequest, relayInfo) // count messages token error 计算promptTokens错误 if err != nil { return types.NewError(err, types.ErrorCodeCountTokenFailed) } c.Set("prompt_tokens", promptTokens) + relayInfo.PromptTokens = promptTokens }
🧹 Nitpick comments (8)
constant/context_key.go (1)
34-34
: Document the semantics (0 = unlimited) and unit for the new key.Please add a brief comment clarifying that this value is a per-channel prompt token cap, measured in tokens, and that 0 disables the cap. This avoids ambiguity across middleware/relay layers.
Apply this diff:
- ContextKeyChannelTokenLimit ContextKey = "channel_token_limit" + // per-channel prompt token limit (prompt/context tokens only); 0 means unlimited + ContextKeyChannelTokenLimit ContextKey = "channel_token_limit"Can you confirm that the enforcement code treats values <= 0 as “unlimited”?
types/error.go (1)
41-41
: Set client error status for prompt-token limit
In relay/relay-text.go at thecheckPromptTokensInBotChannel
error path (line 132),types.NewError(err, types.ErrorCodePromptTokensTooLarge)
defaults to HTTP 500. Replace it withtypes.NewErrorWithStatusCode(err, types.ErrorCodePromptTokensTooLarge, http.StatusBadRequest)
(or 422) so token-limit violations return a 4xx status.web/src/pages/Channel/EditChannel.js (1)
1375-1384
: Clarify UX: show “0 = unlimited” and what is being limited.Add brief help text and a clearer placeholder to reduce confusion between prompt vs output tokens.
Apply this diff:
<Form.InputNumber field='token_limit' - label={t('最大上下文')} - placeholder={t('最大上下文')} + label={t('最大上下文')} + placeholder={t('最大上下文,0 表示不限制')} min={0} onNumberChange={(value) => handleInputChange('token_limit', value)} style={{ width: '100%' }} + extraText={t('仅限制提示词 tokens(输入上下文),不限制输出 tokens')} />relay/common/relay_info.go (2)
65-65
: Document the semantics of ChannelTokenLimit (0 = unlimited) and avoid negative surprises.Add a short field comment to make the contract explicit. Consider treating negatives as 0.
- ChannelTokenLimit int + ChannelTokenLimit int // 0 means unlimited; negative values are treated as 0
219-221
: Clamp channelTokenLimit to zero if negative.Add a guard after reading the value to prevent negative limits from blocking requests:
- channelTokenLimit := common.GetContextKeyInt(c, constant.ContextKeyChannelTokenLimit) + channelTokenLimit := common.GetContextKeyInt(c, constant.ContextKeyChannelTokenLimit) + if channelTokenLimit < 0 { + channelTokenLimit = 0 + }relay/relay-text.go (3)
130-134
: Return 400 Bad Request on limit violation (don’t use 500).Exceeding a per-channel prompt limit is a client error. Use NewErrorWithStatusCode to return 400.
- err = checkPromptTokensInBotChannel(promptTokens, relayInfo) - if err != nil { - return types.NewError(err, types.ErrorCodePromptTokensTooLarge) - } + err = checkPromptTokensInBotChannel(promptTokens, relayInfo) + if err != nil { + return types.NewErrorWithStatusCode(err, types.ErrorCodePromptTokensTooLarge, http.StatusBadRequest) + }
268-274
: Name/message polish and defensive semantics.
- Function name can be clearer; current name mixes “Bot” jargon. Suggest “checkPromptTokensAgainstChannelLimit”.
- Grammar: “tokens are greater…”
- Optional: treat negative limits as unlimited (defensive).
-func checkPromptTokensInBotChannel(promptTokens int, info *relaycommon.RelayInfo) error { - if info.ChannelTokenLimit > 0 && promptTokens > info.ChannelTokenLimit { - return fmt.Errorf("prompt tokens (%d) is greater than channel token limit (%d)", promptTokens, info.ChannelTokenLimit) - } - return nil -} +func checkPromptTokensAgainstChannelLimit(promptTokens int, info *relaycommon.RelayInfo) error { + limit := info.ChannelTokenLimit + if limit < 0 { + limit = 0 + } + if limit > 0 && promptTokens > limit { + return fmt.Errorf("prompt tokens (%d) are greater than channel token limit (%d)", promptTokens, limit) + } + return nil +}And update the call site:
-err = checkPromptTokensInBotChannel(promptTokens, relayInfo) +err = checkPromptTokensAgainstChannelLimit(promptTokens, relayInfo)
268-274
: Optional: enforce “prompt + planned completion” against the limit.If “最大上下文” is intended as total context tokens, consider also checking promptTokens + max(textRequest.MaxTokens, textRequest.MaxCompletionTokens) against the limit. Would require passing the request into the check function; keep it off by default to avoid false rejections on generous limits.
Do you want me to draft this change gated by a config flag (e.g., enforce_total_context_limit: bool) so it can be rolled out safely?
Also applies to: 130-134
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (7)
constant/context_key.go
(1 hunks)middleware/distributor.go
(1 hunks)model/channel.go
(2 hunks)relay/common/relay_info.go
(3 hunks)relay/relay-text.go
(3 hunks)types/error.go
(1 hunks)web/src/pages/Channel/EditChannel.js
(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-21T03:37:41.726Z
Learnt from: 9Ninety
PR: QuantumNous/new-api#1273
File: relay/channel/gemini/relay-gemini.go:97-116
Timestamp: 2025-06-21T03:37:41.726Z
Learning: In relay/channel/gemini/relay-gemini.go, the thinking budget calculation logic (including the MaxOutputTokens multiplication) was introduced in PR #1247. PR #1273 focused specifically on decoupling the thoughts summary feature from thinking budget settings and did not modify the existing thinking budget behavior.
Applied to files:
relay/relay-text.go
🧬 Code graph analysis (4)
middleware/distributor.go (2)
common/gin.go (1)
SetContextKey
(49-51)constant/context_key.go (1)
ContextKeyChannelTokenLimit
(34-34)
relay/common/relay_info.go (2)
common/gin.go (1)
GetContextKeyInt
(61-63)constant/context_key.go (1)
ContextKeyChannelTokenLimit
(34-34)
web/src/pages/Channel/EditChannel.js (1)
web/src/pages/Detail/index.js (1)
handleInputChange
(624-630)
relay/relay-text.go (2)
types/error.go (2)
NewError
(146-154)ErrorCodePromptTokensTooLarge
(41-41)relay/common/relay_info.go (1)
RelayInfo
(62-113)
🔇 Additional comments (5)
model/channel.go (1)
47-47
: Pointer + default: verify migration and allow‐null semantics
- Confirm
&model.Channel
is included in theDB.AutoMigrate(...)
call (model/main.go) so thetoken_limit
column is created.- Ensure the column is NULL-able and that
GetTokenLimit()
correctly treats nil as 0.- Server-side must still sanitize negative
token_limit
values from raw API clients, despite the UI’smin=0
.web/src/pages/Channel/EditChannel.js (1)
105-105
: LGTM: sensible default.Defaulting
token_limit
to 0 (unlimited) matches server semantics.middleware/distributor.go (1)
270-270
: LGTM: propagates channel token limit via context.This surfaces the limit early for relay code to consume. With the accessor clamping negatives to 0, the context value will be safe to read.
Please confirm
relay/common/relay_info.GenRelayInfo
reads this viacommon.GetContextKeyInt
and that enforcement checks use “> 0” before comparing.relay/common/relay_info.go (1)
240-241
: LGTM — value is plumbed into RelayInfo.relay/relay-text.go (1)
504-504
: LGTM — simpler string concatenation over fmt.Sprintf.Minor readability/micro-alloc improvement.
f200496
to
2a96c67
Compare
2a96c67
to
7ac2070
Compare
PR 类型
PR 是否包含破坏性更新?
PR 描述
例图
Summary by CodeRabbit
Summary by CodeRabbit