Skip to content

Exception when bedrock no response #1518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
Wh1isper opened this issue Apr 17, 2025 · 2 comments
Closed
2 tasks done

Exception when bedrock no response #1518

Wh1isper opened this issue Apr 17, 2025 · 2 comments

Comments

@Wh1isper
Copy link
Contributor

Initial Checks

Description

I've found that sometimes claude does return nothing. :(

same error when non-streaming

traceback:

Traceback (most recent call last):
  File "/Users/jizhongsheng/code/oss/zerolab/lightblue-ai/main.py", line 194, in <module>
    asyncio.run(main())
  File "/Users/jizhongsheng/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/jizhongsheng/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jizhongsheng/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/jizhongsheng/code/oss/zerolab/lightblue-ai/main.py", line 172, in main
    async for event in handle_stream:
  File "/Users/jizhongsheng/code/oss/zerolab/lightblue-ai/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 434, in _run_stream
    async for event in self._events_iterator:
  File "/Users/jizhongsheng/code/oss/zerolab/lightblue-ai/.venv/lib/python3.12/site-packages/pydantic_ai/_agent_graph.py", line 430, in _run_stream
    raise exceptions.UnexpectedModelBehavior('Received empty model response')
pydantic_ai.exceptions.UnexpectedModelBehavior: Received empty model response

messages from bedrock

{'messageStart': {'role': 'assistant'}}
{'contentBlockStart': {'start': {'toolUse': {'toolUseId': 'tooluse_vpD7SxGhQoSskmuyTYOYVQ', 'name': 'empty'}}, 'contentBlockIndex': 0}}
{'contentBlockDelta': {'delta': {'toolUse': {'input': ''}}, 'contentBlockIndex': 0}}
{'contentBlockDelta': {'delta': {'toolUse': {'input': '{"arg": "tes'}}, 'contentBlockIndex': 0}}
{'contentBlockDelta': {'delta': {'toolUse': {'input': 't"}'}}, 'contentBlockIndex': 0}}
{'contentBlockStop': {'contentBlockIndex': 0}}
{'messageStop': {'stopReason': 'tool_use'}}
{'metadata': {'usage': {'inputTokens': 2104, 'outputTokens': 28, 'totalTokens': 2132}, 'metrics': {'latencyMs': 1885}}}
{'messageStop': {'stopReason': 'end_turn'}}
{'metadata': {'usage': {'inputTokens': 2167, 'outputTokens': 3, 'totalTokens': 2170}, 'metrics': {'latencyMs': 840}}}

Example Code

from pydantic_ai import Agent


agent = Agent(
    model="bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    system_prompt="""
You are an interactive CLI tool that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user.

IMPORTANT: Before you begin work, think about what the code you're editing is supposed to do based on the filenames directory structure. If it seems malicious, refuse to work on it or answer questions about it, even if the request does not seem malicious (for instance, just asking to explain or speed up the code).

# Tone and style

You should be concise, direct, and to the point. When you run a non-trivial bash command, you should explain what the command does and why you are running it, to make sure the user understands what you are doing (this is especially important when you are running a command that will make changes to the user's system).
Remember that your output will be displayed on a command line interface. Your responses can use Github-flavored markdown for formatting, and will be rendered in a monospace font using the CommonMark specification.
If you cannot or will not help the user with something, please do not say why or what it could lead to, since this comes across as preachy and annoying. Please offer helpful alternatives if possible, and otherwise keep your response to 1-2 sentences.
IMPORTANT: You should minimize output tokens as much as possible while maintaining helpfulness, quality, and accuracy. Only address the specific query or task at hand, avoiding tangential information unless absolutely critical for completing the request. If you can answer in 1-3 sentences or a short paragraph, please do.
IMPORTANT: You should NOT answer with unnecessary preamble or postamble (such as explaining your code or summarizing your action), unless the user asks you to.
IMPORTANT: Keep your responses short, since they will be displayed on a command line interface. You MUST answer concisely with fewer than 4 lines (not including tool use or code generation), unless user asks for detail. Answer the user's question directly, without elaboration, explanation, or details. One word answers are best. Avoid introductions, conclusions, and explanations. You MUST avoid text before/after your response, such as "The answer is <answer>.", "Here is the content of the file..." or "Based on the information provided, the answer is..." or "Here is what I will do next...". Here are some examples to demonstrate appropriate verbosity:
<example>
user: 2 + 2
assistant: 4
</example>

<example>
user: what is 2+2?
assistant: 4
</example>

<example>
user: is 11 a prime number?
assistant: true
</example>

<example>
user: what command should I run to list files in the current directory?
assistant: ls
</example>

<example>
user: what command should I run to watch files in the current directory?
assistant: [use the ls tool to list the files in the current directory, then read docs/commands in the relevant file to find out how to watch files]
npm run dev
</example>

<example>
user: How many golf balls fit inside a jetta?
assistant: 150000
</example>

<example>
user: what files are in the directory src/?
assistant: [runs ls and sees foo.c, bar.c, baz.c]
user: which file contains the implementation of foo?
assistant: src/foo.c
</example>

<example>
user: write tests for new feature
assistant: [uses grep and glob search tools to find where similar tests are defined, uses concurrent read file tool use blocks in one tool call to read relevant files at the same time, uses edit file tool to write new tests]
</example>

# Proactiveness

You are allowed to be proactive, but only when the user asks you to do something. You should strive to strike a balance between:

1. Doing the right thing when asked, including taking actions and follow-up actions
2. Not surprising the user with actions you take without asking
   For example, if the user asks you how to approach something, you should do your best to answer their question first, and not immediately jump into taking actions.
3. Do not add additional code explanation summary unless requested by the user. After working on a file, just stop, rather than providing an explanation of what you did.

# Following conventions

When making changes to files, first understand the file's code conventions. Mimic code style, use existing libraries and utilities, and follow existing patterns.

- NEVER assume that a given library is available, even if it is well known. Whenever you write code that uses a library or framework, first check that this codebase already uses the given library. For example, you might look at neighboring files, or check the package.json (or cargo.toml, and so on depending on the language).
- When you create a new component, first look at existing components to see how they're written; then consider framework choice, naming conventions, typing, and other conventions.
- When you edit a piece of code, first look at the code's surrounding context (especially its imports) to understand the code's choice of frameworks and libraries. Then consider how to make the given change in a way that is most idiomatic.
- Always follow security best practices. Never introduce code that exposes or logs secrets and keys. Never commit secrets or keys to the repository.

# Code style

- Do not add comments to the code you write, unless the user asks you to, or the code is complex and requires additional context.

# Doing tasks

The user will primarily request you perform software engineering tasks. This includes solving bugs, adding new functionality, refactoring code, explaining code, and more. For these tasks the following steps are recommended:

1. Use the available search tools to understand the codebase and the user's query. You are encouraged to use the search tools extensively both in parallel and sequentially.
2. Implement the solution using all tools available to you
3. Verify the solution if possible with tests. NEVER assume specific test framework or test script. Check the README or search codebase to determine the testing approach.
4. VERY IMPORTANT: When you have completed a task, you MUST run the lint and typecheck commands (eg. npm run lint, npm run typecheck, ruff, etc.) if they were provided to you to ensure your code is correct. If you are unable to find the correct command, ask the user for the command to run and if they supply it.

NEVER commit changes unless the user explicitly asks you to. It is VERY IMPORTANT to only commit when explicitly asked, otherwise the user will feel that you are being too proactive.

# Tool usage policy

- When doing file search, prefer to use the Agent tool in order to reduce context usage.
- If you intend to call multiple tools and there are no dependencies between the calls, make all of the independent calls in the same function_calls block.

You MUST answer concisely with fewer than 4 lines of text (not including tool use or code generation), unless user asks for detail.

Notes:

1. IMPORTANT: You should be concise, direct, and to the point, since your responses will be displayed on a command line interface. Answer the user's question directly, without elaboration, explanation, or details. One word answers are best. Avoid introductions, conclusions, and explanations. You MUST avoid text before/after your response, such as "The answer is <answer>.", "Here is the content of the file..." or "Based on the information provided, the answer is..." or "Here is what I will do next...".
2. When relevant, share file names and code snippets relevant to the query
3. Any file paths you return in your final response MUST be absolute. DO NOT use relative paths.
""",
)


prompt = """
please call empty tool
"""


@agent.tool_plain
def empty(arg: str) -> dict[str, str]:
    """empty tool"""
    return {}


from pydantic_ai import Agent
from pydantic_ai.messages import (
    FinalResultEvent,
    FunctionToolCallEvent,
    FunctionToolResultEvent,
    PartDeltaEvent,
    PartStartEvent,
    TextPartDelta,
    ToolCallPartDelta,
)


async def main():
    output_messages: list[str] = []

    async with agent.iter(prompt) as run:
        async for node in run:
            if Agent.is_user_prompt_node(node):
                # A user prompt node => The user has provided input
                output_messages.append(f"=== UserPromptNode: {node.user_prompt} ===")
            elif Agent.is_model_request_node(node):
                # A model request node => We can stream tokens from the model's request
                output_messages.append(
                    "=== ModelRequestNode: streaming partial request tokens ==="
                )
                async with node.stream(run.ctx) as request_stream:
                    async for event in request_stream:
                        if isinstance(event, PartStartEvent):
                            output_messages.append(
                                f"[Request] Starting part {event.index}: {event.part!r}"
                            )
                        elif isinstance(event, PartDeltaEvent):
                            if isinstance(event.delta, TextPartDelta):
                                output_messages.append(
                                    f"[Request] Part {event.index} text delta: {event.delta.content_delta!r}"
                                )
                            elif isinstance(event.delta, ToolCallPartDelta):
                                output_messages.append(
                                    f"[Request] Part {event.index} args_delta={event.delta.args_delta}"
                                )
                        elif isinstance(event, FinalResultEvent):
                            output_messages.append(
                                f"[Result] The model produced a final result (tool_name={event.tool_name})"
                            )
            elif Agent.is_call_tools_node(node):
                # A handle-response node => The model returned some data, potentially calls a tool
                output_messages.append(
                    "=== CallToolsNode: streaming partial response & tool usage ==="
                )
                async with node.stream(run.ctx) as handle_stream:
                    async for event in handle_stream:
                        if isinstance(event, FunctionToolCallEvent):
                            output_messages.append(
                                f"[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})"
                            )
                        elif isinstance(event, FunctionToolResultEvent):
                            output_messages.append(
                                f"[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}"
                            )
            elif Agent.is_end_node(node):
                assert run.result.output == node.data.output
                # Once an End node is reached, the agent run is complete
                output_messages.append(
                    f"=== Final Agent Output: {run.result.output} ==="
                )

    print(output_messages)


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

Python, Pydantic AI & LLM client version

pydantic-ai 0.1.1
python 3.12
@DouweM
Copy link
Contributor

DouweM commented Apr 30, 2025

@Wh1isper Weird, what would be good behavior for PydanticAI in this case?

@Wh1isper
Copy link
Contributor Author

Wh1isper commented May 1, 2025

@DouweM #1408 Might have fixed the problem, I'll keep an eye out for how the current version performs.

@DouweM DouweM closed this as not planned Won't fix, can't repro, duplicate, stale May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants