feat(instrumentation-aws-sdk): add gen ai conventions for converse stream span #2769

anuraaga · 2025-03-21T03:09:34Z

Which problem is this PR solving?

Currently only non-streaming Converse populates gen ai conventions

Short description of the changes

Allows a service extension to override the response, needed to wrap streams
Don't end the span in middleware if an extension indicates the response is a stream as the extension needs to end it after it populates attributes during stream consumption. Unrelated to semconv, this should also make the timings more accurate instead of only being for the initial request before response consumption
Handles ConverseStream in bedrock extension using above

/cc @trentm @codefromthecrypt

…ream span

anuraaga · 2025-03-21T03:11:09Z

plugins/node/opentelemetry-instrumentation-aws-sdk/src/aws-sdk.ts

+                  if (override) {
+                    response.output = override;
+                    normalizedResponse.data = override;
+                  }
                  self._callUserResponseHook(span, normalizedResponse);


I considered whether this should be done for the user hook too but didn't think there's enough use case for it. Currently the change is only internal since AFAIK, users can't define service extensions

codecov · 2025-03-21T03:21:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.71%. Comparing base (64fcbf3) to head (c908fe9).
Report is 6 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2769      +/-   ##
==========================================
+ Coverage   89.69%   89.71%   +0.02%     
==========================================
  Files         184      184              
  Lines        8966     8988      +22     
  Branches     1835     1839       +4     
==========================================
+ Hits         8042     8064      +22     
  Misses        924      924

Files with missing lines	Coverage Δ
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts	`92.94% <100.00%> (+0.17%)`	⬆️
...ntation-aws-sdk/src/services/ServicesExtensions.ts	`96.87% <100.00%> (ø)`
...umentation-aws-sdk/src/services/bedrock-runtime.ts	`98.99% <100.00%> (+0.09%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codefromthecrypt

Streaming is tricky but you handled it concisely. Thanks

trentm

Some initial review. I haven't looked carefully at the implementation yet.

trentm · 2025-03-27T17:00:01Z

plugins/node/opentelemetry-instrumentation-aws-sdk/src/services/ServiceExtension.ts

+  // isStream - if true, then the response is a stream so the span should not be ended by the middleware.
+  // the ServiceExtension must end the span itself, generally by wrapping the stream and ending after it is
+  // consumed.
+  isStream?: boolean;


Do you know if the GenAI SIG discussed/documented wanting this behaviour of ending the span after the full stream is consumed? I've seen opinions vary when discussing HTTP streaming. See the guidance for HTTP client span duration here: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#http-client-span-duration

Is there any equivalent in the Python GenAI-related instrumentations?

Thanks for the link, that's interesting indeed in terms of response streaming. There isn't any guideline on gen ai spans, however I feel it's sort of implied by the conventions for token usage - there really isn't a way to populate them without keeping the span until the end of the stream. While the duration could keep streaming out of it, that would then need to override the end time of that span with the earlier timestamp, which I don't think the JS SDK supports.

While the duration could keep streaming out of it, that would then need to override the end time of that span with the earlier timestamp, which I don't think the JS SDK supports.

Correct, it does not support that. The (documented) behaviour then would be that the Span duration would be just up until the first response from the server, effectively TTFB.

Ah do you mean that we should use TTFB here? That would mean we couldn't record usage information though.

BTW, I realized that this might be closer to RPC than HTTP which has some specification for streaming

https://opentelemetry.io/docs/specs/semconv/rpc/rpc-spans/#message-event

an event for each message sent/received on client and server spans SHOULD be created

I think this also implies the overall span is for the whole stream. FWIW python keeps the span for the entire stream too.

Ah do you mean that we should use TTFB here?

No, I did not mean to imply what this "should" do. I don't have a strong opinion one way or the other.

The HTTP span guidance from https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#http-client-span-duration says:

Because of the potential for confusion around this, HTTP client library instrumentations SHOULD document their behavior around ending HTTP client spans.

I only meant to say that if it is decided to handle this stream by having the span end just on the initial response that this should be documented.

FWIW python keeps the span for the entire stream too.

Sounds good to me to have the intention for the JS instrumentation to be the same.

plugins/node/opentelemetry-instrumentation-aws-sdk/src/services/bedrock-runtime.ts

…y-js-contrib into bedrock-runtime-stream

anuraaga · 2025-05-15T04:15:28Z

plugins/node/opentelemetry-instrumentation-aws-sdk/test/load-instrumentation.ts

 import { AwsInstrumentation } from '../src';

 export const instrumentation = new AwsInstrumentation();
-export const metricReader = initMeterProvider(instrumentation);


I realized this pattern seemed to not work for multiple tests. I did try passing DELTA temporality in the MetricReader constructor since I thought it would fix it but it didn't. So I changed to a pattern inspired by what's used in Java

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/testing-common/src/main/java/io/opentelemetry/instrumentation/testing/LibraryTestRunner.java#L122

trentm

LGTM, with a couple small nits.

@jj22ee, @blumamir, @trivikr It would be good to have one your opinions on this as component owners. @jj22ee I think this PR predates your being added as a component owner for instrumentation-aws-sdk, so it seems likely you haven't seen this.

plugins/node/opentelemetry-instrumentation-aws-sdk/src/services/bedrock-runtime.ts

plugins/node/opentelemetry-instrumentation-aws-sdk/test/bedrock-runtime.test.ts

jj22ee · 2025-05-20T00:38:01Z

plugins/node/opentelemetry-instrumentation-aws-sdk/src/services/bedrock-runtime.ts

+  ) {
+    return {
+      ...response.data,
+      stream: this.wrapConverseStreamResponse(


Does stream: this.wrapConverseStreamResponse(...) overwrite the stream from ...response.data?
I think it will be nice to have a comment to call this out.

Yup - added a comment about it

The instrumentation is done for claude, titan and nova models

trentm

LGTM. Waiting for an approval from one of the code owners before merging.

feat(instrumentation-aws-sdk): add gen ai conventions for converse st…

7608585

…ream span

anuraaga requested a review from a team as a code owner March 21, 2025 03:09

github-actions bot assigned blumamir, jj22ee and trivikr Mar 21, 2025

github-actions bot added the pkg:instrumentation-aws-sdk label Mar 21, 2025

github-actions bot requested review from blumamir, jj22ee and trivikr March 21, 2025 03:09

anuraaga commented Mar 21, 2025

View reviewed changes

Non-automatic style fix

938e1c2

codefromthecrypt approved these changes Mar 21, 2025

View reviewed changes

trentm reviewed Mar 27, 2025

View reviewed changes

anuraaga added 5 commits April 10, 2025 11:23

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

7b351e1

…y-js-contrib into bedrock-runtime-stream

import type

22ea86d

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

fc4e015

…y-js-contrib into bedrock-runtime-stream

WIP

409b434

Finish

4db49c3

anuraaga commented May 15, 2025

View reviewed changes

trentm reviewed May 16, 2025

View reviewed changes

plugins/node/opentelemetry-instrumentation-aws-sdk/src/services/bedrock-runtime.ts Outdated Show resolved Hide resolved

plugins/node/opentelemetry-instrumentation-aws-sdk/test/bedrock-runtime.test.ts Outdated Show resolved Hide resolved

Cleanup

d9c2938

jj22ee reviewed May 20, 2025

View reviewed changes

anuraaga added 2 commits May 20, 2025 10:35

Comment

4838419

A bit more

c908fe9

yuliia-fryshko referenced this pull request May 21, 2025

feat: add Bedrock InvokeModelWithResponseStream instrumentation

1696256

The instrumentation is done for claude, titan and nova models

trentm approved these changes May 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(instrumentation-aws-sdk): add gen ai conventions for converse stream span #2769

feat(instrumentation-aws-sdk): add gen ai conventions for converse stream span #2769

anuraaga commented Mar 21, 2025

anuraaga Mar 21, 2025

codecov bot commented Mar 21, 2025 •

edited

Loading

codefromthecrypt left a comment

trentm left a comment

trentm Mar 27, 2025

anuraaga Apr 10, 2025

trentm Apr 10, 2025

anuraaga Apr 11, 2025

trentm May 15, 2025

anuraaga May 15, 2025

trentm left a comment

jj22ee May 20, 2025

anuraaga May 20, 2025

trentm left a comment

feat(instrumentation-aws-sdk): add gen ai conventions for converse stream span #2769

Are you sure you want to change the base?

feat(instrumentation-aws-sdk): add gen ai conventions for converse stream span #2769

Conversation

anuraaga commented Mar 21, 2025

Which problem is this PR solving?

Short description of the changes

Choose a reason for hiding this comment

codecov bot commented Mar 21, 2025 • edited Loading

Codecov Report

codefromthecrypt left a comment

Choose a reason for hiding this comment

trentm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trentm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trentm left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 21, 2025 •

edited

Loading