fix(instrumentation-undici): fix several header handling handling bugs #2781

benjaminjkraft · 2025-04-03T19:50:22Z

Which problem is this PR solving?

The handling for User-Agent had a bunch of bugs, most notably with the handling of multiple-valued headers, which are rare but legal.

Code similar to the added test "another header with multiple values" caused us errors in production from the user-agent code, and is where I started. Reading the code, writing tests, and improving the types revealed several more bugs in the same code as well as the span-to-attributes handling; the added tests "multiple user-agents", "another header with value user-agent" also fail before this PR (the others are just to actually cover the ordinary cases.

Similarly, I also fixed an incorrect case in undici v5 where it would treat a user-agent-bogus header as a user agent, but I couldn't write a test case since the tests run with the newer version.

Short description of the changes

The relevant code is rewritten to handle multiple headers consistently, and refactored so we don't have to write it all twice.

The handling for User-Agent had a bunch of bugs, most notably with the handling of multiple-valued headers, which are rare but legal. Code similar to the added test "another header with multiple values" caused us errors in production from the user-agent code, and is where I started. Reading the code, writing tests, and improving the types revealed several more bugs in the same code as well as the span-to-attributes handling; the added tests "multiple user-agents", "another header with value user-agent" also fail before this PR (the others are just to actually cover the ordinary cases. Similarly, I also fixed an incorrect case in undici v5 where it would treat a `user-agent-bogus` header as a user agent, but I couldn't write a test case since the tests run with the newer version.

linux-foundation-easycla · 2025-04-03T19:50:30Z

The committers listed above are authorized under a signed CLA.

✅ login: benjaminjkraft / name: Ben Kraft (8534336, f41050d)

benjaminjkraft · 2025-04-03T19:57:19Z

(Working on the CLA with my company, should have that fairly soon.)

JacksonWeber · 2025-04-07T02:07:02Z

plugins/node/instrumentation-undici/src/undici.ts

+          continue;
+        }
+        const key = line.substring(0, colonIndex);
+        const value = line.substring(0, colonIndex + 1);


Wouldn't this capture the key again since you're starting from the zeroth index of the line string again? Could be something like: const value = line.substring(colonIndex + 1).trim();

Eek, you're right!

Which makes me think -- do we have a way to actually unit test the patches against older library versions? I guess that would have to be a separate test package?

Eek, you're right!

Which makes me think -- do we have a way to actually unit test the patches against older library versions? I guess that would have to be a separate test package?

The test-all- versions script should perform these tests. You can inspect which versions are tested in the .tav.yml file at the root of this instrumentation.

david-luna · 2025-04-07T09:16:03Z

Hi @benjaminjkraft,

thanks for contributing to opentelemetry and welcome :)

From your message I couldn't understand what are exactly the errors this PR fixes. I see you mention:

header values provided as an array of strings (multiple-valued headers)
headers prefixed by user-agent may be considered the user agent wrongly.

I'll check your changes assuming both points above. Please let me know if there is something else this PR is fixing soit can be mentioned in the changelog.

Cheers

david-luna · 2025-04-07T11:13:23Z

plugins/node/instrumentation-undici/src/undici.ts

+      if (key.toLowerCase() === 'user-agent') {
+        // user-agent should only appear once per the spec, but the library doesn't
+        // prevent passing it multiple times, so we handle that to be safe.
+        const userAgent = Array.isArray(value) ? value[0] : value;


One may wonder if the 1st occurrence of the UA header value may be the best option but I guess there is no best option in this scenario. If the request not complies the spec and sets multiple values for UA it may be good to log this issue.

I could log it, although I dunno where that goes in practice? We would probably also be spec-compliant-ish to do value.join(", ").

david-luna · 2025-04-07T12:26:28Z

plugins/node/instrumentation-undici/src/undici.ts

+        // prevent passing it multiple times, so we handle that to be safe.
+        const userAgent = Array.isArray(value) ? value[0] : value;
+        attributes[SemanticAttributes.USER_AGENT_ORIGINAL] = userAgent;
+        return true; // no need to keep iterating


nit: IMO forEachXXX name implies we're executing for all items in the collection. Depending on a previous result to decide to continue the loop does not follow the semantics of the name. Without the comment after the return statement one not familiar with the code will not know that the loop breaks.

The control of the format and access to the headers in a separate function is a good idea but maybe it will serve better if it returns an iterable object. This way the consumer code can loop through it (and break the loop) with the language specifics like for..of and break

sure, I can do an iterable, I wasn't totally sure if that was kosher yet for the lib but I guess they're supported pretty much everywhere these days.

benjaminjkraft

Sorry, for a more explicit list of the issues this fixes:

(undici v6+) the code crashed if any header (preceding user-agent) is multi-valued
(undici v6+) the code used the wrong value if a header value preceding the user-agent header was "user-agent" (it would log the next header-key as the user-agent)
(undici v5) the code would treat any header starting with user-agent, e.g. user-agent-bogus as the user-agent

(CLA is in review, I'll get back to this later this week once it's approved.)

benjaminjkraft · 2025-04-09T01:26:32Z

plugins/node/instrumentation-undici/src/undici.ts

+      if (key.toLowerCase() === 'user-agent') {
+        // user-agent should only appear once per the spec, but the library doesn't
+        // prevent passing it multiple times, so we handle that to be safe.
+        const userAgent = Array.isArray(value) ? value[0] : value;


I could log it, although I dunno where that goes in practice? We would probably also be spec-compliant-ish to do value.join(", ").

benjaminjkraft · 2025-04-09T01:27:06Z

plugins/node/instrumentation-undici/src/undici.ts

+        // prevent passing it multiple times, so we handle that to be safe.
+        const userAgent = Array.isArray(value) ? value[0] : value;
+        attributes[SemanticAttributes.USER_AGENT_ORIGINAL] = userAgent;
+        return true; // no need to keep iterating


sure, I can do an iterable, I wasn't totally sure if that was kosher yet for the lib but I guess they're supported pretty much everywhere these days.

benjaminjkraft · 2025-04-09T01:28:01Z

plugins/node/instrumentation-undici/src/undici.ts

+          continue;
+        }
+        const key = line.substring(0, colonIndex);
+        const value = line.substring(0, colonIndex + 1);


Eek, you're right!

Which makes me think -- do we have a way to actually unit test the patches against older library versions? I guess that would have to be a separate test package?

benjaminjkraft · 2025-04-09T21:20:41Z

Thanks for the reviews -- pushed updates to fix the v5 typo and swap to generators (which indeed look cleaner). (Also, CLA is now signed.) Let me know if there's anything else you'd like to see!

david-luna · 2025-04-10T08:44:53Z

Which makes me think -- do we have a way to actually unit test the patches against older library versions? I guess that would have to be a separate test package?

You can use the script npm run test-all-versions which will install and test with different versions of undici. You could even update .tav.yml file since Node.js v14 and v16 support was dropped in #2738

You have a compile issue :(
https://github.com/open-telemetry/opentelemetry-js-contrib/actions/runs/14367208545/job/40308049357?pr=2781

david-luna · 2025-04-29T16:14:01Z

@benjaminjkraft did you have time to look at the compile issue? Do you need any kind of assistance?

david-luna · 2025-05-19T15:12:52Z

@benjaminjkraft are you able to finish this PR? Let me know if you need me to jump in :)

benjaminjkraft requested a review from a team as a code owner April 3, 2025 19:50

github-actions bot added the pkg:instrumentation-undici label Apr 3, 2025

github-actions bot assigned david-luna and trentm Apr 3, 2025

github-actions bot requested review from david-luna and trentm April 3, 2025 19:50

JacksonWeber reviewed Apr 7, 2025

View reviewed changes

david-luna reviewed Apr 7, 2025

View reviewed changes

benjaminjkraft commented Apr 9, 2025

View reviewed changes

review comments

f41050d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(instrumentation-undici): fix several header handling handling bugs #2781

fix(instrumentation-undici): fix several header handling handling bugs #2781

benjaminjkraft commented Apr 3, 2025

linux-foundation-easycla bot commented Apr 3, 2025 •

edited

Loading

benjaminjkraft commented Apr 3, 2025

JacksonWeber Apr 7, 2025

benjaminjkraft Apr 9, 2025

david-luna Apr 10, 2025

david-luna commented Apr 7, 2025

david-luna Apr 7, 2025

benjaminjkraft Apr 9, 2025

david-luna Apr 7, 2025

benjaminjkraft Apr 9, 2025

benjaminjkraft left a comment

benjaminjkraft Apr 9, 2025

benjaminjkraft Apr 9, 2025

benjaminjkraft Apr 9, 2025

benjaminjkraft commented Apr 9, 2025

david-luna commented Apr 10, 2025

david-luna commented Apr 29, 2025

david-luna commented May 19, 2025

fix(instrumentation-undici): fix several header handling handling bugs #2781

Are you sure you want to change the base?

fix(instrumentation-undici): fix several header handling handling bugs #2781

Conversation

benjaminjkraft commented Apr 3, 2025

Which problem is this PR solving?

Short description of the changes

linux-foundation-easycla bot commented Apr 3, 2025 • edited Loading

benjaminjkraft commented Apr 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-luna commented Apr 7, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjaminjkraft left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjaminjkraft commented Apr 9, 2025

david-luna commented Apr 10, 2025

david-luna commented Apr 29, 2025

david-luna commented May 19, 2025

linux-foundation-easycla bot commented Apr 3, 2025 •

edited

Loading