Skip to content

fix(instrumentation-dataloader): Patch batchLoadFn without creating an instance #2498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

onurtemizkan
Copy link
Member

Related: getsentry/sentry-javascript#13869

Which problem is this PR solving?

  • The current version of the instrumentation creates an instance (and returns it) in order to patch the DataLoader constructor for batchLoadFn. This breaks the usage where the DataLoader class is extended.

Short description of the changes

  • This implementation uses the first argument of the DataLoader constructor (which is batchLoadFn) to patch to avoid creating an instance.

@onurtemizkan onurtemizkan requested a review from a team as a code owner October 23, 2024 17:25
@onurtemizkan onurtemizkan changed the title fix(instrumentation-dataloder): Patch batchLoadFn without creating an instance fix(instrumentation-dataloader): Patch batchLoadFn without creating an instance Oct 23, 2024
@onurtemizkan onurtemizkan force-pushed the dataloader-custom-methods branch from 6110fa8 to acfeb34 Compare October 23, 2024 17:29
Copy link

codecov bot commented Oct 23, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.45%. Comparing base (d579630) to head (1e71d0d).

Files with missing lines Patch % Lines
.../instrumentation-dataloader/src/instrumentation.ts 92.85% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2498      +/-   ##
==========================================
- Coverage   89.46%   89.45%   -0.01%     
==========================================
  Files         174      174              
  Lines        8322     8324       +2     
  Branches     1592     1593       +1     
==========================================
+ Hits         7445     7446       +1     
- Misses        877      878       +1     
Files with missing lines Coverage Δ
.../instrumentation-dataloader/src/instrumentation.ts 98.07% <92.85%> (-0.95%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@onurtemizkan onurtemizkan force-pushed the dataloader-custom-methods branch from acfeb34 to 60d8499 Compare October 29, 2024 11:52
Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just a couple edge cases.

return this.load('test');
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this class and the getMd5HashFromIdx function could move inside the test case to be closer to the only place they are used:

--- a/plugins/node/instrumentation-dataloader/test/dataloader.test.ts
+++ b/plugins/node/instrumentation-dataloader/test/dataloader.test.ts
@@ -33,23 +33,10 @@ import * as assert from 'assert';
 import * as Dataloader from 'dataloader';
 import * as crypto from 'crypto';

-const getMd5HashFromIdx = (idx: number) =>
-  crypto.createHash('md5').update(String(idx)).digest('hex');
-
 describe('DataloaderInstrumentation', () => {
   let dataloader: Dataloader<string, number>;
   let contextManager: AsyncHooksContextManager;

-  class CustomDataLoader extends Dataloader<string, string> {
-    constructor() {
-      super(async keys => keys.map((_, idx) => getMd5HashFromIdx(idx)));
-    }
-
-    public async customLoad() {
-      return this.load('test');
-    }
-  }
-
   const memoryExporter = new InMemorySpanExporter();
   const provider = new NodeTracerProvider();
   const tracer = provider.getTracer('default');
@@ -350,6 +337,19 @@ describe('DataloaderInstrumentation', () => {
   });

   it('should not prune custom methods', async () => {
+    const getMd5HashFromIdx = (idx: number) =>
+      crypto.createHash('md5').update(String(idx)).digest('hex');
+
+    class CustomDataLoader extends Dataloader<string, string> {
+      constructor() {
+        super(async keys => keys.map((_, idx) => getMd5HashFromIdx(idx)));
+      }
+
+      public async customLoad() {
+        return this.load('test');
+      }
+    }
+
     const customDataloader = new CustomDataLoader();
     await customDataloader.customLoad();

const instrumentation = this;
const prototype = constructor.prototype;

if (!instrumentation.isEnabled() || !instrumentation.shouldCreateSpans()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!instrumentation.isEnabled() || !instrumentation.shouldCreateSpans()) {
if (!instrumentation.isEnabled()) {

This if-block is at init-time before any spans are being created. This !instrumentation.shouldCreateSpans() is only true by fluke because config.requireParentSpan is undefined in the tests.


if (isWrapped(inst._batchLoadFn)) {
instrumentation._unwrap(inst, '_batchLoadFn');
// BatchLoadFn is the first constructor argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a guard that args[0] is actually a function before wrapping it. Something like (untested):

if (typeof args[0] !== 'function') {
  return constructor.apply(this, args);
}

or perhaps only do the unwrap and wrap if it is a function.


The current behaviour breaks the guard in Dataloader itself:

foo.js

if (process.env.INSTRUMENT_IT) {
  const { DataloaderInstrumentation } = require('./');
  const { NodeSDK, tracing } = require('@opentelemetry/sdk-node');
  const instr = new DataloaderInstrumentation();
  const sdk = new NodeSDK({
    spanProcessor: new tracing.SimpleSpanProcessor(new tracing.ConsoleSpanExporter()),
    serviceName: 'foo',
    instrumentations: [instr]
  });
  sdk.start();
}

const Dataloader = require('dataloader');
const dl = new Dataloader(); // oops, no batchLoadFn passed in

console.log('the end');

Run that without and with instrumentation shows a changed behaviour:

% node foo.js
/Users/trentm/tm/opentelemetry-js-contrib10/node_modules/dataloader/index.js:32
      throw new TypeError('DataLoader must be constructed with a function which accepts ' + ("Array<key> and returns Promise<Array<value>>, but got: " + batchLoadFn + "."));
      ^

TypeError: DataLoader must be constructed with a function which accepts Array<key> and returns Promise<Array<value>>, but got: undefined.
    at new DataLoader (/Users/trentm/tm/opentelemetry-js-contrib10/node_modules/dataloader/index.js:32:13)
    at Object.<anonymous> (/Users/trentm/tm/opentelemetry-js-contrib10/plugins/node/instrumentation-dataloader/foo.js:14:12)
    at Module._compile (node:internal/modules/cjs/loader:1364:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1422:10)
    at Module.load (node:internal/modules/cjs/loader:1203:32)
    at Module._load (node:internal/modules/cjs/loader:1019:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:128:12)
    at node:internal/main/run_main_module:28:49

Node.js v18.20.4

% INSTRUMENT_IT=true node foo.js
the end

Please add a test case for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a guard to check if batchLoadFn is a function. However, I could not reproduce the problem where Dataloader instance is created without a batchLoadFn (with or without the instrumentation).

It fails with:

TypeError: DataLoader must be constructed with a function which accepts Array<key> and returns Promise<Array<value>>, but got: undefined.
for both cases.

@onurtemizkan
Copy link
Member Author

Hey @trentm, can I get another round of reviews, please?

@onurtemizkan onurtemizkan requested a review from a team April 14, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants