-
Notifications
You must be signed in to change notification settings - Fork 654
perf(streams): add single-character fast path for DelimiterStream()
#3739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(streams): add single-character fast path for DelimiterStream()
#3739
Conversation
3b68470
to
e74e501
Compare
e74e501
to
4ce46fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I think we can make this a little simpler.
streams/delimiter_stream.ts
Outdated
let transform: ( | ||
chunk: Uint8Array, | ||
controller: TransformStreamDefaultController<Uint8Array>, | ||
) => void; | ||
const flush = ( | ||
controller: TransformStreamDefaultController<Uint8Array>, | ||
) => { | ||
const bufs = this.#bufs; | ||
const length = bufs.length; | ||
if (length === 0) { | ||
controller.enqueue(new Uint8Array()); | ||
} else if (length === 1) { | ||
controller.enqueue(bufs[0]); | ||
} else { | ||
controller.enqueue(concat(...bufs)); | ||
} | ||
}; | ||
|
||
let lps: Uint8Array | null = null; | ||
|
||
if (delimiter.length === 1) { | ||
// Delimiter is a single char | ||
transform = (chunk, controller) => this.#handleChar(chunk, controller); | ||
} else { | ||
transform = (chunk, controller) => this.#handle(chunk, controller); | ||
lps = createLPS(delimiter); | ||
} | ||
super({ | ||
transform: (chunk, controller) => { | ||
this.#handle(chunk, controller); | ||
}, | ||
flush: (controller) => { | ||
controller.enqueue(concat(...this.#bufs)); | ||
}, | ||
transform, | ||
flush, | ||
}); | ||
|
||
this.#delimiter = delimiter; | ||
this.#delimLPS = createLPS(delimiter); | ||
this.#delimLPS = lps; | ||
this.#disp = options?.disposition ?? "discard"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify the constructor by doing something like:
constructor(
delimiter: Uint8Array,
options?: DelimiterStreamOptions,
) {
super({
transform: (chunk, controller) =>
delimiter.length === 1
? this.#handleChar(chunk, controller)
: this.#handle(chunk, controller),
flush: (controller) => this.#flush(controller),
});
this.#delimiter = delimiter;
this.#delimLPS = delimiter.length > 1 ? createLPS(delimiter) : null;
this.#disp = options?.disposition ?? "discard";
}
I think we should also factor out the flush()
logic into its own separate #flush()
method to match the style of the rest of the class. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gentle ping @aapoalas
DelimiterStream()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone ahead and applied my suggestion by factoring the flush()
logic into it's own method and simplifying the constructor. Now, LGTM! Thanks for this, @aapoalas!
DelimiterStream is often used to delimit formats that use a single char as the separator such as most commonly used flavours of CSV, new-line delimited streamable JSON and others. Multi-character delimiters require extra handling that is entirely unnecessary for char delimiters. As such, it makes sense to special-case char delimiter streams for extra performance.
The performance benefit is at most a few percentage points.