chore(dataobj): Add columnar reading APIs to logs and streams sections #17976
+6,890
−196
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces columnar reading APIs to both the logs and streams sections. As part of this change, both sections now expose the columns stored in the section, which are then used to define predicates on the readers.
The columnar reading APIs emit a sequence of
arrow.Record
s, where each record has no more than the batch size passed to theReader.Read
method.Unlike the original row-based readers, the columnar readers:
To be able to represent a column's value for use in predicates, we import github.com/apache/arrow-go/v18/arrow/scalar. Section Reader implementations perform mapping to and from the internal
dataset.Value
.The implementation of
logs.Reader
is almost an identical copy of thestreams.Reader
implementation. I've opted for duplicating the implementation in the short term, since there's no obvious way for how to deduplicate it in a way keeps the package structure easy to understand. We can change this in the future if a clean and simple approach presents itself.This PR does not yet update
DataObjScan
to make use of these new APIs; that's being left out of scope for another PR to handle.