Skip to content

Skip UTF8 to UTF16 conversion during document indexing #126492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 57 commits into from
Jun 6, 2025
Merged
Changes from 1 commit
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
2153322
Prototype avoid UTF8 to UTF16 conversion
jordan-powers Apr 7, 2025
eaa66bb
Rename to ESBytesRef
jordan-powers Apr 8, 2025
c5d71f7
Apply spotless
jordan-powers Apr 8, 2025
a277aea
Some cleanup and comments
jordan-powers Apr 8, 2025
74822c2
Remove unnecessary throws IOException
jordan-powers Apr 8, 2025
a9ee991
Fix missing bytesValue call
jordan-powers Apr 8, 2025
ae432e1
Fix subsequent calls to parser.getText() after a call to parser.getVa…
jordan-powers Apr 8, 2025
3d1bec4
Use cached stringEnd on subsequent calls to getValueAsByteRef()
jordan-powers Apr 8, 2025
702ada2
Spotless
jordan-powers Apr 8, 2025
e2ebb92
Add textRefOrNull to DotExpandingXContentParser
jordan-powers Apr 8, 2025
4a6fe60
Add textRefOrNull() override to MultiFieldParser
jordan-powers Apr 8, 2025
240fbdb
Avoid cloning ByteSourceJsonBootstrapper
jordan-powers Apr 15, 2025
c1affcf
Rename ESBytesRef to XBytesRef
jordan-powers Apr 15, 2025
ddf7495
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 15, 2025
e285349
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 16, 2025
b0f3336
Add tests for ESJsonFactory
jordan-powers Apr 16, 2025
f2f106f
Add tests for ESUTF8StreamJsonParser
jordan-powers Apr 16, 2025
b0f701c
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 16, 2025
20616e6
Move RawString class into separate file and rename to EncodedString
jordan-powers Apr 17, 2025
1f66ff2
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 17, 2025
fd4ec6c
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 18, 2025
8913ca5
Combine XBytesRef and EncodedString into XContentString
jordan-powers Apr 22, 2025
082ffeb
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 22, 2025
3526556
Add missing override for new xContentText()
jordan-powers Apr 22, 2025
0ca2b60
Fix override in DotExpandingXContentParser
jordan-powers Apr 23, 2025
c12f1b1
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 23, 2025
9c88362
Fix GeoPointFieldMapper geohash
jordan-powers Apr 24, 2025
0d7ff66
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 24, 2025
d413cd3
Add some more tests
jordan-powers Apr 28, 2025
deefb81
Split Text and BytesReference and move base api to libs/core
jordan-powers Apr 28, 2025
681ce38
Use new BaseText and BaseBytesReference types
jordan-powers Apr 28, 2025
b31c2e0
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 28, 2025
603af45
Implement TODO UTF8 to UTF16 conversion
jordan-powers Apr 28, 2025
68d6c2d
Rename xContentText to optimizedText
jordan-powers Apr 30, 2025
1364683
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Apr 30, 2025
70202da
Rename xContentText in tests too
jordan-powers Apr 30, 2025
b3f4e04
Revert "Split Text and BytesReference and move base api to libs/core"
jordan-powers May 1, 2025
a40bee3
Move Text to :libs:x-content
jordan-powers May 1, 2025
84921aa
Use Text instead of XContentString
jordan-powers May 1, 2025
dbbdbb1
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers May 1, 2025
9380a0b
Fix missed reference to XContentString
jordan-powers May 1, 2025
ca03f87
Fix CI
jordan-powers May 5, 2025
180078c
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers May 5, 2025
b3a0bbf
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers May 8, 2025
8e36b5a
Rename test in BaseXContentTestCase to match
jordan-powers May 8, 2025
fb2394f
Update optimizedText to return XContentString interface
jordan-powers May 8, 2025
b9dc1da
Fix renamed length to stringLength
jordan-powers May 8, 2025
c38ff8a
Add OptimizedTextBenchmark
jordan-powers May 9, 2025
6c6b11e
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers May 9, 2025
de33cb6
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Jun 4, 2025
d75f180
Use new UTF8Bytes class
jordan-powers Jun 4, 2025
bcb195e
Use unsigned comparison in UTF8Bytes#compareTo
jordan-powers Jun 5, 2025
4c76525
Use encoded value when recording array offsets
jordan-powers Jun 5, 2025
986d1f6
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Jun 5, 2025
d3b9496
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Jun 5, 2025
9b53320
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Jun 6, 2025
91313b5
Merge remote-tracking branch 'upstream/main' into prototype-skip-utf16
jordan-powers Jun 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Spotless
  • Loading branch information
jordan-powers committed Apr 8, 2025
commit 702ada2c27e349b39334d89467e92ddba6a71ab1
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ public ESUTF8StreamJsonParser(
*/
public ESBytesRef getValueAsByteRef() throws IOException {
if (_currToken == JsonToken.VALUE_STRING && _tokenIncomplete) {
if(stringEnd > 0) {
return new ESBytesRef(_inputBuffer, _inputPtr, stringEnd-1);
if (stringEnd > 0) {
return new ESBytesRef(_inputBuffer, _inputPtr, stringEnd - 1);
}
return _finishAndReturnByteRef();
}
Expand Down