Add parse_sql_with_offsets to preserve original source text #2089
+249
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces a new API method
Parser::parse_sql_with_offsets()that returns parsed statements along with byte offsets into the original source string.Motivation
I'm using
Parser::parse_sqlto parse an arbitrary number of statements. Based on the type of statement, I need to handle execution differently. However, the canonical representation of theStatementincludes uppercase type names (in most cases), which don't work as a query for ClickHouse since ClickHouse uses case-sensitive type names: for example,Nullable(Float64)vsNullable(FLOAT64).parse_sql_with_offsetsreturnsVec<(Statement, SourceOffset)>, whereSourceOffset::start()and::end()return the byte offsets of the statement from the original query, allowing me to recover the original source of the statement in question:Alternatives
This seems like it would only be useful while work on #1548 is not yet complete, so it's totally reasonable if you'd prefer this PR not to be merged.
Implementation details
SourceOffsettype to track byte positions in source textParser::parse_sql_with_offsets()public API methodParser::parse_statements_with_offsets()internal method