Skip to content

Consolidate MapAccess, and Subscript into CompoundExpr to handle the complex field access chain #1551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Dec 22, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
192cab4
v1 tmp
goldmedal Nov 20, 2024
6be3c35
remove MapAccess
goldmedal Nov 25, 2024
0e916dd
fix fmt
goldmedal Nov 25, 2024
767b531
remove debug message
goldmedal Nov 25, 2024
22f4e67
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Nov 27, 2024
dee8b40
fix span test
goldmedal Nov 27, 2024
fc1cd59
introduce CompoundExpr
goldmedal Dec 4, 2024
8590896
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Dec 4, 2024
4ad37d8
fix merge conflict
goldmedal Dec 4, 2024
31a1e74
replace subscript with compound expr
goldmedal Dec 4, 2024
0355290
fix snowflake syntax
goldmedal Dec 4, 2024
1de9b21
limit the access chain supported dialect
goldmedal Dec 4, 2024
2a32b9f
fmt
goldmedal Dec 4, 2024
495d1b3
enhance doc and fix the name
goldmedal Dec 4, 2024
e7b55be
fix typo
goldmedal Dec 4, 2024
6652905
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Dec 9, 2024
47a5da1
update doc
goldmedal Dec 9, 2024
b58e50c
update doc and rename AccessExpr
goldmedal Dec 9, 2024
7cb2e00
remove unused crate
goldmedal Dec 9, 2024
397335a
update the out date doc
goldmedal Dec 9, 2024
ac25e5d
remove unused parsing
goldmedal Dec 9, 2024
a08e5c2
rename to `CompoundFieldAccess`
goldmedal Dec 9, 2024
09b39eb
rename chain and display AccessExpr by itself
goldmedal Dec 9, 2024
8968fcc
rename `parse_compound_expr`
goldmedal Dec 9, 2024
1328274
fmt and clippy
goldmedal Dec 9, 2024
d6743e9
fix doc
goldmedal Dec 9, 2024
7d030c1
remove unnecessary check
goldmedal Dec 16, 2024
90e03eb
improve the doc
goldmedal Dec 16, 2024
5c54d1b
remove the unused method `parse_map_access`
goldmedal Dec 16, 2024
57830e2
avoid the unnecessary cloning
goldmedal Dec 16, 2024
4b3818c
extract parse outer_join_expr
goldmedal Dec 16, 2024
67cd877
consume LBarcket by `parse_multi_dim_subscript`
goldmedal Dec 16, 2024
23aea03
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Dec 16, 2024
94847d7
fix compile
goldmedal Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix snowflake syntax
  • Loading branch information
goldmedal committed Dec 4, 2024
commit 035529019bfba7df7ad085b09ba8b0028b549885
4 changes: 4 additions & 0 deletions src/dialect/snowflake.rs
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,10 @@ impl Dialect for SnowflakeDialect {
RESERVED_FOR_IDENTIFIER.contains(&kw)
}
}

fn supports_partiql(&self) -> bool {
true
}
Comment on lines +238 to +240
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I don't think this is necessarily correct since partiql is a redshift feature, was this required somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to integrate the conditions at

} else if dialect_of!(self is SnowflakeDialect) || self.dialect.supports_partiql() {
self.prev_token();
self.parse_json_access(expr)

Then, we can only check supports_partiql in parse_compound_expr. 🤔

                    if self.consume_token(&Token::LBracket) {
                        if self.dialect.supports_partiql() {
                            ending_lbracket = true;
                            break;
                        } else {
                            self.parse_multi_dim_subscript(&mut chain)?
                        }
                    }

Indeed, the name is a little weird for Snowflake but I think they mean the same thing 🤔

}

/// Parse snowflake create table statement.
Expand Down
75 changes: 50 additions & 25 deletions src/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1148,17 +1148,29 @@ impl<'a> Parser<'a> {
Token::Period => self.parse_compound_expr(Expr::Identifier(w.to_ident(w_span)), vec![]),
Token::LParen => {
let id_parts = vec![w.to_ident(w_span)];
let mut expr = self.parse_function(ObjectName(id_parts))?;
// consume all period if it's a method chain
if self.dialect.supports_methods() {
expr = self.try_parse_method(expr)?
}
let mut fields = vec![];
// if the function returns an array, it can be subscripted
if self.consume_token(&Token::LBracket) {
self.parse_multi_dim_subscript(&mut fields)?;
// parse_comma_outer_join is used to parse the following pattern:
if dialect_of!(self is SnowflakeDialect | MsSqlDialect)
&& self.consume_tokens(&[Token::LParen, Token::Plus, Token::RParen])
{
Ok(Expr::OuterJoin(Box::new(
match <[Ident; 1]>::try_from(id_parts) {
Ok([ident]) => Expr::Identifier(ident),
Err(parts) => Expr::CompoundIdentifier(parts),
},
)))
} else {
let mut expr = self.parse_function(ObjectName(id_parts))?;
// consume all period if it's a method chain
if self.dialect.supports_methods() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try_parse_method already does the if self.dialect.supports_methods() check so that we should be able to skip the if condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if this condition was removed, the test for method parsing will fail:

    parse_method_expr
    parse_method_select

Because the following parse_compound_expr will try to consume all the dots for an expression, we need to parse the method here to avoid consume ( in parse_compound_expr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, to clarify what I meant was that try_parse_method does this already

if !self.dialect.supports_methods() {
	return Ok(expr);
}

So that this can be simplified as following (i.e without the extra if self.dialect.supports_methods())

expr = self.try_parse_method(expr)?;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. It's more simple. Thanks!

expr = self.try_parse_method(expr)?
}
let mut fields = vec![];
// if the function returns an array, it can be subscripted
if self.consume_token(&Token::LBracket) {
self.parse_multi_dim_subscript(&mut fields)?;
}
self.parse_compound_expr(expr, fields)
}
self.parse_compound_expr(expr, fields)
}
Token::LBracket => {
let _ = self.consume_token(&Token::LBracket);
Expand Down Expand Up @@ -1420,15 +1432,21 @@ impl<'a> Parser<'a> {
mut chain: Vec<AccessField>,
) -> Result<Expr, ParserError> {
let mut ending_wildcard: Option<TokenWithSpan> = None;
let mut ending_lbracket = false;
while self.consume_token(&Token::Period) {
let next_token = self.next_token();
match next_token.token {
Token::Word(w) => {
let expr = Expr::Identifier(w.to_ident(next_token.span));
chain.push(AccessField::Expr(expr));
if self.consume_token(&Token::LBracket) && !self.dialect.supports_partiql() {
self.parse_multi_dim_subscript(&mut chain)?
};
if self.consume_token(&Token::LBracket) {
if self.dialect.supports_partiql() {
ending_lbracket = true;
break;
} else {
self.parse_multi_dim_subscript(&mut chain)?
}
}
}
Token::Mul => {
// Postgres explicitly allows funcnm(tablenm.*) and the
Expand All @@ -1450,6 +1468,12 @@ impl<'a> Parser<'a> {
}
}

// if dialect supports partiql, we need to go back one Token::LBracket for the JsonAccess parsing
if self.dialect.supports_partiql() && ending_lbracket {
self.prev_token();
}


if let Some(wildcard_token) = ending_wildcard {
let Some(id_parts) = Self::exprs_to_idents(&root, &chain) else {
return self.expected("an identifier or a '*' after '.'", self.peek_token());
Expand Down Expand Up @@ -3075,11 +3099,9 @@ impl<'a> Parser<'a> {
{
let mut chain = vec![];
self.parse_multi_dim_subscript(&mut chain)?;
Ok(Expr::CompoundExpr {
root: Box::new(expr),
chain,
})
} else if dialect_of!(self is SnowflakeDialect) || self.dialect.supports_partiql() {
self.parse_compound_expr(expr, chain)

} else if self.dialect.supports_partiql() {
self.prev_token();
self.parse_json_access(expr)
} else {
Expand Down Expand Up @@ -3266,14 +3288,17 @@ impl<'a> Parser<'a> {
pub fn parse_map_access(&mut self, expr: Expr) -> Result<Expr, ParserError> {
let key = self.parse_expr()?;
let result = match key {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I was initially wondering about this match since it looked incorrect to restrict the key expression type. e.g. this BigQuery test case. But then I realised that test case is passing because it now takes a different codepath via parse_compound_field_access. And so I wonder if there are any scenarios that rely on this method anymore, if there aren't it seems like we might be able to remove it entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. We can remove it entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 5c54d1b

Expr::Identifier(ident) => Ok(Expr::CompositeAccess {
expr: Box::new(expr),
key: ident,
Expr::Identifier(_) => Ok(Expr::CompoundExpr {
root: Box::new(expr),
chain: vec![AccessField::Expr(key)],
}),
Expr::Value(Value::SingleQuotedString(s))
| Expr::Value(Value::DoubleQuotedString(s)) => Ok(Expr::CompositeAccess {
expr: Box::new(expr),
key: Ident::new(s),
Expr::Value(Value::SingleQuotedString(_)) => Ok(Expr::CompoundExpr {
root: Box::new(expr),
chain: vec![AccessField::Expr(key)],
}),
Expr::Value(Value::DoubleQuotedString(s)) => Ok(Expr::CompoundExpr {
root: Box::new(expr),
chain: vec![AccessField::Expr(Expr::Identifier(Ident::new(s)))],
}),
_ => parser_err!("Expected identifier or string literal", self.peek_token()),
};
Expand Down