-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28280: SemanticException when querying VIEW with DISTINCT clause #6103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
51b5bc8
to
d399943
Compare
case HiveProject hiveProject -> hiveProject; | ||
case SingleRel singleRel when singleRel.getInput() instanceof HiveProject hiveProject -> hiveProject; | ||
default -> throw new SemanticException("View " + subqAlias + " is corresponding to " | ||
+ relNode + ", rather than a HiveProject or a SingleRel with HiveProject as its child."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make the error message clearer? Maybe something like "Could not obtain a HiveProject from " + relNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I have changed the message in the new commit
+ relNode.toString() + ", rather than a HiveProject."); | ||
HiveProject project = switch (Objects.requireNonNull(relNode)) { | ||
case HiveProject hiveProject -> hiveProject; | ||
case SingleRel singleRel when singleRel.getInput() instanceof HiveProject hiveProject -> hiveProject; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to descend into SingleRel
s recursively, until a HiveProject can be found? Maybe the view definition could have, e.g., a Sort(Filter(Project(...))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually earlier I tried to produce a plan where the second node is not a Project
but couldn't. But just to be safe I have updated the code to recursively check for the first Project
through the SingleRel
s.
throw new SemanticException("View " + subqAlias + " is corresponding to " | ||
+ relNode.toString() + ", rather than a HiveProject."); | ||
HiveProject project = switch (Objects.requireNonNull(relNode)) { | ||
case HiveProject hiveProject -> hiveProject; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Sonar warns about wrong indentation for the case
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I noticed that but there's no good solution to it. In the latest commit it shouldn't complain as the code is:
private Optional<HiveProject> extractFirstProject(RelNode rel) {
return switch (rel) {
case HiveProject hiveProject -> Optional.of(hiveProject);
case SingleRel sr -> extractFirstProject(sr.getInput());
case null, default -> Optional.empty();
};
}
However, I would have liked to see the cases indented to:
private Optional<HiveProject> extractFirstProject(RelNode rel) {
return switch (rel) {
case HiveProject hiveProject -> Optional.of(hiveProject);
case SingleRel sr -> extractFirstProject(sr.getInput());
case null, default -> Optional.empty();
};
}
Actually in checkstyle/checkstyle.xml
we have defined the indentation for case
s to be 0
:
<module name="Indentation">
<property name="basicOffset" value="2" />
<property name="caseIndent" value="0" />
<property name="throwsIndent" value="4" />
</module>
which is fine when you have the older switch-case
statement, as it doesn't look too bad:
switch(...) {
case a: ...
case b: ...
}
But when you have a switch expression
, like the example shown at the beginning of this comment (i.e., with a return
), it makes more sense to treat it as a basicOffset
and not a caseIndent
in my opinion. But we cannot simply change the value of caseIndent
as there are older style switch
statements in the repo.
Ideally I would like to have a value of 2
for case indents. Also by default the cases should be indented as seen here: https://checkstyle.sourceforge.io/checks/misc/indentation.html#caseIndent
Sorry for the long write-up, but we shouldn't see checkstyle warnings in the latest commit :)
|
} else { | ||
throw new SemanticException("View " + subqAlias + " is corresponding to " | ||
+ relNode.toString() + ", rather than a HiveProject."); | ||
HiveProject project = extractFirstProject(relNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a project and why the first project is important?
How do we use the project afterwards?
Does the code remain correct if we use a project that is not at the top of the plan?
Do we generally support views with LIMIT
, ORDER BY
, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the projects for authorization in HiveRelFieldTrimmer.trimFields
here, and that's the only place where we use it. I believe that is why we need only the first project, that gives us the final list of fields.
I don't think we can use other projects in place of the top one, as the fields could be different.
I am not really sure if we officially support views with limits or order by, but right now it looks like we can support them.
What changes were proposed in this pull request?
Earlier we expected a view's logical plan to always have a
HiveProject
as its top node. But we can haveHiveSortLimit
too if the original view definition has a limit.We can add a case for
SingleRel
and check that itsgetInput()
returns aHiveProject
, and then the code can work as before.Why are the changes needed?
We get an error as described in https://issues.apache.org/jira/browse/HIVE-28280, https://issues.apache.org/jira/browse/HIVE-21163
Does this PR introduce any user-facing change?
No
How was this patch tested?