Skip to content

Conversation

KiranVelumuri
Copy link
Contributor

What changes were proposed in this pull request?

Warn instead of throw error in logs when KafkaDagCredentialSupplier is not available in classpath

Why are the changes needed?

Stacktrace for ClassNotFoundException for KafkaDagCredentialSupplier is printed for cases not relevant to current scenario(might not be Kafka related or need a credential supplier)

Does this PR introduce any user-facing change?

No

How was this patch tested?

mvn test -Dtest=TestBeeLineWithArgs#testRowsAffected

Copy link

github-actions bot commented Sep 17, 2025

@check-spelling-bot Report

🔴 Please review

See the files view or the action log for details.

Unrecognized words (6)

calcualtion
Chrono
getenv
ntz
OOM
unsign

Previously acknowledged words that are now absent www
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the [email protected]:KiranVelumuri/hive.git repository
on the HIVE-28965 branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
-H "Content-Type: application/json" \
"https://api.github.com/repos/apache/hive/issues/comments/3302999994" > "$comment_json"
comment_body=$(mktemp)
jq -r ".body // empty" "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")

patch_add=$(perl -e '$/=undef; $_=<>; if (m{Unrecognized words[^<]*</summary>\n*```\n*([^<]*)```\n*</details>$}m) { print "$1" } elsif (m{Unrecognized words[^<]*\n\n((?:\w.*\n)+)\n}m) { print "$1" };' < "$comment_body")

update_files
rm $comment_body
git add -u
If the flagged items do not appear to be text

If items relate to a ...

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

  • binary file.

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

dagSuppliers.add(c.getConstructor().newInstance());
} catch (ReflectiveOperationException e) {
LOG.error("Failed to add credential supplier", e);
LOG.warn("Failed to add credential supplier: {}", s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's ok, but would be great to init those CredentialSuppliers only when used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, would look into this.

@KiranVelumuri
Copy link
Contributor Author

KiranVelumuri commented Oct 9, 2025

@deniskuzZ We needed some check at DagUtils to add the kafka credential supplier only when needed. So I referred to obtainToken and added those checks(isTokenRequired) here. Could you please tell if this approach is ok here?

* @param props the properties from which to obtain the protocol.
* @return the security protocol if one is defined in the properties and null otherwise.
*/
static SecurityProtocol getSecurityProtocol(Properties props) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this functionality should be extracted to util class like KafkaUtils

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was from KafkaUtils, I put it here to avoid dependency on kafka-handler(since it causes cyclic dependency)

@deniskuzZ
Copy link
Member

deniskuzZ commented Oct 9, 2025

in my opinion it's a bit of overkill. We shouldn't copy code from kafka-handler, instead we might initialize suppliers lazily only when UserGroupInformation.isSecurityEnabled
cc @abstractdog

private void getCredentialsFromSuppliers(BaseWork work, Set<TableDesc> tables, DAG dag, JobConf conf) {
    if (!UserGroupInformation.isSecurityEnabled()){
      return;
    }
    ..
    for (DagCredentialSupplier supplier : credentialSuppliers.get()) {
      ....
    }
}

private final Supplier<List<DagCredentialSupplier>> credentialSuppliers;

DagUtils(Supplier<List<DagCredentialSupplier>> credentialSuppliers) {
  this.credentialSuppliers = credentialSuppliers;
}

private static final DagUtils instance = new DagUtils(() -> defaultCredentialSuppliers());

Note, this is only a test classpath issue, in production deployment kafka-handler jar would be always present.

@KiranVelumuri
Copy link
Contributor Author

yes @deniskuzZ. This is what I felt too and wanted your opinion on.

@abstractdog
Copy link
Contributor

in my opinion it's a bit of overkill. We shouldn't copy code from kafka-handler, instead we might initialize suppliers lazily only when UserGroupInformation.isSecurityEnabled cc @abstractdog

private void getCredentialsFromSuppliers(BaseWork work, Set<TableDesc> tables, DAG dag, JobConf conf) {
    if (!UserGroupInformation.isSecurityEnabled()){
      return;
    }
    ..
    lazyInitCredentialSuppliers();
}

Note, this is only a test classpath issue, in production deployment kafka-handler jar would be always present.

the original goal was to completely remove kafka dependency from ql, which is the case for every other third party that hive provides a storage handler for, so we should maintain that behavior
I agree with trying the lazy approach if security is disabled as it would easily solve the test log noise

@deniskuzZ
Copy link
Member

i've updated the snippet, @KiranVelumuri please take a look #6081 (comment)

…FoundException: org.apache.hadoop.hive.kafka.KafkaDagCredentialSupplier
Copy link

@KiranVelumuri
Copy link
Contributor Author

i've updated the snippet, @KiranVelumuri please take a look #6081 (comment)

@deniskuzZ Thank you for the snippet. Could you please review? Sorry I was away travelling for the past 3 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants