Remove Timestamp Check for S3 #56

vinayvija · 2024-08-05T15:02:00Z

When using s3 as Hadoop File system we enconter resource changed on src file system error. The reason I think is that in S3 implementation file timestamps are changed which does not happen in HDFS. Timestamp doc

Tested this by deploying this branch in QA NA. The same jobs that failed with this error passed after this change.

Example error log

Application application_1722669525346_0001 failed 5 times due to AM Container for appattempt_1722669525346_0001_000005 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2024-08-05 14:41:43.796]Resource s3a://hubspot-hadoop-fs-backfill-s3-h3-eu1-qa/mapred/staging/backfill-s3-h3-qa/Wkd60q0QYAePuklPUXszbRFkLCjyB03P/.staging/job_1722669525346_0001/libjars changed on src filesystem - expected: "2024-08-05T14:39:39.188+0000", was: "2024-08-05T14:41:43.775+0000", current time: "2024-08-05T14:41:43.775+0000"
java.io.IOException: Resource s3a://hubspot-hadoop-fs-backfill-s3-h3-eu1-qa/mapred/staging/backfill-s3-h3-qa/Wkd60q0QYAePuklPUXszbRFkLCjyB03P/.staging/job_1722669525346_0001/libjars changed on src filesystem - expected: "2024-08-05T14:39:39.188+0000", was: "2024-08-05T14:41:43.775+0000", current time: "2024-08-05T14:41:43.775+0000"
at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:282)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:72)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:425)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:422)

ddelong · 2024-08-07T15:27:49Z

...ect/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java

    FileSystem sourceFs = sCopy.getFileSystem(conf);
    FileStatus sStat = sourceFs.getFileStatus(sCopy);
    if (sStat.getModificationTime() != resource.getTimestamp()) {
-      throw new IOException("Resource " + sCopy + " changed on src filesystem" +


We talked about this briefly, but outright disabling this check makes me uneasy. it's there for a reason and we might start introducing unknown behavior by allowing this. S3 does work differently here, but we should make the logic reflect that fact or find a way to make modification time work as expected.

I think the reason it's there is that it's using timestamp as a mechanism for checking if we have the right jar is in place to run the job and not something else has overwritten.

So if we get rid of timestamp check and instead do checksum comparison it would work better.

I added a check for skipping this for s3 file systems. Looking at some stack overflow threads they seem safe.

…tscheck

vinayvija · 2024-08-16T15:17:51Z

Tested this in NA QA backfill-s3 cluster job ran fine.

ddelong

This is safer and we can go with this.

vinayvija added 3 commits May 1, 2024 15:44

log the timestamp difference instead of throwing exception.

f6c2037

keep trailing slash for s3 filesystems use.

b380874

revert to original

39aae9f

vinayvija requested review from cathturner, ddelong and johnnysohn August 5, 2024 15:02

vinayvija self-assigned this Aug 5, 2024

ddelong requested changes Aug 7, 2024

View reviewed changes

vinayvija added 4 commits August 8, 2024 12:05

Merge branch 'hubspot-3.3.6' of github.com:HubSpot/hadoop into remove…

f355e15

…tscheck

add check if the cluster is a S3 cluster

46ee022

null check

758e8d9

more universal check

b7c122e

ddelong changed the title ~~Removetscheck~~ Remove Timestamp Check for S3 Aug 16, 2024

ddelong approved these changes Aug 16, 2024

View reviewed changes

vinayvija merged commit f3312a2 into hubspot-3.3.6 Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove Timestamp Check for S3 #56

Remove Timestamp Check for S3 #56

Uh oh!

vinayvija commented Aug 5, 2024

Uh oh!

ddelong Aug 7, 2024

Uh oh!

johnnysohn Aug 8, 2024

Uh oh!

vinayvija Aug 15, 2024 •

edited

Loading

Uh oh!

vinayvija commented Aug 16, 2024

Uh oh!

ddelong left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Remove Timestamp Check for S3 #56

Remove Timestamp Check for S3 #56

Uh oh!

Conversation

vinayvija commented Aug 5, 2024

Uh oh!

ddelong Aug 7, 2024

Choose a reason for hiding this comment

Uh oh!

johnnysohn Aug 8, 2024

Choose a reason for hiding this comment

Uh oh!

vinayvija Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vinayvija commented Aug 16, 2024

Uh oh!

ddelong left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vinayvija Aug 15, 2024 •

edited

Loading