Skip to content

treat sync1.1 errors as warnings #989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

Conversation

brianolson
Copy link
Contributor

No description provided.

Copy link
Collaborator

@bnewbold bnewbold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, and a couple good little cleanups along the way.

I flagged a couple places we probably don't need warn-versus-error: #sync events aren't emitted by pre-sync1.1 software at all, and DID syntax in #commit events can always be enforced strictly (not a sync1.1 change)


did, err := syntax.ParseDID(msg.Did)
if err != nil {
syncVerifyErrors.WithLabelValues(hostname, "did").Inc()
return nil, err
if !val.Sync11ErrorsAreWarnings {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these #sync event checks could be errors-not-warnings. The whole event is new in sync v1.1, so there shouldn't be any problem with old implementations outputting events which don't validate. It is only the #commit events we need to handle with warning-not-error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where I thought the discussion went yesterday, that we have to allow mediocre implementation of sync1.1 for a while

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main point is to allow PDS instances which don't implement Sync v1.1 at all, not ones which attempt to implement it but mess up.

aka, we want code which was running and working in December 2024 to keep running for a while longer. but code which was implemented last week and implemented sync1.1 wrong can error.

@brianolson brianolson force-pushed the bolson/sync11-error-as-warn branch from d604306 to e5f3163 Compare March 20, 2025 13:59
Comment on lines 409 to 419
commit, err := atrepo.LoadCARCommit(ctx, bytes.NewReader([]byte(msg.Blocks)))
if err != nil {
commitVerifyErrors.WithLabelValues(hostname, "car").Inc()
return nil, err
if !val.Sync11ErrorsAreWarnings {
return nil, err
} else {
logger.Warn("invalid car", "err", err)
}
}

if commit.Rev != rev.String() {
Copy link
Contributor

@devinivy devinivy Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there was an error in loading the CAR, I would expect commit to be nil. If that's the case, falling through on the "invalid car" warn would cause a nil pointer exception here checking the rev.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed (and in commit handler)

Comment on lines 409 to 412
commit, err := atrepo.LoadCARCommit(ctx, bytes.NewReader([]byte(msg.Blocks)))
if err != nil {
commitVerifyErrors.WithLabelValues(hostname, "car").Inc()
if !val.Sync11ErrorsAreWarnings {
if !val.Sync11ErrorsAreWarnings || commit == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that if err != nil that the commit is going to be nil. Is that not the case? I think that means we always have to error if we fail to read the commit from the CAR, there's no case where we'll end-up warning here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, yeah, content and err are mutually exclusive, cleaned up

Copy link
Collaborator

@bnewbold bnewbold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few more instances where a check might error, resulting in nil / zero-like values.

especially for the MST inversion case, I think the easiest fix is to split code out to a sub-method which fails or succeeds all together, and then warn-or-error on that overall result.

}
if commit.DID != did.String() {
commitVerifyErrors.WithLabelValues(hostname, "did2").Inc()
return nil, fmt.Errorf("rev did not match commit")
if !val.Sync11ErrorsAreWarnings {
return nil, fmt.Errorf("rev did not match commit")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message should be about DID, not about rev: "message DID and commit DID differed" or something like that

if !val.Sync11ErrorsAreWarnings {
return nil, fmt.Errorf("invalid repo path in ops list: %w", err)
} else {
logger.Warn("invalid repo path", "err", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to some of the other branches: if there is a ParseRepoPath error, then nsid and rkey will be empty strings, and the code path below won't work.

I think this could just always be an error: there should be zero or very-near-zero commits floating around the network which would fail this check; not a "Sync v1.1" change.

if !val.Sync11ErrorsAreWarnings {
return nil, err
} else {
logger.Warn("invalid record cid", "err", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the warning message should be something like "record CID not found in CAR diff" (not "invalid record cid")

if !val.Sync11ErrorsAreWarnings {
return nil, err
} else {
logger.Warn("invalid commit ops", "err", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with some of the other code branches, if ParseCommitOps fails, I assume that ops will be nil/empty, and the checks below (like NormalizeOps, and inversion in general) won't make sense.

maybe one way to make this handling simpler would be a "VerifyCommitMessageOps" sub-method. the "Sync11ErrorsAreWarnings" check can be implemented based on the output of that overall function, instead of for each step which might fail.


did, err := syntax.ParseDID(msg.Did)
if err != nil {
syncVerifyErrors.WithLabelValues(hostname, "did").Inc()
return nil, err
if !val.Sync11ErrorsAreWarnings {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main point is to allow PDS instances which don't implement Sync v1.1 at all, not ones which attempt to implement it but mess up.

aka, we want code which was running and working in December 2024 to keep running for a while longer. but code which was implemented last week and implemented sync1.1 wrong can error.

return nil, err
} else {
logger.Warn("invalid rev", "err", err)
}
}
if rev.Time().After(time.Now().Add(val.maxRevFuture)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rev might be zero-like (empty string) if syntax check above failed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in HandleCommit I'd changed it to assume valid DID and rev TID are basic; propagated that here too

if val.Sync11ErrorsAreWarnings {
logger.Warn("invalid car", "err", err)
}
return nil, fmt.Errorf("invalid car, %w", err)
}

if commit.Rev != rev.String() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit will be zero-like here if LoadCARCommit failed above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's okay because we do always error out on that error

return nil, fmt.Errorf("rev did not match commit")
} else {
logger.Warn("message rev != commit.rev")
}
}
if commit.DID != did.String() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit will be zero-like here if LoadCARCommit failed above

mostly reorg a couple chains of actions into functions so bail out is easier
@brianolson
Copy link
Contributor Author

ok, I think I cleaned up all the places where we should bail on a sequence of steps if there's an error, bundled things up into two new inner functions to organize that.

@bnewbold bnewbold marked this pull request as draft April 23, 2025 20:27
@bnewbold bnewbold closed this Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants