Skip to content

Conversation

prakhar1144
Copy link
Member

@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch 3 times, most recently from 0b4f350 to 94cfa73 Compare April 14, 2025 16:41
**On success**, the bouncer replies to the server with `device_id`.
The server stores `device_id` in the `PushDevice`.

**On error**, the server keep retrying for a long time. Like maybe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not on a liveness failure. (Those don't have a hope of fixing themselves on retry.)

In that case the server should probably then just forget about that push_account_id​, so that the client retries from scratch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnprice gnprice changed the title ZAP: Proposal to add encryption for mobile push notifications. ZAP: Encryption for mobile push notifications Apr 15, 2025
@gnprice
Copy link
Member

gnprice commented Apr 15, 2025

I shortened the title because "proposal" is redundant in the name of a ZAP 🙂

@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from 94cfa73 to aa02161 Compare April 16, 2025 10:11
@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from aa02161 to 57251f2 Compare April 16, 2025 13:23
@prakhar1144
Copy link
Member Author

@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from 57251f2 to 7358a81 Compare April 17, 2025 12:42
@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from 7358a81 to bede1c5 Compare April 17, 2025 17:33
@gnprice
Copy link
Member

gnprice commented Apr 18, 2025

I have a workflow suggestion for this repo:

Instead of using the workflow we use in Zulip's software repos, I'd like to use a squash-merge workflow. That means:

  • The PR branch isn't a draft of the sequence of commits to be pushed to the repo. Instead it's a draft of the desired total diff.
  • When revising the PR, push the revisions as new commits on top, without force-pushing.
  • When it's ready to merge, we'll use GitHub's "Squash and merge" button to squash it to a single commit.

This way, the commits in the PR are a record of its history during review. We know the end result will be a single commit — so, unlike with Zulip's main repos, there's no need to use the commit structure within the PR branch to express the details of the desired commit structure to be merged.

And this way everyone has a lot more options for looking at the PR after some revisions have happened and seeing exactly what's changed since the last time one looked, or exactly what the PR looked like at the time of a given review discussion.

(Cf. previous discussion of the trade-offs of squash-merge vs. Zulip's normal rebase-merge workflow, in the context of our main repos: #git help > squash-merges @ 💬.)

@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from 5d0ad50 to 7fd37bf Compare April 21, 2025 15:22
Changes:
* Import/Export PushDevice carefully.
* Keep Zulip Cloud implementation conceptually similar.
* Explanation for how a user is notified of liveness error.
* No need to pass `user_id` when registering to bouncer.
* Cleanup stale entries of `PushDevice`.
@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from 7fd37bf to b6d61dc Compare April 21, 2025 16:25
@prakhar1144
Copy link
Member Author

New changes made are in a separate commit as discussed. See commit message for a summary.

Updates since last time:
* Add `token_kind` in `PushDevice` to make batched API call
  to the push bouncer while sending notifications possible.
* Addressed Zixuan comments.
@prakhar1144 prakhar1144 marked this pull request as ready for review April 23, 2025 12:23
Comment on lines 187 to 190
- This allows the client to determine if the `push_account_id` it has
locally for the account is missing on the `push_account_ids` map.
If missing, it knows it needs to reregister the push device.
It helps to show an error to the user in case of liveness error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the last sentence here. What's a "liveness error" in this context? Is this saying that something helps for showing an error, or that showing an error is helpful?

Copy link
Member Author

@prakhar1144 prakhar1144 Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to the error bouncer raises during registration:

On error:

  • If it's a liveness error, don't retry and forget the push_account_id.
    Pass the error to the server admin. The user is notified about this
    error on the client using the push_account_ids map provided in the
    register-queue response. (Refer
    to the Miscellaneous section for more details.)

Updating it would be helpful:

  It helps to show an error to the user in case of liveness error
+ raised by the bouncer during registration.

Also, I think we can be more specific here. When we say "user is notified about this error", there would be some specific error message like "contact your admin" ? & not just a retry.

So, PushDevice record with push_account_id will be deleted in the case of expired_time as well as "liveness error". How we're going to distinguish to show the specific error message to the user in one case and not the other ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I still don't understand what this is saying. Can you clarify this part?

Is this saying that something helps for showing an error, or that showing an error is helpful?

And if the former: what is the thing that helps for showing an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is addressing #2 (comment)

Is this saying that something helps for showing an error

Yes.

if the former: what is the thing that helps for showing an error?

push_account_ids map (which has push_account_id and status) will help the client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, so this sentence could say:

This data helps the client to show an error to the user in case of a liveness error.

I understand now what the sentence is saying. Next is the how and the why: how does the client use this data for that purpose, and why is that something we want to do? Why does the client care that there was specifically a liveness error?

For giving the user feedback that notifications aren't successfully set up, I think we probably want just a timeout: the client started trying to tell the server about its token more than, like, a few minutes ago, and the status is still that it hasn't succeeded yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PUSH_REGISTRATION_LIVENESS_TIMEOUT is 24 hours.

So, are you suggesting:

  • Storing the timestamp in client when the registration request was made and server responded with 200.
  • Then, in future, when the user reopens the map and the push_account_id is missing in the map + 24 hours passed; we can show the error and retry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I meant what I said.

If the user installed the Zulip app and logged into Zulip 5 minutes ago, and they're not getting notifications when people DM them, they're going to be wondering what's up because "the mobile app" clearly isn't working. So the app should acknowledge that.

At that point the server will still be periodically attempting to retry the registration with the bouncer (because PUSH_REGISTRATION_LIVENESS_TIMEOUT hasn't expired). In other words it will be still attempting to recover from the outage. But that doesn't mean there isn't an outage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, simply client side handling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in the 7th commit.

93750b6424 (HEAD -> web-empty-topic-name-support) inline_topic_edit: Show realm_empty_topic_display_name as placeholder.
@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from a2e0b1d to 34240d4 Compare April 28, 2025 18:02
@prakhar1144 prakhar1144 force-pushed the encrypt-push-notification branch from 34240d4 to c3aa215 Compare April 28, 2025 20:13
@prakhar1144
Copy link
Member Author

prakhar1144 commented Apr 28, 2025

Addressed the last batch of comments.

I'll post a design for #2 (comment) in CZO. -- Update: Discussed and documented here, ready to review.

Adds plan to support rotating asymmetric key generated by the bouncer.
@timabbott
Copy link
Member

Merged via #3. I verified all the comment threads above are resolved in the merged draft.

@timabbott timabbott closed this Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants