Supavisor V2 #408

abc3 · 2024-07-31T16:55:32Z

In this PR, Supavisor has been adapted for Erlang 27 and Elixir 1.17. The metrics backend has been changed from PromEx to Peep, and the most significant change is the introduction of ProxyHandler for processing transaction mode.

The ProxyHandler first authenticates the incoming connection through auth_query. If there is no connection pool, it creates one and also launches a Ranch on a free port, saving the port and host in the corresponding Syn record.

If the pool is on the same node as the client connection, we locally change the owner of the TCP socket for each transaction to the direct connection with the database. After the transaction is completed, the client process returns ownership of the TCP socket to the DbHandler.

If the pool is on a different node, the client process connects to the Ranch of the corresponding pool on the other node, switches to session mode, and forwards everything (works like proxy). The local Ranch operates the same way as when the client connection is on the same node as the connection pool.

With this handler, erldist is used only for synchronizing Syn. Forwarding query data between nodes is done via TCP sockets (Ranch)

abc3 · 2024-07-31T19:36:02Z

Hey, @supabase/dashbit, we have prepared improvements for Supavisor and would love to hear your feedback. There are a lot of changes, so feel free to review whatever you like. We are particularly interested in your opinion on the ProxyHandler. The corresponding implementation is here:

josevalim · 2024-08-01T08:24:47Z

Exciting work! Do you have any numbers or results into the performance of this new version?

After the transaction is completed, the client process returns ownership of the TCP socket to the DbHandler.

I haven't reviewed the code yet, so I will try to verify this as well, but remember that, if the client crashes for some reason, you most likely can't assert for certain the state of the socket: for example, was something only partially written to it? So you may need to drop the connection altogether or do some sort of cleanup (if possible), like make sure to abort transactions and other things. I am not sure how much state can be carried around.

josevalim · 2024-08-01T08:37:31Z

Two additional questions for now:

I also remember we discussed improvements to poolboy or your fork to poolboy. I don't remember what they were exactly though. I assume those may already be in place?
Is it correct that this pull request still preserves the old code for talking to the database? Is the plan to remove it in the future or are there issues in unifying everything through the proxy handler?

josevalim · 2024-08-01T08:53:00Z

Two more:

The ProxyHandler first authenticates the incoming connection through auth_query. If there is no connection pool, it creates one and also launches a Ranch on a free port, saving the port and host in the corresponding Syn record.

Is the maximum number of ports (65k) a potential concern here? If so, could we use a single Ranch port but pass some other information (such as a preamble when connecting via gen_tcp) to choose with pool to use?

If the pool is on a different node, the client process connects to the Ranch of the corresponding pool on the other node, switches to session mode, and forwards everything (works like proxy).

How much do you handoff to the other node? From looking at the code, if it is on a different node, you are doing both auth and ssl handshakes. Does it mean you are doing them twice? On in the current node and then once you have to proxy? I wonder if it is worth doing a "special handshake" between those nodes, so when you join this special Ranch port (which should not be exposed), you can assume auth and ssl have been handled in the first node and you don't have to redo it. Maybe that's already the case, but I wanted to double check.

abc3 · 2024-08-01T10:32:34Z

Exciting work!

😊😊😊

Do you have any numbers or results into the performance of this new version?

We are still measuring performance for different scenarios. When the pooler is on the same machine as the database, Supavisor outperforms or performs the same as PgBouncer. However, it is slightly slower when a Supavisor cluster of two nodes is in the same AWS zone as the database.

I haven't reviewed the code yet, so I will try to verify this as well, but remember that, if the client crashes for some reason, you most likely can't assert for certain the state of the socket: for example, was something only partially written to it? So you may need to drop the connection altogether or do some sort of cleanup (if possible), like make sure to abort transactions and other things. I am not sure how much state can be carried around.

We keep a linked DbHandler which originally created the socket, and if the client goes down, DbHandler will be respawned with a new direct connection.

I also remember we discussed improvements to poolboy or your fork to poolboy. I don't remember what they were exactly though. I assume those may already be in place?

I spent some time finding bottlenecks, and the biggest ones were metrics collection (shout out to @hauleth who fixed that) and the synchronous calling between ClientHandler and DbHandler to avoid overwhelming the mailboxes. So far, we haven't added anything new to Poolboy, except our previous changes for idle timeout.

Is it correct that this pull request still preserves the old code for talking to the database? Is the plan to remove it in the future or are there issues in unifying everything through the proxy handler?

DbHandler is still responsible for establishing direct connections in transaction mode, reconnecting in idle state, and it has been extended with handler_call to change ownership of the TCP socket. However, the connection logic for talking with the database during transactions is located in ProxyDb. In the future, we will remove the old module.

Is the maximum number of ports (65k) a potential concern here? If so, could we use a single Ranch port but pass some other information (such as a preamble when connecting via gen_tcp) to choose with pool to use?

Not at the moment, but it could potentially become a problem. The update is already quite complex, so I don't want to add further complications. We can always add this feature later

How much do you handoff to the other node? From looking at the code, if it is on a different node, you are doing both auth and ssl handshakes. Does it mean you are doing them twice? On in the current node and then once you have to proxy? I wonder if it is worth doing a "special handshake" between those nodes, so when you join this special Ranch port (which should not be exposed), you can assume auth and ssl have been handled in the first node and you don't have to redo it. Maybe that's already the case, but I wanted to double check.

Hmm, it should not use SSL for proxying to another node. The main reason for authentication was to provide an additional security guard. It's not very resource-intensive but can prevent accidental connections if the client has some issues with authentication.

josevalim · 2024-08-01T10:53:23Z

Fantastic. I have already reviewed the code and I will do another pass later. My only suggestion so far would be to have a custom tcp handshake for the proxy, so you have only a single special port instead of a pool of them and there is no need for additional auth. For security, you can put a 20-40 bytes random token on Syn instead of the Ranch port, and the new handshake uses a key plus this token for connecting. But as you said, I don't think it is a necessary change at this moment.

We are still measuring performance for different scenarios. When the pooler is on the same machine as the database, Supavisor outperforms or performs the same as PgBouncer. However, it is slightly slower when a Supavisor cluster of two nodes is in the same AWS zone as the database.

This is great! As far as I know, no other "bouncer" offers clustering anyway, and slightly slower would certainly be expected.

abc3 · 2024-08-01T12:17:50Z

My only suggestion so far would be to have a custom tcp handshake for the proxy, so you have only a single special port instead of a pool of them and there is no need for additional auth. For security, you can put a 20-40 bytes random token on Syn instead of the Ranch port, and the new handshake uses a key plus this token for connecting

a good call, thanks!

lib/supavisor/handlers/proxy/client.ex

josevalim

I did another review of the proxy modules and I have dropped only two additional comments. :)

hauleth · 2025-01-14T19:11:01Z

Moved v2 to main.

abc3 changed the title ~~Supavisor V2~~ [WIP] Supavisor V2 Jul 31, 2024

abc3 requested a review from a team July 31, 2024 19:36

josevalim reviewed Aug 1, 2024

View reviewed changes

lib/supavisor/handlers/proxy/client.ex Outdated Show resolved Hide resolved

josevalim reviewed Aug 1, 2024

View reviewed changes

lib/supavisor/handlers/proxy/client.ex Outdated Show resolved Hide resolved

josevalim reviewed Aug 1, 2024

View reviewed changes

abc3 mentioned this pull request Aug 2, 2024

fix: better handle application name #412

Merged

abc3 marked this pull request as ready for review August 30, 2024 13:43

abc3 requested a review from a team as a code owner August 30, 2024 13:43

abc3 changed the title ~~[WIP] Supavisor V2~~ Supavisor V2 Aug 30, 2024

encima mentioned this pull request Sep 13, 2024

AppSheet + Supabase: Since new Pooler, the connection is basically impossible to work with #386

Open

hauleth approved these changes Nov 16, 2024

View reviewed changes

hauleth mentioned this pull request Jan 8, 2025

Add initial Open Telemetry support #385

Closed

hauleth merged commit 7494cf9 into main Jan 14, 2025

hauleth deleted the v2 branch January 14, 2025 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Supavisor V2 #408

Supavisor V2 #408

Uh oh!

abc3 commented Jul 31, 2024 •

edited

Loading

Uh oh!

abc3 commented Jul 31, 2024

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

abc3 commented Aug 1, 2024 •

edited

Loading

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

abc3 commented Aug 1, 2024

Uh oh!

Uh oh!

Uh oh!

josevalim left a comment

Uh oh!

hauleth commented Jan 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Supavisor V2 #408

Supavisor V2 #408

Uh oh!

Conversation

abc3 commented Jul 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abc3 commented Jul 31, 2024

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

abc3 commented Aug 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josevalim commented Aug 1, 2024

Uh oh!

abc3 commented Aug 1, 2024

Uh oh!

Uh oh!

Uh oh!

josevalim left a comment

Choose a reason for hiding this comment

Uh oh!

hauleth commented Jan 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

abc3 commented Jul 31, 2024 •

edited

Loading

abc3 commented Aug 1, 2024 •

edited

Loading