Skip to content

Conversation

davidor
Copy link
Contributor

@davidor davidor commented Apr 8, 2020

Closes #28

This PR adds support for Redis Sentinels (https://redis.io/topics/sentinel-clients)

Notice that this PR is not complete. It does not have tests, does not explain how to use this in the README, and possibly does not handle all the error cases, but I'd rather work on those things after we agree on other aspects first.

In order to test this, you'll need one Redis master, a replica, and 3 sentinels. These are the commands I run to setup this on my local machine using docker, but you can adapt them to your env.

Setup

  • Run master on port 6379
docker run --rm -it --net=host redis --port 6379
  • Run slave on port 6380
docker run --rm -it --net=host redis --port 6380
  • Configure the slave to follow the master
redis-cli -p 6380 slaveof 127.0.0.1 6379
  • Start 3 sentinels

I used this example config file from the official docs. You need to create one for each sentinel. The 3 files should be the same except for the port. I used 9000, 9001 and 9002.

port 9000
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1
docker run --rm -it --net=host -v /tmp/sentinel1.conf:/data/sentinel.conf redis /data/sentinel.conf --port 9000 --sentinel
docker run --rm -it --net=host -v /tmp/sentinel2.conf:/data/sentinel.conf redis /data/sentinel.conf --port 9001 --sentinel
docker run --rm -it --net=host -v /tmp/sentinel3.conf:/data/sentinel.conf redis /data/sentinel.conf --port 9002 --sentinel

Test

  • Instantiate the client:
sentinels = [{host: '127.0.0.1', port:9000},{host:'127.0.0.1', port:9001},{host: '127.0.0.1', port: 9002}]
client = Async::Redis::SentinelsClient.new('mymaster', sentinels)
  • Make a request:
client.call('get', 'some_key')

Notice that the request should go to the master, and not the slave. You can verify that using the monitor Redis command.

To test a failover you can use: redis-cli -p 6379 debug sleep 10. That will block the master during some seconds, and the sentinels will elect a new master.

You can use something like this to verify that the client automatically sends requests to the new master when that happens. Again, you can run monitor on both Redis instances to verify this:

Async do
  loop do
    client.call('get', 'a')
    sleep 2
  end
end

@davidor davidor requested a review from ioquatix April 8, 2020 12:21
Copy link
Member

@ioquatix ioquatix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me. We should definitely try to scaffold some automated integration tests.


module Async
module Redis
class SentinelsClient < Client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm fine with this name for now, please give me a day to think about it.

There are two models for naming things like this:

  • Client::Sentinels

or as you did

  • SentinelsClient

The decision is based on how many other kinds of "clients" we expect to have and their semantics.

Copy link
Contributor Author

@davidor davidor Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what other clients we might want to have, maybe one for Redis Cluster.

I don't have a strong preference about this. I'm fine with whatever you decide.

This is kind of related with something that I wanted to discuss but forgot to mention in the description of the PR. Do you think we should instantiate this client with Client::Sentinels.new(...) / SentinelsClient.new or instead, do you think it'd be better to instantiate it with Client.new(...)? With the second option, the current Client initializer would be responsible for checking the options received in the params, and if there are some sentinels there, instantiate internally a SentinelsClient and return it instead of the standard one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think makes the most sense in the context of the user?

I always try to consider user/interface ergonomics... but I also try to avoid making things inefficient/add layers of abstraction where it only serves ergonomics purposes.

If users would always know they have sentinels or not, it makes sense to have separate classes, for example.

We shouldn't do anything to make Client "inefficient". It's the fast path, everything needs to be explicit.

With Async::HTTP I realised it's cumbersome for many use cases, so I introduced Async::HTTP::Internet which solves those challenges for the user. Some kind of multiplexing layer which understands there might be sentinels, replicas, and other things. Under the hood you still end up going via a directly connected client, but the abstraction is resolved one layer above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other benefit to that style of implementation, is that if you want to change your policy, it's not inheritance but composition. I really prefer composition if possible, simple layers bring about complex behaviour rather than complex class hierarchies that are hard to reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users would always know they have sentinels or not, it makes sense to have separate classes, for example.

They always know, it cannot be transparent because the user needs to provide the list of sentinels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ioquatix I agree with the point you made about composition. The idea would be to instantiate an Async::Redis::Client in the new class and then delegate everything to that instance, right?
I think we can't do that without changes in the Async::Redis::Client class. We'd need a way to be able to set the block that we use in the connect method.

@davidor
Copy link
Contributor Author

davidor commented Apr 8, 2020

@ioquatix I addressed your comments and added a new commit that handles an error case that I didn't take into account.

@ioquatix
Copy link
Member

ioquatix commented Apr 9, 2020

The error handling case makes sense.

This kind of distributed system has tricky error semantics.

The current implementation looks okay.

Actually, I tried to read the redis documentation, and I didn't understand what the hell this feature is. But I read the code, and it's crystal clear. Haha.

def resolve_address
address = case @role
when :master
resolve_master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is simpler if you write return resolve_master and the final statement is raise RuntimeError, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. Both resolve_master and resolve_slave can return nil. So we can't return without checking nil unless we move raise RuntimeError ... to the Async::Pool::Controller block above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def foo
  return nil
end

x, y = foo
x # nil
y # nil

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, the nil check... hmm, it just seems like something that could be a bit simpler. But if not that's okay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question I would ask, is there any situation where returning nil from resolve_* is useful?

Or is it always triggering an exception?

User may do resolve_master || resolve_slave. Then, returning nil makes total sense.

Yes, it could make sense to throw the exception in resolve_master. At that point, you also have more information available to give a meaningful error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the user should be able to do resolve_master || resolve_slave. I think the user needs to decide whether the requests should go to a master or slave when instantiating the client, but allowing both at the same time is a bit dangerous.

Some users might know, for example, that for their use case they only need to perform read operations and they don't mind about the master->slave replication lag. I that case, they might decide to perform those reads on the slaves to decrease the load of the master.

# to resolve the master/slave address.
def connect(**options)
Async::Pool::Controller.wrap(**options) do
address = resolve_address
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using a hash, which I know probably feels more explicit, but it's also more expensive. I'd suggest either:

  • Return an endpoint instance.
  • Use an array of [host, port] as in host, port = resolve_address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to return endpoints.
I think the second option hurts readability a bit because while checking the code you need to remember what each index represents.

end

def select_slave(slaves_cmd_reply)
return nil unless slaves_cmd_reply
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user did decide to implement their own policy, they would need to parse slaves_cmd_reply themselves.

I think we should parse it in resolve_slave either directly, or using a helper, into a data structure which you can easily make sense of (e.g. the hash that you construct already).

Then, select_slave(available) would just be available.sample. Some other implementation might prefer more complex logic (like checking latency).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I've separated the two methods in a new commit 3c6c5ae
I think it's better now.

@ioquatix
Copy link
Member

ioquatix commented Apr 9, 2020

I'm pretty happy with this. I may play around with the composition vs inheritance thing, but I'm happy to merge. I wonder if we should try to test it some how?

To differentiate the part that parses the response and returns the
available slaves from the code that chooses a slave from the available
ones.
@davidor
Copy link
Contributor Author

davidor commented Apr 9, 2020

@ioquatix I addressed your comments. Feel free to merge. As I mentioned above, changing to composition will require changes in the other class, so maybe it's better to address that separately? It should be fine as long as there isn't a release in the middle, because that would be a breaking change.

Regarding tests, testing with real Redis instances might be a bit too much for this, because we'd need 5 at least. On the other hand, we don't have any tooling to create "fake" async clients and I'm a bit concerned that using too much mocking will make the tests useless. Is there something in the async libraries that can help us with this?

@ioquatix
Copy link
Member

There are some options, we could try mocking up fake redis servers.

You've done a great job bringing the code to this point. I didn't understand sentinels at all but it's really clear from the code.

@ioquatix ioquatix merged commit df588b4 into socketry:master Apr 10, 2020
@davidor davidor deleted the sentinels branch April 10, 2020 09:56
@davidor
Copy link
Contributor Author

davidor commented Apr 14, 2020

@ioquatix about the composition vs inheritance topic. I implemented a possible solution in this WIP branch master...davidor:sentinels-change-to-composition but I'm not sure if it's a good idea to expose Client#pool. Is this what you had in mind?

@ioquatix
Copy link
Member

It's a good idea but I think we need a bit more elegant solution.

@ioquatix
Copy link
Member

FYI, I've added integration for Redis sentinels. Check the sentinel directory to see how it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for Redis Sentinels

3 participants