Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 54 additions & 60 deletions nexus/src/app/external_endpoints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,17 @@ use anyhow::Context;
use nexus_db_model::AuthenticationMode;
use nexus_db_model::Certificate;
use nexus_db_model::DnsGroup;
use nexus_db_model::DnsZone;
use nexus_db_model::Silo;
use nexus_db_queries::context::OpContext;
use nexus_db_queries::db::datastore::Discoverability;
use nexus_db_queries::db::model::ServiceKind;
use nexus_db_queries::db::pagination::Paginator;
use nexus_db_queries::db::DataStore;
use nexus_types::identity::Resource;
use nexus_types::silo::silo_dns_name;
use nexus_types::silo::DEFAULT_SILO_ID;
use omicron_common::api::external::http_pagination::PaginatedBy;
use omicron_common::api::external::DataPageParams;
use omicron_common::api::external::Error;
use omicron_common::bail_unless;
use openssl::pkey::PKey;
Expand Down Expand Up @@ -488,69 +490,61 @@ pub(crate) async fn read_all_endpoints(
datastore: &DataStore,
opctx: &OpContext,
) -> Result<ExternalEndpoints, Error> {
// We will not look for more than this number of external DNS zones, Silos,
// or certificates. We do not expect very many of any of these objects.
const MAX: u32 = 200;
let pagparams_id = DataPageParams {
marker: None,
limit: NonZeroU32::new(MAX).unwrap(),
direction: dropshot::PaginationOrder::Ascending,
};
let pagbyid = PaginatedBy::Id(pagparams_id);
let pagparams_name = DataPageParams {
marker: None,
limit: NonZeroU32::new(MAX).unwrap(),
direction: dropshot::PaginationOrder::Ascending,
};

let silos =
datastore.silos_list(opctx, &pagbyid, Discoverability::All).await?;
let external_dns_zones = datastore
.dns_zones_list(opctx, DnsGroup::External, &pagparams_name)
.await?;
// The batch size here is pretty arbitrary. On the vast majority of
// systems, there will only ever be a handful of any of these objects. Some
// systems are known to have a few dozen silos and a few hundred TLS
// certificates. This code path is not particularly latency-sensitive. Our
// purpose in limiting the batch size is just to avoid unbounded-size
// database transactions.
//
// unwrap(): safe because 200 is non-zero.
let batch_size = NonZeroU32::new(200).unwrap();

// Fetch all silos.
let mut silos = Vec::new();
let mut paginator = Paginator::new(batch_size);
while let Some(p) = paginator.next() {
let batch = datastore
.silos_list(
opctx,
&PaginatedBy::Id(p.current_pagparams()),
Discoverability::All,
)
.await?;
paginator = p.found_batch(&batch, &|s: &Silo| s.id());
silos.extend(batch.into_iter());
}

// Fetch all external DNS zones. We should really only ever have one, but
// we may as well paginate this.
let mut external_dns_zones = Vec::new();
let mut paginator = Paginator::new(batch_size);
while let Some(p) = paginator.next() {
let batch = datastore
.dns_zones_list(opctx, DnsGroup::External, &p.current_pagparams())
.await?;
paginator = p.found_batch(&batch, &|z: &DnsZone| z.zone_name.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks correct to me, but out of curiosity: why in the other two queries do we have an explicit PaginatedBy::Id(pageparams), and for this one we seem to be implicitly paginating by zone name? Is the use of z.zone_name here relying on implementation details of dns_zones_list()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly: the sort of low-level parameter for pagination is DataPageParams, which can be parametrized by basically any marker type:

/// Parameters used to request a specific page of results when listing a
/// collection of objects
///
/// This is logically analogous to Dropshot's `PageSelector` (plus the limit from
/// Dropshot's `PaginationParams). However, this type is HTTP-agnostic. More
/// importantly, by the time this struct is generated, we know the type of the
/// sort field and we can specialize `DataPageParams` to that type. This makes
/// it considerably simpler to implement the backend for most of our paginated
/// APIs.
///
/// `NameType` is the type of the field used to sort the returned values and it's
/// usually `Name`.
#[derive(Clone, Debug)]
pub struct DataPageParams<'a, NameType> {
/// If present, this is the value of the sort field for the last object seen
pub marker: Option<&'a NameType>,
/// Whether the sort is in ascending order
pub direction: PaginationOrder,
/// This identifies how many results should be returned on this page.
/// Backend implementations must provide this many results unless we're at
/// the end of the scan. Dropshot assumes that if we provide fewer results
/// than this number, then we're done with the scan.
pub limit: NonZeroU32,
}

For collections that only support pagination by one thing, like dns_zones_list(), their datastore functions just accept DataPageParams:

pagparams: &DataPageParams<'_, String>,

For relatively few collections we support paginating by any one of a few different fields (well, it's always "id" or "name"). The caller gets to pick. The code is basically the same and these functions accept a PaginatedBy, which is just an enum with variants for by-id or by-name, and each variant just contains the appropriate DataPageParams type:

#[derive(Debug)]
pub enum PaginatedBy<'a> {
Id(DataPageParams<'a, Uuid>),
Name(DataPageParams<'a, Name>),
}

So I don't think this is relying on an implementation detail. It's just that dns_zones_list() only supports paginating by the zone name so it accepts DataPageParams directly, while the other functions support pagination by either and so need the wrapper that lets you pick which one you want (but ultimately you're still providing a DataPageParams and that's what the code is using).


There's a bit of an implicit detail here which is that the caller of a paginated function needs to know which field of the object is the marker. (In practice this is not super easy to get wrong because the marker type is usually kind of unique -- a uuid or a Name -- though you could pick a different field that's also a uuid or something.) This applies to all three queries though and every other use of Paginator, too (it's encoded in the closure provided to found_batch()).

external_dns_zones.extend(batch.into_iter());
}
bail_unless!(
!external_dns_zones.is_empty(),
"expected at least one external DNS zone"
);
let certs = datastore
.certificate_list_for(opctx, Some(ServiceKind::Nexus), &pagbyid, false)
.await?;

// If we found too many of any of these things, complain as loudly as we
// can. Our results will be wrong. But we still don't want to fail if we
// can avoid it because we want to be able to serve as many endpoints as we
// can.
// TODO-reliability we should prevent people from creating more than this
// maximum number of Silos and certificates.
let max = usize::try_from(MAX).unwrap();
if silos.len() >= max {
error!(
&opctx.log,
"reading endpoints: expected at most {} silos, but found at \
least {}. TLS may not work on some Silos' external endpoints.",
MAX,
silos.len(),
);
}
if external_dns_zones.len() >= max {
error!(
&opctx.log,
"reading endpoints: expected at most {} external DNS zones, but \
found at least {}. TLS may not work on some Silos' external \
endpoints.",
MAX,
external_dns_zones.len(),
);
}
if certs.len() >= max {
error!(
&opctx.log,
"reading endpoints: expected at most {} certificates, but \
found at least {}. TLS may not work on some Silos' external \
endpoints.",
MAX,
certs.len(),
);

// Fetch all TLS certificates.
let mut certs = Vec::new();
let mut paginator = Paginator::new(batch_size);
while let Some(p) = paginator.next() {
let batch = datastore
.certificate_list_for(
opctx,
Some(ServiceKind::Nexus),
&PaginatedBy::Id(p.current_pagparams()),
false,
)
.await?;
paginator = p.found_batch(&batch, &|s: &Certificate| s.id());
certs.extend(batch);
}

Ok(ExternalEndpoints::new(silos, certs, external_dns_zones))
Expand Down
Loading