-
Notifications
You must be signed in to change notification settings - Fork 36
sunlight: specifying synchronous merging #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I agree with the value of synchronous merging, and it's one of the major motivations of the Sunlight log implementation. I am not sure we can effectively require it in this specification, though, or even at all in policy. Unless resorting to audits or code review, policies are limited to specifying externally observable behavior. "Any SCT must be eventually incorporated at the index in the extension" is already the requirement of the current document. "Any SCT must be observably incorporated before being returned" is a way stronger requirement that doesn't leave space for designs that offer immediate durability but eventual consistency of reads (e.g. putting a caching CDN in front of a Sunlight log). "A log MUST incorporate the certificate into the Merkle Tree before returning the SCT" is not specifying an externally observable behavior, and kind of gets into the durability properties of the log's internals (e.g. "incorporate" probably means running fsync? What about the RAID cache?). IMHO, applying pressure towards safer designs (such as towards synchronous merging, and away from SCTs) makes perfect sense, and that's what the SCT extension does. Mandating designs, on the other hand, feels complex and brittle. We can keep going further (in future iterations of the specification): for example we could require returning the STH along with the SCT next. |
+1 to Filippo's response above, this is where I landed too and for largely the same reasons. |
+1
|
I agree with @FiloSottile. Also, it's unclear to me why a log which has trouble storing unmerged entries durably wouldn't also have trouble storing the sequenced tree durably. Indeed, analyzing past log failures involving data loss or corruption suggests that mandating synchronous merging wouldn't have helped in most cases:
I believe that Sunlight logs will be more robust than past CT logs not because of the protocol but because of the following aspects of the Sunlight implementation:
It would be beneficial to encourage all log implementations to make the above choices, but the Sunlight protocol spec doesn't seem like the right place for it. |
Thanks for the insights. Many of the benefits of synchronous inclusion are indeed achieved by durably storing unmerged entries, and by having much shorter MMDs, so I'll close this issue out. Chrome will very likely try to address the second point by significantly reducing the allowable MMD of tiled logs accepted into Chrome's list. If anyone has opinions on what those restrictions should/should not be, feel free to reach out (directly to me, on ct-policy@, or wherever else is convenient.) |
Early drafts of the Sunlight spec noted that the inclusion of the leaf index in the SCT "limit[ed] Sunlight logs to a null Merge Delay" but that language was softened after it was observed that it was possible to identify a future leaf index without actually yet including the certificate in the tree.
Synchronous merging (i.e. a null merge delay) is a highly-desirable property from Chrome's perspective, and we would like to see this property added to the spec more explicitly.
Experience with RFC6962 logs in the existing CT ecosystem have shown that one of the greatest risks to individual logs is the issuance of SCTs whose corresponding certificates are never included in the log's merkle tree. Dropping certificates for which an SCT has been issued results in an unrecoverable loss of integrity, leading to the log's removal from the list of usable logs by CT-enforcing user agents.
Avoiding this risk is worth a lot to us. Logs commonly experience downtime, but as long as logs have durably included all certificates for which SCTs were issued, and resume correctly serving the required submission and read endpoints, these failures are typically fully recoverable. Downtime when RFC 6962 logs have not yet fully incorporated all pending certificates has led to several log failures due to either omitting entries entirely or rebuilding the tree in a way that resulted in a split view.
Logs that break their integrity guarantees not only pose risks to the directly-involved certificates, but also cause extended periods of reduced availability of CT logging for the entire web ecosystem. Replacement of a log is far from instantaneous -- it takes months to ensure that a new log is usable in all enforcing user agents. During that time, the WebPKI must rely on fewer remaining CT logs.
One wrinkle is that the current specification identifies an API, but largely does not dictate other log behavior. I'll provide a PR soon, but broadly, we'd like to propose the introduction of a "Log Behavior" section (mirroring a similar section in RFC6962) that specifies that:
The text was updated successfully, but these errors were encountered: