Skip to content

Do not preface Surv() or strata() with survival:: in formulas #303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 9, 2025

Conversation

jdblischak
Copy link
Collaborator

@jdblischak jdblischak commented Dec 20, 2024

Updating our survival formulas to adapt to the latest changes in survival 3.7 and 3.8. The basic idea is that we should not preface Surv() and strata() with survival:: inside of formulas.

Below are the relevant bullet points from the NEWS file:

https://cran.r-project.org/web/packages/survival/news.html
https://github.com/therneau/survival/blob/9ea6560c18905ca5c83b2be18e505df6d135733f/inst/NEWS.Rd#L43

Changes in version 3.8-1

  • Changes needed to pass reverse dependency checks, largely tightening things up wrt to the formula fix. In summary, the survival package routines now ensure that anything in a formula that is a part of the package, is found in the package first. There is no need for things like coxph(survival::Surv(time, status) ~ ph.ecog + survival::strata(inst), data=lung), in fact you should not do this. Using survival::coxph rather than coxph is a completely different question, these updates have no bearing on that, only on functions within a formula. Note that prior to this update using survival::strata(inst) above would give the wrong answer; because :: mucks up formula processing the term would not be recognized as a stratum, and the resulting fit equivalent to using factor(inst).

Changes in version 3.8-0

  • Extend the fix to strata of 3.7-3. Now Surv, strata, cluster, and pspline calls internal to a coxph, survfit, survreg, survdiff, or concordance formula are forced to be internal to the survival package, and will not accidentally pick a local copy of the function.

Changes in version 3.7-3

  • Incorporate a copy of strata() local to coxph, something already done with tt(), so that using strata() in a formula does not search for the function. Also, preprocess any coxph formula to replace survival::strata() with strata() and likewise with cluster(). The reason for this is that strata is a special in coxph formulas (see the specials argument to the terms function) and will not be properly recognized when survival:: is prepended.

@jdblischak jdblischak self-assigned this Dec 20, 2024
@jdblischak
Copy link
Collaborator Author

Whether or not we require survival (>= 3.8) is debatable. I was able to run all the tests and examples with survival 3.7. It's my understanding is that the only way to have problems would be to locally define a function named Surv or strata. However, going forward, if someone contributed code that internally defined these functions, it would break the code when running with 3.7 but not 3.8 (so it would be hard to catch with CI, which runs with the latest versions). If we are worried about requiring the latest CRAN release of survival, I can remove the constraint.

@jdblischak
Copy link
Collaborator Author

Also, apparently prior to survival 3.8, using survival::strata() actually did the wrong thing. We should probably investigate this. Our test coverage of the survival formulas is spotty. I ran covr locally, and there is one formula that used survival::strata() that is not run by the tests.

x <- covr::package_coverage()
covr::report(x)

image

Specifically this code block is untested:

simtrial/R/sim_fixed_n.R

Lines 407 to 411 in d2f4127

if (n_stratum > 1) {
ln_hr <- survival::coxph(survival::Surv(tte, event) ~ (treatment == "experimental") + survival::strata(stratum), data = d)$coefficients
ln_hr <- as.numeric(ln_hr)
ans$event <- rep(event, nrow(rho_gamma))
ans$ln_hr <- rep(ln_hr, nrow(rho_gamma))

@nanxstats
Copy link
Collaborator

Requiring a too recent survival version might break people's shared basedline global library builds moving forward because the default survival version bundled with prior R releases can be lower than the required version. So I would suggest relaxing these requirements whenever possible.

My understanding of the problem: qualifying the namespace in coxph() formula components such as survival::strata() in survival < 3.8-0 is not fitting the stratified model you think it's fitting - it actually generates an unexpected, wrong model. If that's the case, then a potential fix is to detect the survival package version at runtime and branch off to use :: or not, so that you are sure you always get the correct results.

Pseudo code:

fit <- if (packageVersion("survival") < "3.8-0") {
  survival::coxph(Surv(time, status) ~ x + strata(group), test)
} else {
  survival::coxph(survival::Surv(time, status) ~ x + survival::strata(group), test)
}

Note that you still need to @importFrom survival Surv strata because the survival < 3.8-0 branch has no namespace qualifiers, it needs this to know where the functions came from.

I patched my packages this way before in irrelevant but similar situations: nanxstats/ggsci#30 nanxstats/protr#54

@jdblischak
Copy link
Collaborator Author

Requiring a too recent survival version might break people's shared basedline global library builds moving forward because the default survival version bundled with prior R releases can be lower than the required version. So I would suggest relaxing these requirements whenever possible.

I'm fine with removing this stringent requirement. Like I said above, I think we'd only ever have a problem with earlier versions of {survival} if we defined functions with the same names, which is unlikely to happen.

If that's the case, then a potential fix is to detect the survival package version at runtime and branch off to use :: or not, so that you are sure you always get the correct results.

I'm not sure we need/want to do this branching. It's my understanding that {simtrial} got flagged by the {survival} reverse dependency checks specifically because it was using simtrial:: in a formula, and we were asked to remove these.

Note that you still need to @importFrom survival Surv strata because the survival < 3.8-0 branch has no namespace qualifiers, it needs this to know where the functions came from.

That's a good point. I think importing Surv() and strata() will ensure our package is compliant pre- and post- survival 3.8.

@nanxstats
Copy link
Collaborator

Agreed! No survival version requirement + removing all :: + importing Surv() and strata() sounds like the best path forward to me.

Copy link
Collaborator

@LittleBeannie LittleBeannie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @jdblischak ! I will postpone the merged in case Keaven are interested in reviewing it.

@jdblischak
Copy link
Collaborator Author

I rebased the PR onto main to incorporate the recently merged #304

@@ -405,12 +406,12 @@ doAnalysis <- function(d, rho_gamma, n_stratum) {

event <- sum(d$event)
if (n_stratum > 1) {
ln_hr <- survival::coxph(survival::Surv(tte, event) ~ (treatment == "experimental") + survival::strata(stratum), data = d)$coefficients
ln_hr <- survival::coxph(Surv(tte, event) ~ (treatment == "experimental") + strata(stratum), data = d)$coefficients
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only lingering concern is that this line that uses survival::strata() is not covered by our tests/examples, and this usage is specifically noted as behaving incorrectly in previous versions of {survival}. From the release notes:

Note that prior to this update using survival::strata(inst) above would give the wrong answer; because :: mucks up formula processing the term would not be recognized as a stratum, and the resulting fit equivalent to using factor(inst).

My questions are:

  1. Can we add a test/example with more than one stratum to execute this code?
  2. Are we worried that this could give incorrect results when run with survival <3.8-1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LittleBeannie LittleBeannie merged commit ea88914 into Merck:main Jan 9, 2025
7 checks passed
@jdblischak jdblischak deleted the survival-formula branch January 9, 2025 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants