You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -119,12 +119,12 @@ Continuous variables are anything measured on a quantitative scale that could be
119
119
would be something like weight measured in kg. [Ordinal data](http://en.wikipedia.org/wiki/Ordinal_data) are data that have a fixed, small (< 100) number of levels but are ordered.
120
120
This could be for example survey responses where the choices are: poor, fair, good. [Categorical data](http://en.wikipedia.org/wiki/Categorical_variable) are data where there
121
121
are multiple categories, but they aren't ordered. One example would be sex: male or female. [Missing data](http://en.wikipedia.org/wiki/Missing_data) are data
122
-
that are missing and you don't know the mechanism. You should code missing values as `NA`. [Censored data](http://en.wikipedia.org/wiki/Censoring_(statistics\)) are data
122
+
that are missing and you don't know the mechanism. You should code missing values as `NA`. [Censored data](http://en.wikipedia.org/wiki/Censoring_\(statistics\)) are data
123
123
where you know the missingness mechanism on some level. Common examples are a measurement being below a detection limit
124
124
or a patient being lost to follow-up. They should also be coded as `NA` when you don't have the data. But you should
125
125
also add a new column to your tidy data called, "VariableNameCensored" which should have values of `TRUE` if censored
126
126
and `FALSE` if not. In the code book you should explain why those values are missing. It is absolutely critical to report
127
-
to the analyst if there is a reason you know about that some of the data are missing. You should also not [impute](http://en.wikipedia.org/wiki/Imputation_(statistics\))/make up/
127
+
to the analyst if there is a reason you know about that some of the data are missing. You should also not [impute](http://en.wikipedia.org/wiki/Imputation_\(statistics\))/make up/
128
128
throw away missing observations.
129
129
130
130
In general, try to avoid coding categorical or ordinal variables as numbers. When you enter the value for sex in the tidy
0 commit comments