You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did want to add to your ‘Zeros replace missing values’; in my time as an ETL developer (before going back into academia) I have come across the following as common entries for missing values:
-1
99
...
-99999
...
-99
00000
Basically, stuff the field with as many 9s as it will hold or, if possible, make it negative as that is ‘clearly’ a non-sensical value. Never bother documenting any of these decisions. I’ve actually never come across 0 for missing, but that might be the type of data I work with. Point being, I would suggest emphasising that distributions and summaries, as well as specific tests on each row for ‘reasonableness’ (e.g. ‘break if {value} < 0’), are a key part of most processing and analysis regardless of the quality of the data itself.
The text was updated successfully, but these errors were encountered:
onyxfish
changed the title
More "bad nulls" (from email
More "bad nulls" (from email)
Dec 17, 2015
Copying over from emailed note
The text was updated successfully, but these errors were encountered: