Skip to content

Commit 937d8cc

Browse files
committed
analysis/cleansing: initial structure created
1 parent 93a22f0 commit 937d8cc

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

analysis/cleansing.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
Data cleansing
2+
==============
3+
4+
Often, the data you have will have errors, and you'll need to clean it up.
5+
Missing data, inconsistent labels, duplicates and misspellings are common
6+
examples of errors. The [Wikipedia article on data
7+
cleansing](http://en.wikipedia.org/wiki/Data_cleansing) talks more about this.
8+
9+
The typical cycle of data cleansing is:
10+
11+
1. Find errors
12+
2. Correct the errors
13+
3. Repeat 1 & 2 as often required
14+
4. Automate the correction process for new data feeds
15+
16+
Finding errors
17+
--------------
18+
19+
Correcting errors
20+
-----------------
21+
22+
Automating correction
23+
---------------------

0 commit comments

Comments
 (0)