Skip to content

Commit 87b62af

Browse files
committed
Codecov and Census dataset
1 parent 76e685e commit 87b62af

File tree

10 files changed

+382
-8
lines changed

10 files changed

+382
-8
lines changed

.travis.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ install:
2020
- pip install -e .
2121

2222
script:
23-
- if [ $TEST == 'unit' ]; then pytest --cov=./ --sanitize-with tests/sanitize-notebook.cfg tests/unit/; fi
24-
- if [ $TEST == 'issue' ]; then pytest --cov=./ --sanitize-with tests/sanitize-notebook.cfg tests/issues/; fi
25-
- if [ $TEST == 'examples' ]; then pytest --nbval --cov=./ --sanitize-with tests/sanitize-notebook.cfg examples/; fi
23+
- if [ $TEST == 'unit' ]; then pytest --cov=$TEST --sanitize-with tests/sanitize-notebook.cfg tests/unit/; fi
24+
- if [ $TEST == 'issue' ]; then pytest --cov=$TEST --sanitize-with tests/sanitize-notebook.cfg tests/issues/; fi
25+
- if [ $TEST == 'examples' ]; then pytest --nbval --cov=$TEST --sanitize-with tests/sanitize-notebook.cfg examples/; fi
2626
# Our well-behaved Unix-style command-line tool exits with code 0 unless an internal error occurred
2727
- if [ $TEST == 'console' ]; then pandas_profiling -h; fi
2828

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ For each column the following statistics - if relevant for the column type - are
2222

2323
The following examples can give you an impression of what the package can do:
2424

25+
* [Census Income](http://pandas-profiling.github.io/pandas-profiling/examples/census/census_report.html) (US Adult Census data relating income)
2526
* [NASA Meteorites](http://pandas-profiling.github.io/pandas-profiling/examples/meteorites/meteorites_report.html) (comprehensive set of meteorite landings)
2627
* [Titanic](http://pandas-profiling.github.io/pandas-profiling/examples/titanic/titanic_report.html) (the "Wonderwall" of datasets)
2728
* [NZA](http://pandas-profiling.github.io/pandas-profiling/examples/nza/nza_report.html) (open data from the Dutch Healthcare Authority)

docs/index.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ <h1 id="pandas-profiling">Pandas Profiling</h1>
4242
<h2 id="examples">Examples</h2>
4343
<p>The following examples can give you an impression of what the package can do:</p>
4444
<ul>
45+
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/census/census_report.html">Census Income</a> (US Adult Census data relating income)</li>
4546
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/meteorites/meteorites_report.html">NASA Meteorites</a> (comprehensive set of meteorite landings)</li>
4647
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/titanic/titanic_report.html">Titanic</a> (the "Wonderwall" of datasets)</li>
4748
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/nza/nza_report.html">NZA</a> (open data from the Dutch Healthcare Authority)</li>

examples/census/census.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
from pathlib import Path
2+
3+
import pandas as pd
4+
import numpy as np
5+
import requests
6+
7+
import pandas_profiling
8+
9+
if __name__ == "__main__":
10+
file_name = Path("census_train.csv")
11+
if not file_name.exists():
12+
data = requests.get(
13+
"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
14+
)
15+
file_name.write_bytes(data.content)
16+
17+
# Names based on https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names
18+
df = pd.read_csv(
19+
file_name,
20+
header=None,
21+
index_col=False,
22+
names=[
23+
"age",
24+
"workclass",
25+
"fnlwgt",
26+
"education",
27+
"education-num",
28+
"marital-status",
29+
"occupation",
30+
"relationship",
31+
"race",
32+
"sex",
33+
"capital-gain",
34+
"capital-loss",
35+
"hours-per-week",
36+
"native-country",
37+
],
38+
)
39+
40+
# Prepare missing values
41+
df = df.replace("\\?", np.nan, regex=True)
42+
43+
profile = df.profile_report(title="Census Dataset")
44+
profile.to_file(output_file=Path("./census_report.html"))

examples/census/census_report.html

Lines changed: 328 additions & 0 deletions
Large diffs are not rendered by default.

examples/meteorites/meteorites_report.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

examples/nza/nza_report.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

examples/stata_auto/stata_auto_report.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

examples/titanic/titanic_report.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

examples/website_inaccessibility/website_inaccessibility_report.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)