Skip to content

feat: add pd.get_dummies #149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Nov 1, 2023
Merged
Changes from 1 commit
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
7a30750
feat: add pd.get_dummies
Oct 26, 2023
1b3fae0
Merge branch 'main' of github.com:googleapis/python-bigquery-datafram…
Oct 26, 2023
0aca97c
remove unneeded prefix case
Oct 26, 2023
706fa35
Merge branch 'main' into b297352026-get-dummies
milkshakeiii Oct 26, 2023
cb2a8b1
Merge branch 'main' of github.com:googleapis/python-bigquery-datafram…
Oct 27, 2023
0781c08
param/documentation fixes
Oct 27, 2023
eec7822
be stricter about types in test
Oct 27, 2023
bdee75e
be stricter about types in series test
Oct 27, 2023
15b1aa3
remove unneeded comment
Oct 27, 2023
ef57790
Merge branch 'b297352026-get-dummies' of github.com:googleapis/python…
Oct 27, 2023
0f7c38e
adjust for type difference in pandas 1
Oct 27, 2023
b060475
add example code (tested)
Oct 30, 2023
ce5ea69
fix None columns and add test cases
Oct 30, 2023
758bd6d
variable names and _get_unique_values per-column
Oct 30, 2023
dab3eba
account for pandas 1 behavior difference
Oct 30, 2023
b2032e1
remove already_seen set
Oct 30, 2023
1899a58
avoid unnecessary join/projection
Oct 30, 2023
1a71217
fix column ordering edge case
Oct 30, 2023
257531a
adjust for picky examples checker
Oct 30, 2023
87b358e
example tweak
Oct 31, 2023
979eb39
make part of the example comments
Oct 31, 2023
5b3dc18
use ellipsis in doctest comment
Oct 31, 2023
aa7a0a3
add <BLANKLINES> to doctest string
Oct 31, 2023
9db8707
Merge branch 'main' into b297352026-get-dummies
milkshakeiii Oct 31, 2023
f178012
extract parameter standardization
Nov 1, 2023
cc4aa4c
extract submethods
Nov 1, 2023
6bc36a5
Merge branch 'main' of github.com:googleapis/python-bigquery-datafram…
Nov 1, 2023
3fcdd5f
Merge branch 'b297352026-get-dummies' of github.com:googleapis/python…
Nov 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
adjust for picky examples checker
  • Loading branch information
Henry J Solberg committed Oct 30, 2023
commit 257531aca5b007b5640ff3f5d090c65a3e40a8b7
24 changes: 18 additions & 6 deletions third_party/bigframes_vendored/pandas/core/reshape/encoding.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,51 +28,63 @@ def get_dummies(
>>> import bigframes.pandas as pd
>>> s = pd.Series(list('abca'))
>>> pd.get_dummies(s)
a b c
a b c
0 True False False
1 False True False
2 False False True
3 True False False

[4 rows x 3 columns]

>>> s1 = pd.Series(['a', 'b', None])

>>> pd.get_dummies(s1)
a b
a b
0 True False
1 False True
2 False False

[3 rows x 2 columns]

>>> pd.get_dummies(s1, dummy_na=True)
a b <NA>
a b <NA>
0 True False False
1 False True False
2 False False True

[3 rows x 3 columns]

>>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'],
... 'C': [1, 2, 3]})

>>> pd.get_dummies(df, prefix=['col1', 'col2'])
C col1_a col1_b col2_a col2_b col2_c
C col1_a col1_b col2_a col2_b col2_c
0 1 True False False True False
1 2 False True True False False
2 3 True False False False True

[3 rows x 6 columns]

>>> pd.get_dummies(pd.Series(list('abcaa')))
a b c
a b c
0 True False False
1 False True False
2 False False True
3 True False False
4 True False False

[5 rows x 3 columns]

>>> pd.get_dummies(pd.Series(list('abcaa')), drop_first=True)
b c
b c
0 False False
1 True False
2 False True
3 False False
4 False False

[5 rows x 2 columns]

Args:
data (Series or DataFrame):
Data of which to get dummy indicators.
Expand Down