Skip to content

feat: add PolynomialFeatures to_gbq and pipeline support #805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 28, 2024

Conversation

GarrettWu
Copy link
Contributor

@GarrettWu GarrettWu commented Jun 24, 2024

BEGIN_COMMIT_OVERRIDE
feat: add PolynomialFeatures support to to_gbq and pipelines (#805)
END_COMMIT_OVERRIDE

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@GarrettWu GarrettWu self-assigned this Jun 24, 2024
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jun 24, 2024
@GarrettWu GarrettWu requested review from junyazhang and shobsi June 25, 2024 17:41
@GarrettWu GarrettWu marked this pull request as ready for review June 25, 2024 17:41
@GarrettWu GarrettWu requested review from a team as code owners June 25, 2024 17:41
@GarrettWu GarrettWu added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 25, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 25, 2024
@GarrettWu GarrettWu added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 25, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 25, 2024
Tuple[
str,
Union[preprocessing.PreprocessingType, impute.SimpleImputer],
Union[str, List[str]],
]
] = []
] = set()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for going from list to set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The processor is different from others. Where previous are 1 -> 1 transform, poly_features is n -> m. So other transformers we get 1 transformed_column, but poly_features we get m transformed_column. Which need to be deduped. And the order of the transformers doesn't matter.

@GarrettWu GarrettWu requested a review from shobsi June 28, 2024 17:20
@GarrettWu GarrettWu merged commit 57d98b9 into main Jun 28, 2024
22 of 23 checks passed
@GarrettWu GarrettWu deleted the garrettwu-poly2 branch June 28, 2024 23:10
@tswast tswast changed the title feat: add PolynomailFeatures to_gbq and pipeline support feat: add PolynomialFeatures to_gbq and pipeline support Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants