-
Notifications
You must be signed in to change notification settings - Fork 125
ENH: Allow partial table schema in to_gbq() table_schema (#218) #257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I think enabling schema 'overrides' is a good feature.
I left one question about the code, but the tests look great, so assuming they pass 👍
def _update_bq_schema(schema_old, schema_new): | ||
from pandas_gbq import schema | ||
|
||
return schema.update_schema(schema_old, schema_new) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this function? Should we import from schema
directly? Or there's a circular import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just following the pattern used for the only other schema
function, which is imported the same way in gbq._generate_bq_schema
. As far as I'm concerned it's fine to get rid of it. Let me know and I'll make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll defer to Chesterton's fence; we can clean up later if @tswast knows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_generate_bq_schema
is only in gbq.py
for backwards compatibility. We should use the update_schema
method from the schema
module directly.
I've sent #259 to clean this up (and also improve the docs for this feature).
Great! Any thoughts @tswast ? |
@JohnPaton Want to add a Whatsnew? Feel free to add you GH handle |
@max-sixty Done! I didn't see any other handles in the Changelog so if that's not what you meant just let me know 😆 |
Many thanks @JohnPaton ! |
Thanks a bunch for the contribution! |
This PR:
pandas_gbq.schema
,update_schema
which takes and old schema and a new one, updating any existing fields and adding any new onesschema.generate_bq_schema
for every DataFrame passed toto_gbq
, and if atable_schema
is passed, this schema is used to update the generated schema. This way custom provided fields intable_schema
are added, and missing fields fall back to the generated defaults.This is the functionality requested in #218