Welcome to pointblank Discussions! #244
Replies: 3 comments 2 replies
-
|
I'm just getting familiar with pointblank and I'm very excited about it! Google BigQuery is one core part of my data pipeline. I don't see any mention of adding BigQuery in the (very nice) list of upcoming work, so I assume support for it is far off if at all. I'm curious whether there is any reason it would be especially difficult to support BigQuery, or if it's just a low priority for other reasons. Looking forward to using pointblank in our nascent data validation steps! |
Beta Was this translation helpful? Give feedback.
-
|
Ah, great to know that some functions might work as they are. I'll test
them out and check on the four you mentioned that may be an issue -- I
should be getting back to working on validation a few weeks from now. Will
report back!
<http://www.theloomaproject.com/>
Elaine McVey
VP of Data Science
M 919.272.8013 E [email protected]
<https://vimeo.com/theloomaproject>
<https://www.instagram.com/loomaproject/>
<https://www.linkedin.com/in/eamcvey>
…On Mon, Jan 4, 2021 at 3:07 PM Richard Iannone ***@***.***> wrote:
Hi Elaine, thanks for getting this discussion going! I'd love to get
BigQuery 'verified' as working. My main difficulty was/is that testing
against databases is hard, mainly because of access. My guess right now is
that running pointblank against BigQuery might be okay for a lot of the
validation functions. Some of the ones where it might not work so well are
col_vals_regex(), col_vals_increasing(), col_vals_decreasing(), and
rows_duplicated().
Would you be able to tentatively test out pointblank on a BigQuery table?
If you could that would be really great. I could provide a table and an R
script that exercises all of the validation functions.
If certain steps don't work, then the col_vals_expr() could provide a
nice workaround. It uses a dplyr expression, translates to SQL, and runs
that as the validation. In the future, I also want to include a
col_vals_sql() function where you send SQL for the validation step.
Knowing that you need BigQuery to work, I could prioritize this bit of
work (I'll create an Issue). I was pretty happy to find out that pointblank
works pretty well (as far as I can tell w/o testing) on Snowflake (
https://dev.solita.fi/2020/12/16/data-quality-with-r.html). So there's
hope for BigQuery!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<https://github.com/rich-iannone/pointblank/discussions/244#discussioncomment-261039>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANOXHWVV62JE6AZYEPZZBU3SYINXTANCNFSM4USX7WNA>
.
|
Beta Was this translation helpful? Give feedback.
-
|
Hi Rich, Could a data entry package like {DataEditR} lean on {pointblank} as a dependency to do real-time data validation, and reject invalid values at the point of data entry? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
👋 Welcome!
We’re using Discussions as a place to connect with other members of our community. We hope that you:
build together 💪.
Beta Was this translation helpful? Give feedback.
All reactions