Skip to content

feat: support bigquery.vector_search() #736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 7, 2024
Merged

Conversation

ashleyxuu
Copy link
Contributor

@ashleyxuu ashleyxuu commented May 30, 2024

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes internal #344019437 🦕

@ashleyxuu ashleyxuu requested a review from tswast May 30, 2024 20:41
@ashleyxuu ashleyxuu requested review from a team as code owners May 30, 2024 20:41
@product-auto-label product-auto-label bot added the size: l Pull request size is large. label May 30, 2024
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. label May 30, 2024
@ashleyxuu ashleyxuu force-pushed the ashleyxu-vector-search branch from 02208aa to c36ba63 Compare May 31, 2024 18:14
@ashleyxuu ashleyxuu force-pushed the ashleyxu-vector-search branch from 49b65e8 to 72f411f Compare June 6, 2024 17:40
@ashleyxuu ashleyxuu force-pushed the ashleyxu-vector-search branch from 72f411f to 0cfa88a Compare June 6, 2024 19:09
>>> import bigframes.bigquery as bbq
>>> bpd.options.display.progress_bar = None

DataFrame embeddings for which to find nearest neighbors:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clarify here that the ARRAY<FLOAT> column is used as the search query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

<BLANKLINE>
[4 rows x 4 columns]

You can specify the name of the column in the query DataFrame embeddings and distance type:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably query_column_to_search is required in this case, right? Let's tell the user that if so. Or if not, what happens if they don't specify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ashleyxuu ashleyxuu force-pushed the ashleyxu-vector-search branch from 0cfa88a to f2859f3 Compare June 6, 2024 20:37
@GarrettWu GarrettWu merged commit dad66fd into main Jun 7, 2024
23 checks passed
@GarrettWu GarrettWu deleted the ashleyxu-vector-search branch June 7, 2024 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants