Skip to content

docs: add llm kmeans notebook as an included example #177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Nov 9, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add connection and cleanup
  • Loading branch information
Henry J Solberg committed Nov 8, 2023
commit 299800fe9847131a353fae26713438cf1931eb55
157 changes: 153 additions & 4 deletions notebooks/generative_ai/bq_dataframes_llm_kmeans.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,10 @@
"outputs": [],
"source": [
"# set your project ID below\n",
"PROJECT_ID = \"\" # @param {type:\"string\"}"
"PROJECT_ID = \"\" # @param {type:\"string\"}\n",
"\n",
"# Set the project id in gcloud\n",
"! gcloud config set project {PROJECT_ID}"
]
},
{
Expand All @@ -162,6 +165,146 @@
"REGION = \"US\" # @param {type: \"string\"}"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Authenticate your Google Cloud account\n",
"\n",
"Depending on your Jupyter environment, you might have to manually authenticate. Follow the relevant instructions below."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"**Vertex AI Workbench**\n",
"\n",
"Do nothing, you are already authenticated."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"**Local JupyterLab instance**\n",
"\n",
"Uncomment and run the following cell:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# ! gcloud auth login"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"**Colab**\n",
"\n",
"Uncomment and run the following cell:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# from google.colab import auth\n",
"# auth.authenticate_user()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"If you want to reset the location of the created DataFrame or Series objects, reset the session by executing `bf.close_session()`. After that, you can reuse `bf.options.bigquery.location` to specify another location."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to Vertex AI\n",
"\n",
"In order to use PaLM2TextGenerator, we will need to set up a [cloud resource connection](https://cloud.google.com/bigquery/docs/create-cloud-resource-connection)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.cloud import bigquery_connection_v1 as bq_connection\n",
"\n",
"CONN_NAME = \"bqdf-llm\"\n",
"\n",
"client = bq_connection.ConnectionServiceClient()\n",
"new_conn_parent = f\"projects/{PROJECT_ID}/locations/{REGION}\"\n",
"exists_conn_parent = f\"projects/{PROJECT_ID}/locations/{REGION}/connections/{CONN_NAME}\"\n",
"cloud_resource_properties = bq_connection.CloudResourceProperties({})\n",
"\n",
"try:\n",
" request = client.get_connection(\n",
" request=bq_connection.GetConnectionRequest(name=exists_conn_parent)\n",
" )\n",
" CONN_SERVICE_ACCOUNT = f\"serviceAccount:{request.cloud_resource.service_account_id}\"\n",
"except Exception:\n",
" connection = bq_connection.types.Connection(\n",
" {\"friendly_name\": CONN_NAME, \"cloud_resource\": cloud_resource_properties}\n",
" )\n",
" request = bq_connection.CreateConnectionRequest(\n",
" {\n",
" \"parent\": new_conn_parent,\n",
" \"connection_id\": CONN_NAME,\n",
" \"connection\": connection,\n",
" }\n",
" )\n",
" response = client.create_connection(request)\n",
" CONN_SERVICE_ACCOUNT = (\n",
" f\"serviceAccount:{response.cloud_resource.service_account_id}\"\n",
" )\n",
"print(CONN_SERVICE_ACCOUNT)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set permissions for the service account\n",
"\n",
"The resource connection service account requires certain project-level permissions:\n",
" - `roles/aiplatform.user` and `roles/bigquery.connectionUser`: These roles are required for the connection to create a model definition using the LLM model in Vertex AI ([documentation](https://cloud.google.com/bigquery/docs/generate-text#give_the_service_account_access)).\n",
" - `roles/run.invoker`: This role is required for the connection to have read-only access to Cloud Run services that back custom/remote functions ([documentation](https://cloud.google.com/bigquery/docs/remote-functions#grant_permission_on_function)).\n",
"\n",
"Set these permissions by running the following `gcloud` commands:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!gcloud projects add-iam-policy-binding {PROJECT_ID} --condition=None --no-user-output-enabled --member={CONN_SERVICE_ACCOUNT} --role='roles/bigquery.connectionUser'\n",
"!gcloud projects add-iam-policy-binding {PROJECT_ID} --condition=None --no-user-output-enabled --member={CONN_SERVICE_ACCOUNT} --role='roles/aiplatform.user'\n",
"!gcloud projects add-iam-policy-binding {PROJECT_ID} --condition=None --no-user-output-enabled --member={CONN_SERVICE_ACCOUNT} --role='roles/run.invoker'"
]
},
{
"attachments": {},
"cell_type": "markdown",
Expand Down Expand Up @@ -191,7 +334,7 @@
"import bigframes.pandas as bpd\n",
"\n",
"bpd.options.bigquery.project = PROJECT_ID\n",
"bpd.options.bigquery.location = LOCATION"
"bpd.options.bigquery.location = REGION"
]
},
{
Expand Down Expand Up @@ -456,7 +599,8 @@
"source": [
"from bigframes.ml.llm import PaLM2TextGenerator\n",
"\n",
"q_a_model = PaLM2TextGenerator(connection_name=\"bigframes-dev.us.bigframes-ml\")"
"connection = f\"{PROJECT_ID}.{REGION}.{CONN_NAME}\"\n",
"q_a_model = PaLM2TextGenerator(connection_name=connection)"
]
},
{
Expand Down Expand Up @@ -512,7 +656,12 @@
"metadata": {},
"outputs": [],
"source": [
"# TODO"
"# # Delete the BigQuery Connection\n",
"# from google.cloud import bigquery_connection_v1 as bq_connection\n",
"# client = bq_connection.ConnectionServiceClient()\n",
"# CONNECTION_ID = f\"projects/{PROJECT_ID}/locations/{REGION}/connections/{CONN_NAME}\"\n",
"# client.delete_connection(name=CONNECTION_ID)\n",
"# print(f\"Deleted connection '{CONNECTION_ID}'.\")"
]
}
],
Expand Down