some fixes

googleapis · ashleyxuu · Nov 9, 2023 · Nov 7, 2023 · Nov 7, 2023 · Nov 7, 2023
commit 392488cdf059f355bf770034004b8fbd4d0391f6
@@ -61,7 +61,7 @@
         "\n",
         "1. Use PaLM2TextEmbeddingGenerator to [generate text embeddings](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings) for each of 10000 complaints sent to an online bank. If you're not familiar with what a text embedding is, it's a list of numbers that are like coordinates in an imaginary \"meaning space\" for sentences. (It's like [word embeddings](https://en.wikipedia.org/wiki/Word_embedding), but for more general text.) The important point for our purposes is that similar sentences are close to each other in this imaginary space.\n",
         "2. Use KMeans clustering to group together complaints whose text embeddings are near to eachother. This will give us sets of similar complaints, but we don't yet know _why_ these complaints are similar.\n",
-        "3. Simply ask PaLM2TextGenerator in English what the difference is between the groups of complaints that we got. Thanks to the power of modern LLMs, the response might give us a very good idea of what these complaints are all about, but remember to [\"understand the limits of your dataset and model.\"](https://ai.google/responsibility/responsible-ai-practices/#:~:text=Understand%20the%20limitations%20of%20your%20dataset%20and%20model)\n",
+        "3. Prompt PaLM2TextGenerator in English asking what the difference is between the groups of complaints that we got. Thanks to the power of modern LLMs, the response might give us a very good idea of what these complaints are all about, but remember to [\"understand the limits of your dataset and model.\"](https://ai.google/responsibility/responsible-ai-practices/#:~:text=Understand%20the%20limitations%20of%20your%20dataset%20and%20model)\n",
         "\n",
         "We will tie these pieces together in Python using BigQuery DataFrames. [Click here](https://cloud.google.com/bigquery/docs/dataframes-quickstart) to learn more about BigQuery DataFrames!"
       ]
@@ -87,13 +87,51 @@
         "\n",
         "* BigQuery (compute)\n",
         "* BigQuery ML\n",
+        "* Generative AI support on Vertex AI\n",
         "\n",
-        "Learn about [BigQuery compute pricing](https://cloud.google.com/bigquery/pricing#analysis_pricing_models),\n",
+        "Learn about [BigQuery compute pricing](https://cloud.google.com/bigquery/pricing#analysis_pricing_models), [Generative AI support on Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing#generative_ai_models),\n",
         "and [BigQuery ML pricing](https://cloud.google.com/bigquery/pricing#bqml),\n",
         "and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)\n",
         "to generate a cost estimate based on your projected usage."
       ]
     },
+    {
+      "attachments": {},
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Before you begin\n",
+        "\n",
+        "Complete the tasks in this section to set up your environment."
+      ]
+    },
+    {
+      "attachments": {},
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Set up your Google Cloud project\n",
+        "\n",
+        "**The following steps are required, regardless of your notebook environment.**\n",
+        "\n",
+        "1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 credit towards your compute/storage costs.\n",
+        "\n",
+        "2. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n",
+        "\n",
+        "3. [Click here](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,cloudfunctions.googleapis.com,run.googleapis.com,artifactregistry.googleapis.com,cloudbuild.googleapis.com,cloudresourcemanager.googleapis.com) to enable the following APIs:\n",
+        "\n",
+        "  * BigQuery API\n",
+        "  * BigQuery Connection API\n",
+        "  * Cloud Functions API\n",
+        "  * Cloud Run API\n",
+        "  * Artifact Registry API\n",
+        "  * Cloud Build API\n",
+        "  * Cloud Resource Manager API\n",
+        "  * Vertex AI API\n",
+        "\n",
+        "4. If you are running this notebook locally, install the [Cloud SDK](https://cloud.google.com/sdk)."
+      ]
+    },
     {
       "attachments": {},
       "cell_type": "markdown",
@@ -120,10 +158,7 @@
       },
       "outputs": [],
       "source": [
-        "import bigframes.pandas as bpd\n",
-        "\n",
-        "bpd.options.bigquery.project = \"bigframes-dev\"\n",
-        "bpd.options.bigquery.location = \"us\""
+        "import bigframes.pandas as bpd"
       ]
     },
     {
@@ -219,6 +254,14 @@
         "combined_df = downsampled_issues_df.join(predicted_embeddings)"
       ]
     },
+    {
+      "attachments": {},
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "We now have the complaints and their text embeddings as two columns in our combined_df. Recall that complaints with numerically similar text embeddings should have similar meanings semantically. We will now group similar complaints together."
+      ]
+    },
     {
       "attachments": {},
       "cell_type": "markdown",