Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -509,7 +509,7 @@
"\n",
"To achieve that, we will do following.\n",
"\n",
"1. **Generate embedings for each of document in the knowledge library with Huggingface all-MiniLM-L6-v2 embedding model.**\n",
"1. **Generate embeddings for each of document in the knowledge library with Huggingface all-MiniLM-L6-v2 embedding model.**\n",
"2. **Identify top K most relevant documents based on user query.**\n",
" - 2.1 **For a query of your interest, generate the embedding of the query using the same embedding model.**\n",
" - 2.2 **Search the indexes of top K most relevant documents in the embedding space using in-memory Faiss search.**\n",
Expand Down Expand Up @@ -689,7 +689,7 @@
"id": "cfe4a131-9b09-4141-96c5-6a13751e99ff",
"metadata": {},
"source": [
"We generate embedings for each of document in the knowledge library with Huggingface all-MiniLM-L6-v2 embedding model.documents"
"We generate embeddings for each of document in the knowledge library with Huggingface all-MiniLM-L6-v2 embedding model.documents"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@
"source": [
"## Step 2. Ask a question to LLM without providing the context\n",
"\n",
"To better illustrate why we need retrieval-augmented generation (RAG) based approach to solve the question and anwering problem. Let's directly ask the model a question and see how they respond."
"To better illustrate why we need retrieval-augmented generation (RAG) based approach to solve the question and answering problem. Let's directly ask the model a question and see how they respond."
]
},
{
Expand Down Expand Up @@ -390,7 +390,7 @@
"\n",
"To achieve that, we will do following.\n",
"\n",
"1. **Generate embedings for each of document in the knowledge library with Cohere Multilingual embedding model.**\n",
"1. **Generate embeddings for each of document in the knowledge library with Cohere Multilingual embedding model.**\n",
"2. **Identify top K most relevant documents based on user query.**\n",
" - 2.1 **For a query of your interest, generate the embedding of the query using the same embedding model.**\n",
" - 2.2 **Search the indexes of top K most relevant documents in the embedding space using in-memory Faiss search.**\n",
Expand Down Expand Up @@ -741,7 +741,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Firstly, we **generate embedings for each of document in the knowledge library with SageMaker GPT-J-6B embedding model.**"
"Firstly, we **generate embeddings for each of document in the knowledge library with SageMaker GPT-J-6B embedding model.**"
]
},
{
Expand Down Expand Up @@ -788,7 +788,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Print out the top 3 most relevant docuemnts as below."
"Print out the top 3 most relevant documents as below."
]
},
{
Expand Down Expand Up @@ -839,7 +839,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Send the top 3 most relevant docuemnts and question into LLM to get a answer."
"Send the top 3 most relevant documents and question into LLM to get a answer."
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@
"source": [
"## Step 2. Ask a question to LLM without providing the context\n",
"\n",
"To better illustrate why we need retrieval-augmented generation (RAG) based approach to solve the question and anwering problem. Let's directly ask the model a question and see how they respond."
"To better illustrate why we need retrieval-augmented generation (RAG) based approach to solve the question and answering problem. Let's directly ask the model a question and see how they respond."
]
},
{
Expand Down Expand Up @@ -302,7 +302,7 @@
"\n",
"To achieve that, we will do following.\n",
"\n",
"* **Generate embedings for each of document in the knowledge library with the GPT-J-6B embedding model.**\n",
"* **Generate embeddings for each of document in the knowledge library with the GPT-J-6B embedding model.**\n",
"* **Identify top K most relevant documents based on user query.**\n",
" * **For a query of your interest, generate the embedding of the query using the same embedding model.**\n",
" * **Search the indexes of top K most relevant documents in the embedding space using the SageMaker KNN algorithm.**\n",
Expand Down Expand Up @@ -405,7 +405,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.2. Generate embedings for each of document in the knowledge library with the GPT-J-6B embedding model.\n",
"### 4.2. Generate embeddings for each of document in the knowledge library with the GPT-J-6B embedding model.\n",
"\n",
"For the purpose of the demo we will use [Amazon SageMaker FAQs](https://aws.amazon.com/sagemaker/faqs/) as knowledge library. The data are formatted in a CSV file with two columns Question and Answer. We use **only** the Answer column as the documents of knowledge library, from which relevant documents are retrieved based on a query. \n",
"\n",
Expand Down Expand Up @@ -541,7 +541,7 @@
"1. Start a training job to index the embedding knowledge data. The underlying algorithm used to index the data is [Faiss](https://github.com/facebookresearch/faiss).\n",
"2. Start an endpoint to take the embedding of the query as input and return the top K nearest indexes of the documents.\n",
"\n",
"**Note.** For the KNN training job, the features are N by P matrix, where N is the number of documetns in the knowledge library, P is the embedding dimension, and each row corresponds to an embedding of a document. The labels are ordinal integers starting from 0. During inference, given an embedding of query, the labels of the top K nearest documents with respect to the query are used as indexes to retrieve the corresponded textual documents.\n",
"**Note.** For the KNN training job, the features are N by P matrix, where N is the number of documents in the knowledge library, P is the embedding dimension, and each row corresponds to an embedding of a document. The labels are ordinal integers starting from 0. During inference, given an embedding of query, the labels of the top K nearest documents with respect to the query are used as indexes to retrieve the corresponded textual documents.\n",
"\n",
"\n"
]
Expand Down Expand Up @@ -649,7 +649,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Deploy the KNN endpoint for retrieving indexes of top K most relevant docuemnts."
"Deploy the KNN endpoint for retrieving indexes of top K most relevant documents."
]
},
{
Expand Down Expand Up @@ -740,7 +740,7 @@
"context_embed_retrieve = construct_context(context_predictions_arr, df_knowledge[\"Answer\"])\n",
"\n",
"print(\n",
" f\"{newline}{bold}Elastic time for computing the embedding of a query and retrieved the top K most relevant docuemnts: {time.time() - start} seconds.{unbold}{newline}\"\n",
" f\"{newline}{bold}Elastic time for computing the embedding of a query and retrieved the top K most relevant documents: {time.time() - start} seconds.{unbold}{newline}\"\n",
")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@
"\n",
"To achieve that, we will do following.\n",
"\n",
"1. **Generate embedings for each of document in the knowledge library with SageMaker GPT-J-6B embedding model.**\n",
"1. **Generate embeddings for each of document in the knowledge library with SageMaker GPT-J-6B embedding model.**\n",
"2. **Identify top K most relevant documents based on user query.**\n",
" - 2.1 **For a query of your interest, generate the embedding of the query using the same embedding model.**\n",
" - 2.2 **Search the indexes of top K most relevant documents in the embedding space using in-memory Faiss search.**\n",
Expand Down Expand Up @@ -670,7 +670,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Firstly, we **generate embedings for each of document in the knowledge library with SageMaker GPT-J-6B embedding model.**"
"Firstly, we **generate embeddings for each of document in the knowledge library with SageMaker GPT-J-6B embedding model.**"
]
},
{
Expand Down Expand Up @@ -711,7 +711,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Print out the top 3 most relevant docuemnts as below."
"Print out the top 3 most relevant documents as below."
]
},
{
Expand Down Expand Up @@ -756,7 +756,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Send the top 3 most relevant docuemnts and question into LLM to get a answer."
"Send the top 3 most relevant documents and question into LLM to get a answer."
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@
"source": [
"## Step 2. Ask a question to LLM without providing the context\n",
"\n",
"To better illustrate why we need retrieval-augmented generation (RAG) based approach to solve the question and anwering problem. Let's directly ask the model a question and see how they respond."
"To better illustrate why we need retrieval-augmented generation (RAG) based approach to solve the question and answering problem. Let's directly ask the model a question and see how they respond."
]
},
{
Expand Down Expand Up @@ -335,7 +335,7 @@
"\n",
"To achieve that, we will do following.\n",
"\n",
"* Generate embedings for each of document in the knowledge library with the MiniLM embedding model.\n",
"* Generate embeddings for each of document in the knowledge library with the MiniLM embedding model.\n",
"* Identify top K most relevant documents based on user query.\n",
" * For a query of your interest, generate the embedding of the query using the same embedding model.\n",
" * Search the indexes of top K most relevant documents in the embedding space using the SageMaker KNN algorithm.\n",
Expand Down