FEAT: add text embedder #1694

rhajou · 2025-05-02T15:16:13Z

Related Issues

fixes Add a GoogleAIGeminiDocumentEmbedder and GoogleAIGeminiTextEmbedder #1534 Gemini embedder models #1611

Proposed Changes:

Added the Google Text Embedder

How did you test it?

Unit tests

Notes for the reviewer

Didn't add the Document Embedder, is it needed?

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.

mpangrazzi

I've left some comments! I recommend reading first the official docs, then update the implementation and the tests. I also recommend adding an example (and test it out) or an integration test. LMK if something is not clear! 😉

mpangrazzi · 2025-06-04T07:58:09Z

integrations/google_ai/pyproject.toml

@@ -23,7 +23,7 @@ classifiers = [
  "Programming Language :: Python :: Implementation :: CPython",
  "Programming Language :: Python :: Implementation :: PyPy",
 ]
-dependencies = ["haystack-ai>=2.9.0", "google-generativeai>=0.3.1"]
+dependencies = ["haystack-ai>=2.9.0", "google-generativeai>=0.3.1", "google-genai==1.13.0"]


Adding google-genai==1.13.0 with exact version pinning could cause conflicts. What about google-genai>=1.13.0?

mpangrazzi · 2025-06-04T08:06:50Z

...grations/google_ai/src/haystack_integrations/components/embedders/google_ai/text_embedder.py

+        :param model: The name of the Google AI embedding model to use.
+                      Defaults to "models/embedding-001".
+        :param api_key: The Google AI API key. It can be explicitly provided or automatically read from the
+                        `GOOGLE_API_KEY` environment variable.


Note: above you're initializing this from GEMINI_API_KEY, but here you are referring to GOOGLE_API_KEY.

mpangrazzi · 2025-06-04T08:10:32Z

...grations/google_ai/src/haystack_integrations/components/embedders/google_ai/text_embedder.py

+            configs.title = self.title
+        elif self.title and self.task_type != "retrieval_document":
+            warnings.warn(
+                UserWarning("Warning: Title 'Should Be Ignored' is ignored because task_type is 'retrieval_query'"),


Shouldn't this be f"Warning: title '{self.title}' is ignored..."?

mpangrazzi · 2025-06-04T08:25:03Z

...grations/google_ai/src/haystack_integrations/components/embedders/google_ai/text_embedder.py

+            raise RuntimeError(msg) from e
+
+        # Extract embeddings - result.embedding should be the list of lists
+        embeddings = result.get("embedding")  # Use .get for safety, returns None if key missing


According to docs, result should be an object and not a dict, so you should do result.embeddings.

I see that in the tests you're mocking this response (so tests are actually passing), but have you tried it outside tests (e.g. in an integration tests or an example?)

mpangrazzi · 2025-06-04T08:28:11Z

integrations/google_ai/tests/embedders/test_text_embedder.py

+    texts = ["text 1", "text 2"]
+    expected_embeddings = [[0.1, 0.2], [0.3, 0.4]]
+    # Configure the mock embed_content method to return a successful response
+    mock_client_instance.models.embed_content.return_value = {"embedding": expected_embeddings}


This is the wrong mocking I was mentioning above.

According to docs, a correct mock should be something like:

mock_response = MagicMock() mock_response.embeddings = None # or e.g. [[0.1, 0.2], [0.3, 0.4]] mock_client_instance.models.embed_content.return_value = mock_response

Can you please update tests accordingly?

add text embedder

f7b1419

rhajou requested a review from a team as a code owner May 2, 2025 15:16

rhajou requested review from mpangrazzi and removed request for a team May 2, 2025 15:16

github-actions bot added integration:google-ai type:documentation Improvements or additions to documentation labels May 2, 2025

anakin87 mentioned this pull request May 24, 2025

feat: Add GoogleAITextEmbedder and GoogleAIDocumentEmbedder components #1783

Merged

mpangrazzi requested changes Jun 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT: add text embedder #1694

FEAT: add text embedder #1694

Uh oh!

rhajou commented May 2, 2025 •

edited

Loading

Uh oh!

mpangrazzi left a comment

Uh oh!

mpangrazzi Jun 4, 2025

Uh oh!

mpangrazzi Jun 4, 2025

Uh oh!

mpangrazzi Jun 4, 2025

Uh oh!

mpangrazzi Jun 4, 2025

Uh oh!

mpangrazzi Jun 4, 2025

Uh oh!

Uh oh!

FEAT: add text embedder #1694

Are you sure you want to change the base?

FEAT: add text embedder #1694

Uh oh!

Conversation

rhajou commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

mpangrazzi left a comment

Choose a reason for hiding this comment

Uh oh!

mpangrazzi Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

mpangrazzi Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

mpangrazzi Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

mpangrazzi Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

mpangrazzi Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rhajou commented May 2, 2025 •

edited

Loading