Skip to content

Conversation

@thomas-may
Copy link

@thomas-may thomas-may commented Nov 27, 2025

Description

Using PGVector, we experience fulls sequential DB scans while searching for memories for a given user.
The mem0 collection has indeed no indexes on payload->>user_id and usage on large datasets leads to long query time and high DB resources consumption or saturation.

Open to discussion: index on agent_id and run_id may be usefull to other users of mem0 and having those index enabled/disabled by configuration could be useful to some users to avoid space and memory waste.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Testing index benefit

In a database loaded with ~50k memories split over ~200 user_id, we run load tests with and without the index on user_id.

Postgres query plan and request duration is checked with and without index.

Testing PR modification applies correct index

Init local DB:

docker run --name pg_vector_store -e POSTGRES_DB=memory_db -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -p 5555:5432 ankane/pgvector

Check DB schema:

PGPASSWORD=password psql -h localhost -U user -p 5555 memory_db -c "\d
 mem0"

Run following python test script:

# file test_pg_vector_idx.py
from mem0.vector_stores.pgvector import PGVector

vector_store = PGVector(
    collection_name="mem0",
    dbname="memory_db",
    user="user",
    password="password",
    host="localhost",
    port="5555",
    embedding_model_dims=1536,
    diskann=False,
    hnsw=False,
    )

Run with: hatch run dev_py_3_12:python test_pg_vector_idx.py

Check schema again:

PGPASSWORD=password psql -h localhost -U user -p 5555 memory_db -c "\d
 mem0"

# Should output
                   Table "public.mem0"
 Column  |     Type     | Collation | Nullable | Default 
---------+--------------+-----------+----------+---------
 id      | uuid         |           | not null | 
 vector  | vector(1536) |           |          | 
 payload | jsonb        |           |          | 
Indexes:
    "mem0_pkey" PRIMARY KEY, btree (id)
    "mem0_payload_user_id_idx" btree ((payload ->> 'user_id'::text))

Cleanup:

docker rm -f pg_vector_store
rm test_pg_vector_idx.py

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Made sure Checks passed

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants