Skip to content

fix: make flow name retrieval efficient #414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 30, 2025

Conversation

lemorage
Copy link
Contributor

This PR will make cocoindex ls command way more efficient than before, and fix #412.

Context

In the previous versions, we tried to access each Flow’s name property in the list comprehension:

current_flow_names = [fl.name for fl in flow.flows()]

The name property of a Flow is defined as:

@property
def name(self) -> str:
    return self._lazy_engine_flow().name()

Each call to self._lazy_engine_flow() triggers the creation of the underlying _engine.Flow if it hasn’t been built yet. Till now, you can see we have done a lot of unnecessary jobs here, since only we need here is to retrieve flow names.

Solution

We use the helper now to quickly get flow names as a list.

def flow_names() -> list[str]:
    """
    Get the names of all flows.
    """
    with _flows_lock:
        return list(_flows.keys())

Result

Compared to the result in #412, now we've made it 10 times faster.

 cocoindex  time uv run main.py cocoindex ls
CodeEmbedding [+]
DocsToKG [+]
TextEmbedding [+]
GoogleDriveTextEmbedding [+]

Notes:
  [+]: Flows present in the current process, but missing setup.
uv run main.py cocoindex ls  2.01s user 0.47s system 68% cpu 3.647 total

@badmonster0
Copy link
Member

Thanks for submitting the fix!

@badmonster0 badmonster0 merged commit 0e97d1e into cocoindex-io:main Apr 30, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cocoindex ls is too slow to use
2 participants