Skip to content

Performance bottleneck in get_online_features due to repeated metadata resolution from registry #4710

@breno-costa

Description

@breno-costa

Is your feature request related to a problem? Please describe.
I'm running some benchmarks with Python SDK and profiling the code to understand more about its execution. Here's the profile report for my benchmark.

image

If you look at it, the _prepare_entities_to_read_from_online_store method and its sub-calls account for more than half of the execution time. For each call, it needs to resolve metadata from registry and this takes a lot of time in a relative comparison to other parts of the code.

However, in a ML inference scenario, we usually create a feature service for each ML model application when it's deployed. The ML application calls the method get_online_features using the same feature service, i.e. all calls use same metadata. The current SDK implementation creates unnecessary overhead since it resolves same metadata on every call.

Describe the solution you'd like
I don't know whether it's possible to make metadata resolution more efficient. If not, a potential solution would be to cache the metadata in the SDK itself. There might be some configuration that turn this caching on/off.

Additional context
I see many functions used to get online features have been moved to utils.py. This can make changes and optimizations more complex. There are over 500 lines of code in these util functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureNew feature or requestwontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions