-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
Description
Expected Behavior
When calling the materialize functionality to materialize data from a SnowflakeSource offline store to a local PostgreSQLOnlineStore, the table is not populated with the data.
Current Behavior
The feature table in the local Postgres instance is not populated, while no exception is raised, and from the logs it seems like the data should be pushed to Postgres.
Steps to reproduce
- Use this feature_store.yaml file:
project: my_project
provider: local
registry:
registry_type: sql
path: postgresql://postgres:[email protected]:5432/feature_store
cache_ttl_seconds: 60
online_store:
type: postgres
host: 0.0.0.0
port: 5432
database: feature_store
db_schema: public
user: postgres
password: test
offline_store:
<SNOWFLAKE_INFORMATION>
entity_key_serialization_version: 2
- Spin up this docker-compose file:
---
version: "3"
services:
db:
restart: always
image: postgres:15-alpine
container_name: feast_db
ports:
- "5432:5432"
volumes:
- ~/feast_postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_DB=feature_store
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=test
volumes:
feast_postgres_data: null
- Initialize the Entities, SnowflakeSource (or another source), FeatureView, and FeatureService, and apply these. All using the Python SDK.
from datetime import timedelta
from feast import (
Entity,
FeatureService,
FeatureView,
Field,
SnowflakeSource,
ValueType,
FeatureStore,
)
from feast.types import Float32
feature_store = FeatureStore()
entity = Entity(
name="entity",
join_keys=["entity_ID"],
value_type=ValueType.STRING,
)
source = SnowflakeSource(
name="snowflake_source_name",
timestamp_field="EVENT_TIMESTAMP",
schema="TEMP",
table="TABLE"
)
feature_view = FeatureView(
name="feature_view_name",
entities=[entity],
ttl=timedelta(days=0),
schema=[
Field(name="feature_1", dtype=Float32),
Field(name="feature_2", dtype=Float32),
],
online=True,
source=source,
tags={"team": "team"},
)
feature_service = FeatureService(
name="feature_service",
features=[feature_view],
)
feature_store.apply(
[
entity,
source,
feature_view,
feature_service,
]
)
- Run materialize commands using the Python SDK
feature_store = FeatureStore()
feature_store.materialize(
start_date=datetime.utcnow() - timedelta(weeks=52),
end_date=datetime.utcnow(),
feature_views=["feature_view_name"],
)
Specifications
- Version: 0.35.0
- Platform: Local MacBook M1
Possible Solution
It seems like a conn.commit() statement is missing in the online_write_batch method of the PostgreSQLOnlineStore. Specifically, on this line.
After adding this, the table is populated.
The PR implementing this proposed fix can be found here.
Additional notes
When replacing the the postgres online store with the following sqlite online store in the config file, everything works without any code changes
online_store:
type: sqlite
path: data/online_store.db
TomSteenbergen, picnic-sven and HaoXuAI