Skip to content

Commit 28a3d24

Browse files
redhatHameeddmartinoltmihalac
authored
feat: Remote offline Store (#4262)
* feat: Added offline store remote deployment functionly using arrow flight server and client Signed-off-by: Abdul Hameed <[email protected]> * Initial functional commit for remote get_historical_features Signed-off-by: Abdul Hameed <[email protected]> * remote offline store example Signed-off-by: Abdul Hameed <[email protected]> * removing unneeded test code and fixinf impotrts Signed-off-by: Abdul Hameed <[email protected]> * call do_put only once, postpone the invocation of do_put and simplified _make_flight_info Signed-off-by: Abdul Hameed <[email protected]> * added primitive parameters to the command descriptor Signed-off-by: Abdul Hameed <[email protected]> * removed redundant param Signed-off-by: Abdul Hameed <[email protected]> * Initial skeleton of unit test for offline server Signed-off-by: Abdul Hameed <[email protected]> * added unit test for offline store remote client Signed-off-by: Abdul Hameed <[email protected]> * testing all offlinestore APIs Signed-off-by: Abdul Hameed <[email protected]> * integrated comments Signed-off-by: Abdul Hameed <[email protected]> * Updated remote offline server readme with the capability to init with an environment variable Signed-off-by: Theodor Mihalache <[email protected]> Signed-off-by: Abdul Hameed <[email protected]> * added RemoteOfflineStoreDataSourceCreator, use feature_view_names to transfer feature views and remove dummies Signed-off-by: Abdul Hameed <[email protected]> * added missing CI requirement Signed-off-by: Abdul Hameed <[email protected]> * fixed linter Signed-off-by: Abdul Hameed <[email protected]> * fixed multiprocess CI requirement Signed-off-by: Abdul Hameed <[email protected]> * feat: Added offline store remote deployment functionly using arrow flight server and client Signed-off-by: Abdul Hameed <[email protected]> * fix test errors Signed-off-by: Abdul Hameed <[email protected]> * managing feature view aliases and restored skipped tests Signed-off-by: Abdul Hameed <[email protected]> * fixced linter issue Signed-off-by: Abdul Hameed <[email protected]> * fixed broken test Signed-off-by: Abdul Hameed <[email protected]> * added supported deployment modes using helm chart for online (default), offline, ui and registry Signed-off-by: Abdul Hameed <[email protected]> * updated the document for offline remote server Signed-off-by: Abdul Hameed <[email protected]> * added the document for remote offline server Signed-off-by: Abdul Hameed <[email protected]> * rebase and fix conflicts Signed-off-by: Abdul Hameed <[email protected]> * feat: Added offline store remote deployment functionly using arrow flight server and client Signed-off-by: Abdul Hameed <[email protected]> * added unit test for offline store remote client Signed-off-by: Abdul Hameed <[email protected]> * added RemoteOfflineStoreDataSourceCreator, use feature_view_names to transfer feature views and remove dummies Signed-off-by: Abdul Hameed <[email protected]> * feat: Added offline store remote deployment functionly using arrow flight server and client Signed-off-by: Abdul Hameed <[email protected]> * Added missing remote offline store apis implementation Signed-off-by: Theodor Mihalache <[email protected]> Signed-off-by: Abdul Hameed <[email protected]> * Fixed tests Signed-off-by: Theodor Mihalache <[email protected]> Signed-off-by: Abdul Hameed <[email protected]> * Implemented PR change proposal Signed-off-by: Theodor Mihalache <[email protected]> Signed-off-by: Abdul Hameed <[email protected]> * Implemented PR change proposal Signed-off-by: Theodor Mihalache <[email protected]> Signed-off-by: Abdul Hameed <[email protected]> * updated example readme file Signed-off-by: Abdul Hameed <[email protected]> * Implemented PR change proposal Signed-off-by: Theodor Mihalache <[email protected]> Signed-off-by: Abdul Hameed <[email protected]> * fixing the integration tests Signed-off-by: Abdul Hameed <[email protected]> * Fixed OfflineServer teardown Signed-off-by: Theodor Mihalache <[email protected]> * updated the document for remote offline feature server and client Signed-off-by: Abdul Hameed <[email protected]> * Implemented PR change proposal Signed-off-by: Theodor Mihalache <[email protected]> --------- Signed-off-by: Abdul Hameed <[email protected]> Signed-off-by: Theodor Mihalache <[email protected]> Co-authored-by: Daniele Martinoli <[email protected]> Co-authored-by: Theodor Mihalache <[email protected]> Co-authored-by: Theodor Mihalache <[email protected]>
1 parent b755fc4 commit 28a3d24

File tree

36 files changed

+1636
-40
lines changed

36 files changed

+1636
-40
lines changed

docs/SUMMARY.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@
8585
* [PostgreSQL (contrib)](reference/offline-stores/postgres.md)
8686
* [Trino (contrib)](reference/offline-stores/trino.md)
8787
* [Azure Synapse + Azure SQL (contrib)](reference/offline-stores/mssql.md)
88+
* [Remote Offline](reference/offline-stores/remote-offline-store.md)
8889
* [Online stores](reference/online-stores/README.md)
8990
* [Overview](reference/online-stores/overview.md)
9091
* [SQLite](reference/online-stores/sqlite.md)
@@ -117,6 +118,8 @@
117118
* [Python feature server](reference/feature-servers/python-feature-server.md)
118119
* [\[Alpha\] Go feature server](reference/feature-servers/go-feature-server.md)
119120
* [\[Alpha\] AWS Lambda feature server](reference/feature-servers/alpha-aws-lambda-feature-server.md)
121+
* [Offline Feature Server](reference/feature-servers/offline-feature-server)
122+
120123
* [\[Beta\] Web UI](reference/alpha-web-ui.md)
121124
* [\[Alpha\] On demand feature view](reference/alpha-on-demand-feature-view.md)
122125
* [\[Alpha\] Data quality monitoring](reference/dqm.md)

docs/reference/feature-servers/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,8 @@ Feast users can choose to retrieve features from a feature server, as opposed to
1212

1313
{% content-ref url="alpha-aws-lambda-feature-server.md" %}
1414
[alpha-aws-lambda-feature-server.md](alpha-aws-lambda-feature-server.md)
15+
{% endcontent-ref %}
16+
17+
{% content-ref url="offline-feature-server.md" %}
18+
[offline-feature-server.md](offline-feature-server.md)
1519
{% endcontent-ref %}
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Offline feature server
2+
3+
## Description
4+
5+
The Offline feature server is an Apache Arrow Flight Server that uses the gRPC communication protocol to exchange data.
6+
This server wraps calls to existing offline store implementations and exposes interfaces as Arrow Flight endpoints.
7+
8+
## How to configure the server
9+
10+
## CLI
11+
12+
There is a CLI command that starts the Offline feature server: `feast serve_offline`. By default, remote offline server uses port 8815, the port can be overridden with a `--port` flag.
13+
14+
## Deploying as a service on Kubernetes
15+
16+
The Offline feature server can be deployed using helm chart see this [helm chart](https://github.com/feast-dev/feast/blob/master/infra/charts/feast-feature-server).
17+
18+
User need to set `feast_mode=offline`, when installing Offline feature server as shown in the helm command below:
19+
20+
```
21+
helm install feast-offline-server feast-charts/feast-feature-server --set feast_mode=offline --set feature_store_yaml_base64=$(base64 > feature_store.yaml)
22+
```
23+
24+
## Server Example
25+
26+
The complete example can be find under [remote-offline-store-example](../../../examples/remote-offline-store)
27+
28+
## How to configure the client
29+
30+
Please see the detail how to configure offline store client [remote-offline-store.md](../offline-stores/remote-offline-store.md)
31+
32+
## Functionality Matrix
33+
34+
The set of functionalities supported by remote offline stores is the same as those supported by offline stores with the SDK, which are described in detail [here](../offline-stores/overview.md#functionality).
35+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Remote Offline Store
2+
3+
## Description
4+
5+
The Remote Offline Store is an Arrow Flight client for the offline store that implements the `RemoteOfflineStore` class using the existing `OfflineStore` interface.
6+
The client implements various methods, including `get_historical_features`, `pull_latest_from_table_or_query`, `write_logged_features`, and `offline_write_batch`.
7+
8+
## How to configure the client
9+
10+
User needs to create client side `feature_store.yaml` file and set the `offline_store` type `remote` and provide the server connection configuration
11+
including adding the host and specifying the port (default is 8815) required by the Arrow Flight client to connect with the Arrow Flight server.
12+
13+
{% code title="feature_store.yaml" %}
14+
```yaml
15+
offline_store:
16+
type: remote
17+
host: localhost
18+
port: 8815
19+
```
20+
{% endcode %}
21+
22+
## Client Example
23+
24+
The complete example can be find under [remote-offline-store-example](../../../examples/remote-offline-store)
25+
26+
## How to configure the server
27+
28+
Please see the detail how to configure offline feature server [offline-feature-server.md](../feature-servers/offline-feature-server.md)
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Feast Remote Offline Store Server
2+
3+
This example demonstrates the steps using an [Arrow Flight](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/) server/client as the remote Feast offline store.
4+
5+
## Launch the offline server locally
6+
7+
1. **Create Feast Project**: Using the `feast init` command for example the [offline_server](./offline_server) folder contains a sample Feast repository.
8+
9+
2. **Start Remote Offline Server**: Use the `feast server_offline` command to start remote offline requests. This command will:
10+
- Spin up an `Arrow Flight` server at the default port 8815.
11+
12+
3. **Initialize Offline Server**: The offline server can be initialized by providing the `feature_store.yml` file via an environment variable named `FEATURE_STORE_YAML_BASE64`. A temporary directory will be created with the provided YAML file named `feature_store.yml`.
13+
14+
Example
15+
16+
```console
17+
cd offline_server
18+
feast -c feature_repo apply
19+
```
20+
21+
```console
22+
feast -c feature_repo serve_offline
23+
```
24+
25+
Sample output:
26+
```console
27+
Serving on grpc+tcp://127.0.0.1:8815
28+
```
29+
30+
## Launch a remote offline client
31+
32+
The [offline_client](./offline_client) folder includes a test python function that uses an offline store of type `remote`, leveraging the remote server as the
33+
actual data provider.
34+
35+
36+
The test class is located under [offline_client](./offline_client/) and uses a remote configuration of the offline store to delegate the actual
37+
implementation to the offline store server:
38+
```yaml
39+
offline_store:
40+
type: remote
41+
host: localhost
42+
port: 8815
43+
```
44+
45+
The test code in [test.py](./offline_client/test.py) initializes the store from the local configuration and then fetches the historical features
46+
from the store like any other Feast client, but the actual implementation is delegated to the offline server
47+
```py
48+
store = FeatureStore(repo_path=".")
49+
training_df = store.get_historical_features(entity_df, features).to_df()
50+
```
51+
52+
53+
Run client
54+
`cd offline_client;
55+
python test.py`
56+
57+
Sample output:
58+
59+
```console
60+
config.offline_store is <class 'feast.infra.offline_stores.remote.RemoteOfflineStoreConfig'>
61+
----- Feature schema -----
62+
63+
<class 'pandas.core.frame.DataFrame'>
64+
RangeIndex: 3 entries, 0 to 2
65+
Data columns (total 10 columns):
66+
# Column Non-Null Count Dtype
67+
--- ------ -------------- -----
68+
0 driver_id 3 non-null int64
69+
1 event_timestamp 3 non-null datetime64[ns, UTC]
70+
2 label_driver_reported_satisfaction 3 non-null int64
71+
3 val_to_add 3 non-null int64
72+
4 val_to_add_2 3 non-null int64
73+
5 conv_rate 3 non-null float32
74+
6 acc_rate 3 non-null float32
75+
7 avg_daily_trips 3 non-null int32
76+
8 conv_rate_plus_val1 3 non-null float64
77+
9 conv_rate_plus_val2 3 non-null float64
78+
dtypes: datetime64[ns, UTC](1), float32(2), float64(2), int32(1), int64(4)
79+
memory usage: 332.0 bytes
80+
None
81+
82+
----- Features -----
83+
84+
driver_id event_timestamp label_driver_reported_satisfaction ... avg_daily_trips conv_rate_plus_val1 conv_rate_plus_val2
85+
0 1001 2021-04-12 10:59:42+00:00 1 ... 590 1.022378 10.022378
86+
1 1002 2021-04-12 08:12:10+00:00 5 ... 974 2.762213 20.762213
87+
2 1003 2021-04-12 16:40:26+00:00 3 ... 127 3.419828 30.419828
88+
89+
[3 rows x 10 columns]
90+
------training_df----
91+
driver_id event_timestamp label_driver_reported_satisfaction ... avg_daily_trips conv_rate_plus_val1 conv_rate_plus_val2
92+
0 1001 2021-04-12 10:59:42+00:00 1 ... 590 1.022378 10.022378
93+
1 1002 2021-04-12 08:12:10+00:00 5 ... 974 2.762213 20.762213
94+
2 1003 2021-04-12 16:40:26+00:00 3 ... 127 3.419828 30.419828
95+
96+
[3 rows x 10 columns]
97+
```
98+

examples/remote-offline-store/offline_client/__init__.py

Whitespace-only changes.
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
project: offline_server
2+
# By default, the registry is a file (but can be turned into a more scalable SQL-backed registry)
3+
registry: ../offline_server/feature_repo/data/registry.db
4+
# The provider primarily specifies default offline / online stores & storing the registry in a given cloud
5+
provider: local
6+
offline_store:
7+
type: remote
8+
host: localhost
9+
port: 8815
10+
entity_key_serialization_version: 2
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
from datetime import datetime
2+
from feast import FeatureStore
3+
import pandas as pd
4+
5+
entity_df = pd.DataFrame.from_dict(
6+
{
7+
"driver_id": [1001, 1002, 1003],
8+
"event_timestamp": [
9+
datetime(2021, 4, 12, 10, 59, 42),
10+
datetime(2021, 4, 12, 8, 12, 10),
11+
datetime(2021, 4, 12, 16, 40, 26),
12+
],
13+
"label_driver_reported_satisfaction": [1, 5, 3],
14+
"val_to_add": [1, 2, 3],
15+
"val_to_add_2": [10, 20, 30],
16+
}
17+
)
18+
19+
features = [
20+
"driver_hourly_stats:conv_rate",
21+
"driver_hourly_stats:acc_rate",
22+
"driver_hourly_stats:avg_daily_trips",
23+
"transformed_conv_rate:conv_rate_plus_val1",
24+
"transformed_conv_rate:conv_rate_plus_val2",
25+
]
26+
27+
store = FeatureStore(repo_path=".")
28+
29+
training_df = store.get_historical_features(entity_df, features).to_df()
30+
31+
print("----- Feature schema -----\n")
32+
print(training_df.info())
33+
34+
print()
35+
print("----- Features -----\n")
36+
print(training_df.head())
37+
38+
print("------training_df----")
39+
40+
print(training_df)

examples/remote-offline-store/offline_server/__init__.py

Whitespace-only changes.

examples/remote-offline-store/offline_server/feature_repo/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)