Skip to content

Commit a965af9

Browse files
authored
docs: Add docs for batch materialization engine (feast-dev#2959)
Signed-off-by: Achal Shah <[email protected]>
1 parent 6d7b38a commit a965af9

File tree

12 files changed

+223
-5
lines changed

12 files changed

+223
-5
lines changed

docs/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
* [Offline store](getting-started/architecture-and-components/offline-store.md)
2626
* [Online store](getting-started/architecture-and-components/online-store.md)
2727
* [Provider](getting-started/architecture-and-components/provider.md)
28+
* [Batch Materialization Engine](getting-started/architecture-and-components/batch-materialization-engine.md)
2829
* [Learning by example](getting-started/feast-workshop.md)
2930
* [Third party integrations](getting-started/third-party-integrations.md)
3031
* [FAQ](getting-started/faq.md)
@@ -53,6 +54,7 @@
5354
* [Deploying a Java feature server on Kubernetes](how-to-guides/fetching-java-features-k8s.md)
5455
* [Upgrading from Feast 0.9](https://docs.google.com/document/u/1/d/1AOsr\_baczuARjCpmZgVd8mCqTF4AZ49OEyU4Cn-uTT0/edit)
5556
* [Adding a custom provider](how-to-guides/creating-a-custom-provider.md)
57+
* [Adding a custom batch materialization engine](how-to-guides/creating-a-custom-materialization-engine.md)
5658
* [Adding a new online store](how-to-guides/adding-support-for-a-new-online-store.md)
5759
* [Adding a new offline store](how-to-guides/adding-a-new-offline-store.md)
5860
* [Adding or reusing tests](how-to-guides/adding-or-reusing-tests.md)

docs/getting-started/architecture-and-components/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,5 +12,4 @@
1212

1313
{% page-ref page="provider.md" %}
1414

15-
16-
15+
{% page-reg page="batch-materialization-engine.md" %}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Batch Materialization Engine
2+
3+
A batch materialization engine is a component of Feast that's responsible for moving data from the offline store into the online store.
4+
5+
A materialization engine abstracts over specific technologies or frameworks that are used to materialize data. It allows users to use a pure local serialized approach (which is the default LocalMaterializationEngine), or delegates the materialization to seperate components (e.g. AWS Lambda, as implemented by the the LambdaMaterializaionEngine).
6+
7+
If the built-in engines are not sufficient, you can create your own custom materialization engine. Please see [this guide](../../how-to-guides/creating-a-custom-materialization-engine.md) for more details.
8+
9+
Please see [feature\_store.yaml](../../reference/feature-repository/feature-store-yaml.md#overview) for configuring providers.
10+

docs/getting-started/concepts/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@
1818

1919
{% page-ref page="point-in-time-joins.md" %}
2020

21-
{% page-ref page="registry.md" %}
21+
{% page-ref page="registry.md" %}
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Adding a custom materialization engine
2+
3+
### Overview
4+
5+
Feast batch materialization operations (`materialize` and `materialize-incremental`) execute through a `BatchMaterializationEngine`.
6+
7+
Custom batch materialization engines allow Feast users to extend Feast to customize the materialization process. Examples include:
8+
9+
* Setting up custom materialization-specific infrastructure during `feast apply` (e.g. setting up Spark clusters or Lambda Functions)
10+
* Launching custom batch ingestion \(materialization\) jobs \(Spark, Beam, AWS Lambda\)
11+
* Tearing down custom materialization-specific infrastructure during `feast teardown` (e.g. tearing down Spark clusters, or deleting Lambda Functions)
12+
13+
Feast comes with built-in materialization engines, e.g, `LocalMaterializationEngine`, and an experimental `LambdaMaterializationEngine`. However, users can develop their own materialization engines by creating a class that implements the contract in the [BatchMaterializationEngine class](https://github.com/feast-dev/feast/blob/6d7b38a39024b7301c499c20cf4e7aef6137c47c/sdk/python/feast/infra/materialization/batch_materialization_engine.py#L72).
14+
15+
### Guide
16+
17+
The fastest way to add custom logic to Feast is to extend an existing materialization engine. The most generic engine is the `LocalMaterializationEngine` which contains no cloud-specific logic. The guide that follows will extend the `LocalProvider` with operations that print text to the console. It is up to you as a developer to add your custom code to the engine methods, but the guide below will provide the necessary scaffolding to get you started.
18+
19+
#### Step 1: Define an Engine class
20+
21+
The first step is to define a custom materialization engine class. We've created the `MyCustomEngine` below.
22+
23+
```python
24+
from typing import Any, Callable, Dict, List, Optional, Sequence, Tuple, Union
25+
26+
from feast.entity import Entity
27+
from feast.feature_view import FeatureView
28+
from feast.batch_feature_view import BatchFeatureView
29+
from feast.stream_feature_view import StreamFeatureView
30+
from feast.infra.materialization import LocalMaterializationEngine, LocalMaterializationJob, MaterializationTask
31+
from feast.infra.offline_stores.offline_store import OfflineStore
32+
from feast.infra.online_stores.online_store import OnlineStore
33+
from feast.repo_config import RepoConfig
34+
35+
36+
class MyCustomEngine(LocalMaterializationEngine):
37+
def __init__(
38+
self,
39+
*,
40+
repo_config: RepoConfig,
41+
offline_store: OfflineStore,
42+
online_store: OnlineStore,
43+
**kwargs,
44+
):
45+
super().__init__(
46+
repo_config=repo_config,
47+
offline_store=offline_store,
48+
online_store=online_store,
49+
**kwargs,
50+
)
51+
52+
def update(
53+
self,
54+
project: str,
55+
views_to_delete: Sequence[
56+
Union[BatchFeatureView, StreamFeatureView, FeatureView]
57+
],
58+
views_to_keep: Sequence[
59+
Union[BatchFeatureView, StreamFeatureView, FeatureView]
60+
],
61+
entities_to_delete: Sequence[Entity],
62+
entities_to_keep: Sequence[Entity],
63+
):
64+
print("Creating new infrastructure is easy here!")
65+
pass
66+
67+
def materialize(
68+
self, registry, tasks: List[MaterializationTask]
69+
) -> List[LocalMaterializationJob]:
70+
print("Launching custom batch jobs or multithreading things is pretty easy...")
71+
return [
72+
self._materialize_one(
73+
registry,
74+
task.feature_view,
75+
task.start_time,
76+
task.end_time,
77+
task.project,
78+
task.tqdm_builder,
79+
)
80+
for task in tasks
81+
]
82+
83+
```
84+
85+
Notice how in the above engine we have only overwritten two of the methods on the `LocalMaterializatinEngine`, namely `update` and `materialize`. These two methods are convenient to replace if you are planning to launch custom batch jobs.
86+
87+
#### Step 2: Configuring Feast to use the engine
88+
89+
Configure your [feature\_store.yaml](../reference/feature-repository/feature-store-yaml.md) file to point to your new engine class:
90+
91+
```yaml
92+
project: repo
93+
registry: registry.db
94+
batch_engine: feast_custom_engine.MyCustomEngine
95+
online_store:
96+
type: sqlite
97+
path: online_store.db
98+
offline_store:
99+
type: file
100+
```
101+
102+
Notice how the `batch_engine` field above points to the module and class where your engine can be found.
103+
104+
#### Step 3: Using the engine
105+
106+
Now you should be able to use your engine by running a Feast command:
107+
108+
```bash
109+
feast apply
110+
```
111+
112+
```text
113+
Registered entity driver_id
114+
Registered feature view driver_hourly_stats
115+
Deploying infrastructure for driver_hourly_stats
116+
Creating new infrastructure is easy here!
117+
```
118+
119+
It may also be necessary to add the module root path to your `PYTHONPATH` as follows:
120+
121+
```bash
122+
PYTHONPATH=$PYTHONPATH:/home/my_user/my_custom_engine feast apply
123+
```
124+
125+
That's it. You should now have a fully functional custom engine!

docs/reference/feature-repository/feature-store-yaml.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,5 +24,6 @@ The following top-level configuration options exist in the `feature_store.yaml`
2424
* **online_store** — Configures the online store.
2525
* **offline_store** — Configures the offline store.
2626
* **project** — Defines a namespace for the entire feature store. Can be used to isolate multiple deployments in a single installation of Feast. Should only contain letters, numbers, and underscores.
27+
* **engine** - Configures the batch materialization engine.
2728

2829
Please see the [RepoConfig](https://rtd.feast.dev/en/latest/#feast.repo_config.RepoConfig) API reference for the full list of configuration options.

sdk/python/docs/index.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,18 +250,21 @@ Sqlite Online Store
250250

251251
.. automodule:: feast.infra.online_stores.sqlite
252252
:members:
253+
:noindex:
253254

254255
Datastore Online Store
255256
----------------------
256257

257258
.. automodule:: feast.infra.online_stores.datastore
258259
:members:
260+
:noindex:
259261

260262
DynamoDB Online Store
261263
---------------------
262264

263265
.. automodule:: feast.infra.online_stores.dynamodb
264266
:members:
267+
:noindex:
265268

266269
Redis Online Store
267270
------------------
@@ -283,3 +286,23 @@ HBase Online Store
283286
.. automodule:: feast.infra.online_stores.contrib.hbase_online_store.hbase
284287
:members:
285288
:noindex:
289+
290+
291+
Batch Materialization Engine
292+
============================
293+
294+
.. automodule:: feast.infra.materialization
295+
:members: BatchMaterializationEngine, MaterializationJob, MaterializationTask
296+
297+
Local Engine
298+
------------
299+
.. autoclass:: feast.infra.materialization.LocalMaterializationEngine
300+
:members:
301+
:noindex:
302+
303+
(Alpha) Lambda Based Engine
304+
---------------------------
305+
306+
.. autoclass:: feast.infra.materialization.lambda.lambda_engine
307+
:members:
308+
:noindex:

sdk/python/docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@
115115
# Add any paths that contain custom static files (such as style sheets) here,
116116
# relative to this directory. They are copied after the builtin static files,
117117
# so a file named "default.css" will overwrite the builtin "default.css".
118-
html_static_path = ["_static"]
118+
html_static_path = []
119119

120120

121121
# -- Options for HTMLHelp output ------------------------------------------
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
feast.infra.materialization.lambda package
2+
==========================================
3+
4+
Submodules
5+
----------
6+
7+
feast.infra.materialization.lambda.app module
8+
---------------------------------------------
9+
10+
.. automodule:: feast.infra.materialization.lambda.app
11+
:members:
12+
:undoc-members:
13+
:show-inheritance:
14+
15+
feast.infra.materialization.lambda.lambda\_engine module
16+
--------------------------------------------------------
17+
18+
.. automodule:: feast.infra.materialization.lambda.lambda_engine
19+
:members:
20+
:undoc-members:
21+
:show-inheritance:
22+
23+
Module contents
24+
---------------
25+
26+
.. automodule:: feast.infra.materialization.lambda
27+
:members:
28+
:undoc-members:
29+
:show-inheritance:

sdk/python/docs/source/feast.infra.materialization.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,14 @@
11
feast.infra.materialization package
22
===================================
33

4+
Subpackages
5+
-----------
6+
7+
.. toctree::
8+
:maxdepth: 4
9+
10+
feast.infra.materialization.lambda
11+
412
Submodules
513
----------
614

0 commit comments

Comments
 (0)