You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/README.md
+5-6Lines changed: 5 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,13 @@
2
2
3
3
## What is Feast?
4
4
5
-
Feast \(**Fea**ture **St**ore\) is an operational data system for managing and serving machine learning features to models in production. Feast is able to serve feature data to models from a low-latency online store \(for real-time prediction\) or from an offline store \(for scale-out batch scoring or model training\).
5
+
Feast (**Fea**ture **St**ore) is an operational data system for managing and serving machine learning features to models in production. Feast is able to serve feature data to models from a low-latency online store (for real-time prediction) or from an offline store (for scale-out batch scoring or model training).
**Models need consistent access to data:** Machine Learning \(ML\) systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. A result of this coupling, however, is that any change in data infrastructure may break dependent ML systems. Another challenge is that dual implementations of data retrieval for training and serving can lead to inconsistencies in data, which in turn can lead to training-serving skew.
11
+
**Models need consistent access to data:** Machine Learning (ML) systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. A result of this coupling, however, is that any change in data infrastructure may break dependent ML systems. Another challenge is that dual implementations of data retrieval for training and serving can lead to inconsistencies in data, which in turn can lead to training-serving skew.
12
12
13
13
Feast decouples your models from your data infrastructure by providing a single data access layer that abstracts feature storage from feature retrieval. Feast also provides a consistent means of referencing feature data for retrieval, and therefore ensures that models remain portable when moving from training to serving.
14
14
@@ -34,9 +34,9 @@ Feast addresses this problem by introducing feature reuse through a centralized
34
34
35
35
## What Feast is not
36
36
37
-
[**ETL**](https://en.wikipedia.org/wiki/Extract,_transform,_load)**or**[**ELT**](https://en.wikipedia.org/wiki/Extract,_load,_transform)**system:** Feast is not \(and does not plan to become\) a general purpose data transformation or pipelining system. Feast plans to include a light-weight feature engineering toolkit, but we encourage teams to integrate Feast with upstream ETL/ELT systems that are specialized in transformation.
37
+
[**ETL**](https://en.wikipedia.org/wiki/Extract,\_transform,\_load)**or**[**ELT**](https://en.wikipedia.org/wiki/Extract,\_load,\_transform)**system:** Feast is not (and does not plan to become) a general purpose data transformation or pipelining system. Feast plans to include a light-weight feature engineering toolkit, but we encourage teams to integrate Feast with upstream ETL/ELT systems that are specialized in transformation.
38
38
39
-
**Data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse \(or other data sources\) to models in production.
39
+
**Data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse (or other data sources) to models in production.
40
40
41
41
**Data catalog:** Feast is not a general purpose data catalog for your organization. Feast is purely focused on cataloging features for use in ML pipelines or systems, and only to the extent of facilitating the reuse of features.
42
42
@@ -55,4 +55,3 @@ Explore the following resources to get started with Feast:
55
55
*[Running Feast with GCP/AWS](how-to-guides/feast-gcp-aws/) provides a more in-depth guide to using Feast.
56
56
*[Reference](reference/feast-cli-commands.md) contains detailed API and design documents.
57
57
*[Contributing](project/contributing.md) contains resources for anyone who wants to contribute to Feast.
Copy file name to clipboardExpand all lines: docs/community.md
+7-8Lines changed: 7 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,33 +11,32 @@
11
11
* Feast users should join [[email protected]](mailto:[email protected]) group by clicking [here](https://groups.google.com/g/feast-discuss).
12
12
* Feast developers should join [[email protected]](mailto:[email protected]) group by clicking [here](https://groups.google.com/d/forum/feast-dev).
13
13
*[Google Folder](https://drive.google.com/drive/u/0/folders/1jgMHOPDT2DvBlJeO9LCM79DP4lm4eOrR): This folder is used as a central repository for all Feast resources. For example:
14
-
* Design proposals in the form of Request for Comments \(RFC\).
14
+
* Design proposals in the form of Request for Comments (RFC).
15
15
* User surveys and meeting minutes.
16
16
* Slide decks of conferences our contributors have spoken at.
17
17
*[Feast GitHub Repository](https://github.com/feast-dev/feast/): Find the complete Feast codebase on GitHub.
18
18
*[Feast Linux Foundation Wiki](https://wiki.lfaidata.foundation/display/FEAST/Feast+Home): Our LFAI wiki page contains links to resources for contributors and maintainers.
19
19
20
20
## How can I get help?
21
21
22
-
***Slack:** Need to speak to a human? Come ask a question in our Slack channel \(link above\).
22
+
***Slack:** Need to speak to a human? Come ask a question in our Slack channel (link above).
23
23
***GitHub Issues:** Found a bug or need a feature? [Create an issue on GitHub](https://github.com/feast-dev/feast/issues/new).
24
24
***StackOverflow:** Need to ask a question on how to use Feast? We also monitor and respond to [StackOverflow](https://stackoverflow.com/questions/tagged/feast).
25
25
26
26
## Community Calls
27
27
28
-
We have a user and contributor community call every two weeks \(Asia & US friendly\).
28
+
We have a user and contributor community call every two weeks (Asia & US friendly).
29
29
30
30
{% hint style="info" %}
31
31
Please join the above Feast user groups in order to see calendar invites to the community calls
Copy file name to clipboardExpand all lines: docs/getting-started/concepts/feature-service.md
+1-2Lines changed: 1 addition & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
A feature service is an object that represents a logical group of features from one or more [feature views](feature-view.md#feature-view). Feature Services allows features from within a feature view to be used as needed by an ML model. Users can expect to create one feature service per model, allowing for tracking of the features used by models.
Copy file name to clipboardExpand all lines: docs/getting-started/concepts/feature-view.md
+5-7Lines changed: 5 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,10 +2,10 @@
2
2
3
3
## Feature views
4
4
5
-
A feature view is an object that represents a logical group of time-series feature data as it is found in a [data source](data-source.md). Feature views consist of zero or more [entities](entity.md), one or more [features](feature-view.md#feature), and a [data source](data-source.md). Feature views allow Feast to model your existing feature data in a consistent way in both an offline \(training\) and online \(serving\) environment. Feature views generally contain features that are properties of a specific object, in which case that object is defined as an entity and included in the feature view. If the features are not related to a specific object, the feature view might not have entities; see [feature views without entities](feature-view.md#feature-views-without-entities) below.
5
+
A feature view is an object that represents a logical group of time-series feature data as it is found in a [data source](data-source.md). Feature views consist of zero or more [entities](entity.md), one or more [features](feature-view.md#feature), and a [data source](data-source.md). Feature views allow Feast to model your existing feature data in a consistent way in both an offline (training) and online (serving) environment. Feature views generally contain features that are properties of a specific object, in which case that object is defined as an entity and included in the feature view. If the features are not related to a specific object, the feature view might not have entities; see [feature views without entities](feature-view.md#feature-views-without-entities) below.
6
6
7
7
{% tabs %}
8
-
{% tab title="driver\_trips\_feature\_view.py" %}
8
+
{% tab title="driver_trips_feature_view.py" %}
9
9
```python
10
10
driver_stats_fv = FeatureView(
11
11
name="driver_activity",
@@ -37,7 +37,7 @@ Feast does not generate feature values. It acts as the ingestion and serving sys
37
37
If a feature view contains features that are not related to a specific entity, the feature view can be defined without entities.
38
38
39
39
{% tabs %}
40
-
{% tab title="global\_stats.py" %}
40
+
{% tab title="global_stats.py" %}
41
41
```python
42
42
global_stats_fv = FeatureView(
43
43
name="global_stats",
@@ -70,9 +70,9 @@ Together with [data sources](data-source.md), they indicate to Feast where to fi
70
70
71
71
Feature names must be unique within a [feature view](feature-view.md#feature-view).
72
72
73
-
## \[Alpha\] On demand feature views
73
+
## \[Alpha] On demand feature views
74
74
75
-
On demand feature views allows users to use existing features and request time data \(features only available at request time\) to transform and create new features. Users define python transformation logic which is executed in both historical retrieval and online retrieval paths:
75
+
On demand feature views allows users to use existing features and request time data (features only available at request time) to transform and create new features. Users define python transformation logic which is executed in both historical retrieval and online retrieval paths:
76
76
77
77
```python
78
78
# Define a request data source which encodes features / information only
Copy file name to clipboardExpand all lines: docs/getting-started/quickstart.md
+14-15Lines changed: 14 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ In this tutorial we will
9
9
10
10
You can run this tutorial in Google Colab or run it on your localhost, following the guided steps below.
11
11
12
-
[**Run in Google Colab**](https://colab.research.google.com/github/feast-dev/feast/blob/master/examples/quickstart/quickstart.ipynb)\*\*\*\*
12
+
[**Run in Google Colab**](https://colab.research.google.com/github/feast-dev/feast/blob/master/examples/quickstart/quickstart.ipynb)****
13
13
14
14
## Overview
15
15
@@ -19,9 +19,9 @@ In this tutorial, we use feature stores to generate training data and power onli
19
19
* Feast joins these tables with battle-tested logic that ensures _point-in-time_ correctness so future feature values do not leak to models.
20
20
*_\*Upcoming_: Feast alerts users to offline / online skew with data quality monitoring.
21
21
2.**Online feature availability:** At inference time, models often need access to features that aren't readily available and need to be precomputed from other datasources.
22
-
* Feast manages deployment to a variety of online stores \(e.g. DynamoDB, Redis, Google Cloud Datastore\) and ensures necessary features are consistently _available_ and _freshly computed_ at inference time.
22
+
* Feast manages deployment to a variety of online stores (e.g. DynamoDB, Redis, Google Cloud Datastore) and ensures necessary features are consistently _available_ and _freshly computed_ at inference time.
23
23
3.**Feature reusability and model versioning:** Different teams within an organization are often unable to reuse features across projects, resulting in duplicate feature creation logic. Models have data dependencies that need to be versioned, for example when running A/B tests on model versions.
24
-
* Feast enables discovery of and collaboration on previously used features and enables versioning of sets of features \(via _feature services_\).
24
+
* Feast enables discovery of and collaboration on previously used features and enables versioning of sets of features (via _feature services_).
25
25
* Feast enables feature transformation so users can re-use transformation logic across online / offline usecases and across models.
26
26
27
27
## Step 1: Install Feast
@@ -53,7 +53,7 @@ cd feature_repo
53
53
54
54
{% tabs %}
55
55
{% tab title="Output" %}
56
-
```text
56
+
```
57
57
Creating a new Feast repository in /home/Jovyan/feature_repo.
58
58
```
59
59
{% endtab %}
@@ -66,7 +66,7 @@ Let's take a look at the resulting demo repo itself. It breaks down into
66
66
*`feature_store.yaml` contains a demo setup configuring where data sources are
The key line defining the overall architecture of the feature store is the **provider**. This defines where the raw data exists \(for generating training data & feature values for serving\), and where to materialize feature values to in the online store \(for serving\).
122
+
The key line defining the overall architecture of the feature store is the **provider**. This defines where the raw data exists (for generating training data & feature values for serving), and where to materialize feature values to in the online store (for serving).
123
123
124
124
Valid values for `provider` in `feature_store.yaml` are:
125
125
126
126
* local: use file source / SQLite
127
127
* gcp: use BigQuery / Google Cloud Datastore
128
128
* aws: use Redshift / DynamoDB
129
129
130
-
A custom setup \(e.g. using the built-in support for Redis\) can be made by following Creating a custom provider
130
+
A custom setup (e.g. using the built-in support for Redis) can be made by following Creating a custom provider
131
131
132
132
## Step 3: Register feature definitions and deploy your feature store
133
133
134
-
The `apply` command scans python files in the current directory for feature view/entity definitions, registers the objects, and deploys infrastructure. In this example, it reads `example.py`\(shown again below for convenience\) and sets up SQLite online store tables. Note that we had specified SQLite as the default online store by using the `local` provider in `feature_store.yaml`.
134
+
The `apply` command scans python files in the current directory for feature view/entity definitions, registers the objects, and deploys infrastructure. In this example, it reads `example.py` (shown again below for convenience) and sets up SQLite online store tables. Note that we had specified SQLite as the default online store by using the `local` provider in `feature_store.yaml`.
@@ -193,7 +193,7 @@ Deploying infrastructure for driver_hourly_stats
193
193
194
194
## Step 4: Generating training data
195
195
196
-
To train a model, we need features and labels. Often, this label data is stored separately \(e.g. you have one table storing user survey results and another set of tables with feature values\).
196
+
To train a model, we need features and labels. Often, this label data is stored separately (e.g. you have one table storing user survey results and another set of tables with feature values).
197
197
198
198
The user can query that table of labels with timestamps and pass that into Feast as an _entity dataframe_ for training data generation. In many cases, Feast will also intelligently join relevant tables to create the relevant feature vectors.
199
199
@@ -275,7 +275,7 @@ None
275
275
276
276
## Step 5: Load features into your online store
277
277
278
-
We now serialize the latest values of features since the beginning of time to prepare for serving \(note: `materialize-incremental` serializes all new features since the last `materialize` call\).
278
+
We now serialize the latest values of features since the beginning of time to prepare for serving (note: `materialize-incremental` serializes all new features since the last `materialize` call).
279
279
280
280
{% tabs %}
281
281
{% tab title="Bash" %}
@@ -300,7 +300,7 @@ driver_hourly_stats from 2021-08-22 16:25:47+00:00 to 2021-08-23 16:25:46+00:00:
300
300
301
301
## Step 6: Fetching feature vectors for inference
302
302
303
-
At inference time, we need to quickly read the latest feature values for different drivers \(which otherwise might have existed only in batch sources\) from the online feature store using `get_online_features()`. These feature vectors can then be fed to the model.
303
+
At inference time, we need to quickly read the latest feature values for different drivers (which otherwise might have existed only in batch sources) from the online feature store using `get_online_features()`. These feature vectors can then be fed to the model.
304
304
305
305
{% tabs %}
306
306
{% tab title="Python" %}
@@ -346,5 +346,4 @@ pprint(feature_vector)
346
346
* Read the [Architecture](architecture-and-components/) page.
347
347
* Check out our [Tutorials](../tutorials/tutorials-overview.md) section for more examples on how to use Feast.
348
348
* Follow our [Running Feast with GCP/AWS](../how-to-guides/feast-gcp-aws/) guide for a more in-depth tutorial on using Feast.
349
-
* Join other Feast users and contributors in [Slack](https://slack.feast.dev/) and become part of the community!
350
-
349
+
* Join other Feast users and contributors in [Slack](https://slack.feast.dev) and become part of the community!
0 commit comments