Skip to content

Commit cf82ed6

Browse files
adchiagitbook-bot
authored andcommitted
GitBook: [feast-dev#307] Update hero image in master
1 parent f5f5500 commit cf82ed6

30 files changed

+180
-510
lines changed
76.7 KB
Loading

docs/README.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22

33
## What is Feast?
44

5-
Feast \(**Fea**ture **St**ore\) is an operational data system for managing and serving machine learning features to models in production. Feast is able to serve feature data to models from a low-latency online store \(for real-time prediction\) or from an offline store \(for scale-out batch scoring or model training\).
5+
Feast (**Fea**ture **St**ore) is an operational data system for managing and serving machine learning features to models in production. Feast is able to serve feature data to models from a low-latency online store (for real-time prediction) or from an offline store (for scale-out batch scoring or model training).
66

7-
![](.gitbook/assets/feast_hero_010.png)
7+
![](.gitbook/assets/feast-marchitecture-211014.png)
88

99
## Problems Feast Solves
1010

11-
**Models need consistent access to data:** Machine Learning \(ML\) systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. A result of this coupling, however, is that any change in data infrastructure may break dependent ML systems. Another challenge is that dual implementations of data retrieval for training and serving can lead to inconsistencies in data, which in turn can lead to training-serving skew.
11+
**Models need consistent access to data:** Machine Learning (ML) systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. A result of this coupling, however, is that any change in data infrastructure may break dependent ML systems. Another challenge is that dual implementations of data retrieval for training and serving can lead to inconsistencies in data, which in turn can lead to training-serving skew.
1212

1313
Feast decouples your models from your data infrastructure by providing a single data access layer that abstracts feature storage from feature retrieval. Feast also provides a consistent means of referencing feature data for retrieval, and therefore ensures that models remain portable when moving from training to serving.
1414

@@ -34,9 +34,9 @@ Feast addresses this problem by introducing feature reuse through a centralized
3434

3535
## What Feast is not
3636

37-
[**ETL**](https://en.wikipedia.org/wiki/Extract,_transform,_load) **or** [**ELT**](https://en.wikipedia.org/wiki/Extract,_load,_transform) **system:** Feast is not \(and does not plan to become\) a general purpose data transformation or pipelining system. Feast plans to include a light-weight feature engineering toolkit, but we encourage teams to integrate Feast with upstream ETL/ELT systems that are specialized in transformation.
37+
[**ETL**](https://en.wikipedia.org/wiki/Extract,\_transform,\_load) **or** [**ELT**](https://en.wikipedia.org/wiki/Extract,\_load,\_transform) **system:** Feast is not (and does not plan to become) a general purpose data transformation or pipelining system. Feast plans to include a light-weight feature engineering toolkit, but we encourage teams to integrate Feast with upstream ETL/ELT systems that are specialized in transformation.
3838

39-
**Data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse \(or other data sources\) to models in production.
39+
**Data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse (or other data sources) to models in production.
4040

4141
**Data catalog:** Feast is not a general purpose data catalog for your organization. Feast is purely focused on cataloging features for use in ML pipelines or systems, and only to the extent of facilitating the reuse of features.
4242

@@ -55,4 +55,3 @@ Explore the following resources to get started with Feast:
5555
* [Running Feast with GCP/AWS](how-to-guides/feast-gcp-aws/) provides a more in-depth guide to using Feast.
5656
* [Reference](reference/feast-cli-commands.md) contains detailed API and design documents.
5757
* [Contributing](project/contributing.md) contains resources for anyone who wants to contribute to Feast.
58-
162 KB
Loading
162 KB
Loading
162 KB
Loading
162 KB
Loading

docs/community.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,33 +11,32 @@
1111
* Feast users should join [[email protected]](mailto:[email protected]) group by clicking [here](https://groups.google.com/g/feast-discuss).
1212
* Feast developers should join [[email protected]](mailto:[email protected]) group by clicking [here](https://groups.google.com/d/forum/feast-dev).
1313
* [Google Folder](https://drive.google.com/drive/u/0/folders/1jgMHOPDT2DvBlJeO9LCM79DP4lm4eOrR): This folder is used as a central repository for all Feast resources. For example:
14-
* Design proposals in the form of Request for Comments \(RFC\).
14+
* Design proposals in the form of Request for Comments (RFC).
1515
* User surveys and meeting minutes.
1616
* Slide decks of conferences our contributors have spoken at.
1717
* [Feast GitHub Repository](https://github.com/feast-dev/feast/): Find the complete Feast codebase on GitHub.
1818
* [Feast Linux Foundation Wiki](https://wiki.lfaidata.foundation/display/FEAST/Feast+Home): Our LFAI wiki page contains links to resources for contributors and maintainers.
1919

2020
## How can I get help?
2121

22-
* **Slack:** Need to speak to a human? Come ask a question in our Slack channel \(link above\).
22+
* **Slack:** Need to speak to a human? Come ask a question in our Slack channel (link above).
2323
* **GitHub Issues:** Found a bug or need a feature? [Create an issue on GitHub](https://github.com/feast-dev/feast/issues/new).
2424
* **StackOverflow:** Need to ask a question on how to use Feast? We also monitor and respond to [StackOverflow](https://stackoverflow.com/questions/tagged/feast).
2525

2626
## Community Calls
2727

28-
We have a user and contributor community call every two weeks \(Asia & US friendly\).
28+
We have a user and contributor community call every two weeks (Asia & US friendly).
2929

3030
{% hint style="info" %}
3131
Please join the above Feast user groups in order to see calendar invites to the community calls
3232
{% endhint %}
3333

34-
### Frequency \(alternating times every 2 weeks\)
34+
### Frequency (alternating times every 2 weeks)
3535

36-
* Tuesday 18:00 pm to 18:30 pm \(US, Asia\)
37-
* Tuesday 10:00 am to 10:30 am \(US, Europe\)
36+
* Tuesday 18:00 pm to 18:30 pm (US, Asia)
37+
* Tuesday 10:00 am to 10:30 am (US, Europe)
3838

3939
### Links
4040

4141
* Zoom: [https://zoom.us/j/6325193230](https://zoom.us/j/6325193230)
42-
* Meeting notes: [https://bit.ly/feast-notes](https://bit.ly/feast-notes%20)
43-
42+
* Meeting notes: [https://bit.ly/feast-notes](https://bit.ly/feast-notes)

docs/getting-started/concepts/feature-service.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
A feature service is an object that represents a logical group of features from one or more [feature views](feature-view.md#feature-view). Feature Services allows features from within a feature view to be used as needed by an ML model. Users can expect to create one feature service per model, allowing for tracking of the features used by models.
44

55
{% tabs %}
6-
{% tab title="driver\_trips\_feature\_service.py" %}
6+
{% tab title="driver_trips_feature_service.py" %}
77
```python
88
from driver_ratings_feature_view import driver_ratings_fv
99
from driver_trips_feature_view import driver_stats_fv
@@ -46,4 +46,3 @@ feature_store = FeatureStore('.') # Initialize the feature store
4646
feature_service = feature_store.get_feature_service("driver_activity")
4747
feature_store.get_historical_features(features=feature_service, entity_df=entity_df)
4848
```
49-

docs/getting-started/concepts/feature-view.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22

33
## Feature views
44

5-
A feature view is an object that represents a logical group of time-series feature data as it is found in a [data source](data-source.md). Feature views consist of zero or more [entities](entity.md), one or more [features](feature-view.md#feature), and a [data source](data-source.md). Feature views allow Feast to model your existing feature data in a consistent way in both an offline \(training\) and online \(serving\) environment. Feature views generally contain features that are properties of a specific object, in which case that object is defined as an entity and included in the feature view. If the features are not related to a specific object, the feature view might not have entities; see [feature views without entities](feature-view.md#feature-views-without-entities) below.
5+
A feature view is an object that represents a logical group of time-series feature data as it is found in a [data source](data-source.md). Feature views consist of zero or more [entities](entity.md), one or more [features](feature-view.md#feature), and a [data source](data-source.md). Feature views allow Feast to model your existing feature data in a consistent way in both an offline (training) and online (serving) environment. Feature views generally contain features that are properties of a specific object, in which case that object is defined as an entity and included in the feature view. If the features are not related to a specific object, the feature view might not have entities; see [feature views without entities](feature-view.md#feature-views-without-entities) below.
66

77
{% tabs %}
8-
{% tab title="driver\_trips\_feature\_view.py" %}
8+
{% tab title="driver_trips_feature_view.py" %}
99
```python
1010
driver_stats_fv = FeatureView(
1111
name="driver_activity",
@@ -37,7 +37,7 @@ Feast does not generate feature values. It acts as the ingestion and serving sys
3737
If a feature view contains features that are not related to a specific entity, the feature view can be defined without entities.
3838

3939
{% tabs %}
40-
{% tab title="global\_stats.py" %}
40+
{% tab title="global_stats.py" %}
4141
```python
4242
global_stats_fv = FeatureView(
4343
name="global_stats",
@@ -70,9 +70,9 @@ Together with [data sources](data-source.md), they indicate to Feast where to fi
7070

7171
Feature names must be unique within a [feature view](feature-view.md#feature-view).
7272

73-
## \[Alpha\] On demand feature views
73+
## \[Alpha] On demand feature views
7474

75-
On demand feature views allows users to use existing features and request time data \(features only available at request time\) to transform and create new features. Users define python transformation logic which is executed in both historical retrieval and online retrieval paths:
75+
On demand feature views allows users to use existing features and request time data (features only available at request time) to transform and create new features. Users define python transformation logic which is executed in both historical retrieval and online retrieval paths:
7676

7777
```python
7878
# Define a request data source which encodes features / information only
@@ -103,5 +103,3 @@ def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
103103
return df
104104
```
105105

106-
107-

docs/getting-started/quickstart.md

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ In this tutorial we will
99

1010
You can run this tutorial in Google Colab or run it on your localhost, following the guided steps below.
1111

12-
![](../.gitbook/assets/colab_logo_32px.png)[**Run in Google Colab**](https://colab.research.google.com/github/feast-dev/feast/blob/master/examples/quickstart/quickstart.ipynb)\*\*\*\*
12+
![](../.gitbook/assets/colab_logo\_32px.png)[**Run in Google Colab**](https://colab.research.google.com/github/feast-dev/feast/blob/master/examples/quickstart/quickstart.ipynb)****
1313

1414
## Overview
1515

@@ -19,9 +19,9 @@ In this tutorial, we use feature stores to generate training data and power onli
1919
* Feast joins these tables with battle-tested logic that ensures _point-in-time_ correctness so future feature values do not leak to models.
2020
* _\*Upcoming_: Feast alerts users to offline / online skew with data quality monitoring.
2121
2. **Online feature availability:** At inference time, models often need access to features that aren't readily available and need to be precomputed from other datasources.
22-
* Feast manages deployment to a variety of online stores \(e.g. DynamoDB, Redis, Google Cloud Datastore\) and ensures necessary features are consistently _available_ and _freshly computed_ at inference time.
22+
* Feast manages deployment to a variety of online stores (e.g. DynamoDB, Redis, Google Cloud Datastore) and ensures necessary features are consistently _available_ and _freshly computed_ at inference time.
2323
3. **Feature reusability and model versioning:** Different teams within an organization are often unable to reuse features across projects, resulting in duplicate feature creation logic. Models have data dependencies that need to be versioned, for example when running A/B tests on model versions.
24-
* Feast enables discovery of and collaboration on previously used features and enables versioning of sets of features \(via _feature services_\).
24+
* Feast enables discovery of and collaboration on previously used features and enables versioning of sets of features (via _feature services_).
2525
* Feast enables feature transformation so users can re-use transformation logic across online / offline usecases and across models.
2626

2727
## Step 1: Install Feast
@@ -53,7 +53,7 @@ cd feature_repo
5353

5454
{% tabs %}
5555
{% tab title="Output" %}
56-
```text
56+
```
5757
Creating a new Feast repository in /home/Jovyan/feature_repo.
5858
```
5959
{% endtab %}
@@ -66,7 +66,7 @@ Let's take a look at the resulting demo repo itself. It breaks down into
6666
* `feature_store.yaml` contains a demo setup configuring where data sources are
6767

6868
{% tabs %}
69-
{% tab title="feature\_store.yaml" %}
69+
{% tab title="feature_store.yaml" %}
7070
```yaml
7171
project: my_project
7272
registry: data/registry.db
@@ -117,21 +117,21 @@ driver_hourly_stats_view = FeatureView(
117117
{% endtab %}
118118
{% endtabs %}
119119

120-
![Demo parquet data: data/driver\_stats.parquet](../.gitbook/assets/screen-shot-2021-08-23-at-2.35.18-pm.png)
120+
![Demo parquet data: data/driver_stats.parquet](../.gitbook/assets/screen-shot-2021-08-23-at-2.35.18-pm.png)
121121

122-
The key line defining the overall architecture of the feature store is the **provider**. This defines where the raw data exists \(for generating training data & feature values for serving\), and where to materialize feature values to in the online store \(for serving\).
122+
The key line defining the overall architecture of the feature store is the **provider**. This defines where the raw data exists (for generating training data & feature values for serving), and where to materialize feature values to in the online store (for serving).
123123

124124
Valid values for `provider` in `feature_store.yaml` are:
125125

126126
* local: use file source / SQLite
127127
* gcp: use BigQuery / Google Cloud Datastore
128128
* aws: use Redshift / DynamoDB
129129

130-
A custom setup \(e.g. using the built-in support for Redis\) can be made by following Creating a custom provider
130+
A custom setup (e.g. using the built-in support for Redis) can be made by following Creating a custom provider
131131

132132
## Step 3: Register feature definitions and deploy your feature store
133133

134-
The `apply` command scans python files in the current directory for feature view/entity definitions, registers the objects, and deploys infrastructure. In this example, it reads `example.py` \(shown again below for convenience\) and sets up SQLite online store tables. Note that we had specified SQLite as the default online store by using the `local` provider in `feature_store.yaml`.
134+
The `apply` command scans python files in the current directory for feature view/entity definitions, registers the objects, and deploys infrastructure. In this example, it reads `example.py` (shown again below for convenience) and sets up SQLite online store tables. Note that we had specified SQLite as the default online store by using the `local` provider in `feature_store.yaml`.
135135

136136
{% tabs %}
137137
{% tab title="Bash" %}
@@ -183,7 +183,7 @@ driver_hourly_stats_view = FeatureView(
183183

184184
{% tabs %}
185185
{% tab title="Output" %}
186-
```text
186+
```
187187
Registered entity driver_id
188188
Registered feature view driver_hourly_stats
189189
Deploying infrastructure for driver_hourly_stats
@@ -193,7 +193,7 @@ Deploying infrastructure for driver_hourly_stats
193193

194194
## Step 4: Generating training data
195195

196-
To train a model, we need features and labels. Often, this label data is stored separately \(e.g. you have one table storing user survey results and another set of tables with feature values\).
196+
To train a model, we need features and labels. Often, this label data is stored separately (e.g. you have one table storing user survey results and another set of tables with feature values).
197197

198198
The user can query that table of labels with timestamps and pass that into Feast as an _entity dataframe_ for training data generation. In many cases, Feast will also intelligently join relevant tables to create the relevant feature vectors.
199199

@@ -275,7 +275,7 @@ None
275275

276276
## Step 5: Load features into your online store
277277

278-
We now serialize the latest values of features since the beginning of time to prepare for serving \(note: `materialize-incremental` serializes all new features since the last `materialize` call\).
278+
We now serialize the latest values of features since the beginning of time to prepare for serving (note: `materialize-incremental` serializes all new features since the last `materialize` call).
279279

280280
{% tabs %}
281281
{% tab title="Bash" %}
@@ -300,7 +300,7 @@ driver_hourly_stats from 2021-08-22 16:25:47+00:00 to 2021-08-23 16:25:46+00:00:
300300

301301
## Step 6: Fetching feature vectors for inference
302302

303-
At inference time, we need to quickly read the latest feature values for different drivers \(which otherwise might have existed only in batch sources\) from the online feature store using `get_online_features()`. These feature vectors can then be fed to the model.
303+
At inference time, we need to quickly read the latest feature values for different drivers (which otherwise might have existed only in batch sources) from the online feature store using `get_online_features()`. These feature vectors can then be fed to the model.
304304

305305
{% tabs %}
306306
{% tab title="Python" %}
@@ -346,5 +346,4 @@ pprint(feature_vector)
346346
* Read the [Architecture](architecture-and-components/) page.
347347
* Check out our [Tutorials](../tutorials/tutorials-overview.md) section for more examples on how to use Feast.
348348
* Follow our [Running Feast with GCP/AWS](../how-to-guides/feast-gcp-aws/) guide for a more in-depth tutorial on using Feast.
349-
* Join other Feast users and contributors in [Slack](https://slack.feast.dev/) and become part of the community!
350-
349+
* Join other Feast users and contributors in [Slack](https://slack.feast.dev) and become part of the community!

0 commit comments

Comments
 (0)