Skip to content

Commit a7da45e

Browse files
woopgitbook-bot
authored andcommitted
GitBook: [master] 80 pages modified
1 parent c50a36e commit a7da45e

29 files changed

+668
-150
lines changed

docs/SUMMARY.md

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,33 @@
1515

1616
## Concepts
1717

18-
* [Data model and concepts](concepts/data-model-and-concepts.md)
19-
* [Architecture and components](concepts/architecture-and-components.md)
18+
* [Overview](concepts/overview.md)
19+
* [Feature view](concepts/feature-view.md)
20+
* [Data model](concepts/data-model-and-concepts.md)
21+
* [Online Store](concepts/online-store.md)
22+
* [Offline Store](concepts/offline-store.md)
23+
* [Provider](concepts/provider.md)
24+
* [Architecture](concepts/architecture-and-components.md)
2025

2126
## Reference
2227

23-
* [Feature repository](reference/feature-repository.md)
24-
* [feature\_store.yaml](reference/feature-store-yaml.md)
28+
* [Data Sources](reference/data-sources/README.md)
29+
* [BigQuery](reference/data-sources/bigquery.md)
30+
* [File](reference/data-sources/file.md)
31+
* [Online stores](reference/online-stores/README.md)
32+
* [SQLite](reference/online-stores/sqlite.md)
33+
* [Redis](reference/online-stores/redis.md)
34+
* [Datastore](reference/online-stores/datastore.md)
35+
* [Offline stores](reference/offline-stores/README.md)
36+
* [File](reference/offline-stores/file.md)
37+
* [BigQuery](reference/offline-stores/untitled.md)
38+
* [Providers](reference/providers/README.md)
39+
* [Local](reference/providers/local.md)
40+
* [Google Cloud Platform](reference/providers/google-cloud-platform.md)
2541
* [Feast CLI reference](reference/feast-cli-commands.md)
26-
* [.feastignore](reference/feast-ignore.md)
42+
* [Feature repository](reference/feature-repository/README.md)
43+
* [feature\_store.yaml](reference/feature-repository/feature-store-yaml.md)
44+
* [.feastignore](reference/feature-repository/feast-ignore.md)
2745
* [Python API reference](http://rtd.feast.dev/)
2846
* [Telemetry](reference/telemetry.md)
2947

docs/concepts/architecture-and-components.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
# Architecture and components
1+
# Architecture
22

3-
![Feast 0.10 Architecture Diagram](../.gitbook/assets/image%20%284%29.png)
3+
![Feast Architecture Diagram](../.gitbook/assets/image%20%284%29.png)
44

5-
### Functionality
5+
#### Functionality
66

77
* **Create Batch Features:** ELT/ETL systems like Spark and SQL are used to transform data in the batch store.
88
* **Feast Apply:** The user \(or CI\) publishes versioned controlled feature definitions using `feast apply`. This CLI command updates infrastructure and persists definitions in the object store registry.
@@ -13,7 +13,7 @@
1313
* **Prediction:** A backend system makes a request for a prediction from the model serving service.
1414
* **Get Online Features:** The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.
1515

16-
### Components
16+
#### Components
1717

1818
A complete Feast deployment contains the following components:
1919

Lines changed: 1 addition & 123 deletions
Original file line numberDiff line numberDiff line change
@@ -1,87 +1,4 @@
1-
# Data model and concepts
2-
3-
### Concepts
4-
5-
The top-level namespace within Feast is a [project](data-model-and-concepts.md#project). Users define one or more [feature views](data-model-and-concepts.md#feature-view) within a project. Each feature view contains one or more [features](data-model-and-concepts.md#feature) that relate to a specific [entity](data-model-and-concepts.md#entity). A feature view must always have a [data source](data-model-and-concepts.md#data-source). This source is used during the generation of training [datasets](data-model-and-concepts.md#dataset) and when materializing feature values into the online store.
6-
7-
![](../.gitbook/assets/image%20%287%29.png)
8-
9-
### Project
10-
11-
Projects provide complete isolation of feature stores at the infrastructure level. This is accomplished through resource namespacing, e.g., prefixing table names with the associated project. Each project should be considered a completely separate universe of entities and features. It is not possible to retrieve features from multiple projects in a single request. We recommend having a single feature store and a single project per environment \(`dev`, `staging`, `prod`\).
12-
13-
{% hint style="info" %}
14-
Projects are currently being supported for backward compatibility reasons. The concept and functionality provided by Projects may change in the future as we simplify the Feast API.
15-
{% endhint %}
16-
17-
### Data Source
18-
19-
Feast uses a time-series data model to represent data. This data model is used to interpret feature data in data sources in order to build training datasets or when materializing features into an online store.
20-
21-
Below is an example data source with a single entity \(`driver`\) and two features \(`trips_today`, and `rating`\).
22-
23-
![Ride-hailing data source](../.gitbook/assets/image%20%2816%29.png)
24-
25-
### Entity
26-
27-
An entity is a collection of semantically related features. Users define entities to map to the domain of their use case. For example, a ride-hailing service could have customers and drivers as their entities, which group related features that correspond to these customers and drivers.
28-
29-
```python
30-
driver = Entity(name='driver', value_type=ValueType.STRING, join_key='driver_id')
31-
```
32-
33-
Entities are defined as part of feature views. Entities are used to identify the primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities \(more than one entity object\) in a feature view.
34-
35-
Entities should be reused across feature views.
36-
37-
### Feature
38-
39-
A feature is an individual measurable property observed on an entity. For example, a feature of a `customer` entity could be the number of transactions they have made on an average month.
40-
41-
Features are defined as part of feature views. Since Feast does not transform data, a feature is essentially a schema that only contains a name and a type:
42-
43-
```python
44-
trips_today = Feature(
45-
name="trips_today",
46-
dtype=ValueType.FLOAT
47-
)
48-
```
49-
50-
Together with [data sources](data-model-and-concepts.md#data-source), they indicate to Feast where to find your feature values, e.g., in a specific parquet file or BigQuery table. Feature definitions are also used when reading features from the feature store, using [feature references](data-model-and-concepts.md#feature-references).
51-
52-
Feature names must be unique within a [feature view](data-model-and-concepts.md#feature-view).
53-
54-
### Feature View
55-
56-
A feature view is an object that represents a logical group of time-series feature data as it is found in a data source. Feature views consist of one or more entities, features, and a data source. Feature views allow Feast to model your existing feature data in a consistent way in both an offline \(training\) and online \(serving\) environment.
57-
58-
{% tabs %}
59-
{% tab title="driver\_trips\_feature\_view.py" %}
60-
```python
61-
driver_stats_fv = FeatureView(
62-
name="driver_activity",
63-
entities=["driver"],
64-
features=[
65-
Feature(name="trips_today", dtype=ValueType.INT64),
66-
Feature(name="rating", dtype=ValueType.FLOAT),
67-
],
68-
input=BigQuerySource(
69-
table_ref="feast-oss.demo_data.driver_activity"
70-
)
71-
)
72-
```
73-
{% endtab %}
74-
{% endtabs %}
75-
76-
Feature views are used during
77-
78-
* The generation of training datasets by querying the data source of feature views in order to find historical feature values. A single training dataset may consist of features from multiple feature views.
79-
* Loading of feature values into an online store. Feature views determine the storage schema in the online store.
80-
* Retrieval of features from the online store. Feature views provide the schema definition to Feast in order to look up features from the online store.
81-
82-
{% hint style="info" %}
83-
Feast does not generate feature values. It acts as the ingestion and serving system. The data sources described within feature views should reference feature values in their already computed form.
84-
{% endhint %}
1+
# Data model
852

863
### Dataset
874

@@ -147,42 +64,3 @@ Example of an entity dataframe with feature values joined to it:
14764

14865
![](../.gitbook/assets/image%20%2817%29.png)
14966

150-
### **Online Store**
151-
152-
The Feast online store is used for low-latency online feature value lookups. Feature values are loaded into the online store from data sources in feature views using the `materialize` command.
153-
154-
The storage schema of features within the online store mirrors that of the data source used to populate the online store. One key difference between the online store and data sources is that only the latest feature values are stored per entity key. No historical values are stored.
155-
156-
Example batch data source
157-
158-
![](../.gitbook/assets/image%20%286%29.png)
159-
160-
Once the above data source is materialized into Feast \(using `feast materialize`\), the feature values will be stored as follows:
161-
162-
![](../.gitbook/assets/image%20%285%29.png)
163-
164-
### Offline Store
165-
166-
An offline store is a storage and compute system where historic feature data can be stored or accessed for building training datasets or for sourcing data for materialization into the online store.
167-
168-
Offline stores are used primarily for two reasons
169-
170-
1. Building training datasets
171-
2. Querying data sources for feature data in order to load these features into your online store
172-
173-
Feast does not actively manage your offline store. Instead, you are asked to select an offline store \(like `BigQuery` or the `File` offline store\) and then to introduce batch sources from these stores using [data sources](data-model-and-concepts.md#data-source) inside feature views.
174-
175-
Feast will use your offline store to query these sources. It is not possible to query all data sources from all offline stores, and only a single offline store can be used at a time. For example, it is not possible to query a BigQuery table from a `File` offline store, nor is it possible for a `BigQuery` offline store to query files in your local file system.
176-
177-
Please see [feature\_store.yaml](../reference/feature-store-yaml.md#overview) for configuring your offline store.
178-
179-
### **Provider**
180-
181-
A provider is an implementation of a feature store using specific feature store components targeting a specific environment**.** More specifically, a provider is the target environment to which you have configured your feature store to deploy and run.
182-
183-
Providers are built to orchestrate various components \(offline store, online store, infrastructure, compute\) inside an environment. For example, the `gcp` provider may only support `BigQuery` as an offline store and `datastore` as the online store, but it ensures that these components can work together seamlessly.
184-
185-
Providers also come with default configurations which makes it easier for users to start a feature store in a specific environment.
186-
187-
Please see [feature\_store.yaml](../reference/feature-store-yaml.md#overview) for configuring a provider.
188-

docs/concepts/feature-view.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Feature view
2+
3+
### Feature View
4+
5+
A feature view is an object that represents a logical group of time-series feature data as it is found in a [data source](feature-view.md#data-source). Feature views consist of one or more [entities](feature-view.md#entity), [features](feature-view.md#feature), and a [data source](feature-view.md#data-source). Feature views allow Feast to model your existing feature data in a consistent way in both an offline \(training\) and online \(serving\) environment.
6+
7+
{% tabs %}
8+
{% tab title="driver\_trips\_feature\_view.py" %}
9+
```python
10+
driver_stats_fv = FeatureView(
11+
name="driver_activity",
12+
entities=["driver"],
13+
features=[
14+
Feature(name="trips_today", dtype=ValueType.INT64),
15+
Feature(name="rating", dtype=ValueType.FLOAT),
16+
],
17+
input=BigQuerySource(
18+
table_ref="feast-oss.demo_data.driver_activity"
19+
)
20+
)
21+
```
22+
{% endtab %}
23+
{% endtabs %}
24+
25+
Feature views are used during
26+
27+
* The generation of training datasets by querying the data source of feature views in order to find historical feature values. A single training dataset may consist of features from multiple feature views.
28+
* Loading of feature values into an online store. Feature views determine the storage schema in the online store.
29+
* Retrieval of features from the online store. Feature views provide the schema definition to Feast in order to look up features from the online store.
30+
31+
{% hint style="info" %}
32+
Feast does not generate feature values. It acts as the ingestion and serving system. The data sources described within feature views should reference feature values in their already computed form.
33+
{% endhint %}
34+
35+
### Data Source
36+
37+
Feast uses a time-series data model to represent data. This data model is used to interpret feature data in data sources in order to build training datasets or when materializing features into an online store.
38+
39+
Below is an example data source with a single entity \(`driver`\) and two features \(`trips_today`, and `rating`\).
40+
41+
![Ride-hailing data source](../.gitbook/assets/image%20%2816%29.png)
42+
43+
### Entity
44+
45+
An entity is a collection of semantically related features. Users define entities to map to the domain of their use case. For example, a ride-hailing service could have customers and drivers as their entities, which group related features that correspond to these customers and drivers.
46+
47+
```python
48+
driver = Entity(name='driver', value_type=ValueType.STRING, join_key='driver_id')
49+
```
50+
51+
Entities are defined as part of feature views. Entities are used to identify the primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities \(more than one entity object\) in a feature view.
52+
53+
Entities should be reused across feature views.
54+
55+
### Feature
56+
57+
A feature is an individual measurable property observed on an entity. For example, a feature of a `customer` entity could be the number of transactions they have made on an average month.
58+
59+
Features are defined as part of feature views. Since Feast does not transform data, a feature is essentially a schema that only contains a name and a type:
60+
61+
```python
62+
trips_today = Feature(
63+
name="trips_today",
64+
dtype=ValueType.FLOAT
65+
)
66+
```
67+
68+
Together with [data sources](data-model-and-concepts.md#data-source), they indicate to Feast where to find your feature values, e.g., in a specific parquet file or BigQuery table. Feature definitions are also used when reading features from the feature store, using [feature references](data-model-and-concepts.md#feature-references).
69+
70+
Feature names must be unique within a [feature view](data-model-and-concepts.md#feature-view).
71+

docs/concepts/offline-store.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Offline Store
2+
3+
An offline store is a storage and compute system where historic feature data can be stored or accessed for building training datasets or for sourcing data for materialization into the online store.
4+
5+
Offline stores are used primarily for two reasons
6+
7+
1. Building training datasets
8+
2. Querying data sources for feature data in order to load these features into your online store
9+
10+
Feast does not actively manage your offline store. Instead, you are asked to select an offline store \(like `BigQuery` or the `File` offline store\) and then to introduce batch sources from these stores using [data sources](data-model-and-concepts.md#data-source) inside feature views.
11+
12+
Feast will use your offline store to query these sources. It is not possible to query all data sources from all offline stores, and only a single offline store can be used at a time. For example, it is not possible to query a BigQuery table from a `File` offline store, nor is it possible for a `BigQuery` offline store to query files in your local file system.
13+
14+
Please see [feature\_store.yaml](../reference/feature-repository/feature-store-yaml.md#overview) for configuring your offline store.
15+

docs/concepts/online-store.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Online Store
2+
3+
The Feast online store is used for low-latency online feature value lookups. Feature values are loaded into the online store from data sources in feature views using the `materialize` command.
4+
5+
The storage schema of features within the online store mirrors that of the data source used to populate the online store. One key difference between the online store and data sources is that only the latest feature values are stored per entity key. No historical values are stored.
6+
7+
Example batch data source
8+
9+
![](../.gitbook/assets/image%20%286%29.png)
10+
11+
Once the above data source is materialized into Feast \(using `feast materialize`\), the feature values will be stored as follows:
12+
13+
![](../.gitbook/assets/image%20%285%29.png)
14+
15+
###
16+

docs/concepts/overview.md

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,16 @@
11
# Overview
22

3-
### Concepts
3+
The top-level namespace within Feast is a [project](data-model-and-concepts.md#project). Users define one or more [feature views](data-model-and-concepts.md#feature-view) within a project. Each feature view contains one or more [features](data-model-and-concepts.md#feature) that relate to a specific [entity](data-model-and-concepts.md#entity). A feature view must always have a [data source](data-model-and-concepts.md#data-source), which in turn is used during the generation of training [datasets](data-model-and-concepts.md#dataset) and when materializing feature values into the online store.
44

5-
[Entities](entities.md) are objects in an organization like customers, transactions, and drivers, products, etc.
5+
![](../.gitbook/assets/image%20%287%29.png)
66

7-
[Sources](sources.md) are external sources of data where feature data can be found.
7+
### Project
88

9-
[Feature Tables](feature-tables.md) are objects that define logical groupings of features, data sources, and other related metadata.
9+
Projects provide complete isolation of feature stores at the infrastructure level. This is accomplished through resource namespacing, e.g., prefixing table names with the associated project. Each project should be considered a completely separate universe of entities and features. It is not possible to retrieve features from multiple projects in a single request. We recommend having a single feature store and a single project per environment \(`dev`, `staging`, `prod`\).
1010

11-
### Concept Hierarchy
11+
{% hint style="info" %}
12+
Projects are currently being supported for backward compatibility reasons. Projects may change in the future as we simplify the Feast API.
13+
{% endhint %}
1214

13-
![](../.gitbook/assets/image%20%284%29%20%282%29%20%282%29%20%282%29%20%282%29%20%282%29%20%282%29%20%282%29%20%281%29.png)
14-
15-
Feast contains the following core concepts:
16-
17-
* **Projects:** Serve as a top level namespace for all Feast resources. Each project is a completely independent environment in Feast. Users can only work in a single project at a time.
18-
* **Entities:** Entities are the objects in an organization on which features occur. They map to your business domain \(users, products, transactions, locations\).
19-
* **Feature Tables:** Defines a group of features that occur on a specific entity.
20-
* **Features:** Individual feature within a feature table.
15+
###
2116

docs/concepts/provider.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Provider
2+
3+
A provider is an implementation of a feature store using specific feature store components targeting a specific environment**.** More specifically, a provider is the target environment to which you have configured your feature store to deploy and run.
4+
5+
Providers are built to orchestrate various components \(offline store, online store, infrastructure, compute\) inside an environment. For example, the `gcp` provider supports [BigQuery](https://cloud.google.com/bigquery) as an offline store and [Datastore](https://cloud.google.com/datastore) as an online store, ensuring that these components can work together seamlessly.
6+
7+
Providers also come with default configurations which makes it easier for users to start a feature store in a specific environment.
8+
9+
Please see [feature\_store.yaml](../reference/feature-repository/feature-store-yaml.md#overview) for configuring providers.
10+

0 commit comments

Comments
 (0)