Feast Python API Documentation

Client

class feast.client.Client(options: Optional[Dict[str, str]] = None, **kwargs)[source]

Feast Client: Used for creating, managing, and retrieving features.

apply(feature_sets: Union[List[feast.feature_set.FeatureSet], feast.feature_set.FeatureSet])[source]

Idempotently registers feature set(s) with Feast Core. Either a single feature set or a list can be provided.

Parameters

feature_sets – List of feature sets that will be registered

archive_project(project)[source]

Archives a project. Project will still continue to function for ingestion and retrieval, but will be in a read-only state. It will also not be visible from the Core API for management purposes.

Parameters

project – Name of project to archive

property core_secure

Retrieve Feast Core client-side SSL/TLS setting

Returns

Whether client-side SSL/TLS is enabled

property core_url

Retrieve Feast Core URL

Returns

Feast Core URL string

create_project(project: str)[source]

Creates a Feast project

Parameters

project – Name of project

get_batch_features(feature_refs: List[str], entity_rows: Union[pandas.core.frame.DataFrame, str], project: str = None) → feast.job.RetrievalJob[source]

Retrieves historical features from a Feast Serving deployment.

Parameters
  • feature_refs – List of feature references that will be returned for each entity. Each feature reference should have the following format: “feature_set:feature” where “feature_set” & “feature” refer to the feature and feature set names respectively. Only the feature name is required.

  • entity_rows (Union[pd.DataFrame, str]) – Pandas dataframe containing entities and a ‘datetime’ column. Each entity in a feature set must be present as a column in this dataframe. The datetime column must contain timestamps in datetime64 format.

  • project – Specifies the project which contain the FeatureSets which the requested features belong to.

Returns

Returns a retrival job object that can be used to monitor retrieval progress asynchronously, and can be used to materialize the results.

Return type

feast.job.RetrievalJob

Examples

>>> from feast import Client
>>> from datetime import datetime
>>>
>>> feast_client = Client(core_url="localhost:6565", serving_url="localhost:6566")
>>> feature_refs = ["my_project/bookings_7d", "booking_14d"]
>>> entity_rows = pd.DataFrame(
>>>         {
>>>            "datetime": [pd.datetime.now() for _ in range(3)],
>>>            "customer": [1001, 1002, 1003],
>>>         }
>>>     )
>>> feature_retrieval_job = feast_client.get_batch_features(
>>>     feature_refs, entity_rows, default_project="my_project")
>>> df = feature_retrieval_job.to_dataframe()
>>> print(df)
get_feature_set(name: str, project: str = None) → Optional[feast.feature_set.FeatureSet][source]

Retrieves a feature set.

Parameters
  • project – Feast project that this feature set belongs to

  • name – Name of feature set

Returns

Returns either the specified feature set, or raises an exception if none is found

get_online_features(feature_refs: List[str], entity_rows: List[feast.serving.ServingService_pb2.EntityRow], project: Optional[str] = None) → feast.serving.ServingService_pb2.GetOnlineFeaturesResponse[source]

Retrieves the latest online feature data from Feast Serving

Parameters
  • feature_refs – List of feature references that will be returned for each entity. Each feature reference should have the following format: “feature_set:feature” where “feature_set” & “feature” refer to the feature and feature set names respectively. Only the feature name is required.

  • entity_rows – List of GetFeaturesRequest.EntityRow where each row contains entities. Timestamp should not be set for online retrieval. All entity types within a feature

  • project – Specifies the project which contain the FeatureSets which the requested features belong to.

Returns

Returns a list of maps where each item in the list contains the latest feature values for the provided entities

ingest(feature_set: Union[str, feast.feature_set.FeatureSet], source: Union[pandas.core.frame.DataFrame, str], chunk_size: int = 10000, max_workers: int = 7, disable_progress_bar: bool = False, timeout: int = 120) → str[source]

Loads feature data into Feast for a specific feature set.

Parameters
  • feature_set (typing.Union[str, feast.feature_set.FeatureSet]) – Feature set object or the string name of the feature set

  • source (typing.Union[pd.DataFrame, str]) –

    Either a file path or Pandas Dataframe to ingest into Feast Files that are currently supported:

    • parquet

    • csv

    • json

  • chunk_size (int) – Amount of rows to load and ingest at a time.

  • max_workers (int) – Number of worker processes to use to encode values.

  • disable_progress_bar (bool) – Disable printing of progress statistics.

  • timeout (int) – Timeout in seconds to wait for completion.

Returns

ingestion id for this dataset

Return type

str

list_entities() → Dict[str, feast.entity.Entity][source]

Returns a dictionary of entities across all feature sets

Returns

Dictionary of entities, indexed by name

list_feature_sets(project: str = None, name: str = None) → List[feast.feature_set.FeatureSet][source]

Retrieve a list of feature sets from Feast Core

Parameters
  • project – Filter feature sets based on project name

  • name – Filter feature sets based on feature set name

Returns

List of feature sets

list_ingest_jobs(job_id: str = None, feature_set_ref: feast.feature_set.FeatureSetRef = None, store_name: str = None)[source]

List the ingestion jobs currently registered in Feast, with optional filters. Provides detailed metadata about each ingestion job.

Parameters
  • job_id – Select specific ingestion job with the given job_id

  • feature_set_ref – Filter ingestion jobs by target feature set (via reference)

  • store_name – Filter ingestion jobs by target feast store’s name

Returns

List of IngestJobs matching the given filters

list_projects() → List[str][source]

List all active Feast projects

Returns

List of project names

property project

Retrieve currently active project

Returns

Project name

restart_ingest_job(job: feast.job.IngestJob)[source]

Restart ingestion job currently registered in Feast. NOTE: Data might be lost during the restart for some job runners. Does not support stopping a job in a transitional (ie pending, suspending, aborting), terminal state (ie suspended or aborted) or unknown status

Parameters

job – IngestJob to restart

property serving_secure

Retrieve Feast Serving client-side SSL/TLS setting

Returns

Whether client-side SSL/TLS is enabled

property serving_url

Retrieve Serving Core URL

Returns

Feast Serving URL string

set_project(project: Optional[str] = None)[source]

Set currently active Feast project

Parameters

project – Project to set as active. If unset, will reset to the default project.

stop_ingest_job(job: feast.job.IngestJob)[source]

Stop ingestion job currently resgistered in Feast Does nothing if the target job if already in a terminal state (ie suspended or aborted). Does not support stopping a job in a transitional (ie pending, suspending, aborting) or in a unknown status

Parameters

job – IngestJob to restart

version()[source]

Returns version information from Feast Core and Feast Serving

Feature Set

class feast.feature_set.FeatureSet(name: str, project: str = None, features: List[feast.feature.Feature] = None, entities: List[feast.entity.Entity] = None, source: feast.source.Source = None, max_age: Optional[google.protobuf.duration_pb2.Duration] = None)[source]

Represents a collection of features and associated metadata.

add(resource)[source]

Adds a resource (Feature, Entity) to this Feature Set. Does not register the updated Feature Set with Feast Core

Parameters

resource – A resource can be either a Feature or an Entity object

property created_timestamp

Returns the created_timestamp of this feature set

drop(name: str)[source]

Removes a Feature or Entity from a Feature Set. This does not apply any changes to Feast Core until the apply() method is called.

Parameters

name – Name of Feature or Entity to be removed

property entities

Returns list of entities from this feature set

export_tfx_schema() → tensorflow_metadata.proto.v0.schema_pb2.Schema[source]

Create a Tensorflow metadata schema from a FeatureSet.

Returns

Tensorflow metadata schema.

property features

Returns a list of features from this feature set

property fields

Returns a dict of fields from this feature set

classmethod from_dict(fs_dict)[source]

Creates a feature set from a dict

Parameters

fs_dict – A dict representation of a feature set

Returns

Returns a FeatureSet object based on the feature set dict

classmethod from_proto(feature_set_proto: feast.core.FeatureSet_pb2.FeatureSet)[source]

Creates a feature set from a protobuf representation of a feature set

Parameters

feature_set_proto – A protobuf representation of a feature set

Returns

Returns a FeatureSet object based on the feature set protobuf

classmethod from_yaml(yml: str)[source]

Creates a feature set from a YAML string body or a file path

Parameters

yml – Either a file path containing a yaml file or a YAML string

Returns

Returns a FeatureSet object based on the YAML file

get_kafka_source_brokers() → str[source]

Get the broker list for the source in this feature set

get_kafka_source_topic() → str[source]

Get the topic that this feature set has been configured to use as source

import_tfx_schema(schema: tensorflow_metadata.proto.v0.schema_pb2.Schema)[source]

Updates presence_constraints, shape_type and domain_info for all fields (features and entities) in the FeatureSet from schema in the Tensorflow metadata.

Parameters

schema – Schema from Tensorflow metadata

Returns

None

infer_fields_from_df(df: pandas.core.frame.DataFrame, entities: Optional[List[feast.entity.Entity]] = None, features: Optional[List[feast.feature.Feature]] = None, replace_existing_features: bool = False, replace_existing_entities: bool = False, discard_unused_fields: bool = False, rows_to_sample: int = 100)[source]

Adds fields (Features or Entities) to a feature set based on the schema of a Datatframe. Only Pandas dataframes are supported. All columns are detected as features, so setting at least one entity manually is advised.

Parameters
  • df – Pandas dataframe to read schema from

  • entities – List of entities that will be set manually and not inferred. These will take precedence over any existing entities or entities found in the dataframe.

  • features – List of features that will be set manually and not inferred. These will take precedence over any existing feature or features found in the dataframe.

  • replace_existing_features – If true, will replace existing features in this feature set with features found in dataframe. If false, will skip conflicting features.

  • replace_existing_entities – If true, will replace existing entities in this feature set with features found in dataframe. If false, will skip conflicting entities.

  • discard_unused_fields – Boolean flag. Setting this to True will discard any existing fields that are not found in the dataset or provided by the user

  • rows_to_sample – Number of rows to sample to infer types. All rows must have consistent types, even values within list types must be homogeneous

infer_fields_from_pa(table: pyarrow.lib.Table, entities: Optional[List[feast.entity.Entity]] = None, features: Optional[List[feast.feature.Feature]] = None, replace_existing_features: bool = False, replace_existing_entities: bool = False, discard_unused_fields: bool = False) → None[source]

Adds fields (Features or Entities) to a feature set based on the schema of a PyArrow table. Only PyArrow tables are supported. All columns are detected as features, so setting at least one entity manually is advised.

Parameters
  • table (pyarrow.lib.Table) – PyArrow table to read schema from.

  • entities (Optional[List[Entity]]) – List of entities that will be set manually and not inferred. These will take precedence over any existing entities or entities found in the PyArrow table.

  • features (Optional[List[Feature]]) – List of features that will be set manually and not inferred. These will take precedence over any existing feature or features found in the PyArrow table.

  • replace_existing_features (bool) – Boolean flag. If true, will replace existing features in this feature set with features found in dataframe. If false, will skip conflicting features.

  • replace_existing_entities (bool) – Boolean flag. If true, will replace existing entities in this feature set with features found in dataframe. If false, will skip conflicting entities.

  • discard_unused_fields (bool) – Boolean flag. Setting this to True will discard any existing fields that are not found in the dataset or provided by the user.

Returns

None

Return type

None

is_valid()[source]

Validates the state of a feature set locally. Raises an exception if feature set is invalid.

property max_age

Returns the maximum age of this feature set. This is the total maximum amount of staleness that will be allowed during feature retrieval for each specific feature row that is looked up.

property name

Returns the name of this feature set

property project

Returns the project that this feature set belongs to

property source

Returns the source of this feature set

property status

Returns the status of this feature set

to_proto() → feast.core.FeatureSet_pb2.FeatureSet[source]

Converts a feature set object to its protobuf representation

Returns

FeatureSetProto protobuf

class feast.feature_set.FeatureSetRef(project: str = None, name: str = None)[source]

Represents a reference to a featureset

classmethod from_feature_set(feature_set: feast.feature_set.FeatureSet)[source]

Construct a feature set reference that refers to the given feature set.

Parameters

feature_set – Feature set to create reference from.

Returns

FeatureSetRef that refers to the given feature set

classmethod from_str(ref_str: str)[source]

Parse a feature reference from string representation. (as defined by __repr__())

Parameters

ref_str – string representation of the reference.

Returns

FeatureSetRef constructed from the string

property name

Get the name of feature set referenced by this reference

property project

Get the project of feature set referenced by this reference

to_proto() → feast.core.FeatureSetReference_pb2.FeatureSetReference[source]

Convert and return this feature set reference to protobuf.

Returns

Protobuf version of this feature set reference.

Feature

class feast.feature.Feature(name: str, dtype: feast.value_type.ValueType)[source]

Feature field type

property bool_domain

Getter for bool_domain of this field

property domain

Getter for domain of this field

property dtype

Getter for data type of this field

property float_domain

Getter for float_domain of this field

classmethod from_proto(feature_proto: feast.core.FeatureSet_pb2.FeatureSpec)[source]
Parameters

feature_proto – FeatureSpec protobuf object

Returns

Feature object

property group_presence

Getter for group_presence of this field

property image_domain

Getter for image_domain of this field

property int_domain

Getter for int_domain of this field

property mid_domain

Getter for mid_domain of this field

property name

Getter for name of this field

property natural_language_domain

Getter for natural_language_domain of this field

property presence

Getter for presence of this field

property shape

Getter for shape of this field

property string_domain

Getter for string_domain of this field

property struct_domain

Getter for struct_domain of this field

property time_domain

Getter for time_domain of this field

property time_of_day_domain

Getter for time_of_day_domain of this field

to_proto() → feast.core.FeatureSet_pb2.FeatureSpec[source]

Converts Feature object to its Protocol Buffer representation

update_domain_info(feature: Union[tensorflow_metadata.proto.v0.schema_pb2.Feature, feast.core.FeatureSet_pb2.EntitySpec, feast.core.FeatureSet_pb2.FeatureSpec]) → None

Update the domain info in this field from Tensorflow Feature, Feast EntitySpec or FeatureSpec

Parameters

feature – Tensorflow Feature, Feast EntitySpec or FeatureSpec

Returns: None

update_presence_constraints(feature: Union[tensorflow_metadata.proto.v0.schema_pb2.Feature, feast.core.FeatureSet_pb2.EntitySpec, feast.core.FeatureSet_pb2.FeatureSpec]) → None

Update the presence constraints in this field from Tensorflow Feature, Feast EntitySpec or FeatureSpec

Parameters

feature – Tensorflow Feature, Feast EntitySpec or FeatureSpec

Returns: None

update_shape_type(feature: Union[tensorflow_metadata.proto.v0.schema_pb2.Feature, feast.core.FeatureSet_pb2.EntitySpec, feast.core.FeatureSet_pb2.FeatureSpec]) → None

Update the shape type in this field from Tensorflow Feature, Feast EntitySpec or FeatureSpec

Parameters

feature – Tensorflow Feature, Feast EntitySpec or FeatureSpec

Returns: None

property url_domain

Getter for url_domain of this field

property value_count

Getter for value_count of this field

class feast.feature.FeatureRef(name: str, feature_set: str = None)[source]

Feature Reference represents a reference to a specific feature.

classmethod from_proto(proto: feast.serving.ServingService_pb2.FeatureReference)[source]

Construct a feature reference from the given FeatureReference proto

Arg:

proto: Protobuf FeatureReference to construct from

Returns

FeatureRef that refers to the given feature

classmethod from_str(feature_ref_str: str, ignore_project: bool = False)[source]

Parse the given string feature reference into FeatureRef model String feature reference should be in the format feature_set:feature. Where “feature_set” and “name” are the feature_set name and feature name respectively.

Parameters
  • feature_ref_str – String representation of the feature reference

  • ignore_project – Ignore projects in given string feature reference instead throwing an error

Returns

FeatureRef that refers to the given feature

to_proto() → feast.serving.ServingService_pb2.FeatureReference[source]

Convert and return this feature set reference to protobuf.

Returns

Protobuf respresentation of this feature set reference.

Entity

class feast.entity.Entity(name: str, dtype: feast.value_type.ValueType)[source]

Entity field type

property bool_domain

Getter for bool_domain of this field

property domain

Getter for domain of this field

property dtype

Getter for data type of this field

property float_domain

Getter for float_domain of this field

classmethod from_proto(entity_proto: feast.core.FeatureSet_pb2.EntitySpec)[source]

Creates a Feast Entity object from its Protocol Buffer representation

Parameters

entity_proto – EntitySpec protobuf object

Returns

Entity object

property group_presence

Getter for group_presence of this field

property image_domain

Getter for image_domain of this field

property int_domain

Getter for int_domain of this field

property mid_domain

Getter for mid_domain of this field

property name

Getter for name of this field

property natural_language_domain

Getter for natural_language_domain of this field

property presence

Getter for presence of this field

property shape

Getter for shape of this field

property string_domain

Getter for string_domain of this field

property struct_domain

Getter for struct_domain of this field

property time_domain

Getter for time_domain of this field

property time_of_day_domain

Getter for time_of_day_domain of this field

to_proto() → feast.core.FeatureSet_pb2.EntitySpec[source]

Converts Entity to its Protocol Buffer representation

Returns

Returns EntitySpec object

update_domain_info(feature: Union[tensorflow_metadata.proto.v0.schema_pb2.Feature, feast.core.FeatureSet_pb2.EntitySpec, feast.core.FeatureSet_pb2.FeatureSpec]) → None

Update the domain info in this field from Tensorflow Feature, Feast EntitySpec or FeatureSpec

Parameters

feature – Tensorflow Feature, Feast EntitySpec or FeatureSpec

Returns: None

update_presence_constraints(feature: Union[tensorflow_metadata.proto.v0.schema_pb2.Feature, feast.core.FeatureSet_pb2.EntitySpec, feast.core.FeatureSet_pb2.FeatureSpec]) → None

Update the presence constraints in this field from Tensorflow Feature, Feast EntitySpec or FeatureSpec

Parameters

feature – Tensorflow Feature, Feast EntitySpec or FeatureSpec

Returns: None

update_shape_type(feature: Union[tensorflow_metadata.proto.v0.schema_pb2.Feature, feast.core.FeatureSet_pb2.EntitySpec, feast.core.FeatureSet_pb2.FeatureSpec]) → None

Update the shape type in this field from Tensorflow Feature, Feast EntitySpec or FeatureSpec

Parameters

feature – Tensorflow Feature, Feast EntitySpec or FeatureSpec

Returns: None

property url_domain

Getter for url_domain of this field

property value_count

Getter for value_count of this field

Value

class feast.value_type.ValueType[source]

Feature value type. Used to define data types in Feature Sets.

Source

class feast.source.KafkaSource(brokers: str = '', topic: str = '')[source]

Kafka feature set source type.

property brokers

Returns the list of broker addresses for this Kafka source

property source_type

Returns the type of source. For a Kafka source this will always return “kafka”

to_proto() → feast.core.Source_pb2.Source[source]

Converts this Source into its protobuf representation

property topic

Returns the topic for this feature set

class feast.source.Source[source]

Source is the top level class that represents a data source for finding feature data. Source must be extended with specific implementations to be useful

classmethod from_proto(source_proto: feast.core.Source_pb2.Source)[source]

Creates a source from a protobuf representation. This will instantiate and return a specific source type, depending on the protobuf that is passed in.

Parameters

source_proto – SourceProto python object

Returns

Source object

property source_type

The type of source. If not implemented, this will return “None”

to_proto()[source]

Converts this source object to its protobuf representation.

Job

class feast.job.IngestJob(job_proto: feast.core.IngestionJob_pb2.IngestionJob, core_stub: feast.core.CoreService_pb2_grpc.CoreServiceStub)[source]

Defines a job for feature ingestion in feast.

property external_id

Getter for IngestJob’s external job id.

property feature_sets

Getter for the IngestJob’s feature sets

property id

Getter for IngestJob’s job id.

reload()[source]

Update this IngestJob with the latest info from Feast

property source

Getter for the IngestJob’s data source.

property status

Getter for IngestJob’s status

property store

Getter for the IngestJob’s target feast store.

wait(status: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f2d3eae38d0>, timeout_secs: float = 300)[source]

Wait for this IngestJob to transtion to the given status. Raises TimeoutError if the wait operation times out.

Parameters
  • status – The IngestionJobStatus to wait for.

  • timeout_secs – Maximum seconds to wait before timing out.

class feast.job.RetrievalJob(job_proto: feast.serving.ServingService_pb2.Job, serving_stub: feast.serving.ServingService_pb2_grpc.ServingServiceStub)[source]

A class representing a job for feature retrieval in Feast.

get_avro_files(timeout_sec: int = 21600)[source]

Wait until job is done to get the file uri to Avro result files on Google Cloud Storage.

Parameters

timeout_sec (int) – Max no of seconds to wait until job is done. If “timeout_sec” is exceeded, an exception will be raised.

Returns

Google Cloud Storage file uris of the returned Avro files.

Return type

str

property id

Getter for the Job Id

reload()[source]

Reload the latest job status Returns: None

result(timeout_sec: int = 21600)[source]

Wait until job is done to get an iterable rows of result. The row can only represent an Avro row in Feast 0.3.

Parameters

timeout_sec (int) – Max no of seconds to wait until job is done. If “timeout_sec” is exceeded, an exception will be raised.

Returns

Iterable of Avro rows.

property status

Getter for the Job status from Feast Core

to_chunked_dataframe(max_chunk_size: int = -1, timeout_sec: int = 21600) → pandas.core.frame.DataFrame[source]

Wait until a job is done to get an iterable rows of result. This method will split the response into chunked DataFrame of a specified size to to be yielded to the instance calling it.

Parameters
  • max_chunk_size (int) – Maximum number of rows that the DataFrame should contain.

  • timeout_sec (int) – Max no of seconds to wait until job is done. If “timeout_sec” is exceeded, an exception will be raised.

Returns

Pandas DataFrame of the feature values.

Return type

pd.DataFrame

to_dataframe(timeout_sec: int = 21600) → pandas.core.frame.DataFrame[source]

Wait until a job is done to get an iterable rows of result. This method will split the response into chunked DataFrame of a specified size to to be yielded to the instance calling it.

Parameters
  • max_chunk_size (int) – Maximum number of rows that the DataFrame should contain.

  • timeout_sec (int) – Max no of seconds to wait until job is done. If “timeout_sec” is exceeded, an exception will be raised.

Returns

Pandas DataFrame of the feature values.

Return type

pd.DataFrame