feast.loaders package

Submodules

feast.loaders.abstract_producer module

class feast.loaders.abstract_producer.AbstractProducer(brokers: str, row_count: int, disable_progress_bar: bool)[source]

Bases: object

Abstract class for Kafka producers

flush(timeout: int)[source]
print_results() → None[source]

Print ingestion statistics.

Returns

None

Return type

None

produce(topic: str, data: bytes)[source]
class feast.loaders.abstract_producer.ConfluentProducer(brokers: str, row_count: int, disable_progress_bar: bool)[source]

Bases: feast.loaders.abstract_producer.AbstractProducer

Concrete implementation of Confluent Kafka producer (confluent-kafka)

flush(timeout: Optional[int])[source]

Generic flush that implements confluent-kafka’s flush method.

Parameters

timeout (Optional[int]) – Timeout in seconds to wait for completion.

Returns

Number of messages still in queue.

Return type

int

produce(topic: str, value: bytes) → None[source]

Generic produce that implements confluent-kafka’s produce method to push a byte encoded object into a Kafka topic.

Parameters
  • topic (str) – Kafka topic.

  • value (bytes) – Byte encoded object.

Returns

None.

Return type

None

class feast.loaders.abstract_producer.KafkaPythonProducer(brokers: str, row_count: int, disable_progress_bar: bool)[source]

Bases: feast.loaders.abstract_producer.AbstractProducer

Concrete implementation of Python Kafka producer (kafka-python)

flush(timeout: Optional[int])[source]

Generic flush that implements kafka-python’s flush method.

Parameters

timeout (Optional[int]) – timeout in seconds to wait for completion.

Returns

None

Raises

KafkaTimeoutError – failure to flush buffered records within the provided timeout

produce(topic: str, value: bytes)[source]

Generic produce that implements kafka-python’s send method to push a byte encoded object into a Kafka topic.

Parameters
  • topic (str) – Kafka topic.

  • value (bytes) – Byte encoded object.

Returns

resolves to RecordMetadata

Return type

FutureRecordMetadata

Raises

KafkaTimeoutError – if unable to fetch topic metadata, or unable to obtain memory buffer prior to configured max_block_ms

feast.loaders.abstract_producer.get_producer(brokers: str, row_count: int, disable_progress_bar: bool) → Union[feast.loaders.abstract_producer.ConfluentProducer, feast.loaders.abstract_producer.KafkaPythonProducer][source]

Simple context helper function that returns a AbstractProducer object when invoked.

This helper function will try to import confluent-kafka as a producer first.

This helper function will fallback to kafka-python if it fails to import confluent-kafka.

Parameters
  • brokers (str) – Kafka broker information with hostname and port.

  • row_count (int) – Number of rows in table

Returns

Concrete implementation of a Kafka producer. Ig can be:
  • confluent-kafka producer

  • kafka-python producer

Return type

Union[ConfluentProducer, KafkaPythonProducer]

feast.loaders.file module

feast.loaders.file.export_dataframe_to_local(df: pandas.core.frame.DataFrame, dir_path: Optional[str] = None) → Tuple[str, str, str][source]

Exports a pandas DataFrame to the local filesystem.

Parameters
  • df (pd.DataFrame) – Pandas DataFrame to save.

  • dir_path (Optional[str]) – Absolute directory path ‘/data/project/subfolder/’.

Returns

Tuple of directory path, file name and destination path. The destination path can be obtained by concatenating the directory path and file name.

Return type

Tuple[str, str, str]

feast.loaders.file.export_source_to_staging_location(source: Union[pandas.core.frame.DataFrame, str], staging_location_uri: str) → List[str][source]

Uploads a DataFrame as an Avro file to a remote staging location.

The local staging location specified in this function is used for E2E tests, please do not use it.

Parameters
  • (Union[pd.DataFrame, str] (source) –

    Source of data to be staged. Can be a pandas DataFrame or a file path.

    Only four types of source are allowed:
    • Pandas DataFrame

    • Local Avro file

    • GCS Avro file

    • S3 Avro file

    • Azure Blob storage Avro file

  • staging_location_uri (str) –

    Remote staging location where DataFrame should be written. .. rubric:: Examples

    • gs://bucket/path/

    • s3://bucket/path/

    • wasbs://bucket@account_name.blob.core.windows.net/path/

    • file:///data/subfolder/

Returns

Returns a list containing the full path to the file(s) in the remote staging location.

Return type

List[str]

feast.loaders.ingest module

feast.loaders.yaml module

feast.loaders.yaml.yaml_loader(yml, load_single=False)[source]

Loads one or more Feast resources from a YAML path or string. Multiple resources can be divided by three hyphens ‘—’

Parameters
  • yml – A path ending in .yaml or .yml, or a YAML string

  • load_single – Expect only a single YAML resource, fail otherwise

Returns

Either a single YAML dictionary or a list of YAML dictionaries

Module contents