Skip to content

Commit 4f85e3e

Browse files
authored
fix: Use ParquetDataset for Schema Inference (feast-dev#2686)
Updates to use ParquetDataset instead of ParquetFile to do schema inference. This supports both single files and directories of partitioned parquet datasets. Signed-off-by: Dirk Van Bruggen <[email protected]>
1 parent 7c69f1c commit 4f85e3e

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

sdk/python/feast/infra/offline_stores/file_source.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
from pyarrow._fs import FileSystem
55
from pyarrow._s3fs import S3FileSystem
6-
from pyarrow.parquet import ParquetFile
6+
from pyarrow.parquet import ParquetDataset
77

88
from feast import type_map
99
from feast.data_format import FileFormat, ParquetFormat
@@ -179,9 +179,9 @@ def get_table_column_names_and_types(
179179
filesystem, path = FileSource.create_filesystem_and_path(
180180
self.path, self.file_options.s3_endpoint_override
181181
)
182-
schema = ParquetFile(
182+
schema = ParquetDataset(
183183
path if filesystem is None else filesystem.open_input_file(path)
184-
).schema_arrow
184+
).schema.to_arrow_schema()
185185
return zip(schema.names, map(str, schema.types))
186186

187187
@staticmethod

0 commit comments

Comments
 (0)