Skip to content

Commit a49f70c

Browse files
niklasvmadchia
authored andcommitted
fix: Fix Spark offline store type conversion to arrow (feast-dev#3071)
* Fix unit tests related to empty list types Signed-off-by: niklasvm <[email protected]> * formatting Signed-off-by: niklasvm <[email protected]> Signed-off-by: niklasvm <[email protected]>
1 parent a32d247 commit a49f70c

File tree

1 file changed

+7
-2
lines changed
  • sdk/python/feast/infra/offline_stores/contrib/spark_offline_store

1 file changed

+7
-2
lines changed

sdk/python/feast/infra/offline_stores/contrib/spark_offline_store/spark.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import tempfile
12
import warnings
23
from datetime import datetime
34
from typing import Dict, List, Optional, Tuple, Union
@@ -6,6 +7,7 @@
67
import pandas
78
import pandas as pd
89
import pyarrow
10+
import pyarrow.parquet as pq
911
import pyspark
1012
from pydantic import StrictStr
1113
from pyspark import SparkConf
@@ -267,8 +269,11 @@ def _to_df_internal(self) -> pd.DataFrame:
267269

268270
def _to_arrow_internal(self) -> pyarrow.Table:
269271
"""Return dataset as pyarrow Table synchronously"""
270-
df = self.to_df()
271-
return pyarrow.Table.from_pandas(df) # noqa
272+
273+
# write to temp parquet and then load it as pyarrow table from disk
274+
with tempfile.TemporaryDirectory() as temp_dir:
275+
self.to_spark_df().write.parquet(temp_dir, mode="overwrite")
276+
return pq.read_table(temp_dir)
272277

273278
def persist(self, storage: SavedDatasetStorage):
274279
"""

0 commit comments

Comments
 (0)