- 2.25.0 (latest)
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Changelog
2.17.0 (2025-08-22)
Features
2.16.0 (2025-08-20)
Features
Add
bigframes.pandas.options.display.precisionoption (#1979) (15e6175)Add where, coalesce, fillna, casewhen, invert local impl (#1976) (f7f686c)
Support callable bigframes function for dataframe where (#1990) (44c1ec4)
When using
repr_mode = "anywidget", numeric values align right (15e6175)
Bug Fixes
Address the packages issue for bigframes function (#1991) (68f1d22)
Correct pypdf dependency specifier for remote PDF functions (#1980) (0bd5e1b)
Enable default retries in calls to BQ Storage Read API (#1985) (f25d7bd)
Fix the copyright year in dbt sample files (#1996) (fad5722)
Performance Improvements
Documentation
Add examples of running bigframes in kaggle (#2002) (7d89d76)
Remove preview warning from partial ordering mode sample notebook (#1986) (132e0ed)
2.15.0 (2025-08-11)
Features
Add
st_buffer,st_centroid, andst_convexhulland their corresponding GeoSeries methods (#1963) (c4c7fa5)Allow callable as a conditional or replacement input in DataFrame.where (#1971) (a8d57d2)
Bug Fixes
Add warnings for duplicated or conflicting type hints in bigfram… (#1956) (d38e42c)
Make
remote_functionmore robust when there arecreate_functionretries (#1973) (cd954ac)Make ExecutionMetrics stats tracking more robust to missing stats (#1977) (feb3ff4)
Performance Improvements
Documentation
2.14.0 (2025-08-05)
Features
Dynamic table width for better display across devices (https://github.com/googleapis/python-bigquery-dataframes/issues/1948) (a6d30ae) (a6d30ae)
Bug Fixes
Performance Improvements
Documentation
Add code snippet for storing dataframes to a CSV file (#1943) (a511e09)
Add code snippet for storing dataframes to a CSV file (#1953) (a298a02)
2.13.0 (2025-07-25)
Features
Add CSS styling for TableWidget pagination interface (#1934) (5b232d7)
Add row numbering local pushdown in hybrid execution (#1932) (92a2377)
Bug Fixes
Dependencies
2.12.0 (2025-07-23)
Features
Add code samples for dbt bigframes integration (#1898) (7e03252)
Allow local arithmetic execution in hybrid engine (#1906) (ebdcd02)
Provide day_of_year and day_of_week for dt accessor (#1911) (40e7638)
Support params
max_batching_rows,container_cpu, andcontainer_memoryforudf(#1897) (8baa912)Support typed pyarrow.Scalar in assignment (#1930) (cd28e12)
Bug Fixes
Correct min field from max() to min() in remote function tests (#1917) (d5c54fc)
Resolve location reset issue in bigquery options (#1914) (c15cb8a)
Series.str.isdigit in unicode superscripts and fractions (#1924) (8d46c36)
Documentation
Add code snippets for session and IO public docs (#1919) (6e01cbe)
Add snippets for performance optimization doc (#1923) (4da309e)
2.11.0 (2025-07-15)
Features
Add
__contains__to Index, Series, DataFrame (#1899) (07222bf)Add pagination buttons (prev/next) to anywidget mode for DataFrames (#1841) (8eca767)
Add total_rows property to pandas batches iterator (#1888) (e3f5e65)
Support bpd.Series(json_data, dtype=”json”) (#1882) (05cb7d0)
Bug Fixes
Show slot_millis_sum warning only when
allow_large_results=False(#1892) (25efabc)Used query row count metadata instead of table metadata (#1893) (e1ebc53)
2.10.0 (2025-07-08)
Features
df.to_pandas_batches()returns one empty DataFrame ifdfis empty (#1878) (e43d15d)Add simple stats support to hybrid local pushdown (#1873) (8715105)
Bug Fixes
Documentation
2.9.0 (2025-06-30)
Features
Add
bpd.read_arrowto convert an Arrow object into a bigframes DataFrame (#1855) (633bf98)Create
deploy_remote_functionanddeploy_udffunctions to immediately deploy functions to BigQuery (#1832) (c706759)
Bug Fixes
Fix bug with DataFrame.agg for string values (#1870) (81e4d64)
Generate GoogleSQL instead of legacy SQL data types for
dry_run=Truefrombpd._read_gbq_colabwith local pandas DataFrame (#1867) (fab3c38)Revert dict back to protobuf in the iam binding update (#1838) (9fb3cb4)
Documentation
2.8.0 (2025-06-23)
⚠ BREAKING CHANGES
- add required param ‘engine’ to multimodal functions (#1834)
Features
Add
bpd.options.compute.maximum_result_rowsoption to limit client data download (#1829) (e22a3f6)Add
bpd.options.display.repr_mode = "anywidget"to create an interactive display of the results (#1820) (be0a3cf)Add required param ‘engine’ to multimodal functions (#1834) (37666e4)
Performance Improvements
Documentation
2.7.0 (2025-06-16)
Features
Add bbq.json_query_array and warn bbq.json_extract_array deprecated (#1811) (dc9eb27)
Add bbq.json_value_array and deprecate bbq.json_extract_string_array (#1818) (019051e)
Support custom build service account in
remote_function(#1796) (e586151)
Bug Fixes
Documentation
Document how to use ai.map() for information extraction (#1808) (b586746)
Rearrange README.rst to include a short code sample (#1812) (f6265db)
Use pandas API instead of pandas-like or pandas-compatible (#1825) (aa32369)
2.6.0 (2025-06-09)
Features
Bug Fixes
Address
read_csvwith bothindex_colanduse_colsbehavior inconsistency with pandas (#1785) (ba7c313)Allow KMeans model init parameter as k-means++ alias (#1790) (0b59cf1)
Replace function now can handle bpd.NA value. (#1786) (7269512)
Documentation
Adjust strip method examples to match latest pandas (#1797) (817b0c0)
Fix docstrings to improve html rendering of code examples (#1788) (38d9b73)
2.5.0 (2025-05-30)
⚠ BREAKING CHANGES
- the updated
ai.map()parameter list is not backward-compatible
Features
Add
bpd.options.bigquery.requests_transport_adaptersoption (#1755) (bb45db8)Add bbq.json_query and warn bbq.json_extract deprecated (#1756) (ec81dd2)
Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs (#1723) (80aad9a)
Add structured output for ai map, ai filter and ai join (#1746) (133ac6b)
Add support for df.loclist, column(s) (768a757)
Include bq schema and query string in dry run results (#1752) (bb51147)
Support
inplace=Trueinrenameandrename_axis(#1744) (734cc65)Support astype conversions to and from JSON dtypes (#1716) (8ef4de1)
Support dtype parameter in read_csv for bigquery engine (#1749) (50dca4c)
Bug Fixes
Fix the default value for na_value for numpy conversions (#1766) (0629cac)
Include location in Session-based temporary storage manager DDL queries (#1780) (acba032)
Prevent creating unnecessary client objects in multithreaded environments (#1757) (1cf9f5e)
Reduce bigquery table modification via DML for to_gbq (#1737) (545cdca)
Stop ignoring arguments to
MatrixFactorization.score(X, y)(#1726) (55c07e9)Support JSON and STRUCT for bbq.sql_scalar (#1754) (190390b)
Performance Improvements
Faster local data comparison using idenitity (#1738) (2858b1e)
Use JOB_CREATION_OPTIONAL when
allow_large_results=False(#1763) (15f3f2a)
Dependencies
Documentation
Add MatrixFactorization to the table of contents (#1725) (611e43b)
Fix typo for “population” in the
GeminiTextGenerator.predict(..., output_schema={...})sample notebook (#1748) (bd07e05)Integrations notebook extracts token from
bqclient._http.credentialsinstead ofbqclient._credentials(#1784) (6e63eca)Use partial ordering mode in the quickstart sample (#1734) (476b7dd)
2.4.0 (2025-05-12)
Features
Add
.dt.days,.dt.seconds,dt.microseconds, anddt.total_seconds()for timedelta series. (#1713) (2b3a45f)Improve error message in
Series.applyfor direct udfs (#1673) (1a658b2)Publish bigframes blob(Multimodal) to preview (#1693) (e4c85ba)
Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models (#1305) (b16740e)
Support to_strip parameter for str.strip, str.lstrip and str.rstrip (#1705) (a84ee75)
Bug Fixes
Performance Improvements
Dependencies
Documentation
Add snippets for Matrix Factorization tutorials (#1630) (24b37ae)
Deprecate
bpd.options.bigquery.allow_large_resultsin favor ofbpd.options.compute.allow_large_results(#1597) (18780b4)Include import statement in the bigframes code snippet (#1699) (08d70b6)
Include the clean-up step in the udf code snippet (#1698) (48992e2)
Move multimodal notebook out of experimental folder (#1712) (68b6532)
2.3.0 (2025-05-06)
Features
Bug Fixes
Guarantee guid thread safety across threads (#1684) (cb0267d)
Support large lists of lists in bpd.Series() constructor (#1662) (0f4024c)
Use value equality to check types for unix epoch functions and timestamp diff (#1690) (81e8fb8)
Performance Improvements
Documentation
Add a visualization notebook to BigFrame samples (#1675) (ee062bf)
Update snippet for
Create a k-meansmodel tutorial (#1664) (761c364)
2.2.0 (2025-04-30)
Features
Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints (#1650) (4fb54df)
Add GeminiTextGenerator.predict structured output (#1653) (6199023)
DataFrames.getitem support for slice input (#1668) (563f0cb)
Print right origin of
PreviewWarningfor thebpd.udf(#1629) (48d10d1)Session.bytes_processed_sum will be updated when allow_large_re… (#1669) (ae312db)
Support names parameter in read_csv for bigquery engine (#1659) (3388191)
Support passing list of values to bigframes.core.sql.simple_literal (#1641) (102d363)
Bug Fixes
Prefer remote schema instead of throwing on materialize conflicts (#1644) (53fc25b)
Resolve issue where pre-release versions of google-auth are installed (#1491) (ebb7a5e)
Performance Improvements
Dependencies
Documentation
Fix
bq_dataframes_templatenotebook to work if partial ordering mode is enabled (#1665) (f442e7a)Note that
udfis in preview and must be python 3.11 compatible (#1629) (48d10d1)
2.1.0 (2025-04-22)
Features
Add
bigframes.bigquery.st_distancefunction (#1637) (bf1ae70)Enhance
read_csvindex_colparameter support (#1631) (f4e5b26)
Bug Fixes
Add retry for test_clean_up_via_context_manager (#1627) (58e7cb0)
Improve robustness of managed udf code extraction (#1634) (8cc56d5)
Documentation
2.0.0 (2025-04-17)
⚠ BREAKING CHANGES
make
datasetandnameparams mandatory inudf(#1619)Locational endpoints support is not available in BigFrames 2.0.
change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
change default ingress setting for
remote_functionto internal-only (#1544)make
remote_functionparams keyword only (#1537)make
remote_functiondefault service account explicit (#1537)set
allow_large_results=Falseby default (#1541)
Features
Add
onparameter indataframe.rolling()anddataframe.groupby.rolling()(#1556) (45c9d9f)Add support for creating a Matrix Factorization model (#1330) (b5297f9)
Allow
input_types,output_type, anddatasetto be used positionally inremote_function(#1560) (bcac8c6)Allow pandas.cut ‘labels’ parameter to accept a list of string (#1549) (af842b1)
Change default ingress setting for
remote_functionto internal-only (#1544) (c848a80)Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
Make
remote_functiondefault service account explicit (#1537) (9eb9089)Support bigquery connection in managed function (#1554) (f6f697a)
Support inlining small list, struct, json data (#1589) (2ce891f)
Use session temp tables for all ephemeral storage (#1569) (9711b83)
Use validated local storage for data uploads (#1612) (aee4159)
Warn the deprecated
max_download_size,random_stateandsampling_methodparameters in(DataFrame|Series).to_pandas()(#1573) (b9623da)
Bug Fixes
to_pandas_batches()respectspage_sizeandmax_resultsagain (#1572) (27c5905)Ensure
page_sizeworks correctly into_pandas_batcheswhenmax_resultsis not set (#1588) (570cff3)Include role and service account in IAM exception (#1564) (8c50755)
Make
datasetandnameparams mandatory inudf(#1619) (637e860)Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
Prevent
KeyErrorinbpd.concatwith empty DF and struct/array types DF (#1568) (b4da1cf)Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)
Performance Improvements
Dependencies
Documentation
Add details for
bigquery_connectionin[@bpd](https://github.com/bpd).udfdocstring (#1609) (ef63772)Add explain forecast snippet to multiple time series tutorial (#1586) (40c55a0)
Add message to remove default model for version 3.0 (#1563) (910be2b)
Add samples for ArimaPlus
time_series_id_colfeature (#1577) (1e4cd9c)Deprecate default model in
TextEmbedddingGenerator,GeminiTextGenerator, and otherbigframes.ml.llmclasses (#1570) (89ab33e)Include all licenses for vendored packages in the root LICENSE file (#1626) (8116ed0)
Remove gemini-1.5 deprecation warning for
GeminiTextGenerator(#1562) (0cc6784)Use restructured text to allow publishing to PyPI (#1565) (d1e9ec2)
Miscellaneous Chores
1.42.0 (2025-03-27)
Features
Add
GeoSeries.difference()andbigframes.bigquery.st_difference()(#1471) (e9fe815)Add
GeoSeries.intersection()andbigframes.bigquery.st_intersection()(#1529) (8542bd4)Allow iloc to support lists of negative indices (#1497) (a9cf215)
Bug Fixes
Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X (#1534) (c93e720)
Change the default value for pdf extract/chunk (#1517) (a70a607)
Read_pandas inline returns None when exceeds limit (#1525) (578081e)
Temporary fix for StreamingDataFrame not working backend bug (#1533) (6ab4ffd)
Tolerate BQ connection service account propagation delay (#1505) (6681f1f)
Performance Improvements
Documentation
1.41.0 (2025-03-19)
Features
Add support for the ‘right’ parameter in ‘pandas.cut’ (#1496) (8aff128)
Support BQ managed functions through
read_gbq_function(#1476) (802183d)Warn when the BigFrames version is more than a year old (#1455) (00e0750)
Bug Fixes
Performance Improvements
Documentation
1.40.0 (2025-03-11)
⚠ BREAKING CHANGES
- reading JSON data as a custom arrow extension type (#1458)
Features
Bug Fixes
Fix list-like indexers in partial ordering mode (#1456) (fe72ada)
Use
==instead ofisfor timedelta type equality checks (#1480) (0db248b)
Performance Improvements
1.39.0 (2025-03-05)
Features
(Preview) Support aggregations over timedeltas (#1418) (1251ded)
(Preview) Support arithmetics between dates and timedeltas (#1413) (962b152)
(Preview) Support automatic load of timedelta from BQ tables. (#1429) (b2917bb)
Add
allow_large_resultsoption to many I/O methods. Set toFalseto reduce latency (#1428) (dd2f488)Support interface for BigQuery managed functions (#1373) (2bbf53f)
Warn if default ingress_settings is used in remote_functions (#1419) (dfd891a)
Bug Fixes
Do not compare schema description during schema validation (#1452) (03a3a56)
Remove warnings for null index and partial ordering mode in prep for GA (#1431) (6785aee)
Warn if default
cloud_function_service_accountis used inremote_function(#1424) (fe7463a)Write chunked text instead of dummy text for pdf chunk (#1444) (96b0e8a)
Performance Improvements
Documentation
1.38.0 (2025-02-24)
Features
(Preview) Support diff aggregation for timestamp series. (#1405) (abe48d6)
Add
GeoSeries.from_wkt()andGeoSeries.to_wkt()(#1401) (2993b28)Support routines with ARRAY return type in
read_gbq_function(#1412) (4b60049)
Bug Fixes
Calling to_timdelta() over timedeltas no longer changes their values (#1411) (650a190)
Replace empty dict with None to avoid mutable default arguments (#1416) (fa4e3ad)
Performance Improvements
Dependencies
Documentation
Add samples using SQL methods via the
bigframes.bigquerymodule (#1358) (f54e768)Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial (#1310) (c6c9120)
1.37.0 (2025-02-19)
Features
(Preview) Support add, sub, mult, div, and more between timedeltas (#1396) (ffa63d4)
(Preview) Support comparison, ordering, and filtering for timedeltas (#1387) (34d01b2)
(Preview) Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns (#1390) (50ad3a5)
JSON dtype support for read_pandas and Series constructor (#1391) (44f4137)
Bug Fixes
Performance Improvements
Documentation
1.36.0 (2025-02-11)
Features
(Preview) Support addition between a timestamp and a timedelta (#1369) (b598aa8)
(Preview) Support casting floats and list-likes to timedelta series (#1362) (65933b6)
Add
bigframes.bigquery.st_areaand suggest it fromGeoSeries.area(#1318) (8b5ffa8)
Bug Fixes
Dtype parameter ineffective in Series/DataFrame construction (#1354) (b9bdca8)
Translate labels to col ids when copying dataframes (#1372) (0c55b07)
Performance Improvements
1.35.0 (2025-02-04)
Features
(Preview) Support timedeltas for read_pandas() (#1349) (866ba9e)
Allow
case_whento change dtypes if case list contains the condition(True, some_default_value)(#1311) (5c2a2c6)
Bug Fixes
Exclude
DataFrameandSeries__call__from unimplemented API metrics (#1351) (f2d5264)Make
DataFrame__getattr__and__setattr__more robust to subclassing (#1352) (417de3a)
Performance Improvements
Dependencies
Documentation
Add link to DataFrames intro to improve SEO (#1176) (aafb5be)
Add snippet to explain the univariate model’s forecast result in the Forecast a single time series with a univariate model tutorial (#1272) (c22126b)
1.34.0 (2025-01-27)
⚠ BREAKING CHANGES
- Enable reading JSON data with
dbjsonextension dtype (#1139)
Features
(df|s).hist(), (df|s).line(), (df|s).area(), (df|s).bar(), df.scatter() (#1320) (bd3f584)
(Preview) Define timedelta type and to_timedelta function (#1317) (3901951)
Enable reading JSON data with
dbjsonextension dtype (#1139) (f672262)
1.33.0 (2025-01-22)
Features
Add
bigframes.bigquery.sql_scalar()to apply SQL syntax on Series objects (#1293) (aa2f73a)Add unix_seconds, unix_millis and unix_micros for timestamp series. (#1297) (e4b0c8d)
Bug Fixes
Dataframe sort_values Series input keyerror. (#1285) (5a2731b)
Fix read_gbq_function issue in dataframe apply method (#1174) (0318764)
Series sort_index and sort_values now raises when axis!=0 (#1294) (94bc2f2)
Documentation
Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial (#1271) (a687050)
1.32.0 (2025-01-13)
Features
Bug Fixes
Avoid global mutation in
BigQueryOptions.client_endpoints_override(#1280) (788f6e9)Fix erroneous window bounds removal during compilation (#1163) (f91756a)
Dependencies
Documentation
Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents (#1266) (58f13cb)
Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial (#1267) (3dcae2d)
Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial (#1268) (059a564)
Use 002 model for better scalability in text generation (#1270) (bb7a850)
1.31.0 (2025-01-05)
Features
Bug Fixes
Raise if trying to change
ordering_modeafter session has started (#1252) (8cfaae8)Reduce the number of labels added to query jobs (#1245) (fdcdc18)
Documentation
1.30.0 (2024-12-30)
Features
Add
LinearRegression.predict_explain()to generateML.EXPLAIN_PREDICTcolumns (#1190) (e13eca2)Add
LogisticRegression.predict_explain()to generateML.EXPLAIN_PREDICTcolumns (#1222) (bcbc732)Add
write_engineparameter toread_FORMATNAMEmethods to control how data is written to BigQuery (#371) (ed47ef1)Add client side retry to GeminiTextGenerator (#1242) (8193abe)
Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 (#1208) (298fc73)
Add support for
LinearRegression.predict_explainandLogisticRegression.predict_explainparameter,top_k_features(#1228) (3068e19)
Bug Fixes
Throw an error message when setting is_row_processor=True to read a multi param function (#1160) (b2816a5)
Documentation
Add an “open in BQ Studio” link to all BigFrames sample notebooks (#1223) (e0a8288)
Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” (#1239) (840aaff)
Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial (#1227) (20f3190)
1.29.0 (2024-12-12)
Features
Documentation
1.28.0 (2024-12-11)
Features
bigframes.bigquery.vector_searchsupportsuse_brute_forceandfraction_lists_to_searchparameters (#1158) (131edc3)Add
ARIMAPlus.predict_explain()to generate forecasts with explanation columns (#1177) (05f8b4d)Add client_endpoints_override to bq options (#1167) (be74b99)
Add support for temporal types in dataframe’s describe() method (#1189) (2d564a6)
Allow join-free alignment of analytic expressions (#1168) (daef4f0)
Bug Fixes
Performance Improvements
Dependencies
Documentation
Add a code sample using
bpd.options.bigquery.ordering_mode = "partial"(#909) (f80d705)Add snippet for creating boosted tree model (#1142) (a972668)
Add snippet for evaluating a boosted tree model (#1154) (9d8970a)
Add snippet for predicting classifications using a boosted tree model (#1156) (e7b83f1)
Add third party
pandas.Index methodsand docstrings (#1171) (a970294)Fix Bigframes.Pandas.General_Function missing docs (#1164) (de923d0)
1.27.0 (2024-11-16)
Features
Bug Fixes
Documentation
1.26.0 (2024-11-12)
Features
Bug Fixes
Fix Series.to_frame generating string label instead of int where name is None (#1118) (14e32b5)
Update the API documentation with newly added rep (#1120) (72c228b)
Performance Improvements
Documentation
Add file for Classification with a Boosted Treed Model and snippet for preparing sample data (#1135) (7ac6639)
Add snippet for Linear Regression tutorial Predict Outcomes section (#1101) (108f4a9)
Update
DataFramedocstrings to include the errors section (#1127) (a38d4c4)Update Session doctrings to include exceptions (#1130) (a870421)
1.25.0 (2024-10-29)
Features
Add the
ground_with_google_searchoption for GeminiTextGenerator predict (#1119) (ca02cd4)Add warning when user tries to access struct series fields with
__getitem__(#1082) (20e5c58)Allow
fitto take additional eval data in linear and ensemble models (#1096) (254875c)Support context manager for bigframes session (#1107) (5f7b8b1)
Performance Improvements
1.24.0 (2024-10-24)
Features
Documentation
1.23.0 (2024-10-23)
Features
Add
bigframes.bigquery.create_vector_indexto assist in creating vector index onARRAY<FLOAT64>columns (#1024) (863d694)Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. (#1105) (7094c85)
Add support for pandas series & data frames as inputs for ml models. (#1088) (30c8883)
Cleanup temp resources with session deletion (#1068) (1d5373d)
Show possible correct key(s) in
.__getitem__KeyError message (#1097) (32fab96)
Bug Fixes
Performance Improvements
Speed up tree transforms during sql compile (#1071) (d73fe9d)
Utilize ORDER BY LIMIT over ROW_NUMBER where possible (#1077) (7003d1a)
Documentation
Show best practice of closing the session to cleanup resources in sample notebooks (#1095) (62a88e8)
Update docstrings of Session and related files (#1087) (bf93e80)
1.22.0 (2024-10-09)
Features
Support regional endpoints for more bigquery locations (#1061) (45b672a)
Update LLM generators to warn user about model name instead of raising error. (#1048) (650d80d)
Bug Fixes
Correct zero row count in DataFrame from table view (#1062) (b536070)
Fix generic error message when entering an incorrect column name (#1031) (5ac217d)
Make invalid location warning case-insensitive (#1044) (b6cd55a)
Show warning for unknown location set through .ctor (#1052) (02c2da7)
Performance Improvements
Documentation
1.21.0 (2024-10-02)
Features
Add deprecation warning to PaLM2TextGenerator model (#1035) (1183b0f)
Add DeprecationWarning for PaLM2TextEmbeddingGenerator (#1018) (4af5bbb)
Add ml.model_selection.cross_validate support (#1020) (1a38063)
Allow access of struct fields with dot operators on
Series(#1019) (ef76f13)
Bug Fixes
Documentation
1.20.0 (2024-09-25)
Features
Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations (#955) (1930b4e)
Allow multiple columns input for llm models (#998) (2fe5e48)
Bug Fixes
Documentation
Limit pypi notebook to 7 days and add more info about differences with partial ordering mode (#1013) (3c54399)
Move and edit existing linear-regression tutorial snippet (#991) (4cb62fd)
1.19.0 (2024-09-24)
Features
Support bool and bytes types in
describe(include='all')(#994) (cc48f58)Support ingress settings in
remote_function(#1011) (8e9919b)
Bug Fixes
Performance Improvements
Dependencies
1.18.0 (2024-09-18)
Features
Add “include” param to describe for string types (#973) (deac6d2)
Add
subsetparameter toDataFrame.dropnato select which columns to consider (#981) (f7c03dc)
Bug Fixes
DataFrameGroupby.agg now works with unnamed tuples (#985) (0f047b4)
Fix a bug that raises exception when re-indexing columns with their original order (#988) (596b03b)
Make the
Series.applyoutcomeassignable to the original dataframe in partial ordering mode (#874) (c94ead9)
Dependencies
1.17.0 (2024-09-11)
Features
Include the bigframes package version alongside the feedback link in error messages (#936) (7b59b6d)
Bug Fixes
Make
read_gbq_functionwork for multi-param functions (#947) (c750be6)Support
read_gbq_functionfor axis=1 application (#950) (86e54b1)
Documentation
1.16.0 (2024-09-04)
Features
Add
DataFrame.struct.explodeto add struct subfields to a DataFrame (#916) (ad2f75e)Implement
bigframes.bigquery.json_extract_array(#910) (575a29e)
Bug Fixes
Fix issue with iterating on >10gb dataframes (#949) (2b0f0fa)
Unordered mode errors in ml train_test_split (#925) (85d7c21)
Performance Improvements
Dependencies
Documentation
Create sample notebook to manipulate struct and array data (#883) (3031903)
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook (#890) (d1883cc)
1.15.0 (2024-08-20)
Features
Documentation
Add columns for “requires ordering/index” to supported APIs summary (#892) (d2fc51a)
Remove duplicate description for
kms_key_name(#898) (1053d56)
1.14.0 (2024-08-14)
Features
Bug Fixes
Performance Improvements
Documentation
1.13.0 (2024-08-05)
Features
df.apply(axis=1)to support remote function with mutiple params (#851) (2158818)Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters (#879) (8753bdd)
Bug Fixes
Documentation
1.12.0 (2024-07-31)
Features
Add config option to set partial ordering mode (#855) (823c0ce)
Add stratify param support to ml.model_selection.train_test_split method (#815) (27f8631)
Allow DataFrame.join for self-join on Null index (#860) (e950533)
Support remote function cleanup with
session.close(#818) (ed06436)Support to_csv/parquet/json to local files/objects (#858) (d0ab9cc)
Bug Fixes
Fewer relation joins from df self-operations (#823) (0d24f73)
Fix unordered mode using ordered path to print frame (#839) (93785cb)
Reduce redundant
remote_functiondeployments (#856) (cbf2d42)
Documentation
Add partner attribution steps to integrations sample notebook (#835) (d7b333f)
Make
get_global_session/close_session/reset_sessionappears in the docs (#847) (01d6bbb)
1.11.1 (2024-07-08)
Documentation
Remove session and connection in llm notebook (#821) (74170da)
Remove the experimental flask icon from the public docs (#820) (067ff17)
1.11.0 (2024-07-01)
Features
Add
bigframes.streaming.to_pubsubmethod to create continuous query that writes to Pub/Sub (#801) (b47f32d)Add
DataFrame.to_arrowto create Arrow Table from DataFrame (#807) (1e3feda)Add
PolynomialFeaturessupport toto_gbqand pipelines (#805) (57d98b9)Add Series.peek to preview data efficiently (#727) (580e1b9)
More informative error when query plan too complex (#811) (136dc24)
Bug Fixes
Documentation
1.10.0 (2024-06-21)
Features
Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
Bigframes.streaming module for continuous queries (#703) (0433a1c)
Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)
Bug Fixes
Allow
__repr__to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
Ensure numpy version matches in
remote_functiondeployment (#798) (324d93c)Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
Self-join optimization doesn’t needlessly invalidate caching (#797) (1b96b80)
1.9.0 (2024-06-10)
Features
Bug Fixes
Improve to_pandas_batches for large results (#746) (61f18cb)
Resolve issue with unset thread-local options (#741) (d93dbaf)
Documentation
1.8.0 (2024-05-31)
Features
mergeonly generates a default index if both inputs already have an index (#733) (25d049c)Add
GroupBy.size()to get number of rows in each group (#479) (1fca588)Add slot_millis and add stats to session object (#725) (72e9583)
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
Allow functions decorated with
bpd.remote_function()to execute locally (#704) (d850da6)Ensure
"bigframes-api"label is always set on jobs, even if the API is unknown (#722) (1832778)Support type annotations to supply input and output types to
bpd.remote_function()decorator (#717) (4a12e3c)Support type annotations with
bpd.remote_function()andaxis=1(a preview feature) (#730) (e5a2992)
Bug Fixes
Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
Set
bpd.remote_function()sinput_typesandoutput_typesdefault toNoneto allow omitting them when type annotations are present (#729) (0e25a3b)Warn and disable time travel for linked datasets (#712) (085fa9d)
Performance Improvements
Documentation
1.7.0 (2024-05-20)
Features
read_gbq_querysupportsfilters(9386373)read_gbqsuggests a correct column name when one is not found (9386373)Add
DefaultIndexKind.NULLto use asindex_colinread_gbq\*, creating an indexless DataFrame/Series (#662) (29e4886)Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
To_datetime supports utc=False for string inputs (#579) (adf9889)
Bug Fixes
read_gbq_tablerespects primary keys even whenfiltersare set (#689) (9386373)Improve escaping of literals and identifiers (#682) (da9b136)
Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
Remove a usage of the
resourcepackage when not available, such as on Windows (#681) (96243f2)
Performance Improvements
Don’t run query immediately from
read_gbq_tableiffiltersis set (9386373)Use a
LIMITclause whenmax_resultsis set (9386373)
Documentation
Add code snippets for imported onnx tutorials (#684) (cb36e46)
Add code snippets for imported tensorflow model (#679) (b02c401)
Use
class_weight="balanced"in the logistic regression prediction tutorial (#678) (b951549)
1.6.0 (2024-05-13)
Features
Add
strategy="quantile"in KBinsDiscretizer (#654) (c6c487f)Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
Support
axis=1indf.applyfor scalar outputs (#629) (f6bdc4a)Support gcf vpc connector in
remote_function(#677) (9ca92d0)Warn with a more specific
DefaultLocationWarningcategory when no location can be detected (#648) (e084e54)
Bug Fixes
Dependencies
- Add jellyfish as a dependency for spelling correction (57ccabc)
Documentation
1.5.0 (2024-05-07)
Features
bigframes.optionsandbigframes.option_contextnow uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)Add
ARIMAPlus.coef_property exposingML.ARIMA_COEFFICIENTSfunctionality (#585) (81d1262)Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
Add the
bigframes.bigquerysub-package with abigframes.bigquery.array_lengthfunction (#630) (9963f85)Always do a query dry run when
option.repr_mode == "deferred"(#652) (651fd7d)Warn with
DefaultIndexWarningfromread_gbqon clustered/partitioned tables with noindex_colorfiltersset (#631, #658) (2715d2b, 73064dd)Support
index_col=Falseinread_csvandengine="bigquery"(73064dd)Support gcf max instance count in
remote_function(#657) (36578ab)
Bug Fixes
Don’t raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
Fix bug with na in the column labels in stack (#659) (4a34293)
Documentation
Add python code sample for multiple forecasting time series (#531) (16866d2)
Fix the Palm2TextGenerator output token size (#649) (c67e501)
1.4.0 (2024-04-29)
Features
Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
Series binary ops compatible with more types (#618) (518d315)
Support the
scoremethod forPaLM2TextGenerator(#634) (3ffc1d2)
Bug Fixes
Performance Improvements
Automatically condense internal expression representation (#516) (03c1b0d)
Cache transpose to allow performant retranspose (#635) (44b738d)
Documentation
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
1.3.0 (2024-04-22)
Features
Add fine tuning
fit()for Palm2TextGenerator (#616) (9c106bd)Expose
max_batching_rowsinremote_function(#622) (240a1ac)Support primary key(s) in
read_gbqby using as theindex_colby default (#625) (75bb240)Warn if location is set to unknown location (#609) (3706b4f)
Bug Fixes
Documentation
Fix rendering of examples for multiple apis (#620) (9665e39)
Set
index_colsinread_gbqas a best practice (#624) (70015b7)
1.2.0 (2024-04-15)
Features
Bug Fixes
Documentation
1.1.0 (2024-04-04)
Features
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
Expose
DataFrame.bqclientto assist in integrations (#519) (0be8911)Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
Support
ML.GENERATE_EMBEDDINGinPaLM2TextEmbeddingGenerator(#539) (1156c1e)Support max_columns in repr and make repr more efficient (#515) (54e49cf)
Bug Fixes
Don’t download 100gb onto local python machine in load test (#537) (082c58b)
Exclude list-like s parameter in plot.scatter (#568) (1caac27)
Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
Product operation produces float result for all input types (#501) (6873b30)
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
Use bytes limit on frame inlining rather than element count (#576) (659a161)
Performance Improvements
Dependencies
Documentation
bigframes.options.bigquery.projectandlocationare optional in some circumstances (#548) (90bcec5)Add “Supported pandas APIs” reference to the documentation (#542) (74c3915)
Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
Address more comments from technical writers to meet legal purposes (#571) (9084df3)
Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)
1.0.0 (2024-03-25)
⚠ BREAKING CHANGES
rename model parameter
min_rel_progresstotolearly_stopsetting no longer supported, always usesTruerename model parameter
n_parallell_treeston_estimatorsrename
class_weightstoclass_weightrename
learn_ratetolearning_ratePCA
n_componentssupports float value andNone, default toNonerename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491)
Features
Allow assigning directly to Series.name property (#495) (ad0e99e)
Ensure
Series.str.len()can get length of array columns (#497) (10c0446)PCA
n_componentssupports float value andNone, default toNone(65c6f47)Rename
class_weightstoclass_weight(65c6f47)Rename
learn_ratetolearning_rate(65c6f47)Rename model parameter
min_rel_progresstotol(65c6f47)Rename model parameter
n_parallell_treeston_estimators(65c6f47)Rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491) (65c6f47)
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 (#504) (fbada4a)
Bug Fixes
early_stopsetting no longer supported, always usesTrue(65c6f47)Properly support format param for numerical input. (#486) (ae20c35)
Sampling plot cannot preserve ordering if index is not ordered (#475) (a5345fe)
Use actual BigQuery types rather than ibis types in to_pandas (#500) (82b4f91)
Dependencies
Documentation
Add code samples for metrics.{accuracy_score, confusion_matrix} (#478) (3e3329a)
Add code samples for metrics.{recall_score, precision_score, f11_score} (#502) (370fe90)
Update LLM + K-means notebook to handle partial failures (#496) (97afad9)
0.26.0 (2024-03-20)
⚠ BREAKING CHANGES
- exclude remote models for .register() (#465)
Features
read_gbq_tablesupportsLIKEas a operator infilters(#454) (d2d425a)Set
force=Trueby default inDataFrame.peek()(#469) (4e8e97d)Support datetime related casting in (Series|DataFrame|Index).astype (#442) (fde339b)
Bug Fixes
Any() on empty set now correctly returns False (#471) (f55680c)
Fix grouping series on multiple other series (#455) (3971bd2)
Groupby aggregates no longer check if grouping keys are numeric (#472) (4fbf938)
Raise
ValueErrorwhenread_pandas()receives a bigframesDataFrame(#447) (b28f9fd)Series.(to_csv|to_json) leverages bq export (#452) (718a00c)
Warn when
read_gbq/read_gbq_tableuses the snapshot time cache (#441) (e16a8c0)
Documentation
0.25.0 (2024-03-14)
Features
(Series|DataFrame).plot.(line|area|scatter) (#431) (0772510)
Support CMEK for
remote_functioncloud functions (#430) (2fd69f4)
0.24.0 (2024-03-12)
⚠ BREAKING CHANGES
read_parquetuses a “pandas” engine to parse files by default. Useengine="bigquery"for the previous behavior
Features
Bug Fixes
Move
third_party.bigframes_vendoredtobigframes_vendored(#424) (763edeb)Only do row identity based joins when joining by index (#356) (76b252f)
Documentation
Add predict sample to samples/snippets/bqml_getting_started_test.py (#388) (6a3b0cc)
Fix the note rendering for DataFrames methods: nlargest, nsmallest (#417) (38bd2ba)
0.23.0 (2024-03-05)
Features
Bug Fixes
Dependencies
Documentation
0.22.0 (2024-02-27)
⚠ BREAKING CHANGES
rename cosine_similarity to paired_cosine_distances (#393)
move model optional args to kwargs (#381)
Features
Bug Fixes
Avoid ibis warning for “database” table() method argument (#390) (a0490a4)
Rename cosine_similarity to paired_cosine_distances (#393) (81ece46)
Performance Improvements
Dependencies
Documentation
Miscellaneous Chores
Code Refactoring
0.21.0 (2024-02-13)
Features
Add ml.metrics.pairwise.cosine_similarity function (#374) (126f566)
Support bigframes.pandas.to_datetime for scalars, iterables and series. (#372) (ffb0d15)
Bug Fixes
Documentation
0.20.1 (2024-02-06)
Performance Improvements
Documentation
0.20.0 (2024-01-30)
Features
Add
DataFrame.peek()as an efficient alternative tohead()results preview (#318) (9c34d83)Add ARIMA_EVAULATE options in forecasting models (#336) (73e997b)
Add Index constructor, repr, copy, get_level_values, to_series (#334) (e5d054e)
Improve error message for drive based BQ table reads (#344) (0794788)
Update cut to work without labels = False and show intervals as dict (#335) (4ff53db)
Bug Fixes
Chance default connection name in getting_started.ipnyb (#347) (677f014)
Series iteration correctly returns values instead of index (#339) (2c6af9b)
Documentation
0.19.2 (2024-01-22)
Bug Fixes
Documentation
0.19.1 (2024-01-17)
Bug Fixes
Documentation
0.19.0 (2024-01-09)
Features
Allow manually set clustering_columns in dataframe.to_gbq (#302) (9c21323)
Support assigning to columns like a property (#304) (f645c56)
Support upcasting numeric columns in concat (#294) (e3a056a)
Bug Fixes
Documentation
0.18.0 (2024-01-02)
Features
Add IntervalIndex support to bigframes.pandas.cut (#254) (6c1969a)
Specific pyarrow mappings for decimal, bytes types (#283) (a1c0631)
Bug Fixes
Dataframes to_gbq now creates dataset if it doesn’t exist (#222) (bac62f7)
Exclude pandas 2.2.0rc0 to unblock prerelease tests (#292) (ac1a745)
Fix DataFrameGroupby.agg() issue with as_index=False (#273) (ab49350)
Make
Series.str.replacework for simple strings (#285) (ad67465)Update dataframe.to_gbq to dedup column names. (#286) (746115d)
Dependencies
Documentation
Add code snippets for explore query result page (#278) (7cbbb7d)
Code samples for
astypecommon to DataFrame and Series (#280) (95b673a)Code samples for
DataFrame.copyandSeries.copy(#290) (7cbc2b0)Code samples for
isna,isnull,dropna,isin(#289) (ad51035)Code samples for
reset_indexandsort_values(#282) (acc0eb7)Code samples for
Series.{add, replace, unique, T, transpose}(#287) (0e1bbfc)Code samples for
Series.{map, to_list, count}(#290) (7cbc2b0)Code samples for
Series.groupbyandSeries.{sum,mean,min,max}(#280) (95b673a)Code samples for DataFrame
set_index,items(#295) (c2b1892)
0.17.0 (2023-12-14)
Features
Bug Fixes
Increase recursion limit, cache compilation tree hashes (#184) (b54791c)
Replaced raise
NotImplementedErrorwith returnNotImplemented(#258) (a133822)
Documentation
0.16.0 (2023-12-12)
Features
Add DataFrame from_dict and from_records methods (#244) (8d81e24)
Add nunique method to Series/DataFrameGroupby (#256) (c8ec245)
Support dataframe.loc with conditional columns selection (#233) (3febea9)
Bug Fixes
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests (b02fc2c)
Fix value_counts column label for normalize=True (#245) (d3fa6f2)
Migrate e2e tests to bigframes-load-testing project (8766ac6)
Documentation
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… (#252) (8c63697)
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… (#234) (e735412)
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod (#243) (0523a31)
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… (#251) (77074ec)
Correct the params rendering for
ml.remoteandml.ensemblemodules (#248) (c2829e3)
0.15.0 (2023-11-29)
⚠ BREAKING CHANGES
- model.predict returns all the columns (#204)
Features
Add info and memory_usage methods to dataframe (#219) (9d6613d)
Send warnings on LLM prediction partial failures (#216) (81125f9)
Bug Fixes
Avoid unnecessary row_number() on sort key for io (#211) (a18d40e)
Make to_pandas override enable_downsampling when sampling_method is manually set. (#200) (ae03756)
Update the llm+kmeans notebook with recent change (#236) (f8917ab)
Use anonymous dataset to create
remote_function(#205) (69b016e)
Documentation
Add code samples for
indexandcolumnproperties (#212) (c88d38e)Add code samples for df reshaping, function, merge, and join methods (#203) (010486c)
Add examples for dataframe.kurt, dataframe.std, dataframe.count (#232) (f9c6e72)
Add examples for dataframe.mean, dataframe.median, dataframe.va… (#228) (edd0522)
Add examples for dataframe.min, dataframe.max and dataframe.sum (#227) (3a375e8)
Code samples for
Series.dotandDataFrame.dot(#226) (b62a07a)Code samples for
Series.whereandSeries.mask(#217) (52dfad2)Code samples for dataframe.any, dataframe.all and dataframe.prod (#223) (d7957fa)
Make the code samples reflect default bq connection usage (#206) (71844b0)
Miscellaneous Chores
0.14.1 (2023-11-16)
Bug Fixes
Documentation
0.14.0 (2023-11-14)
Features
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods (#162) (6a28403)
Add series.sample (identical to existing dataframe.sample) (#187) (37914a4)
Log most recent API calls as
recent-bigframes-api-xxlabels on BigQuery jobs (#145) (4ea33b7)Read_gbq creates order deterministically without table copy (#191) (8ab81de)
Support
date_series.astype("string[pyarrow]")to cast DATE to STRING (#186) (aee0e8e)Temporary resources no longer use BigQuery Sessions (#194) (4a02cac)
Bug Fixes
Default to 7 days expiration for
read_csv,read_json,read_parquet(#193) (03606cd)Deprecate the
remote_service_typein llm model (#180) (a8a409a)For reset_index on unnamed multiindex, always use level_[n] label (#182) (f95000d)
Match pandas behavior when assigning listlike to empty dfs (#172) (c1d1f42)
Use anonymous dataset instead of session dataset for temp tables (#181) (800d44e)
Use random table when loading data for
read_csv,read_json,read_parquet(#175) (9d2e6dc)
Documentation
Add code samples for
read_gbq_functionusing community UDFs (#188) (7506eab)Add docstring code samples for
Series.applyandDataFrame.map(#185) (c816d84)Add llm kmeans notebook as an included example (#177) (d49ae42)
Use
head()to get topnresults, not to preview results (#190) (87f84c9)
0.13.0 (2023-11-07)
Features
to_gbqwithout a destination table writes to a temporary table (#158) (e1817c9)Add
DataFrame.__iter__,DataFrame.iterrows,DataFrame.itertuples, andDataFrame.keysmethods (#164) (c065071)Support 32k text-generation and multilingual embedding models (#161) (5f0ea37)
Bug Fixes
0.12.0 (2023-11-01)
Features
Add
DataFrame.to_pandas_batches()to download largeDataFrameobjects (#136) (3afd4a3)Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs (#133) (63c7919)
Bug Fixes
Fix bug with column names under repeated column assignment (#150) (29032d0)
Resolve plotly rendering issue by using ipython html for job pro… (#134) (39df43e)
Use indexee’s session for loc listlike cases (#152) (27c5725)
Documentation
Fix indentation on
read_gbq_functioncode sample (#163) (0801d96)Link to ML.EVALUATE BQML page for score() methods (#137) (45c617f)
0.11.0 (2023-10-26)
Features
Add back
reset_sessionas an alias forclose_session(#124) (694a85a)Change
queryparameter toquery_or_tableinread_gbq(#127) (f9bb3c4)
Bug Fixes
Expose
bigframes.pandas.reset_sessionas a public API (#128) (b17e1f4)Use series’s own session in series.reindex listlike case (#135) (95bff3f)
Documentation
Add runnable code samples for DataFrames I/O methods and property (#129) (6fea8ef)
Add runnable code samples for reading methods (#125) (a669919)
0.10.0 (2023-10-19)
Features
0.9.0 (2023-10-18)
⚠ BREAKING CHANGES
- rename
bigframes.pandas.reset_sessiontoclose_session(#101)
Features
Add
bigframes.options.bigquery.application_namefor partner attribution (#117) (52d64ff)Rename
bigframes.pandas.reset_sessiontoclose_session(#101) (36693bf)Send BigQuery cancel request when canceling bigframes process (#103) (e325fbb)
Support external packages in
remote_function(#98) (ec10c4a)Use ArrowDtype for STRUCT columns in
to_pandas(#85) (9238fad)
Bug Fixes
Performance Improvements
Documentation
0.8.0 (2023-10-12)
⚠ BREAKING CHANGES
- The default behavior of
to_parquetis changing from no compression to'snappy'compression.
Features
- Support compression in
to_parquet(a8c286f)
Bug Fixes
0.7.0 (2023-10-11)
Features
Bug Fixes
Documentation
0.6.0 (2023-10-04)
Features
Bug Fixes
0.5.0 (2023-09-28)
Features
Add
DataFrame.kurtosis/DF.kurtmethod (c1900c2)Add
DataFrame.rollingandDataFrame.expandingmethods (c1900c2)Add index
dtype,astype,drop,fillna, aggregate attributes. (#38) (1a254a4)Support
calculate_p_valuesparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
class_weights="balanced"inLogisticRegressionmodel (c1900c2)Support
df[column_name] = df_only_one_column(c1900c2)Support
early_stopparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
enable_global_explainparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
l2_regparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
learn_rate_strategyparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
ls_init_learn_rateparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
max_iterationsparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
min_rel_progressparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)Support
optimize_strategyparameter inbigframes.ml.linear_model.LinearRegression(c1900c2)
Bug Fixes
Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
Performance Improvements
Add ability to cache dataframe and series to session table (#51) (416d7cb)
Inline small
SeriesandDataFramesin query text (#45) (5e199ec)Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
Documentation
- Link to Remote Functions code samples from README and API reference (c1900c2)
0.4.0 (2023-09-16)
Features
Add
axisparameter todroplevelandreorder_levels(7c6b0dd)Add
bfillandffilltoDataFrameandSeries(7c6b0dd)Add
DataFrame.combineandDataFrame.combine_first(#27) (7c6b0dd)Add
DataFrame.nlargest,nsmallest(7c6b0dd)Add
DataFrame.pct_changeandSeries.pct_change(7c6b0dd)Add
DataFrame.skewandGroupBy.skew(7c6b0dd)Add
DataFrame.to_dict,to_excel,to_latex,to_records,to_string,to_markdown,to_pickle,to_orc(7c6b0dd)Add
diffmethod toDataFrameandGroupBy(7c6b0dd)Add
filterandreindextoSeriesandDataFrame(7c6b0dd)Add
reindex_liketoDataFrameandSeries(7c6b0dd)Add
swapleveltoDataFrameandSeries(7c6b0dd)Add partial support for
Sereies.replace(7c6b0dd)Support
DataFrame.loc[bool_series, column] = scalar(7c6b0dd)Support a persistent
nameinremote_function(7c6b0dd)
Bug Fixes
remote_functionuses same credentials as other APIs (7c6b0dd)Add type hints to models (7c6b0dd)
Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
Remove
transformsparameter inmodel.fit(breaking change) (7c6b0dd)Support column joins with “None indexer” (7c6b0dd)
Use for literals
Int64Dtypeincut(7c6b0dd)Use lowercase strings for parameter literals in
bigframes.ml(breaking change) (7c6b0dd)
Performance Improvements
bigframes-apilabel to I/O query jobs (7c6b0dd)
Documentation
Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
Document region logic in README (7c6b0dd)
Fix OneHotEncoder sample (7c6b0dd)
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
Add
bigframes.get_global_session()andbigframes.reset_session()aliases (a32b747)Add
bigframes.pandas.read_picklefunction (a32b747)Add
components_,explained_variance_, andexplained_variance_ratio_properties tobigframes.ml.decomposition.PCA(89b9503)Add
fit_transformtobigquery.mltransformers (a32b747)Add
Series.dropnaandDataFrame.fillna(8fab755)Add
Series.strmethodsisalpha,isdigit,isdecimal,isalnum,isspace,islower,isupper,zfill,center(a32b747)Support
bigframes.pandas.merge()(8fab755)Support
DataFrame.isinwith list and dict inputs (8fab755)Support
DataFrame.pivot(a32b747)Support
DataFrame.stack(89b9503)Support
DataFrame-DataFramebinary operations (8fab755)Support
df[my_column] = [a python list](89b9503)Support
Index.is_monotonic(8fab755)Support
np.arcsin,np.arccos,np.arctan,np.sinh,np.cosh,np.tanh,np.arcsinh,np.arccosh,np.arctanh,np.expwith Series argument (89b9503)Support
np.sin,np.cos,np.tan,np.log,np.log10,np.sqrt,np.abswith Series argument (89b9503)Support
pow()and power operator inDataFrameandSeries(8fab755)Support
read_jsonwithengine=bigqueryfor newline-delimited JSON files (89b9503)Support
Series.corr(89b9503)Support
Series.map(8fab755)Support for
np.add,np.subtract,np.multiply,np.divide,np.power(8fab755)Support MultiIndex for DataFrame columns (a32b747)
Use
pandas.Indexfor column labels (a32b747)Use default session and connection in
ml.llmandml.imported(8fab755)
Bug Fixes
Add error message to
set_index(a32b747)Align column names with pandas in
DataFrame.aggresults (89b9503)Allow (but still not recommended)
ORDER BYinread_gbqinput when anindex_colis defined (89b9503)Check for IAM role on the BigQuery connection when initializing a
remote_function(89b9503)Check that types are specified in
read_gbq_function(a32b747)Don’t use query cache for Session construction (a32b747)
Include survey link in abstract
NotImplementedErrorexception messages (89b9503)Label temp table creation jobs with
source=bigquery-dataframes-templabel (89b9503)Make
X_trainargument names consistent across methods (8fab755)Raise AttributeError for unimplemented pandas methods (89b9503)
Raise exception for invalid function in
read_gbq_function(a32b747)Support spaces in column names in
DataFrameinitializater (89b9503)
Performance Improvements
Add local cache for
__repr_\*__methods (a32b747)Lazily instantiate client library objects (89b9503)
Use
row_number()filter forhead/tail(8fab755)
Documentation
Add ML section under Overview (a32b747)
Add release status to table of contents (a32b747)
Add samples and best practices to
read_gbqdocs (a32b747)Correct the return types of Dataframe and Series (a32b747)
Create subfolders for notebooks (a32b747)
Fix link to GitHub (89b9503)
Highlight bigframes is open-source (a32b747)
Sample ML Drug Name Generation notebook (a32b747)
Set
options.bigquery.projectin sample code (89b9503)Transform remote function user guide into sample code (a32b747)
Update remote function notebook with read_gbq_function usage (8fab755)
0.2.0 (2023-08-17)
Features
Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.
Bug Fixes
Update ML package to use sessions for queries.
Optimize
read_gbqwithindex_colset to cluster byindex_col.Raise ValueError if the location mismatched.
read_gbqno longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.pyand use correct terminology forconsole.cloud.google.comlinks.
0.1.0 (2023-08-11)
Features
Add
bigframes.pandaspackage with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.Add
bigframes.mlpackage with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.
0.0.0 (2023-02-22)
- Empty package to reserve package name.