- 2.25.0 (latest)
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Changelog
2.15.0 (2025-08-11)
Features
Add
st_buffer
,st_centroid
, andst_convexhull
and their corresponding GeoSeries methods (#1963) (c4c7fa5)Allow callable as a conditional or replacement input in DataFrame.where (#1971) (a8d57d2)
Bug Fixes
Add warnings for duplicated or conflicting type hints in bigfram… (#1956) (d38e42c)
Make
remote_function
more robust when there arecreate_function
retries (#1973) (cd954ac)Make ExecutionMetrics stats tracking more robust to missing stats (#1977) (feb3ff4)
Performance Improvements
Documentation
2.14.0 (2025-08-05)
Features
Dynamic table width for better display across devices (https://github.com/googleapis/python-bigquery-dataframes/issues/1948) (a6d30ae) (a6d30ae)
Bug Fixes
Performance Improvements
Documentation
Add code snippet for storing dataframes to a CSV file (#1943) (a511e09)
Add code snippet for storing dataframes to a CSV file (#1953) (a298a02)
2.13.0 (2025-07-25)
Features
Add CSS styling for TableWidget pagination interface (#1934) (5b232d7)
Add row numbering local pushdown in hybrid execution (#1932) (92a2377)
Bug Fixes
Dependencies
2.12.0 (2025-07-23)
Features
Add code samples for dbt bigframes integration (#1898) (7e03252)
Allow local arithmetic execution in hybrid engine (#1906) (ebdcd02)
Provide day_of_year and day_of_week for dt accessor (#1911) (40e7638)
Support params
max_batching_rows
,container_cpu
, andcontainer_memory
forudf
(#1897) (8baa912)Support typed pyarrow.Scalar in assignment (#1930) (cd28e12)
Bug Fixes
Correct min field from max() to min() in remote function tests (#1917) (d5c54fc)
Resolve location reset issue in bigquery options (#1914) (c15cb8a)
Series.str.isdigit in unicode superscripts and fractions (#1924) (8d46c36)
Documentation
Add code snippets for session and IO public docs (#1919) (6e01cbe)
Add snippets for performance optimization doc (#1923) (4da309e)
2.11.0 (2025-07-15)
Features
Add
__contains__
to Index, Series, DataFrame (#1899) (07222bf)Add pagination buttons (prev/next) to anywidget mode for DataFrames (#1841) (8eca767)
Add total_rows property to pandas batches iterator (#1888) (e3f5e65)
Support bpd.Series(json_data, dtype=”json”) (#1882) (05cb7d0)
Bug Fixes
Show slot_millis_sum warning only when
allow_large_results=False
(#1892) (25efabc)Used query row count metadata instead of table metadata (#1893) (e1ebc53)
2.10.0 (2025-07-08)
Features
df.to_pandas_batches()
returns one empty DataFrame ifdf
is empty (#1878) (e43d15d)Add simple stats support to hybrid local pushdown (#1873) (8715105)
Bug Fixes
Documentation
2.9.0 (2025-06-30)
Features
Add
bpd.read_arrow
to convert an Arrow object into a bigframes DataFrame (#1855) (633bf98)Create
deploy_remote_function
anddeploy_udf
functions to immediately deploy functions to BigQuery (#1832) (c706759)
Bug Fixes
Fix bug with DataFrame.agg for string values (#1870) (81e4d64)
Generate GoogleSQL instead of legacy SQL data types for
dry_run=True
frombpd._read_gbq_colab
with local pandas DataFrame (#1867) (fab3c38)Revert dict back to protobuf in the iam binding update (#1838) (9fb3cb4)
Documentation
2.8.0 (2025-06-23)
⚠ BREAKING CHANGES
- add required param ‘engine’ to multimodal functions (#1834)
Features
Add
bpd.options.compute.maximum_result_rows
option to limit client data download (#1829) (e22a3f6)Add
bpd.options.display.repr_mode = "anywidget"
to create an interactive display of the results (#1820) (be0a3cf)Add required param ‘engine’ to multimodal functions (#1834) (37666e4)
Performance Improvements
Documentation
2.7.0 (2025-06-16)
Features
Add bbq.json_query_array and warn bbq.json_extract_array deprecated (#1811) (dc9eb27)
Add bbq.json_value_array and deprecate bbq.json_extract_string_array (#1818) (019051e)
Support custom build service account in
remote_function
(#1796) (e586151)
Bug Fixes
Documentation
Document how to use ai.map() for information extraction (#1808) (b586746)
Rearrange README.rst to include a short code sample (#1812) (f6265db)
Use pandas API instead of pandas-like or pandas-compatible (#1825) (aa32369)
2.6.0 (2025-06-09)
Features
Bug Fixes
Address
read_csv
with bothindex_col
anduse_cols
behavior inconsistency with pandas (#1785) (ba7c313)Allow KMeans model init parameter as k-means++ alias (#1790) (0b59cf1)
Replace function now can handle bpd.NA value. (#1786) (7269512)
Documentation
Adjust strip method examples to match latest pandas (#1797) (817b0c0)
Fix docstrings to improve html rendering of code examples (#1788) (38d9b73)
2.5.0 (2025-05-30)
⚠ BREAKING CHANGES
- the updated
ai.map()
parameter list is not backward-compatible
Features
Add
bpd.options.bigquery.requests_transport_adapters
option (#1755) (bb45db8)Add bbq.json_query and warn bbq.json_extract deprecated (#1756) (ec81dd2)
Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs (#1723) (80aad9a)
Add structured output for ai map, ai filter and ai join (#1746) (133ac6b)
Add support for df.loclist, column(s) (768a757)
Include bq schema and query string in dry run results (#1752) (bb51147)
Support
inplace=True
inrename
andrename_axis
(#1744) (734cc65)Support astype conversions to and from JSON dtypes (#1716) (8ef4de1)
Support dtype parameter in read_csv for bigquery engine (#1749) (50dca4c)
Bug Fixes
Fix the default value for na_value for numpy conversions (#1766) (0629cac)
Include location in Session-based temporary storage manager DDL queries (#1780) (acba032)
Prevent creating unnecessary client objects in multithreaded environments (#1757) (1cf9f5e)
Reduce bigquery table modification via DML for to_gbq (#1737) (545cdca)
Stop ignoring arguments to
MatrixFactorization.score(X, y)
(#1726) (55c07e9)Support JSON and STRUCT for bbq.sql_scalar (#1754) (190390b)
Performance Improvements
Faster local data comparison using idenitity (#1738) (2858b1e)
Use JOB_CREATION_OPTIONAL when
allow_large_results=False
(#1763) (15f3f2a)
Dependencies
Documentation
Add MatrixFactorization to the table of contents (#1725) (611e43b)
Fix typo for “population” in the
GeminiTextGenerator.predict(..., output_schema={...})
sample notebook (#1748) (bd07e05)Integrations notebook extracts token from
bqclient._http.credentials
instead ofbqclient._credentials
(#1784) (6e63eca)Use partial ordering mode in the quickstart sample (#1734) (476b7dd)
2.4.0 (2025-05-12)
Features
Add
.dt.days
,.dt.seconds
,dt.microseconds
, anddt.total_seconds()
for timedelta series. (#1713) (2b3a45f)Improve error message in
Series.apply
for direct udfs (#1673) (1a658b2)Publish bigframes blob(Multimodal) to preview (#1693) (e4c85ba)
Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models (#1305) (b16740e)
Support to_strip parameter for str.strip, str.lstrip and str.rstrip (#1705) (a84ee75)
Bug Fixes
Performance Improvements
Dependencies
Documentation
Add snippets for Matrix Factorization tutorials (#1630) (24b37ae)
Deprecate
bpd.options.bigquery.allow_large_results
in favor ofbpd.options.compute.allow_large_results
(#1597) (18780b4)Include import statement in the bigframes code snippet (#1699) (08d70b6)
Include the clean-up step in the udf code snippet (#1698) (48992e2)
Move multimodal notebook out of experimental folder (#1712) (68b6532)
2.3.0 (2025-05-06)
Features
Bug Fixes
Guarantee guid thread safety across threads (#1684) (cb0267d)
Support large lists of lists in bpd.Series() constructor (#1662) (0f4024c)
Use value equality to check types for unix epoch functions and timestamp diff (#1690) (81e8fb8)
Performance Improvements
Documentation
Add a visualization notebook to BigFrame samples (#1675) (ee062bf)
Update snippet for
Create a k-means
model tutorial (#1664) (761c364)
2.2.0 (2025-04-30)
Features
Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints (#1650) (4fb54df)
Add GeminiTextGenerator.predict structured output (#1653) (6199023)
DataFrames.getitem support for slice input (#1668) (563f0cb)
Print right origin of
PreviewWarning
for thebpd.udf
(#1629) (48d10d1)Session.bytes_processed_sum will be updated when allow_large_re… (#1669) (ae312db)
Support names parameter in read_csv for bigquery engine (#1659) (3388191)
Support passing list of values to bigframes.core.sql.simple_literal (#1641) (102d363)
Bug Fixes
Prefer remote schema instead of throwing on materialize conflicts (#1644) (53fc25b)
Resolve issue where pre-release versions of google-auth are installed (#1491) (ebb7a5e)
Performance Improvements
Dependencies
Documentation
Fix
bq_dataframes_template
notebook to work if partial ordering mode is enabled (#1665) (f442e7a)Note that
udf
is in preview and must be python 3.11 compatible (#1629) (48d10d1)
2.1.0 (2025-04-22)
Features
Add
bigframes.bigquery.st_distance
function (#1637) (bf1ae70)Enhance
read_csv
index_col
parameter support (#1631) (f4e5b26)
Bug Fixes
Add retry for test_clean_up_via_context_manager (#1627) (58e7cb0)
Improve robustness of managed udf code extraction (#1634) (8cc56d5)
Documentation
2.0.0 (2025-04-17)
⚠ BREAKING CHANGES
make
dataset
andname
params mandatory inudf
(#1619)Locational endpoints support is not available in BigFrames 2.0.
change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
change default ingress setting for
remote_function
to internal-only (#1544)make
remote_function
params keyword only (#1537)make
remote_function
default service account explicit (#1537)set
allow_large_results=False
by default (#1541)
Features
Add
on
parameter indataframe.rolling()
anddataframe.groupby.rolling()
(#1556) (45c9d9f)Add support for creating a Matrix Factorization model (#1330) (b5297f9)
Allow
input_types
,output_type
, anddataset
to be used positionally inremote_function
(#1560) (bcac8c6)Allow pandas.cut ‘labels’ parameter to accept a list of string (#1549) (af842b1)
Change default ingress setting for
remote_function
to internal-only (#1544) (c848a80)Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
Make
remote_function
default service account explicit (#1537) (9eb9089)Support bigquery connection in managed function (#1554) (f6f697a)
Support inlining small list, struct, json data (#1589) (2ce891f)
Use session temp tables for all ephemeral storage (#1569) (9711b83)
Use validated local storage for data uploads (#1612) (aee4159)
Warn the deprecated
max_download_size
,random_state
andsampling_method
parameters in(DataFrame|Series).to_pandas()
(#1573) (b9623da)
Bug Fixes
to_pandas_batches()
respectspage_size
andmax_results
again (#1572) (27c5905)Ensure
page_size
works correctly into_pandas_batches
whenmax_results
is not set (#1588) (570cff3)Include role and service account in IAM exception (#1564) (8c50755)
Make
dataset
andname
params mandatory inudf
(#1619) (637e860)Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
Prevent
KeyError
inbpd.concat
with empty DF and struct/array types DF (#1568) (b4da1cf)Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)
Performance Improvements
Dependencies
Documentation
Add details for
bigquery_connection
in[@bpd](https://github.com/bpd).udf
docstring (#1609) (ef63772)Add explain forecast snippet to multiple time series tutorial (#1586) (40c55a0)
Add message to remove default model for version 3.0 (#1563) (910be2b)
Add samples for ArimaPlus
time_series_id_col
feature (#1577) (1e4cd9c)Deprecate default model in
TextEmbedddingGenerator
,GeminiTextGenerator
, and otherbigframes.ml.llm
classes (#1570) (89ab33e)Include all licenses for vendored packages in the root LICENSE file (#1626) (8116ed0)
Remove gemini-1.5 deprecation warning for
GeminiTextGenerator
(#1562) (0cc6784)Use restructured text to allow publishing to PyPI (#1565) (d1e9ec2)
Miscellaneous Chores
1.42.0 (2025-03-27)
Features
Add
GeoSeries.difference()
andbigframes.bigquery.st_difference()
(#1471) (e9fe815)Add
GeoSeries.intersection()
andbigframes.bigquery.st_intersection()
(#1529) (8542bd4)Allow iloc to support lists of negative indices (#1497) (a9cf215)
Bug Fixes
Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X (#1534) (c93e720)
Change the default value for pdf extract/chunk (#1517) (a70a607)
Read_pandas inline returns None when exceeds limit (#1525) (578081e)
Temporary fix for StreamingDataFrame not working backend bug (#1533) (6ab4ffd)
Tolerate BQ connection service account propagation delay (#1505) (6681f1f)
Performance Improvements
Documentation
1.41.0 (2025-03-19)
Features
Add support for the ‘right’ parameter in ‘pandas.cut’ (#1496) (8aff128)
Support BQ managed functions through
read_gbq_function
(#1476) (802183d)Warn when the BigFrames version is more than a year old (#1455) (00e0750)
Bug Fixes
Performance Improvements
Documentation
1.40.0 (2025-03-11)
⚠ BREAKING CHANGES
- reading JSON data as a custom arrow extension type (#1458)
Features
Bug Fixes
Fix list-like indexers in partial ordering mode (#1456) (fe72ada)
Use
==
instead ofis
for timedelta type equality checks (#1480) (0db248b)
Performance Improvements
1.39.0 (2025-03-05)
Features
(Preview) Support aggregations over timedeltas (#1418) (1251ded)
(Preview) Support arithmetics between dates and timedeltas (#1413) (962b152)
(Preview) Support automatic load of timedelta from BQ tables. (#1429) (b2917bb)
Add
allow_large_results
option to many I/O methods. Set toFalse
to reduce latency (#1428) (dd2f488)Support interface for BigQuery managed functions (#1373) (2bbf53f)
Warn if default ingress_settings is used in remote_functions (#1419) (dfd891a)
Bug Fixes
Do not compare schema description during schema validation (#1452) (03a3a56)
Remove warnings for null index and partial ordering mode in prep for GA (#1431) (6785aee)
Warn if default
cloud_function_service_account
is used inremote_function
(#1424) (fe7463a)Write chunked text instead of dummy text for pdf chunk (#1444) (96b0e8a)
Performance Improvements
Documentation
1.38.0 (2025-02-24)
Features
(Preview) Support diff aggregation for timestamp series. (#1405) (abe48d6)
Add
GeoSeries.from_wkt()
andGeoSeries.to_wkt()
(#1401) (2993b28)Support routines with ARRAY return type in
read_gbq_function
(#1412) (4b60049)
Bug Fixes
Calling to_timdelta() over timedeltas no longer changes their values (#1411) (650a190)
Replace empty dict with None to avoid mutable default arguments (#1416) (fa4e3ad)
Performance Improvements
Dependencies
Documentation
Add samples using SQL methods via the
bigframes.bigquery
module (#1358) (f54e768)Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial (#1310) (c6c9120)
1.37.0 (2025-02-19)
Features
(Preview) Support add, sub, mult, div, and more between timedeltas (#1396) (ffa63d4)
(Preview) Support comparison, ordering, and filtering for timedeltas (#1387) (34d01b2)
(Preview) Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns (#1390) (50ad3a5)
JSON dtype support for read_pandas and Series constructor (#1391) (44f4137)
Bug Fixes
Performance Improvements
Documentation
1.36.0 (2025-02-11)
Features
(Preview) Support addition between a timestamp and a timedelta (#1369) (b598aa8)
(Preview) Support casting floats and list-likes to timedelta series (#1362) (65933b6)
Add
bigframes.bigquery.st_area
and suggest it fromGeoSeries.area
(#1318) (8b5ffa8)
Bug Fixes
Dtype parameter ineffective in Series/DataFrame construction (#1354) (b9bdca8)
Translate labels to col ids when copying dataframes (#1372) (0c55b07)
Performance Improvements
1.35.0 (2025-02-04)
Features
(Preview) Support timedeltas for read_pandas() (#1349) (866ba9e)
Allow
case_when
to change dtypes if case list contains the condition(True, some_default_value)
(#1311) (5c2a2c6)
Bug Fixes
Exclude
DataFrame
andSeries
__call__
from unimplemented API metrics (#1351) (f2d5264)Make
DataFrame
__getattr__
and__setattr__
more robust to subclassing (#1352) (417de3a)
Performance Improvements
Dependencies
Documentation
Add link to DataFrames intro to improve SEO (#1176) (aafb5be)
Add snippet to explain the univariate model’s forecast result in the Forecast a single time series with a univariate model tutorial (#1272) (c22126b)
1.34.0 (2025-01-27)
⚠ BREAKING CHANGES
- Enable reading JSON data with
dbjson
extension dtype (#1139)
Features
(df|s).hist(), (df|s).line(), (df|s).area(), (df|s).bar(), df.scatter() (#1320) (bd3f584)
(Preview) Define timedelta type and to_timedelta function (#1317) (3901951)
Enable reading JSON data with
dbjson
extension dtype (#1139) (f672262)
1.33.0 (2025-01-22)
Features
Add
bigframes.bigquery.sql_scalar()
to apply SQL syntax on Series objects (#1293) (aa2f73a)Add unix_seconds, unix_millis and unix_micros for timestamp series. (#1297) (e4b0c8d)
Bug Fixes
Dataframe sort_values Series input keyerror. (#1285) (5a2731b)
Fix read_gbq_function issue in dataframe apply method (#1174) (0318764)
Series sort_index and sort_values now raises when axis!=0 (#1294) (94bc2f2)
Documentation
Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial (#1271) (a687050)
1.32.0 (2025-01-13)
Features
Bug Fixes
Avoid global mutation in
BigQueryOptions.client_endpoints_override
(#1280) (788f6e9)Fix erroneous window bounds removal during compilation (#1163) (f91756a)
Dependencies
Documentation
Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents (#1266) (58f13cb)
Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial (#1267) (3dcae2d)
Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial (#1268) (059a564)
Use 002 model for better scalability in text generation (#1270) (bb7a850)
1.31.0 (2025-01-05)
Features
Bug Fixes
Raise if trying to change
ordering_mode
after session has started (#1252) (8cfaae8)Reduce the number of labels added to query jobs (#1245) (fdcdc18)
Documentation
1.30.0 (2024-12-30)
Features
Add
LinearRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns (#1190) (e13eca2)Add
LogisticRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns (#1222) (bcbc732)Add
write_engine
parameter toread_FORMATNAME
methods to control how data is written to BigQuery (#371) (ed47ef1)Add client side retry to GeminiTextGenerator (#1242) (8193abe)
Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 (#1208) (298fc73)
Add support for
LinearRegression.predict_explain
andLogisticRegression.predict_explain
parameter,top_k_features
(#1228) (3068e19)
Bug Fixes
Throw an error message when setting is_row_processor=True to read a multi param function (#1160) (b2816a5)
Documentation
Add an “open in BQ Studio” link to all BigFrames sample notebooks (#1223) (e0a8288)
Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” (#1239) (840aaff)
Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial (#1227) (20f3190)
1.29.0 (2024-12-12)
Features
Documentation
1.28.0 (2024-12-11)
Features
bigframes.bigquery.vector_search
supportsuse_brute_force
andfraction_lists_to_search
parameters (#1158) (131edc3)Add
ARIMAPlus.predict_explain()
to generate forecasts with explanation columns (#1177) (05f8b4d)Add client_endpoints_override to bq options (#1167) (be74b99)
Add support for temporal types in dataframe’s describe() method (#1189) (2d564a6)
Allow join-free alignment of analytic expressions (#1168) (daef4f0)
Bug Fixes
Performance Improvements
Dependencies
Documentation
Add a code sample using
bpd.options.bigquery.ordering_mode = "partial"
(#909) (f80d705)Add snippet for creating boosted tree model (#1142) (a972668)
Add snippet for evaluating a boosted tree model (#1154) (9d8970a)
Add snippet for predicting classifications using a boosted tree model (#1156) (e7b83f1)
Add third party
pandas.Index methods
and docstrings (#1171) (a970294)Fix Bigframes.Pandas.General_Function missing docs (#1164) (de923d0)
1.27.0 (2024-11-16)
Features
Bug Fixes
Documentation
1.26.0 (2024-11-12)
Features
Bug Fixes
Fix Series.to_frame generating string label instead of int where name is None (#1118) (14e32b5)
Update the API documentation with newly added rep (#1120) (72c228b)
Performance Improvements
Documentation
Add file for Classification with a Boosted Treed Model and snippet for preparing sample data (#1135) (7ac6639)
Add snippet for Linear Regression tutorial Predict Outcomes section (#1101) (108f4a9)
Update
DataFrame
docstrings to include the errors section (#1127) (a38d4c4)Update Session doctrings to include exceptions (#1130) (a870421)
1.25.0 (2024-10-29)
Features
Add the
ground_with_google_search
option for GeminiTextGenerator predict (#1119) (ca02cd4)Add warning when user tries to access struct series fields with
__getitem__
(#1082) (20e5c58)Allow
fit
to take additional eval data in linear and ensemble models (#1096) (254875c)Support context manager for bigframes session (#1107) (5f7b8b1)
Performance Improvements
1.24.0 (2024-10-24)
Features
Documentation
1.23.0 (2024-10-23)
Features
Add
bigframes.bigquery.create_vector_index
to assist in creating vector index onARRAY<FLOAT64>
columns (#1024) (863d694)Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. (#1105) (7094c85)
Add support for pandas series & data frames as inputs for ml models. (#1088) (30c8883)
Cleanup temp resources with session deletion (#1068) (1d5373d)
Show possible correct key(s) in
.__getitem__
KeyError message (#1097) (32fab96)
Bug Fixes
Performance Improvements
Speed up tree transforms during sql compile (#1071) (d73fe9d)
Utilize ORDER BY LIMIT over ROW_NUMBER where possible (#1077) (7003d1a)
Documentation
Show best practice of closing the session to cleanup resources in sample notebooks (#1095) (62a88e8)
Update docstrings of Session and related files (#1087) (bf93e80)
1.22.0 (2024-10-09)
Features
Support regional endpoints for more bigquery locations (#1061) (45b672a)
Update LLM generators to warn user about model name instead of raising error. (#1048) (650d80d)
Bug Fixes
Correct zero row count in DataFrame from table view (#1062) (b536070)
Fix generic error message when entering an incorrect column name (#1031) (5ac217d)
Make invalid location warning case-insensitive (#1044) (b6cd55a)
Show warning for unknown location set through .ctor (#1052) (02c2da7)
Performance Improvements
Documentation
1.21.0 (2024-10-02)
Features
Add deprecation warning to PaLM2TextGenerator model (#1035) (1183b0f)
Add DeprecationWarning for PaLM2TextEmbeddingGenerator (#1018) (4af5bbb)
Add ml.model_selection.cross_validate support (#1020) (1a38063)
Allow access of struct fields with dot operators on
Series
(#1019) (ef76f13)
Bug Fixes
Documentation
1.20.0 (2024-09-25)
Features
Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations (#955) (1930b4e)
Allow multiple columns input for llm models (#998) (2fe5e48)
Bug Fixes
Documentation
Limit pypi notebook to 7 days and add more info about differences with partial ordering mode (#1013) (3c54399)
Move and edit existing linear-regression tutorial snippet (#991) (4cb62fd)
1.19.0 (2024-09-24)
Features
Support bool and bytes types in
describe(include='all')
(#994) (cc48f58)Support ingress settings in
remote_function
(#1011) (8e9919b)
Bug Fixes
Performance Improvements
Dependencies
1.18.0 (2024-09-18)
Features
Add “include” param to describe for string types (#973) (deac6d2)
Add
subset
parameter toDataFrame.dropna
to select which columns to consider (#981) (f7c03dc)
Bug Fixes
DataFrameGroupby.agg now works with unnamed tuples (#985) (0f047b4)
Fix a bug that raises exception when re-indexing columns with their original order (#988) (596b03b)
Make the
Series.apply
outcomeassign
able to the original dataframe in partial ordering mode (#874) (c94ead9)
Dependencies
1.17.0 (2024-09-11)
Features
Include the bigframes package version alongside the feedback link in error messages (#936) (7b59b6d)
Bug Fixes
Make
read_gbq_function
work for multi-param functions (#947) (c750be6)Support
read_gbq_function
for axis=1 application (#950) (86e54b1)
Documentation
1.16.0 (2024-09-04)
Features
Add
DataFrame.struct.explode
to add struct subfields to a DataFrame (#916) (ad2f75e)Implement
bigframes.bigquery.json_extract_array
(#910) (575a29e)
Bug Fixes
Fix issue with iterating on >10gb dataframes (#949) (2b0f0fa)
Unordered mode errors in ml train_test_split (#925) (85d7c21)
Performance Improvements
Dependencies
Documentation
Create sample notebook to manipulate struct and array data (#883) (3031903)
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook (#890) (d1883cc)
1.15.0 (2024-08-20)
Features
Documentation
Add columns for “requires ordering/index” to supported APIs summary (#892) (d2fc51a)
Remove duplicate description for
kms_key_name
(#898) (1053d56)
1.14.0 (2024-08-14)
Features
Bug Fixes
Performance Improvements
Documentation
1.13.0 (2024-08-05)
Features
df.apply(axis=1)
to support remote function with mutiple params (#851) (2158818)Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters (#879) (8753bdd)
Bug Fixes
Documentation
1.12.0 (2024-07-31)
Features
Add config option to set partial ordering mode (#855) (823c0ce)
Add stratify param support to ml.model_selection.train_test_split method (#815) (27f8631)
Allow DataFrame.join for self-join on Null index (#860) (e950533)
Support remote function cleanup with
session.close
(#818) (ed06436)Support to_csv/parquet/json to local files/objects (#858) (d0ab9cc)
Bug Fixes
Fewer relation joins from df self-operations (#823) (0d24f73)
Fix unordered mode using ordered path to print frame (#839) (93785cb)
Reduce redundant
remote_function
deployments (#856) (cbf2d42)
Documentation
Add partner attribution steps to integrations sample notebook (#835) (d7b333f)
Make
get_global_session
/close_session
/reset_session
appears in the docs (#847) (01d6bbb)
1.11.1 (2024-07-08)
Documentation
Remove session and connection in llm notebook (#821) (74170da)
Remove the experimental flask icon from the public docs (#820) (067ff17)
1.11.0 (2024-07-01)
Features
Add
bigframes.streaming.to_pubsub
method to create continuous query that writes to Pub/Sub (#801) (b47f32d)Add
DataFrame.to_arrow
to create Arrow Table from DataFrame (#807) (1e3feda)Add
PolynomialFeatures
support toto_gbq
and pipelines (#805) (57d98b9)Add Series.peek to preview data efficiently (#727) (580e1b9)
More informative error when query plan too complex (#811) (136dc24)
Bug Fixes
Documentation
1.10.0 (2024-06-21)
Features
Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
Bigframes.streaming module for continuous queries (#703) (0433a1c)
Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)
Bug Fixes
Allow
__repr__
to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
Ensure numpy version matches in
remote_function
deployment (#798) (324d93c)Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
Self-join optimization doesn’t needlessly invalidate caching (#797) (1b96b80)
1.9.0 (2024-06-10)
Features
Bug Fixes
Improve to_pandas_batches for large results (#746) (61f18cb)
Resolve issue with unset thread-local options (#741) (d93dbaf)
Documentation
1.8.0 (2024-05-31)
Features
merge
only generates a default index if both inputs already have an index (#733) (25d049c)Add
GroupBy.size()
to get number of rows in each group (#479) (1fca588)Add slot_millis and add stats to session object (#725) (72e9583)
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
Allow functions decorated with
bpd.remote_function()
to execute locally (#704) (d850da6)Ensure
"bigframes-api"
label is always set on jobs, even if the API is unknown (#722) (1832778)Support type annotations to supply input and output types to
bpd.remote_function()
decorator (#717) (4a12e3c)Support type annotations with
bpd.remote_function()
andaxis=1
(a preview feature) (#730) (e5a2992)
Bug Fixes
Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
Set
bpd.remote_function()
sinput_types
andoutput_types
default toNone
to allow omitting them when type annotations are present (#729) (0e25a3b)Warn and disable time travel for linked datasets (#712) (085fa9d)
Performance Improvements
Documentation
1.7.0 (2024-05-20)
Features
read_gbq_query
supportsfilters
(9386373)read_gbq
suggests a correct column name when one is not found (9386373)Add
DefaultIndexKind.NULL
to use asindex_col
inread_gbq\*
, creating an indexless DataFrame/Series (#662) (29e4886)Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
To_datetime supports utc=False for string inputs (#579) (adf9889)
Bug Fixes
read_gbq_table
respects primary keys even whenfilters
are set (#689) (9386373)Improve escaping of literals and identifiers (#682) (da9b136)
Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
Remove a usage of the
resource
package when not available, such as on Windows (#681) (96243f2)
Performance Improvements
Don’t run query immediately from
read_gbq_table
iffilters
is set (9386373)Use a
LIMIT
clause whenmax_results
is set (9386373)
Documentation
Add code snippets for imported onnx tutorials (#684) (cb36e46)
Add code snippets for imported tensorflow model (#679) (b02c401)
Use
class_weight="balanced"
in the logistic regression prediction tutorial (#678) (b951549)
1.6.0 (2024-05-13)
Features
Add
strategy="quantile"
in KBinsDiscretizer (#654) (c6c487f)Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
Support
axis=1
indf.apply
for scalar outputs (#629) (f6bdc4a)Support gcf vpc connector in
remote_function
(#677) (9ca92d0)Warn with a more specific
DefaultLocationWarning
category when no location can be detected (#648) (e084e54)
Bug Fixes
Dependencies
- Add jellyfish as a dependency for spelling correction (57ccabc)
Documentation
1.5.0 (2024-05-07)
Features
bigframes.options
andbigframes.option_context
now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)Add
ARIMAPlus.coef_
property exposingML.ARIMA_COEFFICIENTS
functionality (#585) (81d1262)Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
Add the
bigframes.bigquery
sub-package with abigframes.bigquery.array_length
function (#630) (9963f85)Always do a query dry run when
option.repr_mode == "deferred"
(#652) (651fd7d)Warn with
DefaultIndexWarning
fromread_gbq
on clustered/partitioned tables with noindex_col
orfilters
set (#631, #658) (2715d2b, 73064dd)Support
index_col=False
inread_csv
andengine="bigquery"
(73064dd)Support gcf max instance count in
remote_function
(#657) (36578ab)
Bug Fixes
Don’t raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
Fix bug with na in the column labels in stack (#659) (4a34293)
Documentation
Add python code sample for multiple forecasting time series (#531) (16866d2)
Fix the Palm2TextGenerator output token size (#649) (c67e501)
1.4.0 (2024-04-29)
Features
Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
Series binary ops compatible with more types (#618) (518d315)
Support the
score
method forPaLM2TextGenerator
(#634) (3ffc1d2)
Bug Fixes
Performance Improvements
Automatically condense internal expression representation (#516) (03c1b0d)
Cache transpose to allow performant retranspose (#635) (44b738d)
Documentation
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
1.3.0 (2024-04-22)
Features
Add fine tuning
fit()
for Palm2TextGenerator (#616) (9c106bd)Expose
max_batching_rows
inremote_function
(#622) (240a1ac)Support primary key(s) in
read_gbq
by using as theindex_col
by default (#625) (75bb240)Warn if location is set to unknown location (#609) (3706b4f)
Bug Fixes
Documentation
Fix rendering of examples for multiple apis (#620) (9665e39)
Set
index_cols
inread_gbq
as a best practice (#624) (70015b7)
1.2.0 (2024-04-15)
Features
Bug Fixes
Documentation
1.1.0 (2024-04-04)
Features
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
Expose
DataFrame.bqclient
to assist in integrations (#519) (0be8911)Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
Support
ML.GENERATE_EMBEDDING
inPaLM2TextEmbeddingGenerator
(#539) (1156c1e)Support max_columns in repr and make repr more efficient (#515) (54e49cf)
Bug Fixes
Don’t download 100gb onto local python machine in load test (#537) (082c58b)
Exclude list-like s parameter in plot.scatter (#568) (1caac27)
Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
Product operation produces float result for all input types (#501) (6873b30)
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
Use bytes limit on frame inlining rather than element count (#576) (659a161)
Performance Improvements
Dependencies
Documentation
bigframes.options.bigquery.project
andlocation
are optional in some circumstances (#548) (90bcec5)Add “Supported pandas APIs” reference to the documentation (#542) (74c3915)
Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
Address more comments from technical writers to meet legal purposes (#571) (9084df3)
Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)
1.0.0 (2024-03-25)
⚠ BREAKING CHANGES
rename model parameter
min_rel_progress
totol
early_stop
setting no longer supported, always usesTrue
rename model parameter
n_parallell_trees
ton_estimators
rename
class_weights
toclass_weight
rename
learn_rate
tolearning_rate
PCA
n_components
supports float value andNone
, default toNone
rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491)
Features
Allow assigning directly to Series.name property (#495) (ad0e99e)
Ensure
Series.str.len()
can get length of array columns (#497) (10c0446)PCA
n_components
supports float value andNone
, default toNone
(65c6f47)Rename
class_weights
toclass_weight
(65c6f47)Rename
learn_rate
tolearning_rate
(65c6f47)Rename model parameter
min_rel_progress
totol
(65c6f47)Rename model parameter
n_parallell_trees
ton_estimators
(65c6f47)Rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491) (65c6f47)
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 (#504) (fbada4a)
Bug Fixes
early_stop
setting no longer supported, always usesTrue
(65c6f47)Properly support format param for numerical input. (#486) (ae20c35)
Sampling plot cannot preserve ordering if index is not ordered (#475) (a5345fe)
Use actual BigQuery types rather than ibis types in to_pandas (#500) (82b4f91)
Dependencies
Documentation
Add code samples for metrics.{accuracy_score, confusion_matrix} (#478) (3e3329a)
Add code samples for metrics.{recall_score, precision_score, f11_score} (#502) (370fe90)
Update LLM + K-means notebook to handle partial failures (#496) (97afad9)
0.26.0 (2024-03-20)
⚠ BREAKING CHANGES
- exclude remote models for .register() (#465)
Features
read_gbq_table
supportsLIKE
as a operator infilters
(#454) (d2d425a)Set
force=True
by default inDataFrame.peek()
(#469) (4e8e97d)Support datetime related casting in (Series|DataFrame|Index).astype (#442) (fde339b)
Bug Fixes
Any() on empty set now correctly returns False (#471) (f55680c)
Fix grouping series on multiple other series (#455) (3971bd2)
Groupby aggregates no longer check if grouping keys are numeric (#472) (4fbf938)
Raise
ValueError
whenread_pandas()
receives a bigframesDataFrame
(#447) (b28f9fd)Series.(to_csv|to_json) leverages bq export (#452) (718a00c)
Warn when
read_gbq
/read_gbq_table
uses the snapshot time cache (#441) (e16a8c0)
Documentation
0.25.0 (2024-03-14)
Features
(Series|DataFrame).plot.(line|area|scatter) (#431) (0772510)
Support CMEK for
remote_function
cloud functions (#430) (2fd69f4)
0.24.0 (2024-03-12)
⚠ BREAKING CHANGES
read_parquet
uses a “pandas” engine to parse files by default. Useengine="bigquery"
for the previous behavior
Features
Bug Fixes
Move
third_party.bigframes_vendored
tobigframes_vendored
(#424) (763edeb)Only do row identity based joins when joining by index (#356) (76b252f)
Documentation
Add predict sample to samples/snippets/bqml_getting_started_test.py (#388) (6a3b0cc)
Fix the note rendering for DataFrames methods: nlargest, nsmallest (#417) (38bd2ba)
0.23.0 (2024-03-05)
Features
Bug Fixes
Dependencies
Documentation
0.22.0 (2024-02-27)
⚠ BREAKING CHANGES
rename cosine_similarity to paired_cosine_distances (#393)
move model optional args to kwargs (#381)
Features
Bug Fixes
Avoid ibis warning for “database” table() method argument (#390) (a0490a4)
Rename cosine_similarity to paired_cosine_distances (#393) (81ece46)
Performance Improvements
Dependencies
Documentation
Miscellaneous Chores
Code Refactoring
0.21.0 (2024-02-13)
Features
Add ml.metrics.pairwise.cosine_similarity function (#374) (126f566)
Support bigframes.pandas.to_datetime for scalars, iterables and series. (#372) (ffb0d15)
Bug Fixes
Documentation
0.20.1 (2024-02-06)
Performance Improvements
Documentation
0.20.0 (2024-01-30)
Features
Add
DataFrame.peek()
as an efficient alternative tohead()
results preview (#318) (9c34d83)Add ARIMA_EVAULATE options in forecasting models (#336) (73e997b)
Add Index constructor, repr, copy, get_level_values, to_series (#334) (e5d054e)
Improve error message for drive based BQ table reads (#344) (0794788)
Update cut to work without labels = False and show intervals as dict (#335) (4ff53db)
Bug Fixes
Chance default connection name in getting_started.ipnyb (#347) (677f014)
Series iteration correctly returns values instead of index (#339) (2c6af9b)
Documentation
0.19.2 (2024-01-22)
Bug Fixes
Documentation
0.19.1 (2024-01-17)
Bug Fixes
Documentation
0.19.0 (2024-01-09)
Features
Allow manually set clustering_columns in dataframe.to_gbq (#302) (9c21323)
Support assigning to columns like a property (#304) (f645c56)
Support upcasting numeric columns in concat (#294) (e3a056a)
Bug Fixes
Documentation
0.18.0 (2024-01-02)
Features
Add IntervalIndex support to bigframes.pandas.cut (#254) (6c1969a)
Specific pyarrow mappings for decimal, bytes types (#283) (a1c0631)
Bug Fixes
Dataframes to_gbq now creates dataset if it doesn’t exist (#222) (bac62f7)
Exclude pandas 2.2.0rc0 to unblock prerelease tests (#292) (ac1a745)
Fix DataFrameGroupby.agg() issue with as_index=False (#273) (ab49350)
Make
Series.str.replace
work for simple strings (#285) (ad67465)Update dataframe.to_gbq to dedup column names. (#286) (746115d)
Dependencies
Documentation
Add code snippets for explore query result page (#278) (7cbbb7d)
Code samples for
astype
common to DataFrame and Series (#280) (95b673a)Code samples for
DataFrame.copy
andSeries.copy
(#290) (7cbc2b0)Code samples for
isna
,isnull
,dropna
,isin
(#289) (ad51035)Code samples for
reset_index
andsort_values
(#282) (acc0eb7)Code samples for
Series.{add, replace, unique, T, transpose}
(#287) (0e1bbfc)Code samples for
Series.{map, to_list, count}
(#290) (7cbc2b0)Code samples for
Series.groupby
andSeries.{sum,mean,min,max}
(#280) (95b673a)Code samples for DataFrame
set_index
,items
(#295) (c2b1892)
0.17.0 (2023-12-14)
Features
Bug Fixes
Increase recursion limit, cache compilation tree hashes (#184) (b54791c)
Replaced raise
NotImplementedError
with returnNotImplemented
(#258) (a133822)
Documentation
0.16.0 (2023-12-12)
Features
Add DataFrame from_dict and from_records methods (#244) (8d81e24)
Add nunique method to Series/DataFrameGroupby (#256) (c8ec245)
Support dataframe.loc with conditional columns selection (#233) (3febea9)
Bug Fixes
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests (b02fc2c)
Fix value_counts column label for normalize=True (#245) (d3fa6f2)
Migrate e2e tests to bigframes-load-testing project (8766ac6)
Documentation
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… (#252) (8c63697)
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… (#234) (e735412)
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod (#243) (0523a31)
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… (#251) (77074ec)
Correct the params rendering for
ml.remote
andml.ensemble
modules (#248) (c2829e3)
0.15.0 (2023-11-29)
⚠ BREAKING CHANGES
- model.predict returns all the columns (#204)
Features
Add info and memory_usage methods to dataframe (#219) (9d6613d)
Send warnings on LLM prediction partial failures (#216) (81125f9)
Bug Fixes
Avoid unnecessary row_number() on sort key for io (#211) (a18d40e)
Make to_pandas override enable_downsampling when sampling_method is manually set. (#200) (ae03756)
Update the llm+kmeans notebook with recent change (#236) (f8917ab)
Use anonymous dataset to create
remote_function
(#205) (69b016e)
Documentation
Add code samples for
index
andcolumn
properties (#212) (c88d38e)Add code samples for df reshaping, function, merge, and join methods (#203) (010486c)
Add examples for dataframe.kurt, dataframe.std, dataframe.count (#232) (f9c6e72)
Add examples for dataframe.mean, dataframe.median, dataframe.va… (#228) (edd0522)
Add examples for dataframe.min, dataframe.max and dataframe.sum (#227) (3a375e8)
Code samples for
Series.dot
andDataFrame.dot
(#226) (b62a07a)Code samples for
Series.where
andSeries.mask
(#217) (52dfad2)Code samples for dataframe.any, dataframe.all and dataframe.prod (#223) (d7957fa)
Make the code samples reflect default bq connection usage (#206) (71844b0)
Miscellaneous Chores
0.14.1 (2023-11-16)
Bug Fixes
Documentation
0.14.0 (2023-11-14)
Features
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods (#162) (6a28403)
Add series.sample (identical to existing dataframe.sample) (#187) (37914a4)
Log most recent API calls as
recent-bigframes-api-xx
labels on BigQuery jobs (#145) (4ea33b7)Read_gbq creates order deterministically without table copy (#191) (8ab81de)
Support
date_series.astype("string[pyarrow]")
to cast DATE to STRING (#186) (aee0e8e)Temporary resources no longer use BigQuery Sessions (#194) (4a02cac)
Bug Fixes
Default to 7 days expiration for
read_csv
,read_json
,read_parquet
(#193) (03606cd)Deprecate the
remote_service_type
in llm model (#180) (a8a409a)For reset_index on unnamed multiindex, always use level_[n] label (#182) (f95000d)
Match pandas behavior when assigning listlike to empty dfs (#172) (c1d1f42)
Use anonymous dataset instead of session dataset for temp tables (#181) (800d44e)
Use random table when loading data for
read_csv
,read_json
,read_parquet
(#175) (9d2e6dc)
Documentation
Add code samples for
read_gbq_function
using community UDFs (#188) (7506eab)Add docstring code samples for
Series.apply
andDataFrame.map
(#185) (c816d84)Add llm kmeans notebook as an included example (#177) (d49ae42)
Use
head()
to get topn
results, not to preview results (#190) (87f84c9)
0.13.0 (2023-11-07)
Features
to_gbq
without a destination table writes to a temporary table (#158) (e1817c9)Add
DataFrame.__iter__
,DataFrame.iterrows
,DataFrame.itertuples
, andDataFrame.keys
methods (#164) (c065071)Support 32k text-generation and multilingual embedding models (#161) (5f0ea37)
Bug Fixes
0.12.0 (2023-11-01)
Features
Add
DataFrame.to_pandas_batches()
to download largeDataFrame
objects (#136) (3afd4a3)Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs (#133) (63c7919)
Bug Fixes
Fix bug with column names under repeated column assignment (#150) (29032d0)
Resolve plotly rendering issue by using ipython html for job pro… (#134) (39df43e)
Use indexee’s session for loc listlike cases (#152) (27c5725)
Documentation
Fix indentation on
read_gbq_function
code sample (#163) (0801d96)Link to ML.EVALUATE BQML page for score() methods (#137) (45c617f)
0.11.0 (2023-10-26)
Features
Add back
reset_session
as an alias forclose_session
(#124) (694a85a)Change
query
parameter toquery_or_table
inread_gbq
(#127) (f9bb3c4)
Bug Fixes
Expose
bigframes.pandas.reset_session
as a public API (#128) (b17e1f4)Use series’s own session in series.reindex listlike case (#135) (95bff3f)
Documentation
Add runnable code samples for DataFrames I/O methods and property (#129) (6fea8ef)
Add runnable code samples for reading methods (#125) (a669919)
0.10.0 (2023-10-19)
Features
0.9.0 (2023-10-18)
⚠ BREAKING CHANGES
- rename
bigframes.pandas.reset_session
toclose_session
(#101)
Features
Add
bigframes.options.bigquery.application_name
for partner attribution (#117) (52d64ff)Rename
bigframes.pandas.reset_session
toclose_session
(#101) (36693bf)Send BigQuery cancel request when canceling bigframes process (#103) (e325fbb)
Support external packages in
remote_function
(#98) (ec10c4a)Use ArrowDtype for STRUCT columns in
to_pandas
(#85) (9238fad)
Bug Fixes
Performance Improvements
Documentation
0.8.0 (2023-10-12)
⚠ BREAKING CHANGES
- The default behavior of
to_parquet
is changing from no compression to'snappy'
compression.
Features
- Support compression in
to_parquet
(a8c286f)
Bug Fixes
0.7.0 (2023-10-11)
Features
Bug Fixes
Documentation
0.6.0 (2023-10-04)
Features
Bug Fixes
0.5.0 (2023-09-28)
Features
Add
DataFrame.kurtosis
/DF.kurt
method (c1900c2)Add
DataFrame.rolling
andDataFrame.expanding
methods (c1900c2)Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. (#38) (1a254a4)Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
class_weights="balanced"
inLogisticRegression
model (c1900c2)Support
df[column_name] = df_only_one_column
(c1900c2)Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)
Bug Fixes
Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
Performance Improvements
Add ability to cache dataframe and series to session table (#51) (416d7cb)
Inline small
Series
andDataFrames
in query text (#45) (5e199ec)Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
Documentation
- Link to Remote Functions code samples from README and API reference (c1900c2)
0.4.0 (2023-09-16)
Features
Add
axis
parameter todroplevel
andreorder_levels
(7c6b0dd)Add
bfill
andffill
toDataFrame
andSeries
(7c6b0dd)Add
DataFrame.combine
andDataFrame.combine_first
(#27) (7c6b0dd)Add
DataFrame.nlargest
,nsmallest
(7c6b0dd)Add
DataFrame.pct_change
andSeries.pct_change
(7c6b0dd)Add
DataFrame.skew
andGroupBy.skew
(7c6b0dd)Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
(7c6b0dd)Add
diff
method toDataFrame
andGroupBy
(7c6b0dd)Add
filter
andreindex
toSeries
andDataFrame
(7c6b0dd)Add
reindex_like
toDataFrame
andSeries
(7c6b0dd)Add
swaplevel
toDataFrame
andSeries
(7c6b0dd)Add partial support for
Sereies.replace
(7c6b0dd)Support
DataFrame.loc[bool_series, column] = scalar
(7c6b0dd)Support a persistent
name
inremote_function
(7c6b0dd)
Bug Fixes
remote_function
uses same credentials as other APIs (7c6b0dd)Add type hints to models (7c6b0dd)
Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
Remove
transforms
parameter inmodel.fit
(breaking change) (7c6b0dd)Support column joins with “None indexer” (7c6b0dd)
Use for literals
Int64Dtype
incut
(7c6b0dd)Use lowercase strings for parameter literals in
bigframes.ml
(breaking change) (7c6b0dd)
Performance Improvements
bigframes-api
label to I/O query jobs (7c6b0dd)
Documentation
Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
Document region logic in README (7c6b0dd)
Fix OneHotEncoder sample (7c6b0dd)
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases (a32b747)Add
bigframes.pandas.read_pickle
function (a32b747)Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
(89b9503)Add
fit_transform
tobigquery.ml
transformers (a32b747)Add
Series.dropna
andDataFrame.fillna
(8fab755)Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
(a32b747)Support
bigframes.pandas.merge()
(8fab755)Support
DataFrame.isin
with list and dict inputs (8fab755)Support
DataFrame.pivot
(a32b747)Support
DataFrame.stack
(89b9503)Support
DataFrame
-DataFrame
binary operations (8fab755)Support
df[my_column] = [a python list]
(89b9503)Support
Index.is_monotonic
(8fab755)Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument (89b9503)Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument (89b9503)Support
pow()
and power operator inDataFrame
andSeries
(8fab755)Support
read_json
withengine=bigquery
for newline-delimited JSON files (89b9503)Support
Series.corr
(89b9503)Support
Series.map
(8fab755)Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
(8fab755)Support MultiIndex for DataFrame columns (a32b747)
Use
pandas.Index
for column labels (a32b747)Use default session and connection in
ml.llm
andml.imported
(8fab755)
Bug Fixes
Add error message to
set_index
(a32b747)Align column names with pandas in
DataFrame.agg
results (89b9503)Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined (89b9503)Check for IAM role on the BigQuery connection when initializing a
remote_function
(89b9503)Check that types are specified in
read_gbq_function
(a32b747)Don’t use query cache for Session construction (a32b747)
Include survey link in abstract
NotImplementedError
exception messages (89b9503)Label temp table creation jobs with
source=bigquery-dataframes-temp
label (89b9503)Make
X_train
argument names consistent across methods (8fab755)Raise AttributeError for unimplemented pandas methods (89b9503)
Raise exception for invalid function in
read_gbq_function
(a32b747)Support spaces in column names in
DataFrame
initializater (89b9503)
Performance Improvements
Add local cache for
__repr_\*__
methods (a32b747)Lazily instantiate client library objects (89b9503)
Use
row_number()
filter forhead
/tail
(8fab755)
Documentation
Add ML section under Overview (a32b747)
Add release status to table of contents (a32b747)
Add samples and best practices to
read_gbq
docs (a32b747)Correct the return types of Dataframe and Series (a32b747)
Create subfolders for notebooks (a32b747)
Fix link to GitHub (89b9503)
Highlight bigframes is open-source (a32b747)
Sample ML Drug Name Generation notebook (a32b747)
Set
options.bigquery.project
in sample code (89b9503)Transform remote function user guide into sample code (a32b747)
Update remote function notebook with read_gbq_function usage (8fab755)
0.2.0 (2023-08-17)
Features
Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.
Bug Fixes
Update ML package to use sessions for queries.
Optimize
read_gbq
withindex_col
set to cluster byindex_col
.Raise ValueError if the location mismatched.
read_gbq
no longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
0.1.0 (2023-08-11)
Features
Add
bigframes.pandas
package with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.Add
bigframes.ml
package with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.
0.0.0 (2023-02-22)
- Empty package to reserve package name.