As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

Changelog¶

0.22.0 (2024-03-05)¶

Features¶

Move bqstorage to extras and add debug capability (#735) (366cb55)

Bug Fixes¶

Remove python 3.7 due to end of life (EOL) (#737) (d0810e8)

0.21.0 (2024-01-25)¶

Features¶

Use faster query_and_wait method from google-cloud-bigquery when available (#722) (ac3ce3f)

Bug Fixes¶

Update runtime check for min google-cloud-bigquery to 3.3.5 (#721) (b5f4869)

0.20.0 (2023-12-10)¶

Features¶

Add ‘columns’ as an alias for ‘col_order’ (#701) (e52e8f8)
Add support for Python 3.12 (#702) (edb93cc)
Migrating off of circle/ci (#638) (08fe090)
Removed pkg_resources for native namespace support (#707) (eeb1959)

Bug Fixes¶

Edit units for time in tests (#655) (d4ebb0c)
Table schema change error (#692) (3dc8ebe)

Documentation¶

Migrate .readthedocs.yml to configuration file v2 (#689) (d921219)

0.19.2 (2023-05-10)¶

Bug Fixes¶

Add exception context to GenericGBQExceptions (#629) (d17ae24)

Documentation¶

Correct the documented dtypes for read_gbq (#598) (b45651d)
Google Colab auth is used with pydata-google-auth 1.8.0+ (#631) (257aa62)
Updates with a link to the canonical source of documentation (#620) (1dca732)

0.19.1 (2023-01-25)¶

Documentation¶

Updates the user instructions re OAuth (0c2b716)

0.19.0 (2023-01-11)¶

Features¶

Adds ability to provide redirect uri (#595) (a06085e)

0.18.1 (2022-11-28)¶

Dependencies¶

Remove upper bound for python and pyarrow (#592) (4d28684)

0.18.0 (2022-11-19)¶

Features¶

Map “if_exists” value to LoadJobConfig.WriteDisposition (#583) (7389cd2)

0.17.9 (2022-09-27)¶

Bug Fixes¶

Updates requirements.txt to fix failing tests due to missing req (#575) (1d797a3)

0.17.8 (2022-08-09)¶

Bug Fixes¶

deps: allow pyarrow < 10 (#550) (c21a414)

0.17.7 (2022-07-11)¶

Bug Fixes¶

allow to_gbq to run without bigquery.tables.create permission. (#539) (3988306)

0.17.6 (2022-06-03)¶

Documentation¶

fix changelog header to consistent size (#529) (218e06a)

0.17.5 (2022-05-09)¶

Bug Fixes¶

deps: allow pyarrow v8 (#525) (a4ee0df)

0.17.4 (2022-03-14)¶

Bug Fixes¶

avoid deprecated “out-of-band” authentication flow (#500) (4758e3a)
correctly transform query job timeout configuration and exceptions (#492) (d8c3900)

0.17.3 (2022-03-05)¶

Bug Fixes¶

deps: require google-api-core>=1.31.5, >=2.3.2 (#493) (744a71c)
deps: require google-auth>=1.25.0 (744a71c)
deps: require proto-plus>=1.15.0 (744a71c)

0.17.2 (2022-03-02)¶

Dependencies¶

allow pyarrow 7.0 (#487) (39441b6)

0.17.1 (2022-02-24)¶

Bug Fixes¶

avoid TypeError when executing DML statements with read_gbq (#483) (e9f0e3f)

Documentation¶

document additional breaking change in 0.17.0 (#477) (a858c80)

0.17.0 (2022-01-19)¶

⚠ BREAKING CHANGES¶

the first argument of read_gbq is renamed from query to query_or_table (#443) (bf0e863)
use nullable Int64 and boolean dtypes if available (#445) (89078f8)

Features¶

accepts a table ID, which downloads the table without a query (#443) (bf0e863)
use nullable Int64 and boolean dtypes if available (#445) (89078f8)

Bug Fixes¶

read_gbq supports extreme DATETIME values such as 0001-01-01 00:00:00 (#444) (d120f8f)
to_gbq allows strings for DATE and floats for NUMERIC with api_method="load_parquet" (#423) (2180836)
allow extreme DATE values such as datetime.date(1, 1, 1) in load_gbq (#442) (e13abaf)
avoid iteritems deprecation in pandas prerelease (#469) (7379cdc)
use data project for destination in to_gbq (#455) (891a00c)

Miscellaneous Chores¶

release 0.17.0 (#470) (29ac8c3)

0.16.0 (2021-11-08)¶

Features¶

to_gbq uses Parquet by default, use api_method="load_csv" for old behavior (#413) (9a65383)
allow Python 3.10 (#417) (faba940)

Miscellaneous Chores¶

release 0.16.0 (#415) (ea0f4e9)

Documentation¶

clarify table_schema (#383) (326e674)

0.15.0 / 2021-03-30¶

Features¶

Load DataFrame with to_gbq to a table in a project different from the API client project. Specify the target table ID as project.dataset.table to use this feature. (#321, #347)
Allow billing project to be separate from destination table project in to_gbq. (#321)

Bug fixes¶

Avoid 403 error from to_gbq when table has policyTags. (#354)
Avoid client.dataset deprecation warnings. (#312)

Dependencies¶

Drop support for Python 3.5 and 3.6. (#337)
Drop support for google-cloud-bigquery==2.4.* due to query hanging bug. (#343)

0.14.1 / 2020-11-10¶

Bug fixes¶

Use object dtype for TIME columns. (#328)
Encode floating point values with greater precision. (#326)
Support INT64 and other standard SQL aliases in ~pandas_gbq.to_gbq table_schema argument. (#322)

0.14.0 / 2020-10-05¶

Add dtypes argument to read_gbq. Use this argument to override the default dtype for a particular column in the query results. For example, this can be used to select nullable integer columns as the Int64 nullable integer pandas extension type. (#242, #332)

df = gbq.read_gbq(
    "SELECT CAST(NULL AS INT64) AS null_integer",
    dtypes={"null_integer": "Int64"},
)

Dependency updates¶

Support google-cloud-bigquery-storage 2.0 and higher. (#329)
Update the minimum version of pandas to 0.20.1. (#331)

Internal changes¶

Update tests to run against Python 3.8. (#331)

0.13.3 / 2020-09-30¶

Include needed “extras” from google-cloud-bigquery package as dependencies. Exclude incompatible 2.0 version. (#324, #329)

0.13.2 / 2020-05-14¶

Fix Provided Schema does not match Table error when the existing table contains required fields. (#315)

0.13.1 / 2020-02-13¶

Fix AttributeError with BQ Storage API to download empty results. (#299)

0.13.0 / 2019-12-12¶

Raise NotImplementedError when the deprecated private_key argument is used. (#301)

0.12.0 / 2019-11-25¶

New features¶

Add max_results argument to ~pandas_gbq.read_gbq(). Use this argument to limit the number of rows in the results DataFrame. Set max_results to 0 to ignore query outputs, such as for DML or DDL queries. (#102)
Add progress_bar_type argument to ~pandas_gbq.read_gbq(). Use this argument to display a progress bar when downloading data. (#182)

Bug fixes¶

Fix resource leak with use_bqstorage_api by closing BigQuery Storage API client after use. (#294)

Dependency updates¶

Update the minimum version of google-cloud-bigquery to 1.11.1. (#296)

Documentation¶

Add code samples to introduction and refactor howto guides. (#239)

0.11.0 / 2019-07-29¶

Breaking Change: Python 2 support has been dropped. This is to align with the pandas package which dropped Python 2 support at the end of 2019. (#268)

Enhancements¶

Ensure table_schema argument is not modified inplace. (#278)

Implementation changes¶

Use object dtype for STRING, ARRAY, and STRUCT columns when there are zero rows. (#285)

Internal changes¶

Populate user-agent with pandas version information. (#281)
Fix pytest.raises usage for latest pytest. Fix warnings in tests. (#282)
Update CI to install nightly packages in the conda tests. (#254)

0.10.0 / 2019-04-05¶

Breaking Change: Default SQL dialect is now standard. Use pandas_gbq.context.dialect to override the default value. (#195, #245)

Documentation¶

Document BigQuery data type to pandas dtype conversion <reading-dtypes> for read_gbq. (#269)

Dependency updates¶

Update the minimum version of google-cloud-bigquery to 1.9.0. (#247)
Update the minimum version of pandas to 0.19.0. (#262)

Internal changes¶

Update the authentication credentials. Note: You may need to set reauth=True in order to update your credentials to the most recent version. This is required to use new functionality such as the BigQuery Storage API. (#267)
Use to_dataframe() from google-cloud-bigquery in the read_gbq() function. (#247)

Enhancements¶

Fix a bug where pandas-gbq could not upload an empty DataFrame. (#237)
Allow table_schema in to_gbq to contain only a subset of columns, with the rest being populated using the DataFrame dtypes (#218) (contributed by @johnpaton)
Read project_id in to_gbq from provided credentials if available (contributed by @daureg)
read_gbq uses the timezone-aware DatetimeTZDtype(unit='ns', tz='UTC') dtype for BigQuery TIMESTAMP columns. (#269)
Add use_bqstorage_api to read_gbq. The BigQuery Storage API can be used to download large query results (>125 MB) more quickly. If the BQ Storage API can’t be used, the BigQuery API is used instead. (#133, #270)

0.9.0 / 2019-01-11¶

Warn when deprecated private_key parameter is used (#240)
New dependency Use the pydata-google-auth package for authentication. (#241)

0.8.0 / 2018-11-12¶

Breaking changes¶

Deprecate private_key parameter to pandas_gbq.read_gbq and pandas_gbq.to_gbq in favor of new credentials argument. Instead, create a credentials object using google.oauth2.service_account.Credentials.from_service_account_info or google.oauth2.service_account.Credentials.from_service_account_file. See the authentication how-to guide <howto/authentication> for examples. (#161, #231)

Enhancements¶

Allow newlines in data passed to to_gbq. (#180)
Add pandas_gbq.context.dialect to allow overriding the default SQL syntax dialect. (#195, #235)
Support Python 3.7. (#197, #232)

Internal changes¶

Migrate tests to CircleCI. (#228, #232)

0.7.0 / 2018-10-19¶

int columns which contain NULL are now cast to float, rather than object type. (#174)
DATE, DATETIME and TIMESTAMP columns are now parsed as pandas’ timestamp objects (#224)
Add pandas_gbq.Context to cache credentials in-memory, across calls to read_gbq and to_gbq. (#198, #208)
Fast queries now do not log above DEBUG level. (#204) With BigQuery’s release of clustering querying smaller samples of data is now faster and cheaper.
Don’t load credentials from disk if reauth is True. (#212) This fixes a bug where pandas-gbq could not refresh credentials if the cached credentials were invalid, revoked, or expired, even when reauth=True.
Catch RefreshError when trying credentials. (#226)

Internal changes¶

Avoid listing datasets and tables in system tests. (#215)
Improved performance from eliminating some duplicative parsing steps (#224)

0.6.1 / 2018-09-11¶

Improved read_gbq performance and memory consumption by delegating DataFrame construction to the Pandas library, radically reducing the number of loops that execute in python (#128)
Reduced verbosity of logging from read_gbq, particularly for short queries. (#201)
Avoid SELECT 1 query when running to_gbq. (#202)

0.6.0 / 2018-08-15¶

Warn when dialect is not passed in to read_gbq. The default dialect will be changing from ‘legacy’ to ‘standard’ in a future version. (#195)
Use general float with 15 decimal digit precision when writing to local CSV buffer in to_gbq. This prevents numerical overflow in certain edge cases. (#192)

0.5.0 / 2018-06-15¶

Project ID parameter is optional in read_gbq and to_gbq when it can inferred from the environment. Note: you must still pass in a project ID when using user-based authentication. (#103)
Progress bar added for to_gbq, through an optional library tqdm as dependency. (#162)
Add location parameter to read_gbq and to_gbq so that pandas-gbq can work with datasets in the Tokyo region. (#177)

Documentation¶

Add authentication how-to guide <howto/authentication>. (#183)
Update contributing guide with new paths to tests. (#154, #164)

Internal changes¶

Tests now use nox to run in multiple Python environments. (#52)
Renamed internal modules. (#154)
Refactored auth to an internal auth module. (#176)
Add unit tests for get_credentials(). (#184)

0.4.1 / 2018-04-05¶

Only show verbose deprecation warning if Pandas version does not populate it. (#157)

0.4.0 / 2018-04-03¶

Fix bug in read_gbq when building a dataframe with integer columns on Windows. Explicitly use 64bit integers when converting from BQ types. (#119)
Fix bug in read_gbq when querying for an array of floats (#123)
Fix bug in read_gbq with configuration argument. Updates read_gbq to account for breaking change in the way google-cloud-python version 0.32.0+ handles query configuration API representation. (#152)
Fix bug in to_gbq where seconds were discarded in timestamp columns. (#148)
Fix bug in to_gbq when supplying a user-defined schema (#150)
Deprecate the verbose parameter in read_gbq and to_gbq. Messages use the logging module instead of printing progress directly to standard output. (#12)

0.3.1 / 2018-02-13¶

Fix an issue where Unicode couldn’t be uploaded in Python 2 (#106)
Add support for a passed schema in `to_gbq instead inferring the schema from the passed DataFrame with DataFrame.dtypes (#46 <https://github.com/googleapis/python-bigquery-pandas/issues/46>`_)
Fix an issue where a dataframe containing both integer and floating point columns could not be uploaded with to_gbq (#116)
to_gbq now uses to_csv to avoid manually looping over rows in a dataframe (should result in faster table uploads) (#96)

0.3.0 / 2018-01-03¶

Use the google-cloud-bigquery library for API calls. The google-cloud-bigquery package is a new dependency, and dependencies on google-api-python-client and httplib2 are removed. See the installation guide for more details. (#93)
Structs and arrays are now named properly (#23) and BigQuery functions like array_agg no longer run into errors during type conversion (#22).
to_gbq now uses a load job instead of the streaming API. Remove StreamingInsertError class, as it is no longer used by to_gbq. (#7, #75)

0.2.1 / 2017-11-27¶

read_gbq now raises QueryTimeout if the request exceeds the query.timeoutMs value specified in the BigQuery configuration. (#76)
Environment variable PANDAS_GBQ_CREDENTIALS_FILE can now be used to override the default location where the BigQuery user account credentials are stored. (#86)
BigQuery user account credentials are now stored in an application-specific hidden user folder on the operating system. (#41)

0.2.0 / 2017-07-24¶

Drop support for Python 3.4 (#40)
The dataframe passed to `.to_gbq(...., if_exists='append') needs to contain only a subset of the fields in the BigQuery schema. (#24 <https://github.com/googleapis/python-bigquery-pandas/issues/24>`_)
Use the google-auth library for authentication because oauth2client is deprecated. (#39)
read_gbq now has a auth_local_webserver boolean argument for controlling whether to use web server or console flow when getting user credentials. Replaces –noauth_local_webserver command line argument. (#35)
read_gbq now displays the BigQuery Job ID and standard price in verbose output. (#70 and #71)

0.1.6 / 2017-05-03¶

All gbq errors will simply be subclasses of ValueError and no longer inherit from the deprecated PandasError.

0.1.4 / 2017-03-17¶

InvalidIndexColumn will be raised instead of InvalidColumnOrder in read_gbq when the index column specified does not exist in the BigQuery schema. (#6)

0.1.3 / 2017-03-04¶

Bug with appending to a BigQuery table where fields have modes (NULLABLE,REQUIRED,REPEATED) specified. These modes were compared versus the remote schema and writing a table via to_gbq would previously raise. (#13)

0.1.2 / 2017-02-23¶

Initial release of transfered code from pandas

Includes patches since the 0.19.2 release on pandas with the following:

read_gbq now allows query configuration preferences pandas-GH#14742
read_gbq now stores INTEGER columns as dtype=object if they contain NULL values. Otherwise they are stored as int64. This prevents precision lost for integers greather than 2*53. Furthermore ``FLOAT`` columns with values above 10*4 are no longer casted to int64 which also caused precision loss pandas-GH#14064, and pandas-GH#14305