API Reference

Note

Only functions and classes which are members of the pandas_gbq module are considered public. Submodules and their members are considered private.

read_gbq(query[, project_id, index_col, …]) Load data from Google BigQuery using google-cloud-python
to_gbq(dataframe, destination_table[, …]) Write a DataFrame to a Google BigQuery table.
context A pandas_gbq.Context object used to cache credentials.
Context() Storage for objects to be used throughout a session.
pandas_gbq.read_gbq(query, project_id=None, index_col=None, col_order=None, reauth=False, auth_local_webserver=False, dialect=None, location=None, configuration=None, credentials=None, verbose=None, private_key=None)

Load data from Google BigQuery using google-cloud-python

The main method a user calls to execute a Query in Google BigQuery and read results into a pandas DataFrame.

This method uses the Google Cloud client library to make requests to Google BigQuery, documented here.

See the How to authenticate with Google BigQuery guide for authentication instructions.

Parameters:
query : str

SQL-Like Query to return data values.

project_id : str, optional

Google BigQuery Account project ID. Optional when available from the environment.

index_col : str, optional

Name of result column to use for index in results DataFrame.

col_order : list(str), optional

List of BigQuery column names in the desired order for results DataFrame.

reauth : boolean, default False

Force Google BigQuery to re-authenticate the user. This is useful if multiple accounts are used.

auth_local_webserver : boolean, default False

Use the local webserver flow instead of the console flow when getting user credentials.

New in version 0.2.0.

dialect : str, default ‘legacy’

Note: The default value is changing to ‘standard’ in a future verion.

SQL syntax dialect to use. Value can be one of:

'legacy'

Use BigQuery’s legacy SQL dialect. For more information see BigQuery Legacy SQL Reference.

'standard'

Use BigQuery’s standard SQL, which is compliant with the SQL 2011 standard. For more information see BigQuery Standard SQL Reference.

location : str, optional

Location where the query job should run. See the BigQuery locations documentation for a list of available locations. The location must match that of any datasets used in the query.

New in version 0.5.0.

configuration : dict, optional

Query config parameters for job processing. For example:

configuration = {‘query’: {‘useQueryCache’: False}}

For more information see BigQuery REST API Reference.

credentials : google.auth.credentials.Credentials, optional

Credentials for accessing Google APIs. Use this parameter to override default credentials, such as to use Compute Engine google.auth.compute_engine.Credentials or Service Account google.oauth2.service_account.Credentials directly.

New in version 0.8.0.

verbose : None, deprecated

Deprecated in Pandas-GBQ 0.4.0. Use the logging module to adjust verbosity instead.

private_key : str, deprecated

Deprecated in pandas-gbq version 0.8.0. Use the credentials parameter and google.oauth2.service_account.Credentials.from_service_account_info() or google.oauth2.service_account.Credentials.from_service_account_file() instead.

Service account private key in JSON format. Can be file path or string contents. This is useful for remote server authentication (eg. Jupyter/IPython notebook on remote host).

Returns:
df: DataFrame

DataFrame representing results of query.

pandas_gbq.to_gbq(dataframe, destination_table, project_id=None, chunksize=None, reauth=False, if_exists='fail', auth_local_webserver=False, table_schema=None, location=None, progress_bar=True, credentials=None, verbose=None, private_key=None)

Write a DataFrame to a Google BigQuery table.

The main method a user calls to export pandas DataFrame contents to Google BigQuery table.

This method uses the Google Cloud client library to make requests to Google BigQuery, documented here.

See the How to authenticate with Google BigQuery guide for authentication instructions.

Parameters:
dataframe : pandas.DataFrame

DataFrame to be written to a Google BigQuery table.

destination_table : str

Name of table to be written, in the form dataset.tablename.

project_id : str, optional

Google BigQuery Account project ID. Optional when available from the environment.

chunksize : int, optional

Number of rows to be inserted in each chunk from the dataframe. Set to None to load the whole dataframe at once.

reauth : bool, default False

Force Google BigQuery to re-authenticate the user. This is useful if multiple accounts are used.

if_exists : str, default ‘fail’

Behavior when the destination table exists. Value can be one of:

'fail'

If table exists, do nothing.

'replace'

If table exists, drop it, recreate it, and insert data.

'append'

If table exists, insert data. Create if does not exist.

auth_local_webserver : bool, default False

Use the local webserver flow instead of the console flow when getting user credentials.

New in version 0.2.0.

table_schema : list of dicts, optional

List of BigQuery table fields to which according DataFrame columns conform to, e.g. [{'name': 'col1', 'type': 'STRING'},...]. If schema is not provided, it will be generated according to dtypes of DataFrame columns. If schema is provided, it must contain all DataFrame columns. pandas_gbq.gbq._generate_bq_schema() may be used to create an initial schema, though it doesn’t preserve column order. See BigQuery API documentation on available names of a field.

New in version 0.3.1.

location : str, optional

Location where the load job should run. See the BigQuery locations documentation for a list of available locations. The location must match that of the target dataset.

New in version 0.5.0.

progress_bar : bool, default True

Use the library tqdm to show the progress bar for the upload, chunk by chunk.

New in version 0.5.0.

credentials : google.auth.credentials.Credentials, optional

Credentials for accessing Google APIs. Use this parameter to override default credentials, such as to use Compute Engine google.auth.compute_engine.Credentials or Service Account google.oauth2.service_account.Credentials directly.

New in version 0.8.0.

verbose : bool, deprecated

Deprecated in Pandas-GBQ 0.4.0. Use the logging module to adjust verbosity instead.

private_key : str, deprecated

Deprecated in pandas-gbq version 0.8.0. Use the credentials parameter and google.oauth2.service_account.Credentials.from_service_account_info() or google.oauth2.service_account.Credentials.from_service_account_file() instead.

Service account private key in JSON format. Can be file path or string contents. This is useful for remote server authentication (eg. Jupyter/IPython notebook on remote host).

pandas_gbq.context = <pandas_gbq.gbq.Context object>

A pandas_gbq.Context object used to cache credentials.

Credentials automatically are cached in-memory by pandas_gbq.read_gbq() and pandas_gbq.to_gbq().

class pandas_gbq.Context

Storage for objects to be used throughout a session.

A Context object is initialized when the pandas_gbq module is imported, and can be found at pandas_gbq.context.

Attributes:
credentials

Credentials to use for Google APIs.

dialect

Default dialect to use in pandas_gbq.read_gbq().

project

Default project to use for calls to Google APIs.

credentials

Credentials to use for Google APIs.

These credentials are automatically cached in memory by calls to pandas_gbq.read_gbq() and pandas_gbq.to_gbq(). To manually set the credentials, construct an google.auth.credentials.Credentials object and set it as the context credentials as demonstrated in the example below. See auth docs for more information on obtaining credentials.

Returns:
google.auth.credentials.Credentials

Examples

Manually setting the context credentials:

>>> import pandas_gbq
>>> from google.oauth2 import service_account
>>> credentials = service_account.Credentials.from_service_account_file(
...     '/path/to/key.json',
... )
>>> pandas_gbq.context.credentials = credentials
dialect

Default dialect to use in pandas_gbq.read_gbq().

Allowed values for the BigQuery SQL syntax dialect:

'legacy'
Use BigQuery’s legacy SQL dialect. For more information see BigQuery Legacy SQL Reference.
'standard'
Use BigQuery’s standard SQL, which is compliant with the SQL 2011 standard. For more information see BigQuery Standard SQL Reference.
Returns:
str

Examples

Setting the default syntax to standard:

>>> import pandas_gbq
>>> pandas_gbq.context.dialect = 'standard'
project

Default project to use for calls to Google APIs.

Returns:
str

Examples

Manually setting the context project:

>>> import pandas_gbq
>>> pandas_gbq.context.project = 'my-project'