API Reference

Note

Only functions and classes which are members of the pandas_gbq module are considered public. Submodules and their members are considered private.

read_gbq(query[, project_id, index_col, …]) Load data from Google BigQuery using google-cloud-python
to_gbq(dataframe, destination_table[, …]) Write a DataFrame to a Google BigQuery table.
pandas_gbq.read_gbq(query, project_id=None, index_col=None, col_order=None, reauth=False, verbose=None, private_key=None, auth_local_webserver=False, dialect='legacy', location=None, configuration=None)

Load data from Google BigQuery using google-cloud-python

The main method a user calls to execute a Query in Google BigQuery and read results into a pandas DataFrame.

This method uses the Google Cloud client library to make requests to Google BigQuery, documented here.

See the How to authenticate with Google BigQuery guide for authentication instructions.

Parameters:
query : str

SQL-Like Query to return data values

project_id : str (optional when available in environment)

Google BigQuery Account project ID.

index_col : str (optional)

Name of result column to use for index in results DataFrame

col_order : list(str) (optional)

List of BigQuery column names in the desired order for results DataFrame

reauth : boolean (default False)

Force Google BigQuery to reauthenticate the user. This is useful if multiple accounts are used.

private_key : str (optional)

Service account private key in JSON format. Can be file path or string contents. This is useful for remote server authentication (eg. jupyter iPython notebook on remote host)

auth_local_webserver : boolean, default False

Use the [local webserver flow] instead of the [console flow] when getting user credentials. A file named bigquery_credentials.dat will be created in current dir. You can also set PANDAS_GBQ_CREDENTIALS_FILE environment variable so as to define a specific path to store this credential (eg. /etc/keys/bigquery.dat).

New in version 0.2.0.

dialect : {‘legacy’, ‘standard’}, default ‘legacy’

‘legacy’ : Use BigQuery’s legacy SQL dialect. ‘standard’ : Use BigQuery’s standard SQL (beta), which is compliant with the SQL 2011 standard. For more information see BigQuery SQL Reference

location : str (optional)

Location where the query job should run. See the BigQuery locations documentation for a list of available locations. The location must match that of any datasets used in the query. .. versionadded:: 0.5.0

configuration : dict (optional)

Query config parameters for job processing. For example:

configuration = {‘query’: {‘useQueryCache’: False}}

For more information see BigQuery SQL Reference

verbose : None, deprecated
Returns:
df: DataFrame

DataFrame representing results of query

pandas_gbq.to_gbq(dataframe, destination_table, project_id=None, chunksize=None, verbose=None, reauth=False, if_exists='fail', private_key=None, auth_local_webserver=False, table_schema=None, location=None, progress_bar=True)

Write a DataFrame to a Google BigQuery table.

The main method a user calls to export pandas DataFrame contents to Google BigQuery table.

This method uses the Google Cloud client library to make requests to Google BigQuery, documented here.

See the How to authenticate with Google BigQuery guide for authentication instructions.

Parameters:
dataframe : pandas.DataFrame

DataFrame to be written

destination_table : str

Name of table to be written, in the form ‘dataset.tablename’

project_id : str (optional when available in environment)

Google BigQuery Account project ID.

chunksize : int (default None)

Number of rows to be inserted in each chunk from the dataframe. Use None to load the dataframe in a single chunk.

reauth : boolean (default False)

Force Google BigQuery to reauthenticate the user. This is useful if multiple accounts are used.

if_exists : {‘fail’, ‘replace’, ‘append’}, default ‘fail’

‘fail’: If table exists, do nothing. ‘replace’: If table exists, drop it, recreate it, and insert data. ‘append’: If table exists and the dataframe schema is a subset of the destination table schema, insert data. Create destination table if does not exist.

private_key : str (optional)

Service account private key in JSON format. Can be file path or string contents. This is useful for remote server authentication (eg. jupyter iPython notebook on remote host)

auth_local_webserver : boolean, default False

Use the [local webserver flow] instead of the [console flow] when getting user credentials.

New in version 0.2.0.

table_schema : list of dicts

List of BigQuery table fields to which according DataFrame columns conform to, e.g. [{‘name’: ‘col1’, ‘type’: ‘STRING’},…]. If schema is not provided, it will be generated according to dtypes of DataFrame columns. See BigQuery API documentation on available names of a field. .. versionadded:: 0.3.1

location : str (optional)

Location where the load job should run. See the BigQuery locations documentation for a list of available locations. The location must match that of the target dataset. .. versionadded:: 0.5.0

progress_bar : boolean, True by default. It uses the library tqdm to show

the progress bar for the upload, chunk by chunk. .. versionadded:: 0.5.0

verbose : None, deprecated