Writing DataFramesΒΆ

Assume we want to write a DataFrame df into a BigQuery table using to_gbq().

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'my_string': list('abc'),
   ...:                    'my_int64': list(range(1, 4)),
   ...:                    'my_float64': np.arange(4.0, 7.0),
   ...:                    'my_bool1': [True, False, True],
   ...:                    'my_bool2': [False, True, False],
   ...:                    'my_dates': pd.date_range('now', periods=3)})
   ...: 

In [3]: df
Out[3]: 
  my_string            ...                               my_dates
0         a            ...             2018-09-21 19:25:18.282837
1         b            ...             2018-09-22 19:25:18.282837
2         c            ...             2018-09-23 19:25:18.282837

[3 rows x 6 columns]

In [4]: df.dtypes
Out[4]: 
my_string             object
my_int64               int64
my_float64           float64
my_bool1                bool
my_bool2                bool
my_dates      datetime64[ns]
dtype: object
to_gbq(df, 'my_dataset.my_table', projectid)

Note

The destination table and destination dataset will automatically be created if they do not already exist.

The if_exists argument can be used to dictate whether to 'fail', 'replace' or 'append' if the destination table already exists. The default value is 'fail'.

For example, assume that if_exists is set to 'fail'. The following snippet will raise a TableCreationError if the destination table already exists.

to_gbq(df, 'my_dataset.my_table', projectid, if_exists='fail')

Note

If the if_exists argument is set to 'append', the destination dataframe will be written to the table using the defined table schema and column types. The dataframe must contain fields (matching name and type) currently in the destination table.

Note

If an error occurs while streaming data to BigQuery, see Troubleshooting BigQuery Errors.

Note

The BigQuery SQL query language has some oddities, see the BigQuery Query Reference Documentation.

Note

While BigQuery uses SQL-like syntax, it has some important differences from traditional databases both in functionality, API limitations (size and quantity of queries or uploads), and how Google charges for use of the service. You should refer to Google BigQuery documentation often as the service is always evolving. BiqQuery is best for analyzing large sets of data quickly, but it is not a direct replacement for a transactional database.