w3resource

Pandas Series: to_sql() function

Series-to_sql() function

The to_sql() function is used to write records stored in a DataFrame to a SQL database.

Databases supported by SQLAlchemy [1] are supported. Tables can be newly created, appended to, or overwritten.

Syntax:

Series.to_sql(self, name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)
Pandas Series: str.to_sql() function

Parameters:

Name Description Type/Default Value Required / Optional
name Name of SQL table. string Required
con Using SQLAlchemy makes it possible to use any DB supported by that library. Legacy support is provided for sqlite3.Connection objects. sqlalchemy.engine.Engine or sqlite3.Connection Required
schema Specify the schema (if database flavor supports this). If None, use default schema. string Optional
if_exists How to behave if the table already exists.
  • fail: Raise a ValueError.
  • replace: Drop the table before inserting new values.
  • append: Insert new values to the existing table.
{'fail', 'replace', 'append'}, default 'fail' Required
index Write DataFrame index as a column. Uses index_label as the column name in the table. bool, default True Required
index_lable Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex. string or sequence, default None Required
chunksize Rows will be written in batches of this size at a time. By default, all rows will be written at once. int Optional/td>
dtype Specifying the datatype for columns. The keys should be the column names and the values should be the SQLAlchemy types or strings for the sqlite3 legacy mode. dict Optional
method Controls the SQL insertion clause used:
  • None : Uses standard SQL INSERT clause (one per row).
  • 'multi': Pass multiple values in a single INSERT clause.
  • callable with signature (pd_table, conn, keys, data_iter).
Details and a sample callable implementation can be found in the section insert method.
{None, ‘multi’, callable}, default None Required
dropna If true, ALL nan rows will not be written to store. bool, default False Required
errors Specifies how encoding and decoding errors are to be handled. See the errors argument for open() for a full list of options. str, default ‘strict’ Required

Raises: ValueError
When the table already exists and if_exists is 'fail' (the default).

Notes Timezone aware datetime columns will be written as Timestamp with timezone type with SQLAlchemy if supported by the database. Otherwise, the datetimes will be stored as timezone unaware timestamps local to the original timezone.

Example - Create an in-memory SQLite database:

Create a table from scratch with 3 rows:

Python-Pandas Code:

import numpy as np
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite://', echo=False)
df = pd.DataFrame({'name' : ['User A', 'User B', 'User C']})
df

Output:

   name
0	User A
1	User B
2	User C

Python-Pandas Code:

import numpy as np
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite://', echo=False)
df = pd.DataFrame({'name' : ['User A', 'User B', 'User C']})
df.to_sql('users', con=engine)
engine.execute("SELECT * FROM users").fetchall()

Output:

[(0, 'User A'), (1, 'User B'), (2, 'User C')]

Python-Pandas Code:

import numpy as np
import pandas as pd
from sqlalchemy import create_engine
df1 = pd.DataFrame({'name' : ['User D', 'User E']})
df1.to_sql('users', con=engine, if_exists='append')
engine.execute("SELECT * FROM users").fetchall()

Output:

[(0, 'User A'), (1, 'User B'), (2, 'User C'), (0, 'User D'), (1, 'User E')]
Pandas Series: str.to_sql() function

Example - Overwrite the table with just df1:

Python-Pandas Code:

import numpy as np
import pandas as pd
from sqlalchemy import create_engine
df1 = pd.DataFrame({'name' : ['User D', 'User E']})
df1.to_sql('users', con=engine, if_exists='append')
df1.to_sql('users', con=engine, if_exists='replace',
           index_label='id')
engine.execute("SELECT * FROM users").fetchall()

Output:

[(0, 'User D'), (1, 'User E')]

Example - Specify the dtype (especially useful for integers with missing values). Notice that while pandas is forced to store the data as floating point, the database supports nullable integers. When fetching the data with Python, we get back integer scalars:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({"X": [2, None, 3]})
df

Output:

   X
0	2.0
1	NaN
2	3.0

Python-Pandas Code:

import numpy as np
import pandas as pd
from sqlalchemy import create_engine
from sqlalchemy.types import Integer
df.to_sql('integers', con=engine, index=False,
          dtype={"X": Integer()})
engine.execute("SELECT * FROM integers").fetchall()

Output:

[(2,), (None,), (3,)]

Previous: Series-to_hdf() function
Next: Series-to_json() function



Follow us on Facebook and Twitter for latest update.