Loading of bulk data to SQL using pyodbc is very slow, To resolve this issue there is a new package introduced in python is called bcpy .
What is it?
This package is a wrapper for Microsoft's SQL Server bcp utility. Current database drivers available in Python are not fast enough for transferring millions of records .This is the fastest way of loading data from CSV to SQL server using python
1. Install the python package called bcpy by running below command.
pip install bcpy
2. Download and install the suitable ODBC driver from Microsoft site.
https://www.microsoft.com/en-us/download/details.aspx?id=56567
3. Download and install BCP Utility from Microsoft site.
https://docs.microsoft.com/en-us/sql/tools/bcp-utility?view=sql-server-2017#T
The following example creates a DataFrame with 100 rows and 4 columns populated with random data and then it sends it to SQL Server.
import bcpy
import numpy as np
import pandas as pd
sql_Conn = {
'server': 'Server',
'database': 'Database',
'username': 'UserName',
'password': 'Password'
}
table_name = 'SampleTable'
df = pd.DataFrame(np.random.randint(-100, 100, size=(100, 4)),columns=list('ABCD'))
bdf = bcpy.DataFrame(df)
exec_sql = bcpy.SqlTable(sql_Conn, table=table_name)
bdf.to_sql(exec_sql)