How to use the DB2 LOAD utility using the python ibm_db driver
LOAD is a DB2 utility that I would like to use to insert data into a table from a CSV file.
How to use the DB2 LOAD utility using the python ibm_db driver
Question by Harvinder
LOAD is a DB2 utility that I would like to use to insert data into a table from a CSV file. How can I do this in Python using the ibm_db
driver? I don’t see anything in the docs here
CMD: LOAD FROM xyz OF del INSERT INTO FOOBAR
Running this as standard SQL fails as expected:
Transaction couldn't be completed: [IBM][CLI Driver][DB2/LINUXX8664] SQL0104N An unexpected token "LOAD FROM xyz OF del" was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "<space>". SQLSTATE=42601 SQLCODE=-104
Using the db2 CLP directly (i.e. os.system('db2 -f /path/to/script.file')
) is not an option as DB2 sits on a different machine that I don’t have SSH access to.
EDIT
:
Using the ADMIN_CMD
utility also doesn’t work because the file being loaded cannot be put on the database server due to firewall. For now, I’ve switched to using INSERT
Answer by Milovan Tomašević
CSV to DB2 with Python
Briefly: One solution is to use an SQLAlchemy adapter and Db2’s External Tables.
SQLAlchemy:
The Engine is the starting point for any SQLAlchemy application. It’s “home base” for the actual database and its DBAPI, delivered to the SQLAlchemy application through a connection pool and a Dialect, which describes how to talk to a specific kind of database/DBAPI combination.
Where above, an Engine references both a Dialect and a Pool, which together interpret the DBAPI’s module functions as well as the behavior of the database.
Creating an engine is just a matter of issuing a single call, create_engine():
dialect+driver://username:password@host:port/database
Where dialect is a database name such as mysql, oracle, postgresql, etc., and driver the name of a DBAPI, such as psycopg2, pyodbc, cx_oracle, etc.
Load data by using transient external table:
Transient external tables (TETs) provide a way to define an external table that exists only for the duration of a single query.
TETs have the same capabilities and limitations as normal external tables. A special feature of a TET is that you do not need to define the table schema when you use the TET to load data into a table or when you create the TET as the target of a SELECT statement.
Following is the syntax for a TET:
INSERT INTO <table> SELECT <column_list | *>
FROM EXTERNAL 'filename' [(table_schema_definition)]
[USING (external_table_options)];
CREATE EXTERNAL TABLE 'filename' [USING (external_table_options)]
AS select_statement;
SELECT <column_list | *> FROM EXTERNAL 'filename' (table_schema_definition)
[USING (external_table_options)];
For information about the values that you can specify for the external_table_options
variable, see External table options.
General example
- Insert data from a transient external table into the database table on the Db2 server by issuing the following command:
INSERT INTO EMPLOYEE SELECT * FROM external '/tmp/employee.dat' USING (delimiter ',' MAXERRORS 10 SOCKETBUFSIZE 30000 REMOTESOURCE 'JDBC' LOGDIR '/logs' )
Requirements
pip install ibm-db
pip install SQLAlchemy
Pyton code
One example below shows how it works together.
from sqlalchemy import create_engine
usr = "enter_username"
pwd = "enter_password"
hst = "enter_host"
prt = "enter_port"
db = "enter_db_name"
#SQL Alchemy URL
conn_params = "db2+ibm_db://{0}:{1}@{2}:{3}/{4}".format(usr, pwd, hst, prt, db)
shema = "enter_name_restore_shema"
table = "enter_name_restore_table"
destination = "/path/to/csv/file_name.csv"
try:
print("Connecting to DB...")
engine = create_engine(conn_params)
engine.connect() # optional, output: DB2/linux...
print("Successfully Connected!")
except Exception as e:
print("Unable to connect to the server.")
print(str(e))
external = """INSERT INTO {0}.{1} SELECT * FROM EXTERNAL '{2}' USING (CCSID 1208 DELIMITER ',' REMOTESOURCE LZ4 NOLOG TRUE )""".format(
shema, table, destination
)
try:
print("Restoring data to the server...")
engine.execute(external)
print("Data restored successfully.")
except Exception as e:
print("Unable to restore.")
print(str(e))
Conclusion
- A great solution for restoredlarge files, specifically, 600m worked without any problems.
- It is also useful for copying data from one table/database to another table. So that the backup is done as an export of csv and then that csv into DB2 with the given example.
- SQLAlchemy-Engine can be combined with other databases such as: sqlite, mysql, postgresql, oracle, mssql, etc.
Share on: