![]() |
VOOZH | about |
Let's start with SQLAlchemy, a Python library that allows communication with databases(MySQL, PostgreSQL etc.) and Python. This library is used as an Object Relational Mapper tool that translates Python classes to tables in relational databases and automatically converts function calls to SQL statements that gives application developers the full power and flexibility of SQL in the programming language communication with databases. SQLAlchemy provides an option that allows developers to create flexible ( format of data transmission is irrelevant or not in a format of data transmission) code to communicate with a different type of database engine. As an ORM tool, SQLAlchemy differs from most others in that it is rooted in a complementarity-oriented approach. The entire processes are exposed within a series of composable, transparent tools rather than being hidden behind a wall of automation.
Using the library, developers can remain in control by automating redundant tasks
of how the database is structured and how SQL is constructed.
In this article, we will look at how to Bulk Insert A Pandas Data Frame Using SQLAlchemy and also a optimized approach for it as doing so directly with Pandas method is very slow.
So first we will create an engine for connection with the database and further we can use to_sql() command to push the data with its constraints. we have used vs-code for inserting data frame using SQLAlchemy, you can use any other editor as well after installing the library.
Note : "Use below sql command to see above results of sql"
We can perform this task by using a method βmultiβ which perform a batch insert by inserting multiple records at a time in a single INSERT statement. But effectiveness(time) for this method is very poor even compared from a default method.
So to overcome this problem we have another method βcallableβ with signature(pd_table, conn, keys, data_iter) which is very faster than βmultiβ method. That can be easily noticed in the below image:
Output: