The easiest way to UPSERT with SQLAlchemy
One command to both INSERT new data and UPDATE existing records in your database
In this short article we’ll find out how we can UPSERT
in SQLAlchemy: we INSERT
new data to our database and UPDATE
records that already exist with the newly provided values. We’ll start out with a use-case, prepare our database and then UPSERT! Easy enough! Let’s code!
When would I want to UPSERT?
Upserting is to INSERT
new records into a database table but UPDATE
records that already exist in that dataset. The database “knows which records already exist” by comparing the Primary Key column of the table.
It’s important to UPDATE
existing records so that our foreign key relationships remain intact. In the next part we’ll go through an example that’ll clearly explain our goal in practice.
Setting up an example
Imagine we run a web shop for a guitar store. Of course we have to keep track of which guitars are in stock, how many are in stock and what their prices are. This is what the inventory
table does:
sku name price stock
LPTD19HSNH1 Gibson Les Paul 2000 6
1-014-9763-300 Fender Jazzmaster 1799 3
IRONMK1KH80BK BC Rich Ironbird 1099 4
Each row is unique per sku
: this is the primary key
of the table. The primary key is pretty important when upserting.
Every day we receive a data set that could look like this:
sku name price stock
LPTD19HSNH1 Gibson Les Paul 2000 4
1-014-9763-300 Fender Jazzmaster 1700 3
DSV00ANCH1 Gibson Flying V 2499 1
Our goal is to UPSERT
the new data into the inventory
table:
INSERT
new records (like the Gibson Flying V)UPDATE
existing records (like the new price for the Fender Jazzmaster
Upserting with SQLAlchemy
We are going to use SQLAlchemy to UPSERT in 3 steps.
In this article I assume you’re familiar with SQLAlchemy so I won’t go too much in detail since it’s not in our current scope but I’ll provide links to article with a more in-depth explanation. Also check out this link for more articles on SQLAlchemy.
1. Setting up our database and connection
First we declare a model called ‘Guitar’. This corresponds to a single row in our inventory table. Notice that it has a column called sku
that is a primary key.
# SQLAlchemy needs a declarative_base when we use declarative models like Guitar
Base = declarative_base()
# Guitar will be one record in our inventory table
class Guitar(Base):
__tablename__ = 'inventory'
sku: typing.Optional[str] = sa.Column(type_=sa.String, primary_key=True)
name: str = sa.Column(type_=sa.String)
stock:int = sa.Column(type_=sa.Integer)
price: int = sa.Column(type_=sa.Integer)
def __repr__(self):
return f"<{self.name} (price=${self.price} - stock=#{self.stock}>"
# Create a database engine (sqlite in this case)
dbEngine = sa.create_engine(url="sqlite:///inventory_db.db", echo=False)
# Create all tables in the database engine
Base.metadata.create_all(bind=dbEngine)
Next we create a database engine for our sqlite database (check out this article for more information) and use it to create the inventory
table.
2. Insert data into the inventory
table
Next we’ll create a few instances of our Guitar
model and insert them into the database using SQLAlchemy’s Session
object:
with Session(dbEngine) as session:
session.add_all([
Guitar(sku="LPTD19HSNH1", name="Gibson Les Paul", price=2000, stock=6),
Guitar(sku="1-014-9763-300", name="Fender Jazzmaster", price=1799, stock=3),
Guitar(sku="IRONMK1KH80BK", name="BC Rich Ironbird", price=1099, stock=4),
])
session.commit()
When we SELECT * FROM inventory
we see the contents of our inventory
table:
sku name price stock
LPTD19HSNH1 Gibson Les Paul 2000 6
1-014-9763-300 Fender Jazzmaster 1799 3
IRONMK1KH80BK BC Rich Ironbird 1099 4
Next we’re going to receive a new dataset that contains the new daily stock. We’re going to have to insert new Guitar
s and update records in our current inventory
.
3. Upsert into the inventory table
Finally it’s time to UPSERT! Here we go:
# Convert the new stock to a list of Guitar objects
new_stock:[Guitar] = [
Guitar(sku="LPTD19HSNH1", name="Gibson Les Paul", price=2000, stock=4), # sold 2 les pauls
Guitar(sku="1-014-9763-300", name="Fender Jazzmaster", price=1700, stock=3), # lowered price on jazzmaster
Guitar(sku="DSV00ANCH1", name="Gibson Flying V", price=2499, stock=1), # new guitar in stock!
]
# Use the session and UPSERT using the "merge" method
with Session(dbEngine) as session:
for guitar in new_stock:
session.merge(guitar)
session.commit()
As you see we use session.merge
. This method analyzes our new_stock and checks which of the provided sku
values are already present in the inventory
table. If the sku
exists it performs an update, otherwise it just inserts the new guitar.
This is the resulting table:
sku name price stock
LPTD19HSNH1 Gibson Les Paul 2000 4 << stock decreased by 2
1-014-9763-300 Fender Jazzmaster 1700 3 << price lowered by $99
IRONMK1KH80BK BC Rich Ironbird 1099 4 << no chnages
DSV00ANCH1 Gibson Flying V 2499 1 << new in stock!
One thing to notice is that the merge
method differs from the MERGE in SQL since it only inserts and updates. If you want more fine-grain control check out this article that shows you how to delete/update/insert in one statement in SQL.
Conclusion
As we’ve seen UPSERTING is very easy when we use the SQLAlchemy ORM. Check out my other articles here to find out the many uses of Python, SQLAlchemy, Docker and databases in general!
I hope this article was as clear as I hope it to be but if this is not the case please let me know what I can do to clarify further. In the meantime, check out my other articles on all kinds of programming-related topics like these:
- Git for absolute beginners: understanding Git with the help of a video game
- Create and publish your own Python package
- Create a fast auto-documented, maintainable, and easy-to-use Python API in 5 lines of code with FastAPI
Happy coding!
— Mike
P.S: like what I’m doing? Follow me!