Safely test and apply changes to your database: getting started with Alembic
Version control your database with this simple Python tool
Safely Test and Apply Changes to Your Database: Getting Started with Alembic
If you are not working with migrations in your database you’re missing out. Database migrations establish the structure and history of changes of our database in code and provide us with the ability to safely apply and revert these changes to our database.
In this article we’ll take a look at a Alembic; a Python tool that makes working with migrations very easy. With it we’ll create migrations that are database-agnostic (meaning you can execute them on any database. These migrations are defined in code which means that we can apply Git-like version control; tracking changes, having a single source of the truth and the ability to collaborate with multiple developers. When you’ve read this article you’ll be able to:
- Define changes to the database structure in code
- version control the changes
- Apply and revert the migrations
- Apply the migrations to multiple databases (e.g. testing on a dev database and then migrating the definitive structure to a production database)
- Not only create your required tables but also add indices and associations
In order to show you all of the features you’ll need to impress your coworkers, I’ve defined 5steps:
- Installation and setting up
- Creating Migrations
- Executing and undoing migrations
- Applying changes to existing databases
- Applying new migrations to a different database
1. Installation and setup
We’re going to use a Python package called Alembic. This package makes it easy to create the database migrations.
Installation
Installation is simple. Open a terminal, navigate to your project directory and execute the code below. This will install and initialize Alembic. Notice that a folder is created in the root called /alembic
.
Unfamiliar with using a terminal? Check out the article below
pip install alembic
alembic init alembic
Next we’ll set up a database connection. Alembic has to know to which database to connect. Normally we’d specify this in /alembic.ini
by adjusting the sqlalchemy.url but it’s better to overwrite this value. Open /alembic/env.py
and find config = context.config
. Below this insert the following:
Keep your credentials safe: it is recommended to load the connection string from an environment variable. If you upload this project to a public repository anyone can see your credentials. Using environment variables you can keep your credentials safe by putting all of your confidential information like passwords and connections strings in a single file that we can .gitignore
. Check out how this works in the article below:
Let’s test our connection with by running alembic current
in the terminal. This shows in which state our database is at the moment. If you see something like the text below you’ve successfully connected to the database.INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
2. Creating migrations
Execute the following code to create your first migration: alembic revision -m "create test schema"
This creates a new file in /alembic/versions
. This file is our actual migration and it contains two functions:
1. upgrade
2. downgrade
In the first function we’ll put the code we want to execute, for example creating a schema. In the second function we’ll put the code that undoes our actions in the first function, for example dropping the schema:
In the upgrade we tell alembic to execute some SQL for creating a schema.
In the downgrade we place instructions to undo exactly this, dropping the schema.
Pretty clear right? Let’s create some tables! We’ll create three tables: students, classes and student_classes. Students and classes will be tables that, obviously, store information on students and classes. Student_classes will connect one or more student-id’s to one or more class-id’s. This is because a student can have multiple classes while a class can have multiple students:
Students table
As you can see we use the op.create_table
function to create a table. First we’ll specify the table name, then all columns. Lastly we specify the schema name. If you just want to use the default schema this can be omitted. Also notice that we add constraints like the primary key and ‘nullable’. Again the downgrade function undoes the upgrade.
The classes table
We’ll only look at the upgrade function here:
The student_classes table
In this table we need to do something fancy. Because this table just exists to bind students to classes and vice versa we can define some more functionalities:
You see that in the code above we define foreignKeys with an ondelete=”CASCADE” flag. This means that if a class or a student gets deleted from their tables (a student graduates or a class is discontinued), then these deletions cascade into the student_class table; deleting associated records. This keeps our data nice and clean.
3. Executing and undoing migrations
Now that we’ve defined all of our tables and relations it’s time to reflect them in the database. In your terminal simply execute:alembic upgrade head
This tells alembic to upgrade all scripts until the head like in Git.
Afterwards we can call alembic history
to show something like this:(venv) C:\rnd\med_alembic>alembic history
a3e1bb14f43b -> a46099cad4d5 (head), student_classes
058a1e623c60 -> a3e1bb14f43b, create classes table
73d31032477a -> 058a1e623c60, create students table
Revision ID: 058a1e623c60
Revises: 73d31032477a
Create Date: 2021–10–22 12:58:23.303590
<base> -> 73d31032477a, create test schema
This shows all of the migrations we’ve performed with a code in front of it. We can use this code to downgrade like:alembic downgrade 058a1e623c60
This will undo all migrations until the creation of the “create students table” migration.
To go back one migration: alembic downgrade -1
(the number, not an L)
To revert all migrations: alembic downgrade base
Reset the database
If you’re still testing and developing the database you can really easily reset. Image that you’ve inserted some data and found a bug in one of your migrations; maybe a wrong column name. Just adjust the errors in the upgrade functions and call:alembic downgrade base && alembic upgrade head
This will reset your entire database in just one statement! While this is very handy for your development database I wouldn’t recommend performing this command on your production database because all data will be lost.fenv
4. Applying changes to existing tables
Sometimes you need to make adjustments to existing tables; renaming a column, adjusting a datatype or adding indices or constraints. With Alembic you can design, test and validate these changes by first developing them on a development database. Once your satisfied that everything works you can just plug in a different .env
file and migrate all new changes to the production database! Below are some examples on how to this.
Adding constraints
Imagine that these tables are in production but there’s something missing. First we’d like to add a UNIQUE constraint to the name column in the classes table; we can’t have multiple classes with the same name. We create a new migration with alembic revision -m “create unique constraint classes.name”
Then add the following content:
Easy as that!
Adding indices
Next problem: our student table has grown quite a lot. Let’s add an index! alembic revision -m “add index on students.name
The code below will create the index.
Rename a column and changing data types
Renaming a column, changing data types and adjusting some constraints (nullable etc) can be changed with the alter_column function:
This changes the column name from “name” to “student_name” and the from a String() to a String with a length of max 100 characters.
Notice that the downgrade undoes all of our changes again.
Don’t forget to check out the difference before and after alembic upgrade head
!
5. Applying new migrations to a different database
No that we’ve tested all of our migrations on our Postgres dev database it’s time to migrate all changes to our production database. For this we only have to connect to the production database and perform our migrations. The best way to do this is using a file that securely holds all of our connection info.
Remember the /alembic/env.py
file from part 1? This is where we define our database connection string. All we have to do is to plug a different .env
file into Alembic. Then we can load the parts of the connection string from the environment. In this case I would use the python-dotenv
package to load the .env
file and create the connection string like below. More on how to use env file here.db_user = os.environ.get("DB_USER")
db_pass = os.environ.get("DB_PASS")
db_host = os.environ.get("DB_HOST")
db_name = os.environ.get("DB_NAME")
db_type = os.environ.get("DB_TYPE")
config.set_main_option('sqlalchemy.url', f'{db_type}://{db_user}:{db_pass}@{db_host}/{db_name}')
Conclusion
Database migrations provide us with the ability to safely test and apply changes to our database. Because we write out all of the changes in code we can include them into our version control which offers even more security. All in all using Alembic is easy, safe and a great idea for designing, testing and changing your database. I hope to have shed some light on how these migrations work and how to use Alembic.
If you have suggestions/clarifications please comment so I can improve this article. In the meantime, check out my other articles on all kinds of programming-related topics like these:
- Create a fast auto-documented, maintainable and easy-to-use Python API in 5 lines of code with FastAPI
- Python to SQL — UPSERT Safely, Easily and Fast
- Create and publish your own Python package
- Create Your Custom, private Python Package That You Can PIP Install From Your Git Repository
- Virtual environments for absolute beginners — what is it and how to create one (+ examples)
- Dramatically improve your database insert speed with a simple upgrade
Happy coding!
— Mike
P.S: like what I’m doing? Follow me!