Nowadays, everybody is trying to run everything in container and I don't blame them, I do the same thing, because running applications, databases or other tools in Docker container is super nice and we all know why (isolation, easy setup, security...). However, sometimes debugging, accessing, or in general interacting with containers can be pretty annoying. This includes accessing, modifying or querying databases. So, as I used PostgreSQL extensively and have been running it inside containers for a while now, I - over time - made a list of few commands that can help immensely when doing simple and also not so simple operation with a database server.
Log Into PSQL
The most basic thing you will always need to do if you want to interact with your database server is to connect to the database itself (using PSQL):
docker exec -it <container_name> psql -U<user_name> -a <db_name>
So for Docker container called db
, default user postgres
and database name blog
it would be
docker exec -it db psql -Upostgres -a blog
Running Command Against Database
It's nice that you can log in and then execute whatever commands you need, but it's often more convenient to do it in one go, especially if you want to run just a single command or query:
docker exec -it <container_name> psql -U<user_name> -a <db_name> -c '<command/query>'
So if we wanted to list all tables in database using same parameters as in previous example:
docker exec -it db psql -Upostgres -a blog -c '\l'
Here, \l
lists all tables in current database, if you are not familiar with psql "backslash" commands, then I highly recommend this cheatsheet.
Apart from psql
commands you can run any SQL query like so:
docker exec -it db psql -Upostgres -a blog -c 'SELECT * FROM posts;'
Backing up Your Data
From time to time I need to backup data or whole schema of the database, sometimes just as an "insurance policy" and sometimes so I can make changes recklessly and restore everything afterwards, so here how to do it:
docker exec -it <container_name> pg_dump -U<user_name> --column-inserts --data-only <db_name> > backup.sql
In this example, we are making use of pg_dump
utility, which allows us to extract PostgreSQL databases. I'm using --column-inserts
and --data-only
flags to get only table rows, but quite often all that is needed is schema, for that you can use -s
flag.
Execute whole SQL files
Sometimes you need to populate existing database with enough data for testing (please don't do this with production databases) or it's just easier to use data from file then to copy and paste them into command above.
docker cp ./data.sql <container_name>:/data.sql
docker exec -it <container_name> psql -U<user_name> -a <db_name> -f /data.sql
Here we first need to copy the file itself into the running container and then execute it using the -f
flag.
Prepopulating Database on the Start
Previous example was good enough if you need to execute it from time to time, but it can become annoying if you have to do it every time you start the database. So, in case you decide that it's better to just populate the database on the start, then here is solution for that. It just requires little more work:
We will need following files:
Dockerfile
- Dockerfile for your Postgres imagecreate_db.sh
- Script that creates database, schema and populates it.schema.sql
- SQL file containing your database schemadata.sql
- SQL file containing data used to populate your database.env
- File with environment variables, to make your life easier
First, the Dockerfile
:
FROM postgres:11
# Custom initialization scripts
COPY ./create_db.sh /docker-entrypoint-initdb.d/20-create_db.sh
COPY schema.sql /schema.sql
COPY data.sql /data.sql
RUN chmod +x /docker-entrypoint-initdb.d/20-create_db.sh
This is very simple Dockerfile, all we need to do here is to copy our script and schema/data into the image so they can be on run start-up. You may be asking, There is no ENTRYPOINT
or COMMAND
, how do we run it on start-up? - the answer is, that base postgres
image runs on start any scripts present in docker-entrypoint-initdb.d
directory, so all we need to do is copy our script to this directory and PostgreSQL takes care of it.
But what is in the script (create_db.sh
)?
#!/bin/bash
set -e
POSTGRES="psql --username ${POSTGRES_USER}"
echo "Creating database: ${DB_NAME}"
$POSTGRES <<EOSQL
CREATE DATABASE ${DB_NAME} OWNER ${POSTGRES_USER};
EOSQL
echo "Creating schema..."
psql -d ${DB_NAME} -a -U${POSTGRES_USER} -f /schema.sql
echo "Populating database..."
psql -d ${DB_NAME} -a -U${POSTGRES_USER} -f /data.sql
Start-up script first logs into psql
with specified username (POSTGRES_USER
), then it creates your database (DB_NAME
). Next it creates database schema using file we copied into the image and finally it populates the database with data. All the variables here are coming from the .env
file mentioned before, which makes it very easy to change your database name or username at any time without modifying script itself.
For more complete example please see my repository here
What About docker-compose
?
In my experience, most of the time I end up running database in conjunction with the application that is using it and the simplest way to do it is docker-compose. Usually I prefer to refer to the docker-compose service by service name, rather then container name which might or might not be the same thing. In case it isn't same, you can just following command:
docker exec -it $(docker-compose ps -q <db_service_name>) psql -U<user_name> -a <db_name>
Only real difference here from the previous examples is the docker-compose
part, which looks up information of the specified service. The -q
flag make it so, that only container IDs are displayed, which is all we need.
Conclusion
I hope at least some of these little hacks will make your life easier when dealing with Docker and PostgreSQL or maybe if you were avoiding Docker just because it might be little annoying when working with databases, then I hope you will give it a shot after reading this article. 🙂