The Essential Guide to Docker

A Python developer's guide to get started with Docker, Docker-Compose and Docker Desktop
Stane Aurelius

Stane Aurelius, Data Scientist

I wasn’t going to write yet another Docker tutorial, there are plenty of good resources for you to learn Docker and its related technology (Docker Compose, Docker Desktop, Multi Container Apps etc). However, over the course of intensely learning Docker, I notice my learning needs can best be served by combining explanation with some hands-on guide to Docker, and this is the resulting learning notes for others in the same boat (heh, see what I did there 😉) .

Quick primer on Docker

During any application development process, one of the most commonly encountered problems might be the “but it works on my machine!” headache. Lots of developers have said this excuse while shipping their app to a colleague, forgetting that there is rarely identical machine for all. Encountering this issue, not a few would also say “Oh, maybe they did the setups wrong, it will take me just a few minutes to install the dependencies and configure their machine, I can fix it real quick!” While they might be able to fix the issue that way, it does not change the fact that it wouldn’t be possible to collaborate if you cannot simplify the setup process and running your code on every machine. Imagine having to setup each one of your collaborators’ (including testers) machine just because the code only works in your machine, such a hassle.

Docker was developed as a solution to these problems. It enables developers to package their application, along with all of its dependencies and configurations in an isolated environment called a container. While Docker isn’t the only tool for building and managing containers, it makes the process of building, managing and deploying containers much more simpler and easier. The containers can also be distributed on any platform without causing compatibility issues.

For this Docker tutorial, we will be working with a simple app that can be used to store and preview a list of customers from the database. The app will be built using Streamlit, a powerful framework for building data apps in Python with few lines of code. This is a quick preview of what you can expect by following this tutorial:

docker tutorial

The app is so simple, as the only functionality it provides are getting an input from the user and write it to the database, plus displaying all data within the database in a table. The simplicity helps us to focus more on learning Docker rather than building the app itself. Furthermore, this app alone is sufficient to be used to demonstrate the key Docker concepts that most beginners find themselves learning / re-learning:

  • building container image
  • Docker volumes
  • modifying the app & bind mounts
  • multi-container apps
  • Docker Compose

Installation & Setup

Depending on your OS, you can see the detailed installation instructions on the Docker docs website. You will need to install Docker Desktop if you are running Windows or MacOS on your machine. If you are running Linux, the old way of installing Docker would be to manually install Docker Engine and Docker Compose. Fortunately, Docker Desktop has been released for Linux not too long ago, so Linux users can just install it and share the same experience of using Docker with Mac and Windows users.

Docker Desktop provides all the tools you need — Docker Engine, Docker Compose, and CLI client. Additionally, it also provides a simple interface for managing your containers, applications, and images. Once you installed it, try launching Docker Desktop and verify the installation by checking the version using the following commands:

$ docker --version
Docker version 20.10.17, build 100c701

$ docker compose version
Docker Compose version v2.6.1

Preparing the App

Moving on to the app, you just need to create a new folder anywhere you like. Open up that folder in your preferred code editor and create 2 files within it: app.py and requirements.txt, then paste the following code in app.py:

# packages
import streamlit as st
import pandas as pd
import sqlite3

# initiate DB
sqlite_db = "data/customer.db"
conn = sqlite3.connect(sqlite_db)
conn.execute(
    """CREATE TABLE IF NOT EXISTS profile
        (
            Name TEXT,
            Age INT
        );"""
)
conn.commit()

# title
st.title("Customer Book")
st.subheader("the best app for storing your customer data")

# sidebar for data input
st.sidebar.header("Data Input")
name = st.sidebar.text_input("Name")
age = st.sidebar.number_input("Age", min_value=0, max_value=100, step=1)
if st.sidebar.button("Save to database"):
    conn.execute(f"INSERT INTO profile (Name, Age) VALUES ('{name}', {age});")
    conn.commit()

# display existing data
df = pd.read_sql("SELECT * FROM profile", con=conn)
st.table(df)

This will be the source code of our app, I added a few comments within it so you can take a glimpse and see what each section does. The next step is to add the dependencies into requirements.txt. Paste this into your requirements.txt:

streamlit==1.11.0
mysql-connector-python==8.0.29

Our app only has 2 dependencies: streamlit as a framework for building and running our app, and mysql-connector-python for connecting our app to a MySQL database later on. Finally, create a subfolder named data, which will be used to store a SQLite database (we will modify it so that it connects to MySQL later on). At this point, you should have the following folder structure:

app
├── data
├── app.py
└── requirements.txt

It is not necessary to run the app on your local machine. However, you need to know how to properly setup and run it as we need to write a script that tells Docker how to configure the machine and run our app later on. Basically, our app can be installed and run in a few simple steps:

  1. create a virtual environment
  2. install the dependencies from requirements.txt using pip install -r requirements.txt command
  3. run the app using streamlit run app.py command

Now we’ve finished preparing our app! We can already build and run the app in a container.

Building the Container Image

To build the container image, we need to create a Dockerfile. It is essentially a text file of instructions which tells Docker how to install and configure an image. Create a file named Dockerfile in your folder (make sure it has no extension) with the following contents. We’ll go over each command after a few minutes!

FROM python:3.9-slim-buster
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["streamlit", "run", "app.py"]

Afterwards, open up your terminal and navigate to the app folder. You can build the container image using the following command in your terminal:

docker build -t customer-book .

I will explain everything starting from the Dockerfile we just created. Every Dockerfile starts with a FROM instruction, which tells Docker what base image we want to use. In our case, we told Docker that we wanted to use python:3.9-slim-buster as our base image. You can see the list of Docker official images in Docker Hub.

docker official images

If you click on Python image, for example, you will see that there are lots of available tags that you can use. Different tags means different base. In our case (python:3.9-slim-buster), we are using an image that is based on Debian 10 (“Buster”). The slim variant is preferable as it only contains the minimal packages needed to run Python, hence consuming less storage. A proper way to choose the base image is by choosing a base OS you are comfortable with. If you are unsure about which base image you need to use, just use the defacto image provided, e.g. python:.

Moving on to the next instruction, WORKDIR was used to specify the working directory of a Docker container. Consider this instruction as a combination of mkdir (make directory) and cd (change directory). Upon defining the working directory, we then used the COPY instruction to copy all files from the app folder in our local machine to the app folder inside the container. Finally, we RUN the pip install command to install the application’s dependencies within the container; whilst specifying the default command to run when starting a container from this image using the CMD instruction. See the following table for the differences between RUN and CMD

DifferencesRUNCMD
Command ExecutionWhen building the docker imageWhen launching the created image
UsageInstall app and dependenciesSet default command to run the app
DuplicateMultiple RUN instructions are permittedOnly the last CMD instruction will take effect

The docker build command builds the image based on the instructions specified in Dockerfile. the -t flag tags our image — it gives a specific name to the final image. Finally, the . tells Docker to find the Dockerfile in the current directory. When it has done building the image, you can check the list of images in your local machine using the following command:

$ docker image list
REPOSITORY      TAG       IMAGE ID       CREATED          SIZE
customer-book   latest    7285ffda08e9   52 minutes ago   746MB

Alternatively, you can open Docker desktop and navigate to the Images tab in the sidebar.

Docker Desktop Image tab

 

Running an App Container

Now that the image has been built, we can use the following command to run our application:

docker run -dp 81:8501 customer-book

The -d flag tells Docker to run the container in detached mode (in the background), while -p : is used to create a mapping between the host port 81 and container port 8501 (Streamlit runs on port 8501 by default). Remember that containers are isolated from our filesystem. When you run a container using Docker run, it does not publish any port to the outside. So, without the -p flag, we won’t be able to access our application. Now navigate to localhost:81 through your web browser and you should see our app! Try to add some data through the sidebar and see if it behaves as expected.

Running an app in docker

You can see the list of running containers through Docker Desktop or by using the following command:

$ docker ps
CONTAINER ID   IMAGE           COMMAND                  CREATED          STATUS          PORTS                  NAMES
46f68e173bf2   customer-book   "streamlit run app.py"   10 minutes ago   Up 10 minutes   0.0.0.0:81->8501/tcp   thirsty_albattani

Notice that Docker assigns a random name to our running containers if we didn’t specify it using the --name flag when executing docker run command. Now try using the following commands to stop and remove the currently running container, then re-launch the container from the previously created image:

$ docker rm -f 
$ docker run -dp 81:8501 customer-book

Notice that after you remove and re-launch the container, you will not see any of the data you previously added to the database. Most of the times, when we make a change to the running app, we want it to be persisted — when we remove the container and relaunch it, we expect our change to be saved. But when we start a container, its starting point is the image definition itself. We can create, modify, and delete files within a container, but those modifications are isolated to that container and not reflected to the image that were used to run that container. This is precisely why we need Volumes. We will explore the use of Volumes in the next section.

Docker Volumes

Previously, we encountered a problem regarding data persistence, and I mentioned that Volumes are the solution for it. Docker Volumes are essentially file systems in our local machine that are managed by Docker. Consider it as a bucket of data that we can mount to a directory in which the data is stored within a container. Volumes allow us to back up data and share file systems across different containers easily. We can use docker volume create command to create a volume:

docker volume create app-db

If you haven’t removed the previous container yet, remove it using the docker rm -f  command. Then we can run the container while mounting the created volume to the path where our app data is located. For our app, the data is stored within /app/data/customer.db, so we are mounting the volume to /app/data. It will then capture all files created at that path.

docker run -dp 81:8501 \
    -v app-db:/app/data \
    customer-book

Now try adding some new data through the app. Afterwards, try removing the running container and re-run it using the same command above. You should still see the data in the table! Remove the container once you’ve done playing around with the volume and app. If you want to see where Docker stores the data when you used named volumes, you can use the docker volume inspect command.

$ docker volume inspect app-db
[
    {
        "CreatedAt": "2022-07-20T06:26:35Z",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/app-db/_data",
        "Name": "app-db",
        "Options": {},
        "Scope": "local"
    }
]

The “Mountpoint” is where Docker stores the data on your disk. The directory might be hidden, so you need to have root access to access this directory on your local machine.

Modifying the App & Bind Mounts

During a development phase, there is a huge chance that you need to modify some components of the app that is running inside a container. To do this, you need to modify the source code of your app. However, recall that any changes you make on your local machine will not be reflected to the app running inside a container, since a container is isolated from your local machine and it already contains the source code (we told Docker to copy the source code from our local machine using the COPY instruction in the Dockerfile when we were building the image).

A simple solution for this problem is to modify the source code in your local machine, rebuild the Docker image, and re-run the container to check if you have made the appropriate changes. But this approach isn’t practical at all, building the image might takes a long time! This is a case where you want to use bind mounts.

To run a container that supports a development workflow, we need to use bind mounts, which allows us to mount our source code into the container. Consider this as running the exact same container as the previous one, but the workdir (app folder) is located in your local machine instead. Navigate to your app folder from the terminal and use the following command:

docker run -dp 81:8501 \
    -w /app \
    -v "$(pwd):/app" \
    --name customer_book \
    python:3.9-slim-buster \
    bash -c "pip install -r requirements.txt && streamlit run app.py"

The command basically tells docker to run a container with the following flags:

  • -dp 81:8501: run the app in detached mode and create port mapping
  • -w /app: set the working directory to /app within the container
  • -v "$(pwd):/app": mount the current directory in my local machine to the /app directory within the container
  • --name customer_book: set customer_book as the name of our container
  • use python:3.9-slim-buster as the base image, then open bash and execute commands from the string "pip install -r requirements.txt && streamlit run app.py"

You can see the logs using docker logs -f customer_book. Wait for a few seconds until you see the following logs:

You can now view your Streamlit app in your browser.

Network URL: http://172.17.0.2:8501
External URL: http://118.137.124.170:8501

Once you see that logs, navigate to localhost:81 from your web browser. Now open app.py in your local machine and change line 20, we are going to change the subheader of our app:

-   st.subheader('the best app for storing your customer data')
+   st.subheader('A simplified app for storing your customer data')

Save the changes and refresh the web page. The app subheader must already be updated. Notice that we can update the app running inside a container by modifying the source code in our local machine! Furthermore, we do not need to install anything on our local machine — every setup and configuration is done within a container!

Updated docker app

Once you’ve done making all the changes you want, rebuild the image by using the docker build -t customer-book . command. You can remove unused images by using docker image prune.

Multi-Container Apps

To make this Docker tutorial a little more reflective of real-world usage patterns, let us change the app so that it connects to a MySQL database instead of a SQLite database file. This means that we want to have MySQL running in our container. This is where multi-container apps comes into picture. We will want to create a separate container that runs the MySQL database due to the following reasons:

  • each container should only has one task and do it well
  • you might not need to modify a database as often as you modify APIs and front-ends
  • when you finished developing your app, you might not want to also deploy your database

Considering that each container is isolated with the others, we need to use networking to allow connections between containers. Simply put, containers can only talk to each other if they belong to the same network. You can create a network using docker network create command.

docker network create my-app

Running the Database

Now that we have created a network, we can run our MySQL database and assign it to the my-app network. We’ll go over the command in a while!

docker run -d \
    --name mysql_db \
    --network my-app \
    --network-alias mysql \
    -v mysql-data:/var/lib/mysql \
    -e MYSQL_ROOT_PASSWORD=my_password \
    -e MYSQL_DATABASE=customer_db \
    mysql:8.0.29

In the command above, we used the mysql:8.0.29 base image to run our MySQL database container. We also specified some flags:

  • --network my-app: assign this container to the my-app network
  • --network-alias mysql: set mysql as the network alias of this container. We have 3 ways of accessing a container from another container within the same network: IP addresshostname, or network alias. So we defined a network alias to make it easier for our app to connect to this database
  • -v: create a named volume mysql-data and mount it to /var/lib/mysql. This is where MySQL stores its data by default. Notice that Docker automatically creates the named volume for us (we never created this named volume beforehand)
  • -e: set environmental variables. You can use this to adjust the configuration of the MySQL instance

Now, let’s verify that we have the database up and running. Connect to the database in interactive mode using docker exec -it mysql_db mysql -p (we are going into the terminal inside the container and executing mysql -p command to open MySQL shell). Type in my_password when prompted, then use the SHOW DATABASES; command to list all the databases.

mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| customer_db        |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

Voila! The customer_db is listed and ready to use!

Running the App

As the source code of our app still stores the data on a SQLite database, we need to do a slight modifications to it. Open up app.py in your preferred code editor and paste this updated lines of code:

# packages
import streamlit as st
import pandas as pd
import mysql.connector

# connect to MySQL database
def init_connection():
    conn = mysql.connector.connect(**st.secrets["mysql"])

    with conn.cursor() as cursor:
        cursor.execute(
        """CREATE TABLE IF NOT EXISTS profile
            (
                Name TEXT,
                Age INT
            );"""
        )

    return conn

conn = init_connection()

# title
st.title('Customer Book')
st.subheader('A simplified app for storing your customer data')

# sidebar for data input
st.sidebar.header("Data Input")
name = st.sidebar.text_input("Name")
age = st.sidebar.number_input("Age", min_value=0, max_value=100, step=1)
if st.sidebar.button("Save to database"):
    with conn.cursor() as cursor:
        cursor.execute(f"INSERT INTO profile (Name, Age) VALUES ('{name}', {age});")
        conn.commit()

# display existing data
df = pd.read_sql("SELECT * FROM profile", con=conn)
st.table(df)

We also need to provide the database configuration to streamlit. Create a new folder .streamlit and a new file secrets.toml within it. Then paste this database configuration data inside secrets.toml:

[mysql]
host = "mysql"
user = "root"
password = "my_password"
database = "customer_db"

You can delete the data folder we previously created as we are currently using MySQL database to store the app data. Your app folder structure should look like this:

app
├── .streamlit
│   └── secrets.toml
├── app.py
└── requirements.txt

Now run the app in a new container using the command we used on the Modifying the App & Bind Mounts section! Don’t forget to also assign the app on the my-app network so it can communicate with the MySQL database.

docker run -dp 81:8501 \
    -w /app \
    -v "$(pwd):/app" \
    --network my-app \
    --name customer_book \
    python:3.9-slim-buster \
    bash -c "pip install -r requirements.txt && streamlit run app.py"

After a few seconds, you should be able to open the app by navigating to localhost:81 in your web browser. Try to add some data through the sidebar and see if it behaves as expected.

Docker app final

 

If you have added some new data, let’s check whether they are actually written to our database. Use the following command and input the password my_password when prompted.

docker exec -it mysql_db mysql -p customer_db

Then we can query the profile table to verify whether the data from our app is written to the database.

mysql> SELECT * FROM profile;
+----------+------+
| Name     | Age  |
+----------+------+
| Jennifer |   20 |
| Sheilla  |   24 |
+----------+------+
2 rows in set (0.01 sec)

We have verified that our containers are connected to each other! Now take a look at your Docker Desktop, we can see that we have 2 containers running, but there isn’t any indication that they belong to the same network.

Desktop Multi Container

 

Docker Compose

At this point, you have already seen how to start up your application in Docker. On the last section, we wrote some commands with so many flags to spin up our application stack in development mode, each providing different functionalities such as mapping ports, setting environmental variables, managing networks, etc. There were lots of things that needs to be done, making it hard to share the application stack in case you want to collaborate.

Fortunately, Docker has provided a tool to help build and share multi-container applications, namely the Docker Compose. It allows us to define our application stack in a single YAML file. With a single command, we can then build everything up or tear it all down, very practical for sharing your app to your team.

Defining the Services

Before creating the YAML file, recall that this was the command that we used to run our app inside a container.

docker run -dp 81:8501 \
    -w /app \
    -v "$(pwd):/app" \
    --network my-app \
    --name customer_book \
    python:3.9-slim-buster \
    bash -c "pip install -r requirements.txt && streamlit run app.py"

Remember that we are using bind mounts to run our app in development mode. With Dockerfile, we won’t be able to use bind mounts, as it only provides instructions for building Docker images. Without further ado, create docker-compose.yml file in your local app folder. Then, we need to define the list of services (containers) we want to run for our application:

services:
  app
  mysql

You can use any name you want for the service. The name you used will automatically become a network alias, making it useful to access a container within another container in the same network. Let’s first define the app service in our YAML file. For our app service, we need 6 definitions:

  • container_name
  • image
  • working_dir
  • ports
  • command
  • and volumes:
services:
  app:
    container_name: customer_book
    image: python:3.9-slim-buster
    working_dir: /app
    volumes:
      - ./:/app
    command: bash -c "pip install -r requirements.txt && streamlit run app.py"
    ports:
      - 81:8501

Now we just need to define the database service. Recall that this was the command we used to run the database inside a container.

docker run -d \
    --name mysql_db \
    --network my-app \
    --network-alias mysql \
    -v mysql-data:/var/lib/mysql \
    -e MYSQL_ROOT_PASSWORD=my_password \
    -e MYSQL_DATABASE=customer_db \
    mysql:8.0.29

Defining the database service in our YAML file is very similar to defining the app service. In this case, our database service needs 4 definitions: container_nameimagevolumes, and environment. Upon defining the database service, our docker-compose.yml should look like this:

services:
  app:
    container_name: customer_book
    image: python:3.9-slim-buster
    working_dir: /app
    volumes:
      - ./:/app
    command: bash -c "pip install -r requirements.txt && streamlit run app.py"
    ports:
      - 81:8501

  mysql:
    container_name: mysql_db
    image: mysql:8.0.29
    volumes:
      - mysql-data:/var/lib/mysql
    environment:
      - MYSQL_ROOT_PASSWORD=my_password
      - MYSQL_DATABASE=customer_db

When we were using docker run to spin up our database, we didn’t need to create a named volume (mysql-data) before using it. However, that doesn’t apply when you are running with Docker Compose. You need to define a top level volumes section and provide the named volumes there. Our final YAML file should look like this:

services:
  app:
    container_name: customer_book
    image: python:3.9-slim-buster
    working_dir: /app
    volumes:
      - ./:/app
    command: bash -c "pip install -r requirements.txt && streamlit run app.py"
    ports:
      - 81:8501

  mysql:
    container_name: mysql_db
    image: mysql:8.0.29
    volumes:
      - mysql-data:/var/lib/mysql
    environment:
      - MYSQL_ROOT_PASSWORD=my_password
      - MYSQL_DATABASE=customer_db

volumes:
  mysql-data:

Running & Tearing the Application Stack

Make sure that there are no containers running by using docker ps. Since we already have the docker-compose.yml file, we just need to run the following command to start up our application stack in detached mode:

docker compose up -d

Once you see the following logs, you should be able to open the application in your web browser:

[+] Running 4/4
 ⠿ Container mysql_db       Started                                                                                                                                  0.8s
 ⠿ Container customer_book  Started   

We’ve started the application stack using just a single command! Now you can easily share your application stack to anyone! Try opening Docker Desktop and you should see a group named app. By default, Docker Compose gives the group name of the directory where docker-compose.yml is located.

Docker Desktop group

 

Finally, to tear everything down, you just need to hit on that trash button on the Docker Desktop, or simply use docker compose down command. This will stop and remove the network while keeping the named volumes intact (use --volumes if you also want to remove named volumes).

And that rounds out our Docker guide for beginners, where we apply some hands-on practice while learning about each piece of a key Docker concept. I hope it is a good introduction to Docker, and that you find as much value in it as I have writing it. 

Supertype on Social Media

Connect with the Author