How to Create Persistent Storage (e.g., Databases) in Docker

Follow us on LinkedIn for our latest data and tips!

Docker containers were created with dynamic data in mind. This meant that, out of the box, Docker containers did not know how to deal with persistent data such as large databases. Two workarounds were initially used to make Docker containers work with databases. The Docker volume API was later introduced to deal with persistent data natively. This article will contains a brief introduction to working with the Docker volume API.

Get Docker Training for Teams

First, let’s review the workarounds. The first workaround to the Docker/database problem is to store the database itself elsewhere on an online platform such as the cloud or on a virtual machine. This is essentially a service via port for legacy applications. Another workaround for dealing with persistent data is to store it on, say, Amazon S3 and retrieve it if the container goes bust. This means the data present within the container is also backed up on the cloud so that it can be retrieved should the container go belly up. Given that databases are typically large files, this can be a cumbersome process.

Both of these workarounds have their drawbacks. For this reason, many developers/companies use Docker data volumes. Docker data volumes are a directory present within any number of containers used to bypass Union File Systems. It is initialized when the container is created. Thus, the directory can be shared across containers on a host and the data can be stored directly on the host. This also means that the volume can be reused by various containers. All changes are now made directly to the volume. The images are also updated independently of the volume, which means that even if the containers are lost for some reason, the changes persist.

In a scenario where Network Attached Storage (NAS) is being used, we have to know which host has the access to different mount points/volumes on the storage and map those points within the container itself.

Docker specifies a volume API explicitly created for this purpose.

docker volume create –name hello

docker run -d -v hello:/container/path/for/volume container_image my_command


The snippet above creates a new volume and then runs it by invoking its path.

This approach is different from the data-only container pattern.

For a better analogy, consider a volume created:

-v volume_name:/container/fs/path

The command above does the following:

Listed via docker volume ls

  1. The command docker volume inspect volume_name can be used to list the volume so named.
  2. Has the same redundancy as a normal directory.
  3. The redundancies are enabled via a –volumes-from connection.

In a data-only container approach, one would have to execute each of the steps above individually.

Dangling Volumes are volumes which have no images attached, consuming space on the storage. As such, they can be removed. But identifying them can be difficult–in order to do so we can use:

docker volume ls -f dangling=true

Using the command below we can delete them:

docker volume rm <volume name>

If the number of dangling volumes is large, a single line of code can be used to delete them in a batch:

docker volume rm $(docker volume ls -f dangling=true -q)

Another method of achieving persistent storage with Docker is by using a method called bind-mounting. With this method, Docker does not create a new path to mount a volume. Instead, it uses a path specified by the host itself. The path can specify either a file or a directory.

The command for bind mounting Docker files is:

docker run -v /host/path:/some/path …

Docker will check if the file exists at the specified path, and will create it if it does not.

However, bind mount volumes differ in some respects compared to a normal volume, since Docker best practices dictate that we avoid changing the attributes of a host not created by it.

Firstly, if a new directory is to be created, the data at the specified path will not be copied into it automatically, unlike with run-of-the-mill volumes.

Secondly, if the following command is used:

docker rm -v my_container

it will result in the Docker container being removed. But the bind-mount volumes will still exist. This is where the command for removing dangling volumes can be used, in the event that the data is not required.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            

Get Docker Training for Teams