Docker Volumes for Persistent Storage

“Every man is a volume if you know how to read him.”  -- William_Ellery_Channing

Floating Book

Docker containers and images are all well and good for things like isolation, consistency, and relatively straightforward scalability. But what if we have a need to remember things between container runs such as populating a database or some configuration file changes between container runs? This facilitates team collaboration and it’s where Docker Volumes come in handy.

Code

Source code for this tutorial is available at my github.

Prerequisites:

Since volumes are a moderately advanced topic in the Docker world, you need to understand at least the following topics.

Best practices but not necessarily required:

Scripts I Found Handy

Remove all exited containers.

docker rm $(docker ps -a -f status=exited -f status=created -q)

Remove all images. Use this with caution! -f means force removal!

docker rmi -f $(docker images -q)

Installing

I am using a VirtualBox instance on Ubuntu 24.04 using Visual Studio Code. My VirtualBox is installed on Windows 11. Two ways of installing VirtualBox are either directly https://www.keypuncher.net/blog/virtualbox-windows-host-manjaro-guest or through Vagrant https://www.keypuncher.net/blog/vagrant-virtualbox-with-ubuntu-step-by-step.

After the installation process completes if you don’t want to type sudo for every docker command add your user to the docker group as shown at: Post-installation steps | Docker Docs. Log out and back in after you are finished adding your user.

I am using docker version 27.5.1. To see which docker packages you have installed you can use Ubuntu’s apt list:

apt list --installed | grep docker

per apt - How to list all installed packages - Ask Ubuntu.

Your First C++ Volume and Dockerfile

You are severely limited on the number of unauthenticated images per day you can download from docker hub at Docker Hub Container Image Library | App Containerization. To get around this I suggest you just log in as follows:

docker login -u <username>

Then create a directory for our C++ application:

mkdir my-cpp-docker

cd my-cpp-docker

Next open VS Code or your favorite integrated development environment (IDE) or editor and add the two files: Dockerfile and main.cpp.

The Dockerfile contains:

# Use the official GCC image with the specified version

FROM gcc:13.3.0

 

# Set the working directory in the container

WORKDIR /usr/src/app

 

# Copy the source code into the container

COPY main.cpp main.cpp

 

# Compile the C++ program

RUN g++ -o myapp main.cpp

And this Dockerfile does the following:

FROM gcc:13.3.0:

This line specifies the base image for the Docker container. Here, we're using the official GCC image with version 13.3.0, which includes the GNU Compiler Collection tools necessary to compile our C++ code.

WORKDIR /usr/src/app:

This sets the working directory inside the container to /usr/src/app. Any subsequent instructions that use relative paths will use this directory as the base.

COPY main.cpp main.cpp

This copies the main.cpp file from your host machine to the working directory inside the container (/usr/src/app).

RUN g++ -o myapp main.cpp

This runs the command g++ -o myapp main.cpp inside the container, which compiles the main.cpp file into an executable named myapp.

Now let’s do the main.cpp file:

#include <iostream>

#include <filesystem>

#include <fstream>

#include <string>

#include <regex>

 

namespace fs = std::filesystem;

 

int main() {

    std::string directory = "cpp-app-volume";

    int fileCount = 0;

    std::regex dataFilePattern("data\\d+\\.txt");

 

    // Ensure the directory exists

    if (!fs::exists(directory)) {

        fs::create_directory(directory);

    }

 

    // Iterate through the files in the directory using regular expressions

    for (const auto& entry : fs::directory_iterator(directory)) {

        std::string filename = entry.path().filename().string();

        if (std::regex_match(filename, dataFilePattern)) {

            fileCount++;

        }

    }

 

    // Create a new file for the current run

    std::string newFilename = directory + "/data" + std::to_string(fileCount + 1) + ".txt";

    std::ofstream outfile(newFilename);

    outfile << "This is run number " << (fileCount + 1) << std::endl;

    outfile.close();

 

    std::cout << "Created file: " << newFilename << std::endl;

 

    return 0;

}

The main.cpp program does the following: 

Includes and Namespace:

  • The program includes headers needed for filesystem operations, file I/O, strings, regular expressions, and input/output streams.

  • It defines a namespace alias fs for std::filesystem.

Main Function:

  • The main function starts by defining the directory name (cpp-app-volume) and initializing a counter (fileCount) for the number of files matching a specific pattern.

  • A regular expression pattern (dataFilePattern) is defined to match filenames like data1.txt, data2.txt, etc.

Directory Creation:

  • The program checks if the directory exists using fs::exists. If it doesn't, it creates the directory using fs::create_directory.

File Iteration and Counting:

  • The program iterates through the files in the specified directory using fs::directory_iterator.

  • For each file, it checks if the filename matches the regular expression pattern. If it does, it increments the fileCount.

File Creation:

  • A new filename is constructed using the pattern dataX.txt, where X is the next number in sequence.

  • The program creates and opens a new file with the constructed name and writes a message indicating the run number.

  • The file is then closed, and a message is printed to the console indicating the created file's name.


Then create the docker volume so our data isn’t lost when we exit the container:

docker volume create cpp-app-data

This should result in this text at the terminal:

cpp-app-data

To inspect your docker volume:

docker volume inspect cpp-app-data

Which results in this terminal output:

[

    {

        "CreatedAt": "2025-02-07T09:24:05-08:00",

        "Driver": "local",

        "Labels": null,

        "Mountpoint": "/var/lib/docker/volumes/cpp-app-data/_data",

        "Name": "cpp-app-data",

        "Options": null,

        "Scope": "local"

    }

]

Next, build the Dockerfile:

docker build -t cpp-app .

This builds the Docker image and tags it with the name cpp-app. This command tells Docker to build an image from the Dockerfile in the current directory (denoted by the . at the end).

What happens when you run this command:

  • Docker looks for the Dockerfile in the current directory.

  • Docker executes the instructions in the Dockerfile:

    • It starts with the GCC base image (gcc:13.3.0).

    • It sets the working directory to /usr/src/app.

    • It copies the main.cpp file into the working directory.

    • It compiles the main.cpp file into an executable named myapp.

To verify you have successfully downloaded the image:

docker images

Which should result in the terminal spewing out:

REPOSITORY    TAG       IMAGE ID       CREATED         SIZE

cpp-app       latest    9561a544d588   9 seconds ago   1.39GB

Next run the container:

  • docker run -it --name cpp-container -v cpp-app-data:/usr/src/app/cpp-app-volume cpp-app /bin/bash

This command does the following:

-it:

  • This option makes the container interactive (-i) and allocates a pseudo-TTY (-t), which means you can interact with the container through the terminal.

--name cpp-container:

  • This names your container cpp-container for easy reference.

-v cpp-app-data:/usr/src/app/cpp-app-volume:

  • This mounts the cpp-app-data volume to /usr/src/app/cpp-app-volume inside the container, allowing the container to persistently store and access data in this volume.

cpp-app:

  • This specifies the image to use for the container, which in this case is the cpp-app image you built.

/bin/bash:

  • This overrides the default command for the container and starts a Bash shell, allowing you to interact with the container's file system and run commands within the container.

This should drop you into the docker container with a prompt. At that prompt type:

root@a52d0d0a39ba:/usr/src/app# ls cpp-app-volume

to verify that the first data file has not yet been created. This should result in no text being fed to the terminal.

Next run the program a few times:

root@a52d0d0a39ba:/usr/src/app# ./myapp

Created file: cpp-app-volume/data1.txt

root@a52d0d0a39ba:/usr/src/app# ./myapp

Created file: cpp-app-volume/data2.txt

Next verify the files were successfully created:

root@a52d0d0a39ba:/usr/src/app# ls cpp-app-volume

Which should result in the terminal output of:

data1.txt  data2.txt

Next verify the data in our files:

root@a52d0d0a39ba:/usr/src/app# cat cpp-app-volume/data1.txt

This is run number 1

root@a52d0d0a39ba:/usr/src/app# cat cpp-app-volume/data2.txt

This is run number 2

The data is there. Yay!

Now close the container by issuing the exit command:

exit

Then issue the command:

docker ps -a

CONTAINER ID   IMAGE     COMMAND       CREATED         STATUS                     PORTS     NAMES

a52d0d0a39ba   cpp-app   "/bin/bash"   4 minutes ago   Exited (0) 6 seconds ago             cpp-container

There’s the container we just exited with name cpp-container!

Now create another container and map it to the same volume.

docker run -it --name cpp-container-2 -v cpp-app-data:/usr/src/app/cpp-app-volume cpp-app /bin/bash

The difference is very subtle, we have changed the container name from cpp-container to cpp-container-2. Let’s see if our files are still there:

root@9ef2f0422e7a:/usr/src/app# ls cpp-app-volume/

data1.txt  data2.txt

There they are. Yay!

Exit the container and issue the command docker ps -a again.

docker ps -a

You should see the following:

CONTAINER ID   IMAGE     COMMAND       CREATED         STATUS                     PORTS     NAMES

9ef2f0422e7a   cpp-app   "/bin/bash"   2 minutes ago   Exited (0) 3 seconds ago             cpp-container-2

a52d0d0a39ba   cpp-app   "/bin/bash"   9 minutes ago   Exited (0) 4 minutes ago             cpp-container

And that’s it, you have successfully created your first volume! Note that if you want to access the volume directly you may, but only via the sudo command:

sudo ls /var/lib/docker/volumes/cpp-app-data/_data

data1.txt  data2.txt

Diagram

Diagram of Image flowing to Containers flowing to Volume

Diagram of our image, containers, and our little volume.

Docker Volumes

See here: Mounting a Volume Inside Docker Container - GeeksforGeeks

  • Purpose: Docker named volumes are used to store data independently of the container's lifecycle and are the preferred mechanism for persisting data in Docker.

  • Management: Volumes are managed by Docker and can be created, listed, inspected, and removed using Docker commands.

  • Location: They are stored in a part of the host filesystem that is managed by Docker (/var/lib/docker/volumes on Linux). I don’t have extensive experience with Docker on Windows or Mac, but a common way it is done is by running through a virtual machine.

Benefits:

  • Data Persistence: Data remains even if the container is removed.

  • Sharing Data: Volumes can be shared between multiple containers.

  • Backup and Restore: Easier to back up and restore data.

  • Decouple Data from Containers. Provide more flexibility by separation of application runtime from the data.

Docker Bind Mounts

  • Purpose: Bind mounts are used to map a directory or file on the host machine to a directory or file in the container.

  • Management: Bind mounts are managed by the host's filesystem and are not managed by Docker.

  • Location: They use any directory or file on the host machine.

Benefits:

  • Direct Access: Changes on the host are reflected in the container and vice versa.

  • Development: Useful for development environments where code changes on the host need to be readily available in the container.

Differences Between Volumes and Bind Mounts

Volumes:

  • Managed by Docker and stored in Docker's directory.

  • Better for data persistence, sharing, and backup.

Bind Mounts:

  • Managed by the host's filesystem.

  • Better for direct access.

Personally, I prefer volumes to bind mounts because they make things easier to share with my teammates. We will not go into much detail about bind mounts because they are pretty similar to volumes. If you want a link covering bind mounts in detail try: https://docs.docker.com/engine/storage/bind-mounts/ or https://stackoverflow.com/questions/47942016/add-bind-mount-to-dockerfile-just-like-volume

We do not go over anonymous Docker volumes because I don’t use them.

Other Topics Related to Volumes

Docker Compose Docker Compose | Docker Docs

Docker’s layered architecture
https://www.geeksforgeeks.org/what-is-docker-layered-file-system/
https://docs.docker.com/get-started/docker-concepts/building-images/understanding-image-layers/

.dockerignore file https://www.geeksforgeeks.org/how-to-use-a-dockerignore-file/

Docker build context: https://docs.docker.com/build/concepts/context/

Docker Security Access Tokens https://docs.docker.com/security/for-developers/access-tokens/

Docker multistage builds https://docs.docker.com/get-started/docker-concepts/building-images/multi-stage-builds/

Docker environment variables https://docs.docker.com/compose/how-tos/environment-variables/

Docker pause vs unpause vs stop https://stackoverflow.com/questions/51466148/pause-vs-stop-in-docker

Feedback

As always, do make a comment or write me an email if you have something to say about this post!

Credits

O’Reilly course: Docker and Kubernetes Masterclass: From Beginner to Advanced

https://github.com/lm-academy/docker-course

https://docs.docker.com/engine/install/ubuntu/#install-using-the-convenience-script

Next
Next

Introduction to Rust