Data Science Hub
  • Data Science Hub
  • STATISTICS
    • Introduction
    • Fundamentals
      • Data Types
      • Central Tendency, Asymmetry, and Variability
      • Sampling
      • Confidence Interval
      • Hypothesis Testing
    • Distributions
      • Exponential Distribution
    • A/B Testing
      • Sample Size Calculation
      • Multiple Testing
  • Database
    • Database Fundamentals
    • Database Management Systems
    • Data Warehouse vs Data Lake
  • SQL
    • SQL Basics
      • Creating and Modifying Tables/Views
      • Data Types
      • Joins
    • SQL Rules
    • SQL Aggregate Functions
    • SQL Window Functions
    • SQL Data Manipulation
      • String Operations
      • Date/Time Operations
    • SQL Descriptive Stats
    • SQL Tips
    • SQL Performance Tuning
    • SQL Customization
    • SQL Practice
      • Designing Databases
        • Spotify Database Design
      • Most Commonly Asked
      • Mixed Queries
      • Popular Websites For SQL Practice
        • SQLZoo
          • World - BBC Tables
            • SUM and COUNT Tutorial
            • SELECT within SELECT Tutorial
            • SELECT from WORLD Tutorial
            • Select Quiz
            • BBC QUIZ
            • Nested SELECT Quiz
            • SUM and COUNT Quiz
          • Nobel Table
            • SELECT from Nobel Tutorial
            • Nobel Quiz
          • Soccer / Football Tables
            • JOIN Tutorial
            • JOIN Quiz
          • Movie / Actor / Casting Tables
            • More JOIN Operations Tutorial
            • JOIN Quiz 2
          • Teacher - Dept Tables
            • Using Null Quiz
          • Edinburgh Buses Table
            • Self join Quiz
        • HackerRank
          • SQL (Basic)
            • Select All
            • Select By ID
            • Japanese Cities' Attributes
            • Revising the Select Query I
            • Revising the Select Query II
            • Revising Aggregations - The Count Function
            • Revising Aggregations - The Sum Function
            • Revising Aggregations - Averages
            • Average Population
            • Japan Population
            • Population Density Difference
            • Population Census
            • African Cities
            • Average Population of Each Continent
            • Weather Observation Station 1
            • Weather Observation Station 2
            • Weather Observation Station 3
            • Weather Observation Station 4
            • Weather Observation Station 6
            • Weather Observation Station 7
            • Weather Observation Station 8
            • Weather Observation Station 9
            • Weather Observation Station 10
            • Weather Observation Station 11
            • Weather Observation Station 12
            • Weather Observation Station 13
            • Weather Observation Station 14
            • Weather Observation Station 15
            • Weather Observation Station 16
            • Weather Observation Station 17
            • Weather Observation Station 18
            • Weather Observation Station 19
            • Higher Than 75 Marks
            • Employee Names
            • Employee Salaries
            • The Blunder
            • Top Earners
            • Type of Triangle
            • The PADS
          • SQL (Intermediate)
            • Weather Observation Station 5
            • Weather Observation Station 20
            • New Companies
            • The Report
            • Top Competitors
            • Ollivander's Inventory
            • Challenges
            • Contest Leaderboard
            • SQL Project Planning
            • Placements
            • Symmetric Pairs
            • Binary Tree Nodes
            • Interviews
            • Occupations
          • SQL (Advanced)
            • Draw The Triangle 1
            • Draw The Triangle 2
            • Print Prime Numbers
            • 15 Days of Learning SQL
          • TABLES
            • City - Country
            • Station
            • Hackers - Submissions
            • Students
            • Employee - Employees
            • Occupations
            • Triangles
        • StrataScratch
          • Netflix
            • Oscar Nominees Table
            • Nominee Filmography Table
            • Nominee Information Table
          • Audible
            • Easy - Audible
          • Spotify
            • Worldwide Daily Song Ranking Table
            • Billboard Top 100 Year End Table
            • Daily Rankings 2017 US
          • Google
            • Easy - Google
            • Medium - Google
            • Hard - Google
        • LeetCode
          • Easy
  • Python
    • Basics
      • Variables and DataTypes
        • Lists
        • Dictionaries
      • Control Flow
      • Functions
    • Object Oriented Programming
      • Restaurant Modeler
    • Pythonic Resources
    • Projects
  • Machine Learning
    • Fundamentals
      • Supervised Learning
        • Classification Algorithms
          • k-Nearest Neighbors
            • kNN Parameters & Attributes
          • Logistic Regression
        • Classification Report
      • UnSupervised Learning
        • Clustering
          • Evaluation
      • Preprocessing
        • Scalers: Standard vs MinMax
        • Feature Selection vs Dimensionality Reduction
        • Encoding
    • Frameworks
    • Machine Learning in Advertising
    • Natural Language Processing
      • Stopwords
      • Name Entity Recognition (NER)
      • Sentiment Analysis
        • Agoda Reviews - Part I - Scraping Reviews, Detecting Languages, and Preprocessing
        • Agoda Reviews - Part II - Sentiment Analysis and WordClouds
    • Recommendation Systems
      • Spotify Recommender System - Artists
  • Geospatial Analysis
    • Geospatial Analysis Basics
    • GSA at Work
      • Web Scraping and Mapping
  • GIT
    • GIT Essentials
    • Connecting to GitHub
  • FAQ
    • Statistics
  • Cloud Computing
    • Introduction to Cloud Computing
    • Google Cloud Platform
  • Docker
    • What is Docker?
Powered by GitBook
On this page
  • Terminology: Docker Images vs Docker Containers
  • Development Process Before Containers
  • Development Process with Containers
  • Deployment Process Before Containers
  • Deployment Process with Containers
  • How Docker Works? and What is the Difference between Virtual Machines and Docker?

Was this helpful?

  1. Docker

What is Docker?

Docker is an open-source virtualization software that simplifies app development and deployment. It packages an application into what is called a container, containing everything it needs to run, including the code, libraries, dependencies, and the runtime and environment configuration. This all-in-one approach, having everything in a single Docker package, makes sharing and distributing applications much easier. It eliminates the "it works on my machine" problem. It supports a wide range of tools and integrates seamlessly with popular continuous integration and continuous deployment (CI/CD) frameworks.

Additionally, Docker Hub, a large repository of pre-built images, a read‑only snapshot that contains everything your application needs to run, accelerates application development by allowing developers to pull ready-to-use components. This ecosystem fosters innovation, enabling faster iteration, testing, and deployment cycles, which effectively reduces time-to-market for software products. Docker's impact on modern software development has been profound, making it a staple in DevOps practices globally.

Terminology: Docker Images vs Docker Containers

A Docker image is a read‑only snapshot that contains everything your application needs to run: the code or scripts, all required libraries and dependencies, configuration files, environment variables, startup commands. A Docker container, on the other hand, is a live, running instance of that image. Here are the key differences:

Aspect
Docker Image
Docker Container

Mutability

Immutable (read‑only)

Mutable (has its own writable layer)

Purpose

Packaging & distribution

Execution & runtime

Lifecycle

Built once, stored in a registry (e.g., Docker Hub)

Can be started, stopped, deleted, recreated repeatedly

Resource isolation

N/A

Namespaces control CPU, memory, network, filesystem

Storage

Stored as layered filesystems (efficient deltas)

Adds one extra “container layer” for writes

To understand why is Docker a big deal, we need to understand how applications used to be developed and deployed before Docker?

Development Process Before Containers

A team of developers working on some application they would have to install all the services that application needs directly on their operating system. For example they would need

  • PostgreSQL for database (e.g. v.17.4)

  • Redis for caching (e.g. v.6.2)

  • Mosquito for messaging (e.g. v.4.1)

  • etc.

They need all these services locally, meaning every developer in the team need to install and configure all those services with the same versions, on their development environment so they can develop and test the application. Depending on their operating system, their installation process will also be different. For example, the installation process of PostgreSQL on Windows differs from that of Mac.

As you can tell, setting up a development environment can be tedious, particularly for complex applications with multiple services, e.g. if the application relies on 15 services, each must be installed individually; and the process may vary for team members depending on their operating systems.

Development Process with Containers

Let us see now how containers can solve some of these problems. With containers we actually do not have to install any of the services directly on our operating system because with Docker we have that service packaged in one isolated environment; for instance, we have PostgreSQL with a specific version packaged with its whole configuration inside of a container so as a developer we don't have to go and look for some binaries to download and install on our machine but rather we just go ahead and start that service as a Docker container using a single Docker command which fetches the container package from internet and starts it on our computer and the docker command will be the same regardless of which operating system we're on and it will also be the same regardless of which service we are installing.

if we have 10 services that our application depends on, we would just have to run 10 Docker commands for each container and that will be it.

docker run postgres
docker run redis
etc.

As you see Docker standardizes the process of running any service on our development environment and makes the whole process much easier so that we can focus on development instead of trying to install and configure services on our machine. This obviously makes setting up our local development environment much faster and easier than the option without containers. Additionally, with the Docker we can even have different versions of the same application running on our local environment without having any conflict.

Deployment Process Before Containers

Before containers a traditional deployment process would look like this

  • development team would produce an application artifact or a package together with a set of instructions of how to install and configure that application package on the server, something like a .jar file for Java application.

    • in addition you would have some kind of database service or some other services that your application needed also with a set of instructions of how to configure and set it up on the server so that application could connect to it and use it.

  • Operations team would handle installing and configuring the application and all its dependent services such as database.

The problem with this kind of approach is that first of all you need to configure everything and install everything again indirectly on the operating system, which can be very tedious and error-prone as explained above.

Another problem that could arise from this kind of process is when there is a miscommunication between the development team and operations team because since everything is in a textual guide like an instruction list, or checklist, of how to configure and run the application there could be cases where developers forget to mention some important step about configuration. When that part fails the operations team have to go back to developers and this could lead to some back and forth communication until the application is successfully deployed on the server.

Deployment Process with Containers

With containers this process is simplified because the developers create an application package that doesn't only include the code itself but also all the dependencies and the configuration for the application so instead of having to write some textual format and document they simply package all of that inside the application artifact and since it's already encapsulated in one environment the operations team doesn't have to configure any of the needed parts directly on the server.

Therefore, it makes the whole process way easier and there is less room for errors and issues. The only thing that operations team need to do is to run a Docker command that fetches the container package that developers created and runs the Docker artifacts on the server the same way operations team will run any services that application needs. This makes the deployment process way easier on the operation side

The operations team will have to install and set up the Docker runtime on the server before they will be able to run containers but that's just one-time effort for one service or one technology. Once they have Docker runtime installed they can simply run Docker containers on that server.

How Docker Works? and What is the Difference between Virtual Machines and Docker?

Docker is a virtualization tool just like a virtual machine (VM) and virtual machines have been around for a long time so why did Docker become so widely adopted? and what advantage is it has over virtual machines? and what is the difference between the two?

To answer these questions we need to understand how Docker works on a technical level. As mentioned above with Docker we don't need to install Services directly on operating system but in that case how does Docker run its containers on an operating system.

How Does an Operating System Work?

An operating systems have two main layers: we have

  • Operating System Kernel (OS Kernel) and the

  • Operating System Application layer (OS Application Layer)

OS Kernel is the part that communicates with the hardware components like CPU, memory storage, etc. When we have a physical machine with all these resources and we install operating system on that physical machine the Kernel of the operating system will actually be the one talking to the hardware components to allocate resources like CPU, memory storage, etc. to the applications. Running on that operating system and those applications are part of the Applications layer and they run on top of the Kernel layer.

Kernel is kind of a middleman between the applications that we see when we interact with our computer and the underlying hardware of our computer.

Components of an Operating System

An operating system (OS) consists of several critical components that work together to manage computer resources and provide a user interface. Here are the main components:

  1. Kernel: The core of the OS, it manages memory, processes, and hardware devices, and facilitates communication between them.

  2. Process Management: Handles the creation, scheduling, and termination of processes, ensuring efficient CPU usage and multitasking capabilities.

  3. Memory Management: Oversees the allocation and deallocation of memory space, ensuring optimal use and preventing memory leaks.

  4. File System: Manages data storage, organization, retrieval, and access permissions for files and directories.

  5. Device Drivers: Specialized software modules that allow the OS to communicate with hardware components, like printers, graphics cards, and network cards.

  6. User Interface: Provides communication between the user and the hardware, typically through a graphical user interface (GUI) or command line interface (CLI).

  7. Security and Access Control: Protects the system from unauthorized access and ensures data privacy through user authentication and permission settings.

These components are essential for the smooth operation of an OS, providing a stable and user-friendly environment for executing applications.

Docker and virtual machine are both virtualization tools the question is what part of the operating system they actually virtualize and that's where the main difference between Docker and virtual machines actually lie. So

  • Docker virtualizes the Applications layer:

    • When we run a Docker container it actually contains the applications layer of the operating system and some other applications installed on top of that application layer. This could be a Java runtime, or python, and it uses the Kernel of the host because it doesn't have its own kernel.

  • Virtual machine has the Applications layer and its own Kernel, so it virtualizes the complete operating system

    • This means that when we download a virtual machine image on our host it doesn't use the host kernel it actually puts up its own.

Therefore,

  • The size of the Docker packages or images are much smaller because they just have to implement one layer of the operating system.

    • Docker images are usually a couple of MB large

    • VM images can be a couple of GBs

    • This means when working with Docker we actually save a lot of disk space.

  • We can run and start Docker containers much faster than VMs because virtual machine has to put up a kernel every time it starts while Docker container just reuses the host kernel. you just start the application layer on top of it so while

    • VMs, generallyy, need a couple of minutes to start

    • Docker containers usually start in a few milliseconds

  • Third difference is compatibility.

    • We can run VM image of any operating system on any other operating system host, e.g. on a Windows machine we can run a Linux VM.

    • We can't do that with Docker at least not directly. For instance, a Linux based Docker image cannot use a Windows kernel, it would need a Linux kernel to run. Because we can't run a Linux application layer on a Windows kernel.

A fun fact is that Docker was originally written and built for Linux. Later Docker made an update and developed what's called Docker desktop for Windows and Mac OS, which made it possible to run Linux based containers on Windows and Mac computers. Docker desktop uses a hypervisor layer with a lightweight Linux Distribution on top of it to provide the needed Linux kernel and this way make running Linux based containers possible on Windows and Mac operating systems.

Last updated 27 days ago

Was this helpful?

Page cover image