Data Science Hub
  • Data Science Hub
  • STATISTICS
    • Introduction
    • Fundamentals
      • Data Types
      • Central Tendency, Asymmetry, and Variability
      • Sampling
      • Confidence Interval
      • Hypothesis Testing
    • Distributions
      • Exponential Distribution
    • A/B Testing
      • Sample Size Calculation
      • Multiple Testing
  • Database
    • Database Fundamentals
    • Database Management Systems
    • Data Warehouse vs Data Lake
  • SQL
    • SQL Basics
      • Creating and Modifying Tables/Views
      • Data Types
      • Joins
    • SQL Rules
    • SQL Aggregate Functions
    • SQL Window Functions
    • SQL Data Manipulation
      • String Operations
      • Date/Time Operations
    • SQL Descriptive Stats
    • SQL Tips
    • SQL Performance Tuning
    • SQL Customization
    • SQL Practice
      • Designing Databases
        • Spotify Database Design
      • Most Commonly Asked
      • Mixed Queries
      • Popular Websites For SQL Practice
        • SQLZoo
          • World - BBC Tables
            • SUM and COUNT Tutorial
            • SELECT within SELECT Tutorial
            • SELECT from WORLD Tutorial
            • Select Quiz
            • BBC QUIZ
            • Nested SELECT Quiz
            • SUM and COUNT Quiz
          • Nobel Table
            • SELECT from Nobel Tutorial
            • Nobel Quiz
          • Soccer / Football Tables
            • JOIN Tutorial
            • JOIN Quiz
          • Movie / Actor / Casting Tables
            • More JOIN Operations Tutorial
            • JOIN Quiz 2
          • Teacher - Dept Tables
            • Using Null Quiz
          • Edinburgh Buses Table
            • Self join Quiz
        • HackerRank
          • SQL (Basic)
            • Select All
            • Select By ID
            • Japanese Cities' Attributes
            • Revising the Select Query I
            • Revising the Select Query II
            • Revising Aggregations - The Count Function
            • Revising Aggregations - The Sum Function
            • Revising Aggregations - Averages
            • Average Population
            • Japan Population
            • Population Density Difference
            • Population Census
            • African Cities
            • Average Population of Each Continent
            • Weather Observation Station 1
            • Weather Observation Station 2
            • Weather Observation Station 3
            • Weather Observation Station 4
            • Weather Observation Station 6
            • Weather Observation Station 7
            • Weather Observation Station 8
            • Weather Observation Station 9
            • Weather Observation Station 10
            • Weather Observation Station 11
            • Weather Observation Station 12
            • Weather Observation Station 13
            • Weather Observation Station 14
            • Weather Observation Station 15
            • Weather Observation Station 16
            • Weather Observation Station 17
            • Weather Observation Station 18
            • Weather Observation Station 19
            • Higher Than 75 Marks
            • Employee Names
            • Employee Salaries
            • The Blunder
            • Top Earners
            • Type of Triangle
            • The PADS
          • SQL (Intermediate)
            • Weather Observation Station 5
            • Weather Observation Station 20
            • New Companies
            • The Report
            • Top Competitors
            • Ollivander's Inventory
            • Challenges
            • Contest Leaderboard
            • SQL Project Planning
            • Placements
            • Symmetric Pairs
            • Binary Tree Nodes
            • Interviews
            • Occupations
          • SQL (Advanced)
            • Draw The Triangle 1
            • Draw The Triangle 2
            • Print Prime Numbers
            • 15 Days of Learning SQL
          • TABLES
            • City - Country
            • Station
            • Hackers - Submissions
            • Students
            • Employee - Employees
            • Occupations
            • Triangles
        • StrataScratch
          • Netflix
            • Oscar Nominees Table
            • Nominee Filmography Table
            • Nominee Information Table
          • Audible
            • Easy - Audible
          • Spotify
            • Worldwide Daily Song Ranking Table
            • Billboard Top 100 Year End Table
            • Daily Rankings 2017 US
          • Google
            • Easy - Google
            • Medium - Google
            • Hard - Google
        • LeetCode
          • Easy
  • Python
    • Basics
      • Variables and DataTypes
        • Lists
        • Dictionaries
      • Control Flow
      • Functions
    • Object Oriented Programming
      • Restaurant Modeler
    • Pythonic Resources
    • Projects
  • Machine Learning
    • Fundamentals
      • Supervised Learning
        • Classification Algorithms
          • k-Nearest Neighbors
            • kNN Parameters & Attributes
          • Logistic Regression
        • Classification Report
      • UnSupervised Learning
        • Clustering
          • Evaluation
      • Preprocessing
        • Scalers: Standard vs MinMax
        • Feature Selection vs Dimensionality Reduction
        • Encoding
    • Frameworks
    • Machine Learning in Advertising
    • Natural Language Processing
      • Stopwords
      • Name Entity Recognition (NER)
      • Sentiment Analysis
        • Agoda Reviews - Part I - Scraping Reviews, Detecting Languages, and Preprocessing
        • Agoda Reviews - Part II - Sentiment Analysis and WordClouds
    • Recommendation Systems
      • Spotify Recommender System - Artists
  • Geospatial Analysis
    • Geospatial Analysis Basics
    • GSA at Work
      • Web Scraping and Mapping
  • GIT
    • GIT Essentials
    • Connecting to GitHub
  • FAQ
    • Statistics
  • Cloud Computing
    • Introduction to Cloud Computing
    • Google Cloud Platform
  • Docker
    • What is Docker?
Powered by GitBook
On this page
  • a. Components
  • b. Tables
  • c. Relationships
  • d. Tables Structures
  • e. Data Model
  • f. Adding New Information to the Database

Was this helpful?

  1. SQL
  2. SQL Practice
  3. Designing Databases

Spotify Database Design

Last updated 1 year ago

Was this helpful?

a. Components

In this practice, we will be creating a database for my personal favorite! music app, Spotify. To do this, we will need to define the tables for each component and the relationships among them. Let's go over the main components we see when we run Spotify:

  • Users

  • Artists

  • Albums

  • Tracks

  • Playlists

  • Followers

  • Likes

b. Tables

Now that we've defined the major components it is time to create the tables and the data we will be storing in them. We can categorize the table in three groups: user-related, artist-related, and association:

  • User Related Tables

    1. users: Stores data regarding the users, including their names, emails, passwords, dates of birth, and profile images

    2. playlists: Stores data about user-created playlists, including image, playlist's name and its description, and user id that will serve as a foreign key to link it to the users table.

  • Artist Related Tables

    1. artists: Stores data about the artists, including their names, genre, and images.

    2. albums: Stores data regarding albums, including name, release year, image, and an artist id that will relate it it to the artists table.

    3. tracks: Stores data about the tracks like name, duration, and the file path, and the album id as a foreign key to link the tracks to the album table.

  • Association Tables

    1. playlists_and_tracks: shows us the connection between playlists and tracks.

    2. follows: shows us the relation between users and artists.

    3. likes: Stores relationship info between the users and the tracks they 'liked', including date and time, user id and track id.

Needless to say, each table has their own unique identifiers such as Primary Keys.

c. Relationships

Now that we have our tables, it is now time to establish the relationships among tables. There are three main relationships in a database:

  • one-to-one.

  • one-to-many.

  • many-to-many

In our design, we will have the last two relationships among our tables: one-to-many, or many-to-many:

  1. One-to-Many:

    1. Users and Playlists: One user may have none, one, or multiple playlists

    2. Artists and Albums: One artist may have none, one, or multiple albums

    3. Albums and Tracks: One album may have none, one, or multiple tracks

  2. Many-to-Many:

    1. Playlists and Tracks (playlists_and_tracks): One playlist can have any number of tracks, and one track can be part of any number of playlists.

    2. Users and Artists (follows): One user may be following any number of artists, and one artist may be followed by any number of users.

    3. Users and Tracks (likes): One user may like any number of tracks, and one track may be liked by any number of users.

Through these tables and relationships, we can readily access and manipulate data pertaining to users, artists, albums, tracks, playlists, followers, and likes.

d. Tables Structures

CREATE TABLE IF NOT EXISTS users (
	user_id SERIAL PRIMARY KEY,
	name TEXT NOT NULL,
	email VARCHAR(50) NOT NULL UNIQUE, 
	password VARCHAR(50) NOT NULL, 
	dob DATE,
	image BYTEA
);


CREATE TABLE IF NOT EXISTS playlists (
	playlist_id SERIAL PRIMARY KEY,
	user_id INT,
	name VARCHAR(50) NOT NULL,
	Image BYTEA,
	FOREIGN KEY (user_id) REFERENCES users(user_id)
);


CREATE TABLE IF NOT EXISTS artists (
	artist_id SERIAL PRIMARY KEY,
	name TEXT NOT NULL,
	genre TEXT NOT NULL,
	image BYTEA
);


CREATE TABLE IF NOT EXISTS albums (
	album_id SERIAL PRIMARY KEY,
	artist_id INT,
	name VARCHAR(50) NOT NULL,
	date_of_release DATE,
	image BYTEA,
	FOREIGN KEY (artist_id) REFERENCES artists(artist_id)
);


CREATE TABLE IF NOT EXISTS tracks (
	track_ID SERIAL PRIMARY KEY,
 	album_id INT,
  	name VARCHAR(50) NOT NULL,
  	duration INT NOT NULL,
  	path VARCHAR(255),
  	FOREIGN KEY (album_id) REFERENCES albums(album_id)
);

CREATE TABLE IF NOT EXISTS playlists (
	playlist_id SERIAL PRIMARY KEY,
	user_id INT,
	name VARCHAR(50) NOT NULL,
	descripton TEXT,
	Image BYTEA,
	FOREIGN KEY (user_id) REFERENCES users(user_id)
);


CREATE TABLE IF NOT EXISTS playlists_and_tracks (
	playlist_id INT,
 	track_id INT,
	"order" INT,
	PRIMARY KEY (playlist_id, track_id),
	FOREIGN KEY (playlist_id) REFERENCES playlists(playlist_id),
	FOREIGN KEY (track_id) REFERENCES tracks(track_id)
);


CREATE TABLE IF NOT EXISTS follows (
	user_ID INT,
	artist_ID INT,
	PRIMARY KEY (user_ID, artist_ID),
	FOREIGN KEY (user_ID) REFERENCES users(user_ID),
	FOREIGN KEY (artist_ID) REFERENCES Artists(artist_ID)
);


CREATE TABLE IF NOT EXISTS likes (
	user_id INT,
	track_id INT,
	liked_date_time TIMESTAMP,
	PRIMARY KEY (user_id, track_id),
	FOREIGN KEY (user_id) REFERENCES users(user_id),
	FOREIGN KEY (track_id) REFERENCES tracks(track_id)
);

e. Data Model

The following diagram shows the table structures and the relationships among them:

Let us see the characteristics of our entities in data model:

  • User: Represents a user, and includes information like user ID, name, email, password, date of birth, and image.

  • Artist: Represents an artist, and includes information like artist ID, name, genre, and image.

  • Album: Represents an album, and includes information like album ID, artist ID, name, release date, and image.

  • Track: Represents a track, and includes information such as track ID, album ID, name, duration, and path.

  • Playlist: Represents a user's playlist, and includes information like playlist ID, user ID, name, description and image.

  • Playlist_Track: Represents the relationship between a playlist and a track, includes information like playlist ID, track ID, and order.

  • Follow: Represents the relationship between a user and an artist, where a user can follow any number of artists and an artist can be followed by any number of users.

  • Like: Represents the relationship between a user and a track, where a user may like any number of tracks, and a track may be liked by any number of users.

Congratulations! By defining our data model that shows the relationships among the tables that we have described in the table structures and relationships sections, we have concluded our (basic) Spotify database design. By designing our database this way, we have normalized the data that we will be storing in the database, i.e. eliminated any prospect redundant data entry and enabled an improved data retrieval and manipulation.

f. Adding New Information to the Database

  • Freemium (Free with Ads)

  • Premium (Paid with No Ads)

The above database design was a simplified version of Spotify and didn't consist of information regarding to the users' subscription types. In this part, we will be including this information to our tables and updating our database design. Because the premium plans requires payment we will be adding tables regarding payments as well.

1. Adding Subscription Type Information to the Database

Adding subscription/user type to the users table is fairly simple. Let us start by giving every user a default subscription/user type of 'freemium':

ALTER TABLE users ADD COLUMN user_type VARCHAR(50) NOT NULL DEFAULT 'freemium';

It is now time to define what 'freemium' package or paid premium plans offer to listeners. We know that Spotify offers different features, some of which can be seen below, for different subscription types, and those features may be subjected to the changes or plans may have additional features in the feature.

Feature
Spotify free (Freemium)
Premium

Over 80 million songs

✓

✓

Podcasts and audiobooks

✓

✓

Travel abroad with your music

For up to 14 days.

✓

Pick and play any track on mobile

Available on select playlists.

✓

Play songs in any order

✓

Ad free music

✓

Download music

✓

Download podcasts

✓

✓

Highest music quality

✓

Therefore it is best to first create a separate table that includes only these features:

CREATE TABLE IF NOT EXISTS features (
	feat_id INT SERIAL PRIMARY KEY,
	name VARCHAR(50),
);   

Now, let us incorporate the freemium and premium plans and create an association table to connect features to their respective packages:

-- Table for Freemium and Premium packages
CREATE TABLE IF NOT EXISTS packages (
	pack_id SERIAL PRIMARY KEY,
	name VARCHAR(50) NOT NULL,
	price NUMERIC(4,2),
	number_of_accounts INT,
	description TEXT,
);

-- Association table between Features and Packages
CREATE TABLE IF NOT EXISTS package_features(
	pack_id INT,
	feat_id INT,
	PRIMARY KEY (pack_id, feat_id),
	FOREIGN KEY (pack_id) REFERENCES packages(pack_id),
	FOREIGN KEY (feat_id) REFERENCES features(feat_id)
);

Now that we have the packages along with their features, we can assign packages to the subscibers:

CREATE TABLE IF NOT EXISTS user_package (
	user_id INT,
	pack_id INT,
	pack_start_date TIMESTAMP,
	pack_end_date TIMESTAMP,
	PRIMARY KEY (user_id, pack_id),
	FOREIGN KEY (user_id) REFERENCES users(user_id),
	FOREIGN KEY (pack_id) REFERENCES packages(pack_id)
);

By adding the tables above, we have now defined two different subscription types, including freemium and premium and assigned different features to listeners. It is time for us to add the last feature which is the payments information for the premium users.

2. Adding Payments Information for Premium Users to the Database

We will create a table to store all payment information:

CREATE TABLE IF NOT EXISTS payments (
	pay_id SERIAL PRIMARY KEY,
	user_id INT NOT NULL,
	payment_method VARCHAR(50) NOT NULL,
	payment_date TIMESTAMP NOT NULL,
	amount DECIMAL(5, 2) NOT NULL,
	FOREIGN KEY (user_id) REFERENCES users(user_id)
);

By adding the payment information to our system we have concluded our Spotify Database Design, and our data model became as follows:

Once again, our design is by no means the ultimate system and can be improved further by adding different entities like recommendations, notifications, etc.

IBM's summary is a great resource to check

Let us display the tables' structures and create them simultaneously with the following queries in :

As we know Spotify offers two types of :

The also consists of based on users' needs and budget: Premium Individual, Premium Student, Premium Duo, Premium Family.

https://www.ibm.com/docs/en/mam/7.6.0?topic=structure-database-relationships
PostgreSQL
subscriptions
premium subscriptions/plan
four different packages
Page cover image
Yo-Yo Ma Spotify Profile
Data Model
Data Model expanded
Spotify Premium Packages