Database Fundamentals

What is a Database

A database is a container and designed to efficiently store, retrieve, and manipulate large amounts organized of data.

The term "database" is frequently misused to refer to the database software being used. However, this is an incorrect usage and often leads to confusion. The correct term for the software is the Database Management System (DBMS). The database itself is the container that is created and manipulated through the DBMS. The nature and structure of the database can vary between different databases.

Following information is for Relational Databases, which will be discussed in Database Management Systems section.

What is a Schema?

In SQL, the database schema defines the structure and characteristics of the database objects within a database system. It specifies the names of the objects, the columns or fields they contain, the data types of those columns, and any constraints or rules that govern the data. For instance, the 'year' feature of almost all tables is restricted to hold only Integer values, while a 'name' column may contain only string values. This predefined structure enables a database to maintain efficiency and consistency, even when storing millions or billions of rows.

What is a Table?

A table in a database represents a structured collection of related data. It is a two-dimensional arrangement of rows and columns, where each row represents a specific record or instance of data, and each column represents a particular attribute or field of that data, and has an associated datatype, i.e. type of data the column can contain, defined in the schema. Every table in a database has a unique name that identifies it.

Hint: Breaking Up Data

It is extremely important to break data into multiple columns correctly. For example, city, state, and ZIP code should always be separate columns. By breaking these out, it becomes possible to sort or filter data by specific columns (for example, to find all customers in a particular state or in a particular city). If city and state are combined into one column, it would be difficult to sort or filter by state.

Primary Key (PK)

In a table, it is essential for each row to have a column (or set of columns) that uniquely identifies it. This ensures that no two rows have identical identification values. For instance, a table storing customer information may utilize a customer ID column as the unique identifier, while an orders table may employ an order ID.

The column (or set of columns) responsible for uniquely identifying each row in a table is referred to as the primary key (PK). It serves as a means to reference and distinguish individual rows. When a primary key is absent, updating or deleting specific rows becomes challenging, as there is no reliable method to accurately identify the intended rows to be affected.

Having a primary key in a table guarantees the ability to precisely reference and manipulate rows, ensuring the integrity and reliability of the data within the table.

The primary key in a table can be chosen from any column, provided that it adheres to the following conditions:

  • No two rows can share the same primary key value.

  • Each row must have a PK value, meaning primary key columns cannot have NULL values.

  • Modifying or updating primary key values are not allowed

  • Primary key values should not be reused. If a row is deleted, its primary key should not be assigned to any new rows in the future.

While primary keys are commonly defined on a single column, it is also possible to use multiple columns as a composite primary key. In such cases, the aforementioned rules apply to all the columns, and the combination of values from these columns must be unique (individual columns do not need to have unique values).

Hint: Always Assign Primary Keys

It is crucial to establish a primary key in a table to ensure the uniqueness, integrity, and reliable identification of rows.

Foreign Key (FK)

A foreign key is a column in a table whose values must be listed in a primary key in another table.

What is View?

A view is a virtual table that is derived from the data in one or more underlying tables. It does not store any data itself but presents the data in a structured and organized format, allowing users to query and manipulate it as if it were a physical table.

Views offer several advantages:

  1. Simplify complex queries: Views can abstract away the complexity of joining multiple tables or applying filters by encapsulating them into a single virtual table.

  2. Data security and access control: Views can be used to provide restricted access to certain columns or rows, allowing users to see only the data they are authorized to access.

  3. Data abstraction and consistency: Views can present a subset or combination of columns from different tables, providing a unified and consistent view of the data.

  4. Query reusability: Views can encapsulate frequently used queries, enabling them to be reused by different users or applications without the need to rewrite the underlying query.

Last updated