What is Normalisation in database?

Let's consider an example to illustrate normalization in a database.

Suppose we have a database that stores information about books and authors. Initially, we might design a single table to hold all the information:

Table: Books

Book ID	Title	Author	Year Published	Genre
1	Book A	Author X	2005	Fiction
2	Book B	Author Y	2010	Non-Fiction
3	Book C	Author X	2015	Fiction

In this scenario, we can observe some data redundancy. The author's name "Author X" is repeated for multiple books. If an author changes their name or other details, we would need to update all the relevant rows, which can be error-prone and inefficient.

To address this, we can normalize the database by creating separate tables and establishing relationships between them. Here's an example of a normalized design:

Table: Authors

Author ID	Author Name
1	Author X
2	Author Y

Table: Books

Book ID	Title	Year Published	Genre	Author ID
1	Book A	2005	Fiction	1
2	Book B	2010	Non-Fiction	2
3	Book C	2015	Fiction	1

In this normalized design, we have separated the authors' information into a separate "Authors" table. Each author is assigned a unique "Author ID". The "Author ID" is then used as a foreign key in the "Books" table to establish a relationship between authors and their respective books.

By normalizing the database, we eliminate data redundancy. Now, if we need to update an author's information, we only need to modify a single row in the "Authors" table.

Why Normalisation?

Normalisation also provides other benefits. For example, if we want to query all books written by a specific author, we can easily join the "Books" and "Authors" tables using the "Author ID" column.

By breaking down the data into separate tables and establishing relationships, normalization helps ensure data consistency, reduce data redundancy, and facilitate efficient querying and maintenance of the database.

What is Atomicity?

Atomicity is enforced by the database management system (DBMS) through the use of transaction management mechanisms, such as transaction logs and rollback operations. If any part of a transaction fails or encounters an error, the DBMS rolls back the entire transaction, undoing all the changes made during that transaction.

For example, consider a banking system where a transaction involves transferring funds from one account to another. Atomicity ensures that if the withdrawal from the sender's account is successful, the deposit into the recipient's account will also be completed. If any error occurs during either operation, the entire transaction is rolled back, and the accounts are left unchanged.

By maintaining atomicity, databases can ensure data consistency and integrity, providing reliable and predictable behavior when executing transactions. It helps to prevent data corruption and maintain the accuracy and reliability of the database.

What are normal forms (1NF, 2NF, and 3NF) ?

Let's dive deeper into each of the normal forms (1NF, 2NF, and 3NF) with examples.

First Normal Form (1NF):
- 1NF ensures that each column in a table contains only atomic values and there are no repeating groups or arrays within the table.
- Atomic values cannot be further divided. Each value in a column should be indivisible.
- Here's an example to illustrate achieving 1NF:

Table: Students

Student ID	Name	Courses
1	John Doe	Math, Science
2	Jane Smith	English, Math

In the initial design, the "Courses" column contains multiple values separated by commas. To achieve 1NF, we need to break down the courses into separate rows:

Table: Students

Student ID	Name	Course
1	John Doe	Math
1	John Doe	Science
2	Jane Smith	English
2	Jane Smith	Math

By separating the courses into individual rows, we ensure atomicity and eliminate the repeating groups within the table.

Second Normal Form (2NF):
- 2NF builds upon 1NF and requires that all non-key attributes in a table are dependent on the entire primary key.
- Non-key attributes should depend on the entire primary key, not just part of it.
- Here's an example to illustrate achieving 2NF:

Table: Orders

Order ID	Customer ID	Product ID	Product Name	Quantity
1	101	1	Book A	2
2	101	2	Book B	1
3	102	1	Book A	3

In this example, the primary key is composed of both "Order ID" and "Product ID". However, the "Product Name" is dependent on the "Product ID" only, not the entire primary key. To achieve 2NF, we can split the table into two:

Table: Orders

Order ID	Customer ID	Product ID	Quantity
1	101	1	2
2	101	2	1
3	102	1	3

Table: Products

Product ID	Product Name
1	Book A
2	Book B

By creating a separate "Products" table, we ensure that the non-key attribute "Product Name" is dependent on the entire primary key of the "Products" table.

Third Normal Form (3NF):
- 3NF extends 2NF and ensures that there are no transitive dependencies between non-key attributes.
- Non-key attributes should depend only on the primary key and not on other non-key attributes.
- Here's an example to illustrate achieving 3NF:

Table: Employees

Employee ID	Name	Department	Department Location
1	John Doe	HR	New York
2	Jane Smith	IT	San Francisco
3	Alex Johnson	Sales	London

In this example, the "Department Location" attribute depends on the "Department" attribute, which is not the primary key. To achieve 3NF, we can split the table into two:

Table: Employees

Employee ID	Name
1	John Doe
2	Jane Smith
3	Alex Johnson

Table: Departments

Department	Department Location
HR	New York
IT	San Francisco
Sales	London

By creating a separate "Departments" table, we eliminate the transitive dependency between "Department Location" and "Department", ensuring that non-key attributes depend only on the primary key.

What is Normalisation in database?

Table of contents

Why Normalisation?

What is Atomicity?

What are normal forms (1NF, 2NF, and 3NF) ?