What Is Database Normalization?

Normalization is the process of organizing a relational database to reduce data redundancy and improve data integrity. The goal is to ensure that each piece of information is stored in exactly one place — so updates, inserts, and deletes don't create inconsistencies.

Normalization is achieved through a series of steps called normal forms, each building on the last. The three most commonly applied are First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).

First Normal Form (1NF) — Eliminate Repeating Groups

A table is in 1NF when:

  • Every column contains atomic (indivisible) values — no lists or sets in a single cell.
  • Each column holds values of a single type.
  • Each row is unique, identifiable by a primary key.

Before 1NF (violation):

A students table where the courses column contains comma-separated values like "Math, Science, History" violates 1NF. Each course should be its own row or its own related table.

After 1NF:

Split the multi-valued column into a separate enrollments table with one row per student-course pair.

Second Normal Form (2NF) — Eliminate Partial Dependencies

A table is in 2NF when it is already in 1NF and every non-key column is fully dependent on the entire primary key — not just part of it. This only matters when you have a composite primary key.

Example violation:

An order_items table with a composite key of (order_id, product_id) that also stores product_name. The product name depends only on product_id, not the full key — that's a partial dependency.

Fix:

Move product_name to a separate products table where product_id is the sole primary key.

Third Normal Form (3NF) — Eliminate Transitive Dependencies

A table is in 3NF when it is in 2NF and no non-key column depends on another non-key column (a transitive dependency).

Example violation:

An employees table with columns: employee_id, department_id, department_name. Here, department_name depends on department_id, not directly on employee_id.

Fix:

Move department_name into a departments table. The employees table keeps department_id as a foreign key.

When to Stop Normalizing

Normalization improves data integrity but can increase query complexity — you'll need more JOINs. For most OLTP (transactional) systems, 3NF is the sweet spot. For reporting and analytical databases (OLAP), some denormalization is often intentional for performance.

Summary Table

Normal FormRequirement
1NFAtomic values, unique rows, single-type columns
2NF1NF + no partial dependencies on composite keys
3NF2NF + no transitive dependencies between non-key columns

Key Takeaways

  • Normalization is about eliminating redundancy step by step.
  • Each higher normal form builds on the previous one.
  • Most production relational databases aim for 3NF.
  • Strategic denormalization can be acceptable for performance-critical read-heavy workloads.