Home Database Mysql Tutorial Relational Database Design: DBMS

Relational Database Design: DBMS

Jan 05, 2025 pm 12:57 PM

Relational Database Design: DBMS

Relational Database Design: Comprehensive Guide

Relational database design is the cornerstone of effective database systems, focusing on organizing data efficiently while reducing redundancy and preserving data integrity. This article provides a thorough exploration of decomposition, normalization, functional dependencies, and keys, ensuring you have a complete understanding of relational database design principles.


Decomposition in Relational Database Design

Decomposition is the process of breaking a large relation (table) into smaller, meaningful relations to eliminate redundancy, improve consistency, and optimize performance. It is a critical aspect of normalization.

Types of Decomposition

  1. Lossy Decomposition:

    • A decomposition is lossy if the original table cannot be perfectly reconstructed by joining the decomposed relations.
    • This happens when some data or relationships are lost during decomposition.
    • Example: Consider the table:
     EmployeeID | ProjectID | ProjectManager
     ---------------------------------------
     E1         | P1        | M1
     E2         | P1        | M1
    
    Copy after login
    Copy after login
    Copy after login
    Copy after login
    Copy after login
    Copy after login
    Copy after login

    If this is decomposed into:

    • Table 1: EmployeeID | ProjectID
    • Table 2: ProjectID | ProjectManager Rejoining these tables can lead to duplicate or inconsistent data, resulting in a lossy decomposition.
  2. Lossless Decomposition:

    • A decomposition is lossless if the original table can be perfectly reconstructed by joining the decomposed relations without losing any data or introducing inconsistencies.
    • This is achieved when the decomposition preserves all functional dependencies or when key attributes are included in each decomposed relation.

Functional Dependency

A functional dependency (FD) describes a relationship between two attributes in a relation where the value of one attribute (or set of attributes) determines the value of another attribute (or set of attributes). It is a fundamental concept in relational database design and normalization.

Definition:

Let X and Y be sets of attributes in a relation R. A functional dependency X → Y means that for any two tuples (rows) in R, if the tuples agree on the values of X, they must also agree on the values of Y.

  • X: Determinant (the attribute(s) on the left side).
  • Y: Dependent (the attribute(s) on the right side).

Example:

Consider a table storing student information:

StudentID | Name    | Major
----------------------------
S1        | Alice   | CS
S2        | Bob     | EE
S3        | Alice   | CS
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login

Here, StudentID → Name, Major because the StudentID uniquely determines both Name and Major.

Properties of Functional Dependencies:

  1. Reflexivity: If Y is a subset of X, then X → Y.
  2. Augmentation: If X → Y, then XZ → YZ (adding attributes to both sides preserves the dependency).
  3. Transitivity: If X → Y and Y → Z, then X → Z.

Keys in Relational Databases

Keys are essential for identifying records uniquely in a table and enforcing data integrity.

Types of Keys:

  1. Superkey:

    • A set of one or more attributes that can uniquely identify a tuple in a relation.
    • Example: In a table with attributes EmployeeID and Name, {EmployeeID}, {EmployeeID, Name} are superkeys.
  2. Candidate Key:

    • A minimal superkey, meaning no proper subset of it is also a superkey.
    • Example: If {EmployeeID} can uniquely identify a tuple, it is a candidate key.
  3. Primary Key:

    • A candidate key chosen by the database designer to uniquely identify tuples.
    • Example: EmployeeID in an Employee table.
  4. Foreign Key:

    • An attribute (or set of attributes) in one table that references the primary key in another table, establishing a relationship between the tables.
    • Example: DepartmentID in an Employee table referencing DepartmentID in a Department table.
  5. Composite Key:

    • A primary key composed of two or more attributes.
    • Example: (StudentID, CourseID) in a table of student enrollments.
  6. Unique Key:

    • A key constraint ensuring that all values in a column (or combination of columns) are unique.

Normalization and Normal Forms

Normalization is the process of organizing attributes and relations to reduce redundancy and dependency, ensuring data integrity. This is achieved by progressively meeting the criteria of successive normal forms.

Normal Forms (Comprehensive Overview)

First Normal Form (1NF)

Definition:

A relation is said to be in First Normal Form (1NF) if it satisfies the following criteria:

  1. Atomicity: All attributes (columns) must contain atomic values. This means that the values in each column are indivisible and cannot be further broken down.
  2. Single-Valued Entries: Each column in a table should contain values of a single data type, and no column should have sets, lists, or arrays.
  3. Uniqueness of Rows: Each row must be unique, meaning the table should have a primary key to distinguish between rows.
  4. No Repeating Groups: The table should not have multiple columns for the same attribute (like Item1, Item2, etc.), nor should it have multiple values stored in a single cell.

Explanation:

  • Atomic Values: Data in each cell must be in its simplest form. For example, instead of storing multiple items in one cell, each item should occupy its own row.
  • Repeating Groups: This is where multiple columns or rows represent the same type of data, making the table non-compliant with 1NF.
  • Primary Key: A primary key ensures that each row is uniquely identifiable, which is a fundamental requirement for relational databases.

Example:

Non-Compliant Table (Not in 1NF):

 EmployeeID | ProjectID | ProjectManager
 ---------------------------------------
 E1         | P1        | M1
 E2         | P1        | M1
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  • The Items column violates atomicity because it contains multiple values (e.g., "Pen, Notebook").
  • There are repeating groups because items are stored in a single cell rather than separate rows.

Compliant Table (In 1NF):

StudentID | Name    | Major
----------------------------
S1        | Alice   | CS
S2        | Bob     | EE
S3        | Alice   | CS
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  • Here, the Items column is broken down into atomic values, with each item in a separate row.
  • No cell contains multiple values, ensuring atomicity.
  • The table has no repeating groups or arrays, making it compliant with 1NF.

Second Normal Form (2NF)

Definition:

A relation is in Second Normal Form (2NF) if:

  1. It is already in First Normal Form (1NF) (i.e., no multi-valued or repeating groups).
  2. Every non-prime attribute is fully functionally dependent on the entire primary key.
  • Non-prime attribute: An attribute that is not part of any candidate key.
  • Fully functionally dependent: A non-prime attribute must depend on the entire composite primary key and not just a part of it.

Explanation:

  • A partial dependency occurs when a non-prime attribute depends on only a part of a composite primary key, rather than the whole key.
  • 2NF eliminates partial dependencies by decomposing the relation into smaller relations, ensuring that non-prime attributes are dependent only on the entire primary key or another candidate key.

This step reduces redundancy caused by partial dependencies and organizes the data better.

Example:

Non-Compliant Table (Not in 2NF):

Consider a table storing student-course information:

 EmployeeID | ProjectID | ProjectManager
 ---------------------------------------
 E1         | P1        | M1
 E2         | P1        | M1
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  • Composite Primary Key: (StudentID, CourseID).
  • Partial Dependency:
    • Instructor and Department depend only on CourseID and not on the entire primary key (StudentID, CourseID).

This violates 2NF because non-prime attributes (Instructor and Department) are partially dependent on the composite key.

Compliant Tables (In 2NF):

To remove the partial dependency, decompose the table into two relations:

  1. Student-Course Table:
StudentID | Name    | Major
----------------------------
S1        | Alice   | CS
S2        | Bob     | EE
S3        | Alice   | CS
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  1. Course-Details Table:
OrderID | Items
-------------------
1       | Pen, Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login
Copy after login

Third Normal Form (3NF)

Definition:

A relation is in Third Normal Form (3NF) if:

  1. It is in Second Normal Form (2NF) (i.e., no partial dependencies).
  2. No transitive dependency exists, which means:
    • No non-prime attribute is dependent on another non-prime attribute.
    • A non-prime attribute should depend only on a candidate key, not through another non-prime attribute.
  • Non-prime attribute: An attribute that is not part of any candidate key.
  • Transitive dependency: A dependency where a non-prime attribute depends indirectly on a candidate key through another non-prime attribute.

Explanation:

In 3NF, we eliminate transitive dependencies to reduce redundancy and improve data consistency.

  • Transitive Dependency Example: If A → B and B → C, then A → C is a transitive dependency. This means C indirectly depends on A through B.
  • Such dependencies introduce redundancy, as changes to B could lead to anomalies when updating C.

Example:

Non-Compliant Table (Not in 3NF):

OrderID | Item
---------------
1       | Pen
1       | Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login

Candidate Key: StudentID uniquely identifies each row.

  • Issue: The HOD attribute depends on Department, not directly on StudentID.
    • StudentID → Department (Direct dependency).
    • Department → HOD (Transitive dependency).
    • So, StudentID → HOD is a transitive dependency.

This structure leads to redundancy: if the HOD for the CS department changes, multiple rows need updating.

Compliant Tables (In 3NF):

To resolve the transitive dependency, decompose the table into two relations:

  1. Student-Department Table:
 EmployeeID | ProjectID | ProjectManager
 ---------------------------------------
 E1         | P1        | M1
 E2         | P1        | M1
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  1. Department-HOD Table:
StudentID | Name    | Major
----------------------------
S1        | Alice   | CS
S2        | Bob     | EE
S3        | Alice   | CS
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login

Boyce-Codd Normal Form (BCNF)

Definition:

A relation is in Boyce-Codd Normal Form (BCNF) if:

  1. It is in Third Normal Form (3NF) (i.e., no partial or transitive dependencies exist).
  2. Every determinant is a candidate key.
  • Determinant: An attribute (or a set of attributes) on which another attribute is functionally dependent.
  • Candidate Key: A minimal set of attributes that can uniquely identify each tuple in a relation.

Key Difference Between 3NF and BCNF:

  • While 3NF allows some dependencies where a non-prime attribute is functionally dependent on a candidate key, BCNF eliminates any such anomalies by ensuring that every determinant is a candidate key.

Explanation:

BCNF is stricter than 3NF and addresses situations where a relation may satisfy 3NF but still have redundancy caused by dependencies that violate BCNF.

When BCNF is Needed:

  • BCNF is necessary when a non-candidate key attribute determines part of a candidate key, leading to redundancy and anomalies.

Example:

Non-Compliant Table (Not in BCNF):

OrderID | Items
-------------------
1       | Pen, Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login
Copy after login

Functional Dependencies:

  1. CourseID → Instructor
  2. Instructor → Room

Candidate Key: CourseID

Issue:

  • The determinant Instructor is not a candidate key but determines the Room.
  • This violates BCNF, as not all determinants are candidate keys.

Compliant Tables (In BCNF):

To achieve BCNF, decompose the table into two relations:

  1. Course-Instructor Table:
OrderID | Item
---------------
1       | Pen
1       | Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login
  1. Instructor-Room Table:
StudentID | CourseID | Instructor | Department
----------------------------------------------
S1        | C1       | Dr. Smith  | CS
S2        | C2       | Dr. Jones  | EE
S1        | C2       | Dr. Jones  | EE
Copy after login

Fourth Normal Form (4NF)

Definition:

A relation is in Fourth Normal Form (4NF) if:

  1. It is in Boyce-Codd Normal Form (BCNF) (i.e., no partial, transitive, or other anomalies).
  2. It does not have any multi-valued dependencies.
  • Multi-valued Dependency (MVD): A multi-valued dependency exists when one attribute in a table determines multiple independent sets of attributes. In other words, if a relation contains two or more independent multi-valued attributes that are not related to each other, it violates 4NF.

Explanation:

In 4NF, the primary goal is to eliminate multi-valued dependencies, which occur when a record contains two or more independent attributes that are not directly related but appear together due to their dependence on the same key.

  • These types of dependencies lead to redundancy because multiple copies of the same information are repeated in rows.
  • By decomposing the relation to remove MVDs, we eliminate redundancy and improve consistency in the database.

Key Concept:

  • In 4NF, a relation should not have two or more multi-valued attributes that depend on a candidate key. Each multi-valued dependency must be eliminated by decomposing the table appropriately.

Example:

Non-Compliant Table (Not in 4NF):

Consider a table that stores information about students, the courses they take, and the clubs they are involved in:

 EmployeeID | ProjectID | ProjectManager
 ---------------------------------------
 E1         | P1        | M1
 E2         | P1        | M1
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login

Candidate Key: StudentID

Multi-Valued Dependencies:

  • A StudentID can determine both a set of Courses and a set of Clubs, but these sets are independent of each other.
    • StudentID → {Courses} (Multi-valued dependency between StudentID and Courses)
    • StudentID → {Clubs} (Multi-valued dependency between StudentID and Clubs)

The table violates 4NF because StudentID determines both the courses and the clubs independently. This causes redundancy, as the same StudentID is repeated multiple times with different combinations of courses and clubs.

Compliant Tables (In 4NF):

To make the table comply with 4NF, we must eliminate the multi-valued dependencies by decomposing it into two tables:

  1. Student-Course Table:
StudentID | Name    | Major
----------------------------
S1        | Alice   | CS
S2        | Bob     | EE
S3        | Alice   | CS
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  1. Student-Club Table:
OrderID | Items
-------------------
1       | Pen, Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login
Copy after login

Now, the two multi-valued dependencies are handled separately:

  • The Student-Course Table stores the relationship between students and the courses they take.
  • The Student-Club Table stores the relationship between students and the clubs they are involved in.

Fifth Normal Form (5NF)

Definition:

A relation is in Fifth Normal Form (5NF), also known as Projection-Join Normal Form (PJNF), if:

  1. It is in Fourth Normal Form (4NF) (i.e., no multi-valued dependencies exist).
  2. It cannot be further decomposed without losing information, meaning that the relation does not contain any join dependency or lossless join decomposition.
  • Join Dependency (JD): A join dependency occurs when a relation can be decomposed into two or more relations, but when they are joined back together, no information is lost. In other words, a join dependency exists when a relation can be divided into sub-relations, but the original relation can be reconstructed without losing any data.

Explanation:

5NF deals with join dependencies, and it ensures that the data is decomposed in such a way that all information can be reconstructed from its decomposed parts without any loss of data. A relation in 5NF is designed in such a way that all of its non-trivial join dependencies are implied by its candidate keys.

  • Lossless Join Decomposition: When a relation is decomposed into smaller relations and then rejoined, the original relation can be fully reconstructed without any data loss. A relation is in 5NF if it cannot be further decomposed without causing a loss of information.
  • Non-trivial Join Dependency: A join dependency is non-trivial if the join dependency is not trivially satisfied (i.e., not all attributes from the relation are present in the join dependency).

In simpler terms, 5NF is concerned with ensuring that there is no redundancy caused by improper decompositions. It guarantees that when a relation is decomposed and later joined back, all of the original data is still available without any loss or ambiguity.

Example:

Non-Compliant Table (Not in 5NF):

Consider a table that stores information about which suppliers supply which parts for different projects:

 EmployeeID | ProjectID | ProjectManager
 ---------------------------------------
 E1         | P1        | M1
 E2         | P1        | M1
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login

Candidate Key: (Supplier, Part, Project)

Join Dependency:

The relation above has a join dependency because it can be decomposed into smaller relations without losing information. For example, the table can be decomposed into three sub-relations:

  1. Supplier-Part Table:
 EmployeeID | ProjectID | ProjectManager
 ---------------------------------------
 E1         | P1        | M1
 E2         | P1        | M1
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  1. Supplier-Project Table:
StudentID | Name    | Major
----------------------------
S1        | Alice   | CS
S2        | Bob     | EE
S3        | Alice   | CS
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
Copy after login
  1. Part-Project Table:
OrderID | Items
-------------------
1       | Pen, Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login
Copy after login

By decomposing the table into these smaller relations, we can still recreate the original table by performing a natural join on these three smaller relations.

However, because this decomposition is possible, it violates 5NF. The reason it violates 5NF is because the information about which supplier supplies which part for a given project is redundantly stored across multiple rows. We are storing the same facts multiple times, which is unnecessary and could lead to inconsistencies.

Compliant Table (In 5NF):

To achieve 5NF, we decompose the table so that the relation cannot be decomposed further without losing information:

  1. Supplier-Part-Project Table:
OrderID | Item
---------------
1       | Pen
1       | Notebook
2       | Pencil
Copy after login
Copy after login
Copy after login

In this form, the relation is now in 5NF because it cannot be decomposed further without losing data. This table represents the same information as the original but in a more normalized form where each attribute is fully dependent on the candidate key, and no redundancy exists due to improper decomposition.


Key Concepts in Relational Design

  • Multi-Valued Dependency: When one attribute determines multiple independent values.
  • Join Dependency: Ensures no spurious tuples are created during joins.
  • Dependency Preservation: Ensures all functional dependencies are preserved after decomposition.

This comprehensive guide equips you to master relational database design, ensuring efficient, consistent, and anomaly-free database systems.

The above is the detailed content of Relational Database Design: DBMS. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1672
14
PHP Tutorial
1276
29
C# Tutorial
1256
24
MySQL's Role: Databases in Web Applications MySQL's Role: Databases in Web Applications Apr 17, 2025 am 12:23 AM

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

Explain the role of InnoDB redo logs and undo logs. Explain the role of InnoDB redo logs and undo logs. Apr 15, 2025 am 12:16 AM

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

MySQL vs. Other Programming Languages: A Comparison MySQL vs. Other Programming Languages: A Comparison Apr 19, 2025 am 12:22 AM

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages ​​such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages ​​have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

How does MySQL index cardinality affect query performance? How does MySQL index cardinality affect query performance? Apr 14, 2025 am 12:18 AM

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

MySQL for Beginners: Getting Started with Database Management MySQL for Beginners: Getting Started with Database Management Apr 18, 2025 am 12:10 AM

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA

Explain the InnoDB Buffer Pool and its importance for performance. Explain the InnoDB Buffer Pool and its importance for performance. Apr 19, 2025 am 12:24 AM

InnoDBBufferPool reduces disk I/O by caching data and indexing pages, improving database performance. Its working principle includes: 1. Data reading: Read data from BufferPool; 2. Data writing: After modifying the data, write to BufferPool and refresh it to disk regularly; 3. Cache management: Use the LRU algorithm to manage cache pages; 4. Reading mechanism: Load adjacent data pages in advance. By sizing the BufferPool and using multiple instances, database performance can be optimized.

MySQL vs. Other Databases: Comparing the Options MySQL vs. Other Databases: Comparing the Options Apr 15, 2025 am 12:08 AM

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

MySQL: Structured Data and Relational Databases MySQL: Structured Data and Relational Databases Apr 18, 2025 am 12:22 AM

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

See all articles