Last Updated on November 30, 2022 by Prepbytes
In the article, we will study what is normalization and what is the role of normalization in dbms. We will study different types of techniques to perform normalization in dbms with examples.
What is normalization in dbms?
Normalization is performed to remove anomaly and redundancy from a database. The need for normalization in dbms is to avoid problems such as:-
- Relations getting large and redundant
- Save the memory space that goes wasted.
- Avoid inconsistencies in the database.
Normalization in dbms reorganizes the data such that there is no repetitive data. Also, normalization in dbms is responsible for removing anomalies in respect to insertion, deletion and updation. Normalization compresses a larger table into a smaller one and inter-connects the table. Dependency Preserving is also among the various parts of Normalization in DBMS.
Having seen what exactly is normalization in dbms, let us have a look at normalization in dbms with examples of each among its many types.
As the sections proceed, we will be looking at each case of normalization in dbms with example. The four types of normalization in dbms are as follow:-
- 1NF
- 2NF
- 3NF
- Boyce-Codd Normal Form
1NF or First Normal Form
First Normal Form states that every attribute in a table must be single-valued and not multi-valued. Single-valued can also be termed as atomic.
Example:
id | articles |
---|---|
I1 | A1,A2 |
I2 | A3,A4 |
Above table has multiple values in the second column i.e. articles. In order to maintain atomicity in the table, we will decompose to preserve atomicity by taking the following steps:-
id | articles |
---|---|
I1 | A1 |
I1 | A2 |
I2 | A3 |
I2 | A4 |
From the above table we can notice that the disadvantage of First Normal Form is that it increases the redundancy and id with name I1 and I2 are repetitive. However, we are able to get rid of multiple values.
2NF or Second Normal Form
Second normal form is obeyed if and only if the relation is already in first normal form i.e. values are atomic and the relation has no partial dependency. Second normal form is a follow up to the first normal form as it avoids the redundancy created as part of normalizing in the first normal form.
As we solved normalization in dbms with examples in past instances, let us take an example and understand better in this case as well.
Example:
Let us assume we have a relation R with attributes A,B,C,D,E,F denoted as: R (A,B,C,D,E,F)
which is in its first normal form. The dependencies are given as :-
F = { AB -> C, C -> D, B -> E, B -> F}. We have to decompose if it is not in 2NF
Candidate Key = {AB} ; AB+ = {A,B,C,D,E,F}
Prime Attribute = {A, B}
B->E and B->F are the two partial dependencies here.
From above data let,
R1(A,B,C,D) and R2(B,E,F) are two relations
Functional Dependencies of R1 = {AB->C,C->D}
Functional Dependencies of R2 = {B->E,B->F}
Candidate Key of R1 = {AB}
Candidate Key of R2 = {B}
Thus, 2NF Normalization in DBMS with example mentioned, we get to to that there is no partial dependency and the relations are in second normal form.
3NF or Third Normal Form
To achieve third normal form of normalization in dbms, a relation must be already in second normal form or 2NF, and all the non-prime attributes must depend on the super key.
Alternatively said, None of the non-prime prime attributes depends on any of the non-prime attributes. Such dependency can be termed as Transitive Dependency.
What is Transitive Dependency
Let’s suppose we have information about students with their winning year, date of birth and sport they excelled at.
Game | Year | Student | Date of Birth |
---|---|---|---|
Athletics | 2009 | Alice | 21 July 1995 |
Basketball | 2012 | Bob | 23 August 2000 |
Chess | 2011 | Davis | 12 January 1998 |
Cycling | 2013 | Bob | 23 August 2000 |
Sport wise distribution of Students
Game | Year | Student |
---|---|---|
Athletics | 2009 | Alice |
Basketball | 2012 | Bob |
Chess | 2011 | Davis |
Cycling | 2013 | Bob |
Table of Date of Birth
Student | Date of Birth |
---|---|
Alice | 21 July 1995 |
Bob | 23 August 2000 |
Davis | 12 January 1998 |
Let G denotes Game, Y denotes Year, S denotes Students and D denotes Date of Birth here.
Relation be R(G,Y,S,D)
The set of functional dependencies is F = {GY -> SD, S -> D}
Candidate Key from above relation can be deduced as {GY}. There are no Partial Dependencies in the relation.
Prime Attributes are {T,Y} while Non-Prime Attributes are {W,D}. In the second dependency, a non-prime attribute is determining another non-prime attribute which is an example of Transitive Dependency.
BCNF or Boyce-Codd Normal Form
Similar to the past examples of normalization in dbms, to fulfill the requirements of BCNF, relation must already be in third normal form with a slight modification to its predecessor.
In each of the functional dependencies A -> B, these conditions must be fulfilled.
- B ⊆ A, It states that B must be a subset or equal to A trivially
- A is a Super Key.
Conclusion
Reaching the end of the article, we have an on what normalization in dbms is as we studied normalization in dbms with examples of each type with their theoretical insights. We hope you liked this article on Normalization in DBMS.
Frequently Asked Questions related to Normalization
1. Why is normalization in dbms performed?
Normalization is performed to remove anomaly and redundancy from a database.
2. What are the different types of normal forms?
1NF, 2NF, 3NF, BCNF
3. Are Normal Forms interconnected chronologically?
Yes, lower levels must be done successfully to carry out succeeding normal forms.