Application Development and Automation Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

what is normalisation?

Former Member
0 Likes
3,380

what is normalisation?

1 ACCEPTED SOLUTION
Read only

Former Member
0 Likes
2,059

hi

<b>Database normalization</b> is a technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems. For example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of data integrity. A table that is sufficiently normalized is less vulnerable to problems of this kind, because its structure reflects the basic assumptions for when multiple instances of the same information should be represented by a single instance only.

Higher degrees of normalization typically involve more tables and create the need for a larger number of joins, which can reduce performance. Accordingly, more highly normalized tables are typically used in database applications involving many isolated transactions (e.g. an automatic teller system), while less normalized tables tend to be used in database applications that do not need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).

Database theory describes a table's degree of normalization in terms of normal forms of successively higher degrees of strictness. A table in third normal form (3NF), for example, is consequently in second normal form (2NF) as well; but the reverse is not always the case.

Although the normal forms are often defined informally in terms of the characteristics of tables, rigorous definitions of the normal forms are concerned with the characteristics of mathematical constructs known as relations. Whenever information is represented relationally, it is meaningful to consider the extent to which the representation is normalized.

<b>Normal forms</b>

The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to such inconsistencies and anomalies. Each table has a "highest normal form" (HNF): by definition, a table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the requirements of any normal form higher than its HNF.

The normal forms are applicable to individual tables; to say that an entire database is in normal form n is to say that all of its tables are in normal form n.

Newcomers to database design sometimes suppose that normalization proceeds in an iterative fashion, i.e. a 1NF design is first normalized to 2NF, then to 3NF, and so on. This is not an accurate description of how normalization typically works. A sensibly designed table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF, it is overwhelmingly likely to have an HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually require an extra expenditure of effort on the part of the designer, because 3NF tables usually need no modification to meet the requirements of these higher normal forms.

Edgar F. Codd originally defined the first three normal forms (1NF, 2NF, and 3NF). These normal forms have been summarized as requiring that all non-key attributes be dependent on "the key, the whole key and nothing but the key". The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-to-many and one-to-many relationships among attributes. Sixth normal form (6NF) incorporates considerations relevant to temporal databases.

<b>First normal form</b>

Main article: First normal form

A table is in first normal form (1NF) if and only if it faithfully represents a relation.[3] Given that database tables embody a relation-like form, the defining characteristic of one in first normal form is that it does not allow nulls or duplicate rows. Simply put, a table with a unique key and without any nullable columns is in 1NF.

A misconception looms from the common assertion that a 1NF table may not have repeating groups[4]. While that statement itself is axiomatic, experts disagree about what qualifies as a "repeating group", thus the precise definition of 1NF is the subject of some controversy. Notwithstanding, this theoretical uncertainty applies to relations, not tables. Table manifestations are intrinsically free of variable repeating groups because they are structurally constrained to the same number of columns in all rows.

See the first normal form article for a fuller discussion of the nuances of 1NF.

<b>Second normal form</b>

Main article: Second normal form

The criteria for second normal form (2NF) are:

The table must be in 1NF.

None of the non-prime attributes of the table are functionally dependent on a part (proper subset) of a candidate key; in other words, all functional dependencies of non-prime attributes on candidate keys are full functional dependencies.[5] For example, in an "Employees' Skills" table whose attributes are Employee ID, Employee Address, and Skill, the combination of Employee ID and Skill uniquely identifies records within the table. Given that Employee Address depends on only one of those attributes – namely, Employee ID – the table is not in 2NF.

Note that if none of a 1NF table's candidate keys are composite – i.e. every candidate key consists of just one attribute – then we can say immediately that the table is in 2NF.

<b>Third normal form</b>

Main article: Third normal form

The criteria for third normal form (3NF) are:

The table must be in 2NF.

Every non-prime attribute of the table must be non-transitively dependent on every candidate key.[5] A violation of 3NF would mean that at least one non-prime attribute is only indirectly dependent (transitively dependent) on a candidate key. For example, consider a "Departments" table whose attributes are Department ID, Department Name, Manager ID, and Manager Hire Date; and suppose that each manager can manage one or more departments. {Department ID} is a candidate key. Although Manager Hire Date is functionally dependent on the candidate key {Department ID}, this is only because Manager Hire Date depends on Manager ID, which in turn depends on Department ID. This transitive dependency means the table is not in 3NF.

<b> Boyce-Codd normal form</b>

Main article: Boyce-Codd normal form

A table is in Boyce-Codd normal form (BCNF) if and only if, for every one of its non-trivial functional dependencies X &#8594; Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[6]

<b>Fourth normal form</b>

Main article: Fourth normal form

A table is in fourth normal form (4NF) if and only if, for every one of its non-trivial multivalued dependencies X &#8594;&#8594; Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[7]

<b>Fifth normal form</b>

Main article: Fifth normal form

The criteria for fifth normal form (5NF and also PJ/NF) are:

The table must be in 4NF.

There must be no non-trivial join dependencies that do not follow from the key constraints. A 4NF table is said to be in the 5NF if and only if every join dependency in it is implied by the candidate keys.

<b> Domain/key normal form</b>

Main article: Domain/key normal form

Domain/key normal form (or DKNF) requires that a table not be subject to any constraints other than domain constraints and key constraints.

[ <b>Sixth normal form</b>

A table is in sixth normal form (6NF) if and only if it satisfies no non-trivial join dependencies at all.[8] This obviously means that the fifth normal form is also satisfied. The sixth normal form was only defined when extending the relational model to take into account the temporal dimension. Unfortunately, most current SQL technologies as of 2005 do not take into account this work, and most temporal extensions to SQL are not relational. See work by Date, Darwen and Lorentzos[9] for a relational temporal extension, Zimyani[10] for further discussion on Temporal Aggregation in SQL, or TSQL2 for a non-relational approach.

REWARD IF USEFULL

4 REPLIES 4
Read only

Former Member
0 Likes
2,059

Check the below links :

Thanks

Seshu

Read only

Former Member
0 Likes
2,059

normalization is a design technique for structuring relational database tables. Tables can be normalized to a greater or lesser degree. The most common normal forms, from least normalized to most normalized, are as follows:

<b>

First normal form (1NF)

Second normal form (2NF)

Third normal form (3NF)

Boyce-Codd normal form (BCNF)

Fourth normal form (4NF)

Fifth normal form (5NF)

Domain/key normal form (DKNF)

Sixth normal form (6NF)</b>

Database normalization is a technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems.

Higher degrees of normalization typically involve more tables and create the need for a larger number of joins, which can reduce performance. Accordingly, more highly normalized tables are typically used in database applications involving many isolated transactions (e.g. an automatic teller system), while less normalized tables tend to be used in database applications that do not need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).

<b>Problems addressed by normalization</b>

An update anomaly. Employee 519 is shown as having different addresses on different records.

An insertion anomaly. Until the new faculty member is assigned to teach at least one course, his details cannot be recorded.

A deletion anomaly. All information about Dr. Giddens is lost when he temporarily ceases to be assigned to any courses.A table that is not sufficiently normalized can suffer from logical inconsistencies of various types, and from anomalies involving data operations. In such a table:

The same information can be expressed on multiple records; therefore updates to the table may result in logical inconsistencies. For example, each record in an "Employees' Skills" table might contain an Employee ID, Employee Address, and Skill; thus a change of address for a particular employee will potentially need to be applied to multiple records (one for each of his skills). If the update is not carried through successfully—if, that is, the employee's address is updated on some records but not others—then the table is left in an inconsistent state. Specifically, the table provides conflicting answers to the question of what this particular employee's address is. This phenomenon is known as an update anomaly.

There are circumstances in which certain facts cannot be recorded at all. For example, each record in a "Faculty and Their Courses" table might contain a Faculty ID, Faculty Name, Faculty Hire Date, and Course Code—thus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a newly-hired faculty member who has not yet been assigned to teach any courses. This phenomenon is known as an insertion anomaly.

There are circumstances in which the deletion of data representing certain facts necessitates the deletion of data representing completely different facts. The "Faculty and Their Courses" table described in the previous example suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears. This phenomenon is known as a deletion anomaly.

Ideally, a relational database table should be designed in such a way as to exclude the possibility of update, insertion, and deletion anomalies. The normal forms of relational database theory provide guidelines for deciding whether a particular design will be vulnerable to such anomalies. It is possible to correct an unnormalized design so as to make it adhere to the demands of the normal forms: this is called normalization.

Normalization typically involves decomposing an unnormalized table into two or more tables that, were they to be combined (joined), would convey exactly the same information as the original table.

<b>First Normal Form (1NF)</b>

First normal form (1NF) sets the very basic rules for an organized database:

Eliminate duplicative columns from the same table.

Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

<b>Second Normal Form (2NF)</b>

Second normal form (2NF) further addresses the concept of removing duplicative data:

Meet all the requirements of the first normal form.

Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

Create relationships between these new tables and their predecessors through the use of foreign keys.

<b>Third Normal Form (3NF)</b>

Third normal form (3NF) goes one large step further:

Meet all the requirements of the second normal form.

Remove columns that are not dependent upon the primary key.

<b>Fourth Normal Form (4NF)</b>

Finally, fourth normal form (4NF) has one additional requirement:

Meet all the requirements of the third normal form.

A relation is in 4NF if it has no multi-valued dependencies.

Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.

<b>

Hope this is helpful.

Do reward</b>

Message was edited by:

Runal Singh

Read only

Former Member
0 Likes
2,060

hi

<b>Database normalization</b> is a technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems. For example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of data integrity. A table that is sufficiently normalized is less vulnerable to problems of this kind, because its structure reflects the basic assumptions for when multiple instances of the same information should be represented by a single instance only.

Higher degrees of normalization typically involve more tables and create the need for a larger number of joins, which can reduce performance. Accordingly, more highly normalized tables are typically used in database applications involving many isolated transactions (e.g. an automatic teller system), while less normalized tables tend to be used in database applications that do not need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).

Database theory describes a table's degree of normalization in terms of normal forms of successively higher degrees of strictness. A table in third normal form (3NF), for example, is consequently in second normal form (2NF) as well; but the reverse is not always the case.

Although the normal forms are often defined informally in terms of the characteristics of tables, rigorous definitions of the normal forms are concerned with the characteristics of mathematical constructs known as relations. Whenever information is represented relationally, it is meaningful to consider the extent to which the representation is normalized.

<b>Normal forms</b>

The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to such inconsistencies and anomalies. Each table has a "highest normal form" (HNF): by definition, a table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the requirements of any normal form higher than its HNF.

The normal forms are applicable to individual tables; to say that an entire database is in normal form n is to say that all of its tables are in normal form n.

Newcomers to database design sometimes suppose that normalization proceeds in an iterative fashion, i.e. a 1NF design is first normalized to 2NF, then to 3NF, and so on. This is not an accurate description of how normalization typically works. A sensibly designed table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF, it is overwhelmingly likely to have an HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually require an extra expenditure of effort on the part of the designer, because 3NF tables usually need no modification to meet the requirements of these higher normal forms.

Edgar F. Codd originally defined the first three normal forms (1NF, 2NF, and 3NF). These normal forms have been summarized as requiring that all non-key attributes be dependent on "the key, the whole key and nothing but the key". The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-to-many and one-to-many relationships among attributes. Sixth normal form (6NF) incorporates considerations relevant to temporal databases.

<b>First normal form</b>

Main article: First normal form

A table is in first normal form (1NF) if and only if it faithfully represents a relation.[3] Given that database tables embody a relation-like form, the defining characteristic of one in first normal form is that it does not allow nulls or duplicate rows. Simply put, a table with a unique key and without any nullable columns is in 1NF.

A misconception looms from the common assertion that a 1NF table may not have repeating groups[4]. While that statement itself is axiomatic, experts disagree about what qualifies as a "repeating group", thus the precise definition of 1NF is the subject of some controversy. Notwithstanding, this theoretical uncertainty applies to relations, not tables. Table manifestations are intrinsically free of variable repeating groups because they are structurally constrained to the same number of columns in all rows.

See the first normal form article for a fuller discussion of the nuances of 1NF.

<b>Second normal form</b>

Main article: Second normal form

The criteria for second normal form (2NF) are:

The table must be in 1NF.

None of the non-prime attributes of the table are functionally dependent on a part (proper subset) of a candidate key; in other words, all functional dependencies of non-prime attributes on candidate keys are full functional dependencies.[5] For example, in an "Employees' Skills" table whose attributes are Employee ID, Employee Address, and Skill, the combination of Employee ID and Skill uniquely identifies records within the table. Given that Employee Address depends on only one of those attributes – namely, Employee ID – the table is not in 2NF.

Note that if none of a 1NF table's candidate keys are composite – i.e. every candidate key consists of just one attribute – then we can say immediately that the table is in 2NF.

<b>Third normal form</b>

Main article: Third normal form

The criteria for third normal form (3NF) are:

The table must be in 2NF.

Every non-prime attribute of the table must be non-transitively dependent on every candidate key.[5] A violation of 3NF would mean that at least one non-prime attribute is only indirectly dependent (transitively dependent) on a candidate key. For example, consider a "Departments" table whose attributes are Department ID, Department Name, Manager ID, and Manager Hire Date; and suppose that each manager can manage one or more departments. {Department ID} is a candidate key. Although Manager Hire Date is functionally dependent on the candidate key {Department ID}, this is only because Manager Hire Date depends on Manager ID, which in turn depends on Department ID. This transitive dependency means the table is not in 3NF.

<b> Boyce-Codd normal form</b>

Main article: Boyce-Codd normal form

A table is in Boyce-Codd normal form (BCNF) if and only if, for every one of its non-trivial functional dependencies X &#8594; Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[6]

<b>Fourth normal form</b>

Main article: Fourth normal form

A table is in fourth normal form (4NF) if and only if, for every one of its non-trivial multivalued dependencies X &#8594;&#8594; Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[7]

<b>Fifth normal form</b>

Main article: Fifth normal form

The criteria for fifth normal form (5NF and also PJ/NF) are:

The table must be in 4NF.

There must be no non-trivial join dependencies that do not follow from the key constraints. A 4NF table is said to be in the 5NF if and only if every join dependency in it is implied by the candidate keys.

<b> Domain/key normal form</b>

Main article: Domain/key normal form

Domain/key normal form (or DKNF) requires that a table not be subject to any constraints other than domain constraints and key constraints.

[ <b>Sixth normal form</b>

A table is in sixth normal form (6NF) if and only if it satisfies no non-trivial join dependencies at all.[8] This obviously means that the fifth normal form is also satisfied. The sixth normal form was only defined when extending the relational model to take into account the temporal dimension. Unfortunately, most current SQL technologies as of 2005 do not take into account this work, and most temporal extensions to SQL are not relational. See work by Date, Darwen and Lorentzos[9] for a relational temporal extension, Zimyani[10] for further discussion on Temporal Aggregation in SQL, or TSQL2 for a non-relational approach.

REWARD IF USEFULL

Read only

Former Member
0 Likes
2,059

refer

<a href="http://databases.about.com/od/specificproducts/a/normalization.htm">link1</a>

<a href="http://www.devshed.com/c/a/Administration/Database-Normalization/">link2</a>

<a href="http://www.rsolutions.net/RSweb/Normalization/">link3</a>

<a href="http://www.agiledata.org/essays/dataNormalization.html">link4</a>

regards,

srinivas