What are the Database Modeling Techniques
?
There are 3 levels of data Design
Techniques
1. Conceptual Data Design
2. Logical Data Design
3. Physical Data Design
Conceptual Data Design
A conceptual data model identifies the highest-level
relationships between the different entities in the system (
An entity is a thing that has a distinct, separate existence
for which data is collected and maintained).
A conceptual model may include a few significant attributes
to augment the definition and visualization of entities. No
effort need be made to inventory the full attribute population
of such a model. A conceptual model may have some identifying
concepts or candidate keys noted but it explicitly does not
include a complete scheme of identity, since identifiers are
logical choices made from a deeper context.
A conceptual database model is like a
blueprint.The graphical format of the model allows the end
user to represent the information needs of the application
without worrying about technicalities such as programming
languages, field sizes, and layout of tables.
What is an ER Model ?
The Entity-Relationship Model is a conceptual
data model that views the real world as consisting of entities
and relationships.
Entities
Entities are the principal data object about
which information is to be collected. Entities are usually
recognizable concepts, either concrete or abstract, such as
person, places, things, or events which have relevance to the
database. Some specific examples of entities are EMPLOYEES,
PROJECTS, INVOICES. An entity is analogous to a table in the
relational model.
Entities are classified as independent or
dependent (in some methodologies, the terms used are strong
and weak, respectively). An independent entity is one that
does not rely on another for identification. A dependent
entity is one that relies on another for identification.
An entity occurrence (also called an instance)
is an individual occurrence of an entity. An occurrence is
analogous to a row in the relational table.
Attributes
Attributes describe the entity of which they
are associated. A particular instance of an attribute is a
value. For example, "John Miller" is one value of the
attribute Name. Attributes can be classified as identifiers or
descriptors. Identifiers, more commonly called keys, uniquely
identify an instance of an entity. A descriptor describes a
non-unique characteristic of an entity instance
Relationships
A Relationship represents an association
between two or more entities. An example of a relationship
would be: employees are assigned to projects projects have
Tasks departments manage one or more projects.
Relationships are classified in terms of
degree, connectivity, cardinality, and
existence.
Classifying Relationships
Relationships are classified by their degree,
connectivity, cardinality, direction, type, and existence. Not
all modeling methodologies use all these
classifications.
The degree of a relationship is the number of
entities associated with the relationship.
The connectivity of a relationship describes
the mapping of associated entity instances in the
relationship. The values of connectivity are "one" or "many".
The cardinality of a relationship is the actual number of
related occurences for each of the two entities. The basic
types of connectivity for relations are: one-to-one,
one-to-many, and many-to-many.
A one-to-one (1:1) relationship is when at
most one instance of a entity A is associated with one
instance of entity B. For example, "employees in the company
are each assigned their own office. For each employee there
exists a unique office and for each office there exists a
unique employee.
A one-to-many (1:N) relationships is when for
one instance of entity A, there are zero, one, or many
instances of entity B, but for one instance of entity B, there
is only one instance of entity A. An example of a 1:N
relationships is -a department has many employees each
employee is assigned to one department
A many-to-many (M:N) relationship, sometimes
called non-specific, is when for one instance of entity A,
there are zero, one, or many instances of entity B and for one
instance of entity B there are zero, one, or many instances of
entity A. An example is: employees can be assigned to no more
than two projects at the same time
Direction:The direction of a relationship is
determined by its connectivity. In a one-to-one relationship
the direction is from the independent entity to a dependent
entity. If both entities are independent, the direction is
arbitrary. With one-to-many relationships, the entity
occurring once is the parent. The direction of many-to-many
relationships is arbitrary.
Type: An identifying relationship is one in
which one of the child entities is also a dependent entity. A
non-identifying relationship is one in which both entities are
independent.
Existence: Existence denotes whether the
existence of an entity instance is dependent upon the
existence of another, related, entity instance. The existence
of an entity in a relationship is defined as either mandatory
or optional. If an instance of an entity must always occur for
an entity to be included in a relationship, then it is
mandatory. An example of mandatory existence is the statement
"every project must be managed by a single department". If the
instance of the entity is not required, it is optional. An
example of optional existence is the statement, "employees may
be assigned to work on projects".
ER Notation
Each modeling methodology uses its own
notation.
All notational styles represent entities as
rectangular boxes and relationships as lines connecting boxes.
Each style uses a special set of symbols to represent the
cardinality of a connection.
The symbols used for the basic ER constructs
are: entities are represented by labeled rectangles. The label
is the name of the entity. Entity names should be singular
nouns. relationships are represented by a solid line
connecting two entities. The name of the relationship is
written above the line. Relationship names should be verbs.
attributes, when included, are listed inside the entity
rectangle. Attributes which are identifiers are underlined.
Attribute names should be singular nouns. cardinality of many
is represented by a line ending in a crow's foot. If the
crow's foot is omitted, the cardinality is one. existence is
represented by placing a circle or a perpendicular bar on the
line. Mandatory existence is shown by the bar next to the
entity for an instance is required. Optional existence is
shown by placing a circle next to theentity that is
optional
Logical Data Design
A logical data model describes the data in as
much detail as possible, without regard to how they will be
physical implemented in the database. Features of a logical
data model include:
1.Define all Entities and Relationships among
them.
2.Specify All attributes for Each entity.
3.Define The primary key for each entity .
4.Identify and Define keys representing the
relationship between different entities 5.Normalize the
Datbase
In a conceptual data
model, the relationships are stated, not specified, we just
know that two entities are related, but we do not specify what
attributes are used for this relationship.
What is Normalization?
Normalization is the process of efficiently
organizing data in a database.
Database normalization can essentially be
defined as the practice of optimizing table structures.
Optimization is accomplished as a result of a thorough
investigation of the various pieces of data that will be
stored within the database, in particular concentrating upon
how this data is interrelated. An analysis of this data and
its corresponding relationships is advantageous because it can
result both in a substantial improvement in the speed in which
the tables are queried, and in decreasing the chance that the
database integrity could be compromised due to tedious
maintenance procedures.
The Normal
Forms
First Normal Form (1NF).
First normal form (1NF) sets the very basic
rules for an organized database: Eliminate duplicative columns
from the same table. Create separate tables for each group of
related data and identify each row with a unique column or set
of columns (the primary key).
Second Normal Form (2NF)
Second normal form (2NF) further addresses the
concept of removing duplicative data: Meet all the
requirements of the first normal form. Remove subsets of data
that apply to multiple rows of a table and place them in
separate tables. Create relationships between these new tables
and their predecessors through the use of foreign keys
Third Normal Form (3NF)
Third normal form (3NF) goes one large step
further: Meet all the requirements of the second normal form.
Remove columns that are not dependent upon the primary key.
Fourth Normal Form (4NF)
Finally, fourth normal form (4NF) has one additional
requirement: Meet all the requirements of the third normal
form. A relation is in 4NF if it has no multi-valued
dependencies
Physical Data Design
Physical data model represents how the model
will be built in the database. A physical database model shows
all table structures, including column name, column data type,
column constraints, primary key, foreign key, and
relationships between tables.
Features of a physical data model include:
1.Specification all tables and columns.
2.Foreign keys are used to identify
relationships between tables. 3.Denormalization may occur
based on user requirements.
4.Physical considerations may cause the
physical data model to be quite different from the logical
data model.
5.Physical data model will be different for
different RDBMS. For example, data type for a column may be
different between Oracle and SQL Server