---[REF]---"style":"heading"---[REF]--- ---[SIDEBAR]--- "type":"objectives", "title":"Building Database Tables", "text":"After completing this lesson, the successful student will be able to:
  1. Describe the steps in designing database tables.
  2. Describe the business domain model of a proposed database.
  3. Understand how to create a database model.
  4. Understand basic syntax to build database tables and insert data into a table
" ---[SIDEBAR]--- ## Designing Database Tables In order to effectively interact with a database, you first need to know how the data in the database are organized. To understand the organization of data in a database, it is helpful to know the basic principles of database modeling and design. Because the focus on this book is not the nuances of database modeling and design, the design topics covered in this book provide just enough details on database design to help you interact with databases. This lesson is a reflection of that. It is not a complete treatment of database design. It focusses on the design of individual database tables. ---[REF]---"style":"link","pageId":"lesson-joining-tables"---[REF]--- describes how to model the relationships among database tables. ---[REF]---"style":"link","pageId":"lesson-normalization"---[REF]--- covers the rules to consider to optimally design a database to reduce or eliminate errors in the data stored in a database. This lesson will describe two important ways to model data: the business model and the database model. Additionally, it will describe how to implement a database model into database tables and rows of data in those tables. Understanding these concepts will help you to write SQL statements to effectively retrieve and manipulate the data in a database. ## The Business Model Before describing what is meant by the business model of a database, it is important to describe what is meant by business and model. Business is meant in the most generic of ways. For the purposes of this discussion, the business is the organization or individual that has a data need. Business is an appropriate term, because it implies that the data will be put into action to complete a user or organizational task. When creating a business model, you must first understand the need for the database and the tasks that will be performed with the data. ---[SIDEBAR]--- "type":"term", "title":"Business", "text":"An organization or individual that has a data need." ---[SIDEBAR]--- It is equally important to understand the second term, model. A model is a representation of something else, generally in a scaled or simplified form. In general, the more complex a model is, the better it approximates what it represents. Of course, increasing complexity comes at an increased cost. Because of this, care must be taken when creating a model that it be just complex enough to serve the purposes of the model. ---[SIDEBAR]--- "type":"term", "title":"Model", "text":"A representation of something else, generally in a scaled or simplified form." ---[SIDEBAR]--- Two elements of a model are important to determining how complex a model should be. First, you should consider the relevance of various aspects of a model. Something is relevant if it is needed or helpful for completing the intended use of the model. For example, if you were building a model of a person, you could include any number of characteristics of people (hair color, blood type, height, weight, IQ, and more). If you are using the model to determine which people to admit to an MBA program, only a few characteristics will actually be relevant (such as grade point average and GMAT scores). If the model is used to recruit players for a university soccer team, other characteristics will be relevant. In general, only relevant characteristics should be included in a model. ---[SIDEBAR]--- "type":"term", "title":"Relevance", "text":"A characteristic that is needed or helpful for completing the intended use of the model." ---[SIDEBAR]--- Second, you should consider the granularity of the model. Granularity refers to the level of detail of the model. As with relevance, granularity is best understood in the context of the task to be performed by the model. If you are using a model to track the monthly sales of a company, you probably only need to see the monthly sales totals for the organization. On the other hand, if you are trying to determine which times of the day are most profitable for the company, you would need to track hourly sales totals (or even individual transactions). Hourly sales totals are more detailed (or more granular) than monthly totals. ---[SIDEBAR]--- "type":"term", "title":"Granularity", "text":"The level of detail of the model." ---[SIDEBAR]--- When constructing a business model of a database, you are concerned with understanding the business needs for the database and the tasks that the database will support. In constructing the model, you must determine what characteristics of the business and tasks are relevant. You must also determine the level of detail or granularity of the data to be stored. This is best done by talking to the people who will use the database to support their work. ### Building a database diagram One important tool available to SQL programmers is a database diagram. This diagram depicts entities, attributes of entities, and the relationships among the entities. It is a graphical depiction of the data that are stored in the database and is helpful in writing queries of the database. Simply put, an entity is a thing. In the context of a business model of a database, an entity is something that you want to store information about. An attribute is a characteristic of an entity. Finally, a relationship describes how entities are associated. A business model is often the starting point for a database diagram, because the business model describes the business entities to be represented in the database, the attributes of those entities and the relationships among the business entities. An example database diagram is depicted as a diagram (see ---[database-diagram-format]---). ---[SIDEBAR]--- "type":"term", "title":"Entity", "text":"Something that you want to store information about." ---[SIDEBAR]--- ---[SIDEBAR]--- "type":"term", "title":"Attribute", "text":"A characteristic of an entity." ---[SIDEBAR]--- ---[SIDEBAR]--- "type":"term", "title":"Relationship", "text":"The associations among entities." ---[SIDEBAR]--- ---[FIGURE]--- "id":"database-diagram-format", "title":"Database Diagram Format", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSaQHsExG3hTy8vXdN87TCJ9GSIqNHz5VJkrL--ub82t7OPbFuCZtVEbdV4s1rdlbNrQepRknIq6unO9FWctMOqwcHhPwVnX_GqjajEG31yXyqrI_QpI5wAbxNfpKfFayNscbZU2i-WrpEv3QPl-HnxpteFacHlxqM1EiYyKLwCb6ng3pQRsH7l5OiJwEp/w640-h258/DatabaseDiagramFormat.png", "altText": "The format for a database diagram.", "imgHandling": "scale" ---[FIGURE]--- Entities are represented by rectangles in a database model diagram. Each entity is named. The name of each entity appears at the top of the rectangle representing that entity. The attributes are also named. They appear within the rectangle that represents that entity under the name of the entity. The relationships are depicted as lines connecting the entities that are related. ### The Database Diagram of the Museum Database The lessons in this resource will use a database that stores information about museums, the artwork exhibited in those museums, and the artists who created the works of art. A database diagram can be built to depict this database. This diagram would depict the aspects of museums, artists, and works of art that are relevant and at the correct level of granularity to support the tasks to be performed with the database. After meeting with the expected users of the database you have determined that the database needs to include: * The name of the artists * The name of the museums * The titles of the works of art * The preferred style of each artist * The style of the works of art * The location of the museums * The nationality of the artists * The phone number and website of the museums * The birth and death years of the artists The first step in creating the business model of this data is to identify the themes in what the database needs to include. In this case, it appears that the database needs to include information about artists, works of art, and museums. Each of these themes will be a different entity in the model. ---[museum-diagram-no-attributes]--- shows the database diagram with these entities. ---[FIGURE]--- "id":"museum-diagram-no-attributes", "title":"Simple Museum Database Diagram", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyjmoaTKSOGiIHPJXOuACrD7CV7QXDNzG0QaaPmcLln5qcQBe4oKUW9EOQR3era1_NxMW7O9UyLGtoU6L3HzUJpkrhmuO60eN9gp9TarreA8QsvEusnVOhA0z3Q2w8cB2AJGA1M9dc9T3-gpT2lFtw605YTmf_cW2fKjofkWYSoXgcItjybGjcNB000klv/w640-h148/DatabaseDiagramNoAttributes.png", "altText": "Simple museum database diagram.", "imgHandling": "scale" ---[FIGURE]--- After determining the entities for the model, the next step involves determining the attributes for each entity. The list of data needs for the museum database can be used to determine the attributes of the artist, work, and museum entities. The attributes of each might include: Artist attributes: * Name (the name of the artists) * Style (the preferred style of each artist) * Nationality (the nationality of the artists) * Birth and Death (the birth and death years of the artists) Work attributes: * Name (the titles of the works of art) * Style (the style of the works of art) Museum attributes: * Name (the name of the museums) * Location (the location of the museums) * Contact Information (the phone number and website of the museums) For simplicity, you should create a descriptive name for each attribute, as in the previous lists, when you add it to the diagram. ---[museum-diagram-attributes]--- shows the database diagram with the attributes of each entity. ---[FIGURE]--- "id":"museum-diagram-attributes", "title":"Museum Database Diagram with Attributes", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzn_nXa6x35UA-9FiXAbuiZTnw0mT5Wjxh4I2TuTdhqYQlqR4iiWNdygpTsMZ6iUpClk4OA-UlyMo2NzAvgHD1cwJYp6eITqExpdR2AjnnAfxO7ko6f3uXnt68poqkzlqpjplNWl1IAmbzEeiBQP36-uCHJTP64cCYOkefRO0O1gKvw1v1RCzK-l_4FKas/w640-h154/DatabaseDiagramAttributes.png", "altText": "Simple museum database diagram with attributes.", "imgHandling": "scale" ---[FIGURE]--- The final step in the process involves modeling the relationships among the entities. Relationships are discussed in detail in ---[REF]---"style":"link","pageId":"lesson-joining-tables"---[REF]---. Notice that relationships are included in ---[museum-diagram-attributes]---. In this model, the artist and work entities are related. This makes sense. Artists are responsible for creating works. The work and museum entities are also related. Again this makes sense. Museums contain works of art. As you can see, the database diagram tells you a lot about the database. It shows the major themes (entities) in the data and the individual attributes of each entity. It also describes how the entities relate to each other. It can be developed further to contain enough technical detail to serve as the blueprint of a database. ## The Database Model At this point in the modeling process, the business model is complete. It can be used as the basis for creating the database model. The business model provides a conceptual blueprint for the data, but it is not sufficient to be used as a blueprint for the database. To develop the model further to the point that it can be used as the blueprint of a database, you must first decide the type of database you want to use. There are a number of types of databases. These include relational, hierarchical, network, and other types of databases. The relational database model is used in this book. A complete discussion of the types of databases not provided in these lessons. A relational database stores data in two-dimensional tables as described in ---[REF]---"style":"link","pageId":"lesson-understanding-sql"---[REF]---. A database diagram can be easily used as the basis for a relational database. Each of the entities in the business model will become the tables of the relational database and the attributes become the columns of these tables. The requirements of a relational database are not completely filled with a business model. In particular, you must ensure that the columns represented in the model are atomic, that a primary key has been designated, and the domain of the columns are defined. ### Atomic Columns First, the columns in a relational database should be atomic. This means that they need to be decomposed to their smallest useful form. For example, the column location in the museum table is not atomic. In includes the street address, city, state, country, and postal code of each museum. Including all of this information in one column is problematic. The data is not consistent across the museums (for example, not all countries have states). It would be also difficult to sort or filter the data by an element of the data, like country. To fix this problem, location should be split into multiple columns that are each atomic. In this example, location should be split into columns for address, city, state, country, and postal code. Many of the attributes in the business model diagram of the museum database are not atomic. The attributes for each entity are listed below, and rated whether they are atomic or not. If they are not atomic, suggestions are made to make them atomic. ---[SIDEBAR]--- "type":"term", "title":"Atomic", "text":"A value that is decomposed to its smallest useful form." ---[SIDEBAR]--- Artist attributes: * Name: not atomic (split to first_name, last_name, and middle_name columns) * Style: atomic * Nationality: atomic * Birth and Death: not atomic (split to birth and death columns) Work attributes: * Name: atomic (the title of a work is in the smallest useful form) * Style: atomic Museum attributes: * Name: atomic * Location: not atomic (split to address, city, state, country, and postal_code columns) * Contact Information: not atomic (split to phone, and website columns) The database diagram should be updated to reflect these changes. ---[museum-diagram-atomic-attributes]--- shows the updated diagram. ---[FIGURE]--- "id":"museum-diagram-atomic-attributes", "title":"Museum Database Diagram with Atomic Attributes", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBsc7u4eu5k3ULi-MOIJ3fbv91oncU8KDFHhWec8TTgxJLvgjYxGe9WLLIMYPMg9mt_h-OC36-j96yepUmgGKXKg8g8h9ooojs5fwanUhFiIEIXEcqY3bj7h9BRHpZfUj_v67iCQNKyeVqg8YgXrImTY76Wpg7tYkj9Yzyzj7skgZxMR5snKjlSQqc3mqz/w640-h240/DatabaseDiagramAtomicAttributes.png", "altText": "Museum database diagram with atomic attributes.", "imgHandling": "scale" ---[FIGURE]--- ### Primary Keys Each table in a relational database should also have a primary key. The primary key is a column that keeps each of the rows of the table unique. The values in the primary key column must be different for each row of the table, cannot be blank, and should not change. In some tables, one or more exist columns might meet these criteria and qualify as a candidate key. A candidate key is an existing column of a database table that could be the primary key (it is unique, does not contain blank values, and will not change). If a table has a candidate key, it could be designed as the primary key for the table. ---[SIDEBAR]--- "type":"term", "title":"Candidate Key", "text":"An existing column of a database table that could be the primary key." ---[SIDEBAR]--- Often there is not an exist column on a table that is a candidate key. Examine the columns of the artist table in the museum database. None of the existing columns of the table are candidate keys, because none of them are unique. Take the first name of the artists. More than one artist could have the same first name. The first name column is not a good candidate to be the primary key. It is equally unlikely that any of the columns of the artist table will be unique. ---[SIDEBAR]--- "type":"note", "title":"Choosing Primary Keys Carefully", "text":"This raises an important point. It may well be that all of the artists that are currently in the database have a different first name. This means that the first name could be used at the primary key now. However, additional artists might be added to the database in the future. If a new artist has the same first name as one of the existing artists, the values for first name would no longer be unique. Whenever possible it is best to design the database anticipating the implications of adding new data." ---[SIDEBAR]--- When a table does not have a candidate key, a surrogate key is created to be the primary key. A surrogate key is a column that is added to a table for the sole purpose of becoming the primary key. The values for a surrogate key don’t describe a meaningful attribute of the entity. They just identify instances or rows for the table. A surrogate key, artist_id could be added to the artist table to be the primary key. The values for artist_id would be numbers designed to each row of the table to identify each artist. ---[SIDEBAR]--- "type":"term", "title":"Surrogate Key", "text":"A column that is added to a table for the sole purpose of becoming the primary key. The values for a surrogate key don’t describe a meaningful attribute of the entity." ---[SIDEBAR]--- None of the three tables in the museum database have a candidate key. Identifiers (surrogate keys) should be added to each table to become the primary key. ---[museum-diagram-primary-keys]--- shows the database diagram with the primary keys added. ---[FIGURE]--- "id":"museum-diagram-primary-keys", "title":"Museum Database Diagram with Primary Keys", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgd2t_ecxgQ0b9y5UmyHh1kF7XbK7ps_K1qzGua01qEeiU_4paleHoZzG0V57E6V8cDPOvRwXThs4hgY0tZy1K8EBc__v7LVukSKve1vggG_HvofsfOEIfXitLt3L5TfGuszVIdikDz1kRMp1441nfZBF8fnXYEhLWLE9SmHLXZyqqPjEXuHul0I5zQq2Tp/w640-h266/DatabaseDiagramPrimaryKeys.png", "altText": "Museum database diagram with primary keys.", "imgHandling": "scale" ---[FIGURE]--- At this point, you can consider the database diagram to be nearly complete, because you will not add any new tables, columns, or relationships to the database until you examine the relationships among the tables in the database. You will not examine these relationships in detail until ---[REF]---"style":"link","pageId":"lesson-joining-tables"---[REF]---. At that time, foreign keys will be added to some of the tables to make it easier to link data one different tables together. ### Defining Column Domains Although the database diagram is nearly complete, an important design element must still be completed before you are ready to create database tables and add data to those tables. The domain of each column in the tables in the database need must defined. The domain of a column is the set of acceptable values that can be stored in the database for that column. ---[SIDEBAR]--- "type":"term", "title":"Domain", "text":"The set of acceptable values that can be stored in a database column." ---[SIDEBAR]--- One important aspect of domain is the datatype of the column. Datatype refers to kind of data that can be stored in a column. Common data types are characters (text), numbers, as well as date and time. Depending on the type of database you use, there can be more than a dozen data types to choose from. When the datatype of a column is set, only values that fit that type can be stored in that column. This is helpful, because it prevents data that does not belong in a particular column from being stored in that column. Once the datatype is defined, additional constraints can be placed on the column to more completely define the data to be stored in the column. For example, the first_name column on the artist table would have a character datatype. You might further constrain the values in the column by adding a limit to the number of characters that can be stored in a single column. Columns that are defined with one of the number data types can be further constrained in a few ways. You can set it to store only whole numbers (integers), limit the number of decimal places in non-integer numbers, or prohibit the storage of certain ranges of values. All columns can be constrained to prohibit the storage of blank values. A column can also be set to allow only unique values. A more complete discussion of data types and the constraints that can be added to database columns is presented in ---[REF]---"style":"link","pageId":"lesson-creating-tables"---[REF]---. The domains for each column in the museum database are listed below: Artist attributes: * Artist_id: Integer * First_name: Text data limited to 20 characters. May not be blank * Last_name: Text data limited to 20 characters. * Middle_name: Text data limited to 20 characters. May not be blank * Style: Text data limited to 25 characters. * Nationality: Text data limited to 25 characters. May not be blank * Birth: Integer * Death: Integer Work attributes: * Work_id: Integer * Name: Text data limited to 50 characters. May not be blank * Style: Text data limited to 25 characters. Museum attributes: * Museum_id: Integer * Name: Text data limited to 50 characters. May not be blank * Address: Text data limited to 50 characters. May not be blank * City: Text data limited to 25 characters. May not be blank * State: Text data limited to 25 characters. * Country: Text data limited to 25 characters. * Postal_code: Text data limited to 10 characters. * Phone: Text data limited to 12 characters. * URL: Text data limited to 50 characters. ### Documenting the database design Adding the domain and constraint information for each column to the database diagram would make the diagram unnecessarily complex and more difficult to use for creating queries. A text-based modeling tool is used to document domain and constraint details (see below). This model begins with a list of the tables in the database. The columns for each the table are listed in parentheses with the datatype and additional constraints for each. Primary and foreign keys are designated. The model for the museum database is listed below: Artist(   Artist_id Integer Primary Key,   First_name Text,   Last_name Text,   Middle_name Text,   Style Text,   Nationality Text,   Birth Integer,   Death Integer ) Work(   Work_id Integer Primary Key,   Name Text,   Style ) Museum(   Museum_id Integer Primary Key,   Name Text,   Address Text,   City Text,   State Text,   Country Text,   Postal_code Text,   Phone Text,   URL Text ) As mentioned above, foreign keys should also be designated. Recall that foreign keys are used to connect tables together. ---[REF]---"style":"link","pageId":"lesson-table-relationships"---[REF]--- discusses the relationships between tables. The text-based design for the museum database will be completed as part of that discussion. ## Building Database Tables At this point, the model is detailed enough that it can be used to build the database tables and add some data to the tables. The format of the schema is very similar to the SQL statements used to create tables. The syntax for creating the Artist table is below. CREATE TABLE Artist(   Artist_id INT PRIMARY KEY,   First_name VARCHAR(20),   Last_name VARCHAR(20),   Middle_name VARCHAR(20),   Style VARCHAR(40),   Nationality VARCHAR(40),   Birth INT,   Death INT ); The syntax for creating a table uses the keywords CREATE and TABLE and then the name of the table. After the table name, each column in the table is listed with its datatype and any constraints added to the columns. Keywords are used to define the datatypes and column constraints. Aside from the keywords, the SQL statement used to create a table is the same as the model. A detailed discussion of this syntax is presented in ---[REF]---"style":"link","pageId":"lesson-creating-tables"---[REF]---. The artist table is now ready for data. Suppose you want to add a row to the database for the artist Pierre-Auguste Renoir. The values for each attribute of Renior is below:
Artist_id First_name Last_name Middle_name Style Nationality Birth Death
500 Pierre Renoir Auguste Impressionist French 1841 1919
The SQL statement used to add the row for Renior to the database is: INSERT INTO artist (Artist_id, First_name, Last_name, Middle_name, Style, Nationality, Birth, Death) VALUES (500, Pierre, Renoir, Auguste, Impressionist, French, 1841, 1919) Notice the keywords INSERT and INTO then the name of the table, artist. The columns you will add a value for are then listed in parentheses. Next the keyword VALUES is inserts. Finally, the values to be added to the table are listed in parentheses. Again, the SQL statements for inserting new rows into a database table will be discussed in ---[REF]---"style":"link","pageId":"lesson-creating-tables"---[REF]---. ## Summary This lesson described the process used to model and design a database table. First a model of the business domain is developed. This model serves as the starting point for a database diagram. Once the business domain is understood, it is time to select the database platform to be used. With the database platform in mind, the business domain model is further developed into a database model that ensures the atomicity of column values and the designation of primary and foreign keys. Before the database can be built, the domain of the columns must be defined. The lesson concludes with by presenting the SQL statements used to create a table and add a new row to the table.