# Databases and SQL ---[SIDEBAR]--- "type":"objectives", "title":"Databases and SQL", "text":"After completing this lesson, you can:
  1. Describe the building blocks of a database
  2. Explain what Structured Query Language (SQL) is and why it is used.
" ---[SIDEBAR]--- Without knowing it, you use databases all the time. Databases are a key component of almost every computer-based system. When you you check a bank account balance on your phone, you are using a database to display your account information. When you make a purchase online, your purchase information is recorded in a database. When you perform and Internet search to find a new restaurant to visit for dinner, you are accessing data stored in a database. In truth, most websites and mobile apps rely on databases. Although you frequently interact with databases, you are not alone if you don't really understand what a database is or how it works. Like most technologies, you don't need to completely understand how a database works to benefit from using one. However, the fact that you are using this resource means that you want to understand more about how databases work so that you can interact with them in more advanced ways. In this lesson, you will learn what a database is and how to interact with them. In particular, the lesson will describe the building blocks of a database. It will also describe the SQL programming language used to retrieve data from and manipulate data in a database. The remaining lessons will dive deeply into various aspects SQL to describe how to interact with databases in powerful ways. ## What is a database? There is a bit of confusion about just what a database is. Often a database is mistaken for the software used to create and manage it. The software used to create and manage a database is called a database management system (DBMS). Examples of popular DBMS software include Oracle DB, Microsoft SQL Server, MySQL, and SQLite. The relationships among a database, the DBMS, and the applications that use the data are depicted on ---[DBDBMSAPP]---. Data is stored in the database. The DBMS creates and manages the database. It also manages requests for data made by applications or users. Applications and users can access the data stored in the database, but not directly. Requests to access or manipulate the data stored in a database are made through the DBMS. ---[FIGURE]--- "id":"DBDBMSAPP", "title":"Database, DBMS, and Applications", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzpb8P1qg72OHxej0PZ3ieLcYy9A7xAEF3NViP9AFjXVTk0YUMXu718a2aZZKdaz8rt4-aDmuGeCVjCQT4WqqyQbyGbwu7c6zvKunV9-bKPbuwEySdfRV9QMslgQwgSPb0E0IWFbWet9npRvihk69YdKobN2mXCU-uXe-UMWsoQfcr_VILUm9otdY9KLVu/s800/DataDBMSApps.png", "altText": "The relationships among a database, DBMS, and applications." ---[FIGURE]--- A database is collection of data stored in an organized way. Care is taken to properly organize data in a database. A well organized database is easier to use and is more effective than a poorly organized database. While just how to design and organize a database is beyond the scope of this resource, it is important to understand the basic building blocks of a database to be able to effectively interact with the data stored in the database. ---[SIDEBAR]--- "type":"term", "title":"Database", "text":"A collection of data stored in an organized way." ---[SIDEBAR]--- ### The building blocks of a database The most common way to organize data in a database is the relational model. In a relational database, data are separated and stored in two-dimensional tables. Tables contain rows and columns. Each of these building blocks are described in the next few sections. #### Tables The information in a database is organized into tables. A table contains data about persons, places, things, or ideas. Each table will store information about a single theme. For example, a database may contain information about the artists and the museums that showcase their art. An artist and a museum are two very different things, so the data stored about each would also be very different. ---[SIDEBAR]--- "type":"term", "title":"Table", "text":"A two-dimensional way to store data in a database." ---[SIDEBAR]--- Information about each different theme will be stored on separate tables. At a minimum this artist and museum database would have two tables, one for the artist information and one for the museum information (though there would likely be other tables to further organize the data in the database). All of the artist data would be stored on the artist table. Likewise, all of the museum data would be stored on the museum table. Generally, a database table contains little or no data about a different theme. This means that you would not expect to see artist information on the museum table or the other way around. ---[artist_table]--- shows an example of an artist table. ---[FIGURE]--- "id":"artist_table", "title":"Example Artist Table", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEvUTYVTJVnMc2coAWFjAga2rAK82Gua1cVCONX9kxCVCueexmayShHHnu3q4X-U2lWiqpT05lj16Ftt8UjQt89I5lxXHgLEXftiYp0b9fZeZMBvHSMZGCdKadrmvYek5v7EHNgkJfcVAJekJLd277HAlTopbYZGvhx-9L_9gj5-L26M68K0nqt_FwjrSl/s1219/ArtistTableData.png", "altText": "Example data from the artist table." ---[FIGURE]--- Database tables are given a name. A table name should describe the data stored in the table. The name will also be unique for each table to identify them from the other tables in the database. No two tables in a database can have the same name. A database table is two-dimensional, much like a worksheet in a spreadsheet. It contains both columns and rows. The columns and rows of a database table are structured in a standard way. The columns and rows of a database table are structured to make it easy to add, edit, and retrieve data in the table. #### Rows The rows of a table contain information about the specific instances of the theme of the table. This means that the rows of the artist table would contain all of the data stored about each artist. Each row contains the data for only one instance. For example, the artist table might contain data on James Ensor (see ---[table_row]---). All information about Ensor would be stored on a single row of the artist table. Additionally, this row would not contain information about any other artist. Because of this, every row on a database table will be unique. ---[SIDEBAR]--- "type":"term", "title":"Row", "text":"Contains information about each instance of the items stored in a table." ---[SIDEBAR]--- ---[FIGURE]--- "id":"table_row", "title":"Artist Table Row for James Ensor", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnZ7-XaKLzMCq3YovlDQpokk_PI91eNyqhbUZBxRktoC4xeGCEYtpmdmLvcZyZAGKgqsgx5A1EVfbS1xQwswh-dzYlVAJU7BkgdKVyWIXOULCwSvTaUPPdoYPoRi5ADmAhjMaEb8Fan8qDMX1qUPA-3k5y6Er-1k6mcoTMYMHRhOhu2xDmDgQL7VaegV9r/s1219/ArtistTableRow.png", "altText": "All data about James Ensor on a single row of the table." ---[FIGURE]--- #### Columns The columns of the table store the characteristics of the items stored on the table. The columns of the artist table would store the characteristics of artists - such as their name, country, birth, and death information. ---[table_column]--- highlights the column that stores the last name of each artist. Care is taken to identify all of the relevant characteristics of the items stored on the table, so that the data is as useful possible. ---[SIDEBAR]--- "type":"term", "title":"Column", "text":"Contain the individual characteristics of the items stored in a table." ---[SIDEBAR]--- ---[FIGURE]--- "id":"table_column", "title":"Table Column for Artist's Last Names", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhA-75ImP3mU2Kp46AYv9di_dtD7RsA_FfglgwAPlj_rEDbm7b4V2TxeImFNC2yvf0qo3OXtRCj3LHhug1SgJAWx4XB2aEoHvjePpciRI4P7Jg6MpFbhijkUh6SoicRVkIlgeSdya-dJiZajCGl5OC0eHKRI-gw5dylR1ecHHGAk3Ax7T1uQn3durpSWDoL/s1219/ArtistTableColumn.png", "altText": "The last_name column of the artist table." ---[FIGURE]--- Care is taken to divide the data on a table into multiple columns. When the data in a table is appropriately broken into columns, it is easier to sort and filter the data on the table. Database tables are also designed to reduce or eliminate errors. Errors can occur when data is added to a table, changed, or deleted from a table.The process of optimizing the design of a database tables will be discussed in subsequent lessons. ---[SIDEBAR]--- "type":"term", "title":"Datatype", "text":"Constrains the values in a column to be a specific type of data like numbers, text, or dates/times." ---[SIDEBAR]--- To help reduce errors in data, each column of a table has a specific datatype. The datatype determines the data that can be stored in that column. Common datatypes include numbers, text, and dates/times. Since a column has a specific datatype, it can only store that type of data. When a column is set to store numbers, for example, it cannot store text characters or other symbols. Constraining the datatype in a column helps to prevent incorrect values from being stored in a table. It also helps the DBMS to optimally store values in the table. Because of this, care is taken to appropriately define the datatype for each column in a table. ## "Key" Columns There are two special purpose columns that can be included on database tables. These are called primary and foreign keys. Each table should have a primary key. The role of the primary key is described below. Foreign keys are used connect data from different tables together. Foreign keys will be discussed in subsequent lessons. ---[SIDEBAR]--- "type":"term", "title":"Foreign Key", "text":"A special purpose column on a database table that connects the data in one table to another table." ---[SIDEBAR]--- As mentioned, each row of a database table must be unique. To make sure that this is the case, a column (or set of columns) is designated as a primary key. Technically, *any* column on a table could be made the primary key, but it must meet each of these conditions: * The values for that column must be unique (no two rows of the table can have the same value). * Every row of the table must have a value for that column (no missing values in the column). * The values in that column should not be changed after they are added to the table. * The values for that column must not be reused (even if you delete the row that was originally using that value). Usually, one column of a table will be set as the primary key. It is possible to use a combination of columns as the primary key. When this is the case, the combination of columns must still meet the conditions listed above. ---[SIDEBAR]--- "type":"term", "title":"Primary Key", "text":"A special purpose column on a database table the ensures that all rows of the table are unique." ---[SIDEBAR]--- Most columns in a typical table don't meet all of these conditions. Consider the artist table. The attributes of an artist might be things like their first and last names, their preferred artistic style, their nationality, and their birth and death years. Each of these are helpful in describing characteristics of the artists, but none of them would be a good primary key. You could not guarantee that the values for any of them would be unique across all artists. More than one artist might have the same first or last name, come from the same country, work in the same style, or share a birth or death year. To make sure that the primary key values are unique, it often easiest to create a column, such as artist_id (see ---[artist_primary_key]---). The values in this column would identify (numbers or set of characters) each row of artists on the table. The simplest way to do this would be to just assign a number of each artist. Any additional artist would be given the next number in sequence. ---[FIGURE]--- "id":"artist_primary_key", "title":"The Primary Key of the Artist Table", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhckAGqAjU8Xzcal1iz6OA7Os1blsn711pN8twQfSIfgXZCKPi0r5JX8SzafQQJZ_nJxT8dTIwalI_UWbz7_aH1SUQ5zUclI-1ZAGIqwmKepySCneWOcUukaSsYUNFJpCLrDkcuC-KDeI_ptJJLSpVwt0NrjHbeMQgJUn_mWqxpdfey--1eqrh2ImfclCTk/s1219/ArtistTablePrimaryKey.png", "altText": "The artist_id column of the artist table is the primary key." ---[FIGURE]--- The main purpose of a primary key is to uniquely identify the rows on a table. Interestingly, the artist_id would not usefully describe a meaningful characteristic of an artist. The only purpose of the artist_id would be to uniquely identify the artists on the table. This is often the case with primary keys of most tables. ---[SIDEBAR]--- "type":"tip", "title":"Primary Keys", "text":"Although it is not required that a table has a primary key, it is a good idea to make sure that every table in your database has a defined primary key." ---[SIDEBAR]--- The primary key is useful because it allows you to identify a specific row. This is particularly helpful when you connect data from one table to another. It is absolutely necessary when you want to update or delete data in a table, since you don't want to change or delete the wrong data. ## What Is Structured Query Language (SQL)? Structured query language (SQL) is a computer language used to interact with a database. Specifically, SQL is the way that applications and users interact with the DBMS to make requests of the database. ---[DBDBMSAPPSQL]--- depicts this relationship. For simplicity, it's easier to think of SQL as a method for interacting with a database. ---[FIGURE]--- "id":"DBDBMSAPPSQL", "title":"Where SQL Fits", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsVhy-Rt81WQ1s9ed2O0eaMYuttzRx5MQagiEbnS5x_I9ufHKo6xcskfT3iZPWHP1_EgbTqbMOQ_HC6vGRR8Dq1h_xdqK6ijc8RhkchzawWWzEZEQlH42Dou75TF8AP0OSueFTAzsuJuIYxDZ-lJbf1HTyuhLZMASZE3dF4peG_7QhC3kRHgGvXaWjV0bx/s795/DataDBMSAppsSQL.png", "altText": "The relationships among a database, DBMS, applications, and SQL." ---[FIGURE]--- SQL statements are written as English phrases that are translated to a form that can be processed by a computer. To facilitate this translation, SQL uses a limited set of keywords and follows a standard protocol for using them. You will learn more about these keywords and the rule for using them throughout this book. SQL is an excellent tool for working with databases. Some of the advantages of SQL are: * SQL statements are English phrases which means they are relatively easy to learn and understand. * SQL is focussed as a tool for interacting with databases. This means that it has a small set of keywords. * SQL is powerful. You can use it to perform sophistocated database operations. * SQL is not proprietary. This means that all of the major DBMS support SQL. As with any language, the best way to learn SQL is to try it for yourself. To help you do this, We've integrated the ability to excecute SQL directly in this book. You'll see it in the next chapter and thoughout this set of lessons. ---[SIDEBAR]--- "type":"note", "title":"SQL Extensions", "text":"The ANSI standards committee (ANSI SQL) governs standard SQL. It is the same across DBMS. Most of the major DBMS vendors have extended the functionality of standard SQL to add features to their products. These extensions are proprietary to their product and are not supported by other DBMS. The lessons in this book describe standard SQL as much as possible." ---[SIDEBAR]--- ## Summary This lesson described the basics of databases. It covered the the building blocks of database tables. It also defined structured query language and why it is used.