---[REF]---"style":"heading"---[REF]--- ---[SIDEBAR]--- "type":"objectives", "title":"Learning Objectives", "text":"After completing this lesson, the successful student will be able to:
  1. Describe what what a join is and why they are used.
  2. Use joins to select columns from multiple tables in the same query.
" ---[SIDEBAR]--- ## Database Tables and the Relational Model In ---[REF]---"style":"link","pageId":"lesson-designing-tables"---[REF]--- you learned about designing database tables. The first step involves separating data into themes to be stored in different database tables. The key reason for this is that databases should be designed to eliminate duplicated data in the database. To understand why this is the case, it is best to consider an example. ---[museum-database-diagram]--- displays the database diagram for the museum database. Notice that the data are separated into different tables base on the different types of things stored in the database. ---[FIGURE]--- "id":"museum-database-diagram", "title":"Museum Database Diagram", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7hX_BsoPaH4zbcUkf25Qf4QBazzQDsi4XqDY6Y9Tef5ODWacEiCNh2gTyXnk5egU7xQljugE1OxqtSZi3tGO8QixxI9zh-GY00O1uwPshuEYqw-r7pFrAEvmBDQu4OxklYro62CgO_RiDA_KCabOk4E0s34yC-_o_mUYKe_V-XqMqi1Et7P_nLHguBEU/s561/canvas-trans.png", "altText": "The museum database diagram.", "imgHandling": "scale" ---[FIGURE]--- Consider the artist and the work tables. The artist table contains all of the data stored about artists. Likewise, the work table contains all of the data stored about the works of art. The data in these tables is related. The cardinality of this relationship is best characterized as one to many (see ---[REF]---"style":"link","pageId":"lesson-table-relationships"---[REF]---). A work of art has a single artist and an artist can make more than one work of art. The fact that an artist can make more than one work of art particularly relevant to this discussion. If the data about artists and works of art were stored on a single table that table would contain duplicate information. Every artist with more than one work of art will have their data listed with each work of art. Duplicate data is problematic for a number of reasons. 1. Repeating artist data for each work of art wastes space in the database. 2. Adding the same artist data multiple times in the database increases the likelihood that the data will not be entered identically every time. The leads to inconsistencies and errors in the data. 3. If the artist data changes, you would need to update the data everywhere it was stored. 4. If an artist is removed from the database, you would need to delete it everywhere. Notice how these issues are eliminated when the data about artists and works of art are stored in separate tables. Since each artist is stored only once on the artist table, there is no wasted space and no possibility that data will not be entered inconsistently. If artist data changes, you would only need to update it in one place. Likewise, if an artist is removed from the database you would only need to remove it one place. Separating data into multiple tables according to theme does make it difficult to link connected data across tables. If there is no information about artists on the table storing data about works of art or information about works of art on the artist table, it would be impossible determine which works of art were created by an artist. Foreign keys solve this problem. Recall that a foreign key contains values from the primary key of another table. These values link data from one table to the other table. Notice that the primary key of the artist table is the artist_id. As the primary key, the artist_id is unique for each artist and serves to specifically identify each artist. Also, notice that the artist_id is included on the work table where it is a foreign key. ---[REF]---"style":"link","pageId":"lesson-table-relationships"---[REF]--- describes how this works. To determine which artist created a work of art, take the value for the artist_id on the work table and find the corresponding artist on the artist table. Primary and foreign keys allow you to keep all of the benefits of separating data into multiple tables and connect data in different tables at the same time. Using primary and foreign keys, you can write SELECT statements that join tables together so that you can include columns from multiple tables in the same query result. ## Understanding Joins The ability to link data in multiple tables is one of the most powerful features of SQL. In a single query, you can include columns from different tables, use filters based on columns in different tables, and even create calculations based on data from different tables. In short, you can use any of the features of SQL discussed in the other lessons in this resource on a single table or on more than one table at the same time. This is done by joining tables in the FROM clause. A join is performed using the keywords JOIN and ON. The syntax for a join is: FROM table1 JOIN table2 ON table1.primary_key = table2.foreign_key. The keyword JOIN is used to specify which tables will be combined in the query. The keyword ON is used to describe how the tables will be joined using primary and foreign keys. ---[SIDEBAR]--- "type":"term", "title":"Join", "text":"A keyword in the FROM clause used to specify tables to be combined in a query result." ---[SIDEBAR]--- ---[SIDEBAR]--- "type":"term", "title":"On", "text":"A keyword in the FROM clause used to describe how the tables will be joined using primary and foreign keys." ---[SIDEBAR]--- Let's consider an example. Suppose you wish to combine the artist and work tables together to display the names of the artists and the names of the works of art that they have created. Sort the results by full_name. ---[EVALUATE]--- "type":"plan", "task":"Display the names of the artists and the names of the works of art that they have created. Sort the results by full_name.", "columns":"artist name, work name", "tables":"artist, work", "rows":"all", "order":"alphabetically by full_name" ---[EVALUATE]---
You can use a chat with ^ai_name^ to create the SELECT statement that will do this: ---[AI]--- [ { "prompt": "Write a SQL statement to display the names of the artists and the names of the works that they have created. Sort the results by full_name.", "pretext": "To display the names of the artists and the names of the works they have created, sorted by the full_name of the artists, you can use the following SQL statement:", "response": "SELECT full_name, name \nFROM artist JOIN work ON artist.artist_id = work.artist_id \nORDER BY full_name;", "posttext": "This query performs a join between the artist and work tables using the artist_id column as the join condition. It selects the full_name column from the artist table (artist's name) and the name column from the work table (work's name). The result will display a list of artist names along with the names of the works they have created, sorted by full_name." } ] ---[AI]---
The syntax for a query that accomplishes this is: SELECT full_name, name FROM artist JOIN work ON artist.artist_id = work.artist_id ORDER BY full_name;
---[EVALUATE]--- "type":"evaluate", "query":"SELECT full_name, name \nFROM artist JOIN work ON artist.artist_id = work.artist_id \nORDER BY full_name;" ---[EVALUATE]--- This query returns the correct results. It retrieves the full_name column from the artist table and the name column from the work table. All rows will be included in the query results. The results are sorted alphabetically by the artists' full_name. **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:artist-name-work-name ---[SQL]--- SELECT full_name, name FROM artist JOIN work ON artist.artist_id = work.artist_id ORDER BY full_name; ---[SQL]--- The results of this query contain the full names of artists and the names of the artwork they have created. Each row of query is a joined result meaning that the artist name is linked with a work of art that is associated with that artist. This is done by matching the values for the artist_id on both tables. Each row of the query result represents when the artist_id on the work table matches an artist_id on the artist table. This is probably best seen by editing the prior example to show the artist_id from both tables in the same query. The syntax for this query would be: SELECT artist.artist_id as artist_artist_id, full_name, work.artist_id as work_artist_id, name FROM artist JOIN work ON artist.artist_id = work.artist_id ORDER BY full_name; ---[SIDEBAR]--- "type":"note", "title":"Columns from different tables with the same name", "text":"Notice that the work and artist tables both have a column named artist_id. When you want to show the artist_id in the query result, you need to specify the table for the column you wish to show. This is done with the syntax *tablename.columnname*. In this example, to show the artist_id on the artist table, you would use this syntax: SELECT artist.artist_id." ---[SIDEBAR]--- **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:artist-name-work-name-ids ---[SQL]--- SELECT artist.artist_id as artist_artist_id, full_name, work.artist_id as work_artist_id, name FROM artist JOIN work ON artist.artist_id = work.artist_id ORDER BY full_name; ---[SQL]--- In this query result the artist_id from the artist table and the artist_id from the work table are the same for each row of the query results. The artist and the work of art for a given row match. This means that the works of art were created by the artists that appear on the same row. Although it is not obvious, only artist and works that match are included in the query result. In other words, if there are artists that don't have a work of art on the work table, they will not appear in the query results. Additionally, any works of art not associated with an artist will also be excluded from the query result. Both of these are characteristics of inner joins. Inner joins are the most common way to combine tables in a single query. Details about inner joins and other types of joins will be discussed in ---[REF]---"style":"link","pageId":"lesson-advanced-joins"---[REF]---. All of the joins discussed in this lesson are inner joins. ---[SIDEBAR]--- "type":"caution", "title":"Join conditions", "text":"Join conditions are important. The join condition is the primary and foreign keys used to link the tables together with the keywork ON. If you don't include the join condition in a join, each row from the first table will be combined with each row of the second table. This is called a cartesian join." ---[SIDEBAR]--- ---[SIDEBAR]--- "type":"term", "title":"Cartesian Join", "text":"Created when the join condition is omited in a join. Each row from the first table will be combined with each row of the second table." ---[SIDEBAR]--- This example illustrates how a cartesian join works. Notice that there is no JOIN keyword in the FROM clause. Likewise, the primary and foreign key columns are not defined using the ON keyword. SELECT full_name, name FROM artist, work WHERE full_name = 'Abraham Mignon'; **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:artist-name-work-name-cartesian ---[SQL]--- SELECT full_name, name FROM artist, work WHERE full_name = 'Abraham Mignon'; ---[SQL]--- The WHERE clause is included to make the query results manageable. Only one artist (Abraham Mignon) is included in the query, but notice how Abraham Mignon is associated with all of the works of art (regardless of whether he created the art or not). If all artists were included, then every artist would be associated with every work of art. Cartesian joins are rarely used because the results often don't make sense. It wouldn't make sense to associate every artist with every work of art as was done in the last example. It is important to include the join condition for every join you create. ---[SIDEBAR]--- "type":"tip", "title":"Cross Joins", "text":"You may also hear the term cross join. Cross join is another way to refer to a cartesian join." ---[SIDEBAR]--- ## Joining Multiple Tables There is not a limit to the number of tables that can be combined in a SELECT statement. The syntax joining three tables is an extension of the syntax for joining two tables. The syntax for joining three tables is: FROM table1 JOIN table2 ON table1.primary_key = table2.foreign_key JOIN table3 ON table2.primary_key = table3.foreign_key. This basic syntax can be extended to join an arbitrary number of tables together. Consider another example. Suppose you want to display the artist name, work or art name, and name of the museum that displays the art in the same query. Sort the results by the artists' full_name. ---[EVALUATE]--- "type":"plan", "task":"Display the artist name, work or art name, and name of the museum that displays the art in the same query. Sort the results by the artists' full_name.", "columns":"artist name, work name, museum name", "tables":"artist, work, museum", "rows":"all", "order":"alphabetically by full_name" ---[EVALUATE]---
You can use a chat with ^ai_name^ to create the SELECT statement that will do this: ---[AI]--- [ { "prompt": "Write a SQL statement to display the artist name, work or art name, and name of the museum that displays the art in the same query. Sort the results by the artists' full_name.", "pretext": "To display the artist name, work of art name, and the name of the museum that displays the art in a single query, sorted by the artists' full_name, you will need to perform multiple joins between the artist, work, and museum tables. Here's the SQL statement to achieve this:", "response": "SELECT full_name, work.name AS work_name, museum.name AS museum_name \nFROM artist JOIN work ON artist.artist_id = work.artist_id \nJOIN museum ON work.museum_id = museum.museum_id \nORDER BY artist.full_name;", "posttext": "This query performs the following steps:
  1. Joins the artist and work tables based on the artist_id column.
  2. Joins the result of the previous join with the museum table based on the museum_id column.
  3. Selects the full_name from the artist table (artist's name), the name from the work table (work of art's name), and the name from the museum table (museum's name).
  4. Orders the results by the artists' full_name in ascending order.
The result will display the artist name, work of art name, and the name of the museum that displays the art, all sorted by the artists' full name." } ] ---[AI]---
The syntax for a query that accomplishes this is: SELECT full_name, work.name AS work_name, museum.name AS museum_name FROM artist JOIN work ON artist.artist_id = work.artist_id JOIN museum ON work.museum_id = museum.museum_id ORDER BY artist.full_name;
---[EVALUATE]--- "type":"evaluate", "query":"SELECT full_name, work.name AS work_name, museum.name AS museum_name \nFROM artist JOIN work ON artist.artist_id = work.artist_id \nJOIN museum ON work.museum_id = museum.museum_id \nORDER BY artist.full_name;" ---[EVALUATE]--- This query returns the correct results. It retrieves the full_name column from the artist table and the name column from both the work and museum tables. Since you want to display the 'name' column from both the work and museum tables, you must fully specify the table and column names in the SELECT clause for these columns. Note that aliases are also added to these columns to make the contents of each column clear. All rows will be included in the query results. The results are sorted alphabetically by the artists' full_name. **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:artist-name-work-name-museum-name ---[SQL]--- SELECT full_name, work.name AS work_name, museum.name AS museum_name FROM artist JOIN work ON artist.artist_id = work.artist_id JOIN museum ON work.museum_id = museum.museum_id ORDER BY artist.full_name; ---[SQL]--- The results of this query contain the full names of artists, the names of the artwork they have created, and the names of the museums that contain the artwork. Each row of query is a joined result meaning that the artist name is linked with a work of art that is associated with that artist and the museum that displays that work of art. This is done by matching the values for the artist_id on the artist and work tables tables *and* matching the museum_id on the the museum and work tables. ---[SIDEBAR]--- "type":"caution", "title":"Limiting the number of joins", "text":"While is it true that SQL does not impose a limit to the number of tables joined in a single query, joining tables can be taxing for the DBMS processing the request. Additionally, the DBMS may impose a limit to the number of tables that are joined. Because of this, it is best to limit the tables joined in a query to those that are necessary to complete the task of the query." ---[SIDEBAR]--- ## Using Other Query Features with Joined Tables As mentioned earlier, all of the query features and functions discussed in the prior lessons of this resource have been applied to queries that draw rows from a single table. The good news is that everything you have learned will also work on multi-table queries. For example, you may want to display the works of art and associated museums that were created by Vincent Van Gogh and are exhibited in museums in the USA. ---[EVALUATE]--- "type":"plan", "task":"Display the artist name, work or art name, and name of the museum that displays the art in the same query for all works created by Vincent Van Gogh and are exhibited in museums in the USA.", "columns":"artist name, work name, museum name", "tables":"artist, work, museum", "rows":"last_name is 'Van Gogh' and country is 'USA'", "order":"none specified" ---[EVALUATE]--- You could modify the SQL statement from the last example to create this query as follows: SELECT full_name, work.name AS work_name, museum.name AS museum_name, country FROM artist JOIN work ON artist.artist_id = work.artist_id JOIN museum ON work.museum_id = museum.museum_id WHERE last_name = 'Van Gogh' and country = 'USA'; ---[EVALUATE]--- "type":"evaluate", "query":"SELECT full_name, work.name AS work_name, museum.name AS museum_name, country \nFROM artist JOIN work ON artist.artist_id = work.artist_id \nJOIN museum ON work.museum_id = museum.museum_id \nWHERE last_name = 'Van Gogh' and country = 'USA';" ---[EVALUATE]--- This query returns the correct results. It retrieves the full_name column from the artist table and the name column from both the work and museum tables. Since you want to display the 'name' column from both the work and museum tables, you must fully specify the table and column names in the SELECT clause for these columns. Note that aliases are also added to these columns to make the contents of each column clear. The WHERE clause condition ensures that only those artists with a last name of Van Gogh and museums in the USA will be included in the query results. **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:artist-name-work-name-museum-name-vangogh-usa ---[SQL]--- SELECT full_name, work.name AS work_name, museum.name AS museum_name, country FROM artist JOIN work ON artist.artist_id = work.artist_id JOIN museum ON work.museum_id = museum.museum_id WHERE last_name = 'Van Gogh' and country = 'USA'; ---[SQL]--- The results of this query include all of the works of art created by Vincent Van Gogh. It also displays the museums that exhibit each work of art. Notice that the WHERE clause has two conditions. The first compares the last name of the artists to 'Van Gogh'. The second compares the countries of the museums to 'USA'. Since both the artist and museum tables are joined in the query, columns from both tables can be used as conditions in the WHERE clause. ---[SIDEBAR]--- "type":"note", "title":"Primary and Foreign Key Names", "text":"In all of the examples in this lesson, the primary and foreign key pairs used for joining tables have had the same name (artist_id is the same on the artist and the work tables). This is not required. Databases may use different naming conventions for primary and foreign keys." ---[SIDEBAR]--- ## Summary Joins are one of the most important and powerful features in SQL. Joining tables together allows you to combine multiple tables in the same query result. In this lesson, you learned why multiple tables are used to store data in a database. You also learned how primary and foreign keys can be used to link tables together. Additionally, you learned how to create queries that join tables together in the same query.