# 2. Retrieving Data ---[SIDEBAR]--- "type":"objectives", "title":"Learning Objectives", "text":"After completing this lesson you can:
  1. Use the SELECT statement to retrieve one or more columns of data from a table.
  2. Use the keyword DISTINCT to show only unique rows in a database query
  3. Use the keyword LIMIT to show a specified number of rows in a query result
" ---[SIDEBAR]--- ## The SELECT Statement Structured query language (SQL) is used to retrieve and manipulate data stored in database tables. SQL uses a set of reserved plain english words, called keywords, to form sentence-like commands. This lesson will describe how to use select statements to retrieve one of more columns of data from a table. ---[SIDEBAR]--- "type":"term", "title":"Keyword", "text":"A reserved word that is part of the SQL language." ---[SIDEBAR]--- Because SQL statements are sentence-like, they are meant to be easy to create and to understand. For example, the statement "SELECT name FROM museum" is a valid SQL statement. It is also fairly easy to understand. It would be used to retrieve all of the values from the "name" column stored on the museum table. The keywords in this statement are "SELECT" and "FROM". Because keyword are part of the SQL language, they are "reserved" meaning they are not used to name tables or columns in database. You would not see a column named "select" or a table named "from" in a database. ---[SIDEBAR]--- "type":"note", "title":"Capitalizing SQL Statements", "text":"SQL statements are not case sensitive. The words in a statement can be in uppercase, lowercase, or mixed case. \"SELECT\", \"select\", and \"SeLeCt\" will all be treated the same when they are excuted. When we refer to a keyword in this set of lessons, we will always render it in upper case to differentiate it as a keyword, even though they do not need to be capitalized for the SQL statement to execute correctly." ---[SIDEBAR]--- ---[SIDEBAR]--- "type":"caution", "title":"Table and Column Names and Case", "text":"Although the SQL language is not case-sensitive, the names of tables, columns, and values may be. For example, some database management systems are configured to treat the column name \"firstname\" as different than \"FirstName\"." ---[SIDEBAR]--- There are six main keywords in the SQL language used to retrieve data from a database. Working together, these keywords are used to precisely define the data you want to retrieve from the database. These keywords are: 1. SELECT - used to list the columns you would like to retrieve 2. FROM - used to specify the table or tables you want to get the data from 3. WHERE - used to create conditions or filters on the data that will be retrieved 4. GROUP BY - used to categorize the data that is retrieved from the database 5. HAVING - used to create more complex conditions or filters on the data that will be retrieved 6. ORDER BY - used to sort the data that is retrieved There are two important rules for using these keywords to create a select statement to retrieve data from a database. First, you must provide at least two pieces of information: the columns you would like to retrieve and the table or tables from which you want to retrieve the data. In other words, you must have both SELECT and FROM in every select statement. Second, the order of the keywords is important. All select statements must follow the order listed above. SELECT will always be first and FROM will always be second. The remaining keywords, if used, will appear same order every time. If a keyword is not used, it is just omitted from the statement. In this lesson, we will only use SELECT and FROM. The other keywords will be described in other lessons.
---[SIDEBAR]--- "type":"tip", "title":"Examples in the Lessons", "text":"The best way to learn a new technology skill like writing SQL statements is to practice writing them. You will find opportunities to practice throughout the lessons of this book. Each example will present a question that can be answered with data from a database. You will see a SQL statement that retrieves the needed data in a text box. You can execute the statement and retrieve the actual data. You can also edit the statement to see how changes to the statement will result in different data being returned." ---[SIDEBAR]---
---[SIDEBAR]--- "type":"tip", "title":"Examples in the Lessons", "text":"The best way to learn a new technology skill like writing SQL statements is to practice writing them. You will find opportunities to practice throughout the lessons of this book. Each example will present a question that can be answered with data from a database. You will then see the transcript of a chat with ^ai_name^ to help you generate a SQL statement that retrieves the needed data in a text box. You can execute the statement and retrieve the actual data. You can also edit the statement to see how changes to the statement will result in different data being returned." ---[SIDEBAR]---
This lesson features example SQL statements that interact with the museum database (see ---[museumDBDiagram]---). The museum database has five tables: 1. artist: contains information about the artists whose work is exhibited in the museums in the database. 2. work: contains information about the individual works of art exhibited in the museums in the database. 3. subject: contains information about the topic or focus of the works of art in the database 4. museum: contains information about the museums in the database 5. museum_hours: contains information about the days and times the museums are open ---[FIGURE]--- "id":"museumDBDiagram", "title":"The Museum Database Diagram", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTUdqdAE1lgoTZdSeCFhvs7DjP1IpmkrmcgvG4cvbm6s6bj8S5HMHVZvqDcktQH8pfW0Y_YWPrNTMJYDgjRqMYn3Th50b_MXDvNjACrzqiqLTKecTaWj9fZpVjy1-lKG4ElONb6MU4Wxevq8KSeL4BedSCJld7EdmE1W4EIXibj4rkCYNCBbaG6FYL/s16000/canvas_transparent.png", "altText": "A diagram of the museum database. The museum database has the following tables: artist, subject, work, museum, and museum_hours." ---[FIGURE]--- Each of the examples in this lesson highlight how to write SQL statements to retrieve one or more columns from a table in the museum database. They also describe some basic ways to limit the number of rows that are returned in the query results. ## Retrieving Individual Columns Perhaps the most basic SELECT statement is used to return a single column from a database table. This statement will have both a SELECT clause and a FROM clause. The SELECT clause is used to define the columns you wish to display in the query result. The FROM clause is used to specify the table from which you want to return the data.
For example, let's say you want to see the names of the museums in the database. You can use ^ai_name^ to create the SQL statement you need to do that. ---[AI]--- [ { "prompt": "Write a SQL statement to list the names of museums", "pretext": "To list the names of museums from a database, you would need to know the specific table and column where the museum names are stored. Assuming there is a table called \"Museums\" with a column named \"Name,\" you can use the following SQL statement:", "response": "SELECT Name \nFROM Museums;", "posttext": "This query selects the \"Name\" column from the \"Museums\" table and retrieves all the museum names. Replace \"Museums\" with the actual table name in your database to execute the query successfully." } ] ---[AI]--- Because ^ai_name^ doesn't know anything about the database, it will need to guess the table name and column name. Notice that it does a good job of guessing the column name correctly. The name of each museum is stored in the "name" column. ^ai_name^ does not guess the table name correctly, though. Notice that the information about the museums is stored on the "Museum" table. You could fix this by either modifying the SQL to use the correct tablename or you could provide ^ai_name^ with additional information. You could message ^ai_name^: ---[AI]--- [ { "prompt": "Try again. The table is named museum.", "pretext": "Apologies for the mistake. Here's the revised SQL statement to list the names of museums from a table named \"Museum\":", "response": "SELECT Name \nFROM Museum;", "posttext": "This query selects the \"Name\" column from the \"Museum\" table and retrieves all the museum names. Replace \"Museum\" with the actual table name in your database to execute the query successfully." } ] ---[AI]--- Now the column and table names match the database diagram.
For example, let's say you want to see the names of the museums in the database. You can use this SQL statement to do that. *SELECT name FROM museum*
**Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas|visible owner:atlas-query result:museum-name ---[SQL]--- SELECT name FROM museum; ---[SQL]--- The previous statement uses the SELECT statement to retrieve a single column called "name" from the "museum" table. The desired column name is specified right after the SELECT keyword, and the FROM keyword specifies the name of the table from which to retrieve the data. Click the "Play" button below the statement to see the results. A simple SELECT statement returns all the rows in a table. Data is not filtered to retrieve a subset of the results, nor is it sorted. We’ll discuss these topics in the next few lessons. ---[SIDEBAR]--- "type":"tip", "title":"Terminating Statements", "text":"You can execute more than one SQL statement at the same time. Multiple SQL statements need to be separated by semicolons. For most database management systems, a semicolon is not required at the end of a single SQL statement executed alone. If your particular DBMS complains, you might have to add one. There is no harm in adding a semicolon when it is not required." ---[SIDEBAR]--- All extra white space within a SQL statement is ignored when that statement is processed. SQL statements can be specified on one long line or broken up over many lines. So, these three statements are will all be treated the same: 1. SELECT name FROM    museum; 2. SELECT name FROM   museum; 3. SELECT      name FROM      museum; **Try the SQL statements.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:museum-3 height:10 ---[SQL]--- SELECT name FROM museum; SELECT name FROM museum; SELECT name FROM museum; ---[SQL]--- Most people who write a lot of SQL statements find that breaking them up over multiple lines makes them easier to read and debug. ## Retrieving Multiple Columns The SELECT statement used to retrieve more than one column from a table is very similar to the statement used to retrieve a single column. The only difference is that each column must be specified right after the SELECT keyword. Each column is separated by a comma. ---[SIDEBAR]--- "type":"tip", "title":"Be Careful with Commas", "text":"When selecting multiple columns, be sure to specify a comma between each column name, but not after the last column name. Adding a comma that the end of the list of columns will result in an error." ---[SIDEBAR]---
Let's say you want see the name, city and country of each of the museums in the database. You can use ^ai_name^ to generate a SQL statement to retrieve this data. ---[AI]--- [ { "prompt": "Write a SQL statement to list the names, cities, and countries of museums", "pretext": "Certainly! Here's an example of a SQL statement to list the names, cities, and countries of museums:", "response": "SELECT name, city, country \nFROM museums;", "posttext": "In this example, it is assumed that there is a table named \"museums\" in the database, which contains columns named \"name\", \"city\", and \"country\" to store the respective information for each museum. You may need to adjust the table and column names based on your specific database schema." } ] ---[AI]--- Again, ^ai_name^ assumes that the table is named "museums". You can adjust the table name or prompt ^ai_name^ to provide the correct table name. ---[AI]--- [ { "prompt": "Try again. The table is named museum.", "pretext": "Apologies for the mistake. Here's the corrected SQL statement to list the names, cities, and countries of museums from a table named \"museum\":", "response": "SELECT name, city, country \nFROM museum;", "posttext": "This statement will retrieve the columns \"name\", \"city\", and \"country\" from the \"museum\" table and display the respective information for each museum." } ] ---[AI]---
Let's say you want see the name, city and country of each of the museums in the database. The following SELECT statement retrieves the three columns from the Museum table: *SELECT name, city, country FROM museum*
**Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:museum-3 height:3 ---[SQL]--- SELECT name, city, country FROM museum; ---[SQL]--- Just as in the prior example, this statement uses the SELECT statement to retrieve data from the "museum" table. In this example, three column names are specified, each separated by a comma. ---[SIDEBAR]--- "type":"note", "title":"Ordering Columns", "text":"You probably noticed that the columns are displayed in the query result in the order that they are specified after the keyword SELECT. You can display columns in any order by changing the order in the SELECT statement." ---[SIDEBAR]--- ## Retrieving All Columns There are times that you will want to retrieve all of the columns from a database table. One way is to list all of the column names in the the SELECT statement; however, there is a nice shortcut that you can use to save time. The asterisk "*" is a wildcard character that can be used to retrieve all columns from a table.
Suppose you want to get a quick look at all of the data on the museum table. The following SQL statement will retrieve all of the columns from the museum table: *SELECT \* FROM museum*
**Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas result:museum owner:atlas-query ---[SQL]--- SELECT * FROM museum; ---[SQL]--- When you use the wildcard (*), all columns in the table are returned. Generally, the columns will be returned in the order that they appear in the table. It is usually best to specify the columns that you want to retrieve in your query and not use the wildcard. Retrieving unneeded columns can slow the process of retrieving data from a database. ---[SIDEBAR]--- "type":"tip", "title":"Using Wildcards to Explore Data", "text":"It can be helpful to explore the data in a database to help you effectively write queries to retrieve the specific data you need. This is particularly true if you don't have a database diagram to show you the names of the columns in the database tables. Using the wildcard character to retrieve all of the columns in a database table give you a quick and easy way to explore the data stored on the table." ---[SIDEBAR]--- ## Removing Duplicate Rows All of the queries in this lesson have returned all of the rows from the source table. There are times that you won't want to retrieve all rows. Often the rows retrieved by a query repeat. For example, you may wish to list all of the country in the museum database with a museum. The SQL statement that retrieves the cities with museums is: *SELECT country FROM museum* **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:all-museum-countries ---[SQL]--- SELECT country FROM museum; ---[SQL]--- Notice that the SQL statement returns 57 rows even though there are only 16 different countries on the museum table. Many of the countries displayed appear in the results more than once. It would be easier to use the results if each country were list only once. You can edit this query to show only the unique values for country. This is done by adding the keyword DISTINCT directly after the keyword SELECT. The updated SQL statement would be: *SELECT DISTINCT country FROM museum* **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:distinct-museum-countries ---[SQL]--- SELECT DISTINCT country FROM museum; ---[SQL]--- Notice that how adding the keyword DISTINCT reduces the result set to 16 rows. Each of the rows in the query result contains a unique country.
Artificial intelligence tools like ^ai_name^ can successfully determine when the keyword DISTINCT is needed. Consider the chat below that asks ^ai_name^ to construct a query to retrieve the countries with a museum. ---[AI]--- [ { "prompt": "Write a SQL statement to list the countries on the Museum table.", "pretext": "To list the countries in the \"Museum\" table, assuming there is a column named \"country\" in the table that stores the country names, you can use the following SQL statement:", "response": "SELECT DISTINCT country \nFROM museum;", "posttext": "This query selects the distinct values from the \"country\" column of the \"Museum\" table. The DISTINCT keyword ensures that each country name is returned only once, even if there are multiple museums in the same country. The result will be a list of distinct country names present in the \"Museum\" table." } ] ---[AI]--- Notice that ^ai_name^ not only includes the keyword DISTINCT in the SQL statement, but also provides an explanation for why DISTINCT is appropriate.
When you use DISTINCT in a SQL statement, it applies to all of the columns of the query not just the first column. For example, retrieving both the country and state (as shown below) will return more than 16 rows. A country with more than one state will appear once for every state on the table. The *combination* of countries and states will all be unique. *SELECT DISINCT country, state FROM museum* **Try the SQL statement.** ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:distinct-comparison ---[SQL]--- SELECT DISTINCT country, state FROM museum; SELECT DISTINCT country FROM museum; ---[SQL]--- Notice how both queries in the example return a unique set of values. In the second query, the unique list of countries is returned. In the first, the unique combinations of countries and states are returned. ## Limiting Results There are times that you want to retrieve only the first row or set of rows from a table. Unfortunately, the different database management systems have implemented this in different ways. A common way to retrieve only a few rows involves using the keyword LIMIT. For example, the SQL statement used to retrieve the first five rows of the museum table is: *SELECT \* FROM museum LIMIT 5* ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:limit ---[SQL]--- SELECT * FROM museum LIMIT 5; ---[SQL]--- ## Using Comments SQL statements are instructions that are processed by the database management system to interact with the database. There are times that you want include notes in the the SQL statement that you don't want to be processed. These notes are called comments. ---[SIDEBAR]--- "type":"term", "title":"Comments", "text":"Notes placed inside a SQL statement that are not processed by the database management system." ---[SIDEBAR]--- Here are a few reasons you might want to leave some comment in your SQL statements: 1. The SQL statements you write can become complex. It is often helpful to leave descriptive comments in the statement to explain the logic of the statement. 2. You may want to leave a comment at the beginning of the SQL statement to describe what the statement does. 3. There are times you may want to troubleshoot a SQL statement that isn't working as you expect it to. It can be helpful to temporarily stop part of a statement from being executed, by putting it in a comment, to see the effect of removing that part of the statement. Most database management systems support at least two ways to insert a comment. The first is an inline comment. An inline comment is added using two hyphens (--). Any text on the same line after the hyphens will not be processed. For example, the SQL statement below will retrieve the list of museums in the database, ignoring the text, "This is a comment" when the statement is processed. *SELECT name --This is a comment FROM museum* ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:inline-comment ---[SQL]--- SELECT name --This is a comment FROM museum; ---[SQL]--- You can also create comments that can span across the lines of a SQL statement. Multiline comments begin with the combination of the slash and astericks characters "/*" and end with the astericks and slash characters "*/". For example, the text "This is a multiline comment" will not be processed in this SQL statement: *SELECT name /* This is a multiline comment.*/ FROM museum ---[SQL]--- connection:sql_book_connection schema:canvas diagram:canvas owner:atlas-query result:block-comment ---[SQL]--- SELECT name /* This is a multiline comment*/ FROM museum ---[SQL]--- ## Summary This lesson discussed simple SQL statements that retrieve data from a database. It covered retrieving a single column and multiple columns. Additionally, it described how to use the wildcard character to retrieve all of the columns in a table. Finally, it covered how to retrieve only unique rows, a specified number of rows, and how to embed comments in a SQL statement. ## Practice It's a good idea to take a few minutes now to practice what you have learned in this lesson [Lesson Pracice](?select-where-assessment)
person
navigate_before Table of Contents expand_more navigate_next
Loading Page hourglass_empty