# Databases and SQL ---[SIDEBAR]--- "type":"objectives", "title":"Databases and SQL", "text":"After completing this lesson, you can:
  1. Describe the building blocks of a database
  2. Explain what SQL is
" ---[SIDEBAR]--- Without knowing it, you use databases all the time. Databases are a key component of almost every computer-based system. When you you check a bank account balance on your phone, you are using a database to display your account information. When you make a purchase online, your purchase information is recorded in a database. When you perform and Internet search to find a new restaurant to visit for dinner, you are accessing data stored in a database. In truth, most websites and mobile apps rely on databases. Although you frequently interact with databases, you are not alone if you don't really understand what a database is or how it works. Like most technologies, you don't need to completely understand how a database works to benefit from using one. However, the fact that you are using this resource means that you want to understand more about how databases work so that you can interact with them in more advanced ways. In this lesson, you will learn what a database is and how to interact with them. In particular, the lesson will describe the building blocks of a database. It will also describe the SQL programming language used to retrieve data from and manipulate data in a database. The remaining lessons will dive deeply into various aspects SQL to describe how to interact with databases in powerful ways. ## What is a database? There is a bit of confusion about just what a database is. Often a database is mistaken for the software used to create and manage it. The software used to create and manage a database is called a database management system (DBMS). Examples of popular DBMS software include Oracle DBMS, Microsoft SQL Server, MySQL, and PostgreSQL. A database is collection of data stored in an organized way. Care is taken to properly organize data in a database. A well organized database is easier to use and is more effective than a poorly organized database. While just how to design and organize a database is beyond the scope of this resource, it is important to understand the basic building blocks of a database to be able to effectively interact with the data stored in the database. > ***Key Term:* Database** A collection of data stored in an organized way. ### The building blocks of a database The most common way to organize data in a database is the relational model. In a relational database, data are separated and stored in two-dimensional tables. Tables contain rows and columns. Each of these building blocks are described in the next few sections. ---[FIGURE]--- "id":"gormly", "title":"A database diagram", "url":"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDFqszIjX7XXFbrt2JZGnjXtnzikf2HdGgVXjH3gePUx7sPOHmIET7zCmsDGcYqdrXVu92PN74CjN_y7TEfFAD54bHbvjhclzyqOhF9Kssdr__gkxb3cpNpfElCAs0-jWyOIZR3oQcn5s9KubpH1ZW5bc6nYLmKeMJsGN__CcdLtXNI1gmTmfkMudz/s1600/hudson_oracle_transparent.png", "altText": "Text description of image when mouse hovers." ---[FIGURE]--- this is just a paragraph showing how refer to ---[gormly]---. #### Tables The information in a database is organized into tables by theme. Each theme in the data will be stored on a separate table. For example, a database may contain information about the sales transactions for a company. This information may contain information about the individual sales - such as the date, and total amount of each sale. It may also contain information about the products that were sold. Additionally, it may contain information about the customers who purchased the products. Sales, products, and customers are different themes in the data. > ***NEW TERM:* Table** A two-dimensional structure with information about a specific theme within a database. Information about each different theme will be stored on separate tables, so there will be a sale, product, and customer table in the database. Each table will contain data that is stored about that theme. In other words, the customer table will contain all of the data stored on customers. A database table does not contain information that does not relate to the theme of the table. The customer table will not contain information about the products sold. Database tables are given a name. A table name should describe the data stored in the table. The name will also be unique for each table to identify them from the other tables in the database. No two tables in a database can have the same name. A database table is two-dimensional, much like a worksheet in a spreadsheet. It contains both columns and rows. The columns and rows of a database table are structured in a standard way. The columns and rows of a database table are structured it make it easy to manupulate the data stored in the database. #### Rows The rows of a database table contain the information about each individual instance of the items stored on the table. Each row of a database table would contain infromation about only one instance of that table. Likewise it would contain all of the information stored about that instance. In the customer table, each row would contain information about different customers. Additionally, all of the data about a single customer would be stored on a single row. #### Columns The columns of the table store the characteristics of the items stored on the table. The columns of the customer table would store the characteristics of customers - such as customer name, address, or phone number. The rows of the table would contain all of the information known about the individual customers. > ***NEW TERM:* Column** A single field in a table. All tables are made up of one or more columns. The best way to understand this is to envision database tables as grids, somewhat like spreadsheets. Each column in the grid contains a particular piece of information. In a customer table, for example, one column contains the customer number, another contains the customer name, and the address, city, state, and ZIP code are all stored in their own columns. > ***TIP:* Breaking Up Data** It is extremely important to break data into multiple columns correctly. For example, city, state, and ZIP (or postal) code should always be separate col- umns. When you break these out, it becomes possible to sort or filter data by specific columns (for example, to find all customers in a particular state or in a particular city). If city and state are combined into one column, it would be extremely difficult to sort or filter by state.

When you break up data, the level of granularity is up to you and your specific requirements. For example, addresses are typically stored with the house num- ber and street name together. This is fine, unless you might one day need to sort data by street name, in which case splitting house number and street name would be preferable. Each column in a database has an associated datatype. A *datatype* defines what type of data the column can contain. For example, if the column were to contain a number (perhaps the number of items in an order), the datatype would be a numeric datatype. If the column were to contain dates, text, notes, currency amounts, and so on, the appropriate datatype would be used to specify this. > ***NEW TERM:* Datatype** A type of allowed data. Every table column has an associated datatype that restricts (or allows) specific data in that column. Datatypes restrict the type of data that can be stored in a column (for example, preventing the entry of alphabetical characters into a numeric field). Datatypes also help sort data correctly and play an important role in optimizing disk usage. As such, special attention must be given to picking the right datatype when tables are created. > ***CAUTION:* Datatype Compatibility** Datatypes and their names are one of the primary sources of SQL incompatibil- ity. While most basic datatypes are supported consistently, many more advanced datatypes are not. And worse, occasionally you’ll find that the same datatype is referred to by different names in different DBMSs. There is not much you can do about this, but it is important to keep in mind when you create table schemas. ## Rows Data in a table is stored in rows; each record saved is stored in its own row. Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows. For example, a customers table might store one customer per row. The number of rows in the table is the number of records in it. > ***NEW TERM:* Row** A record in a table.

*Should I use the term "Record" or "Row"?* You may hear users refer to database records when referring to rows. For the most part the two terms are used interchangeably, but row is technically the correct term. ## Primary Keys Every row in a table should have some column (or set of columns) that uniquely identifies it. A table containing customers might use a customer number column for this purpose, whereas a table containing orders might use the order ID. An employee list table might use an employee ID. A table containing a list of books might use the ISBN for this purpose. > ***NEW TERM:* Primary key** A column (or set of columns) whose values uniquely identify every row in a table. This column (or set of columns) that uniquely identifies each row in a table is called a primary key. The primary key is used to refer to a specific row. Without a primary key, updating or deleting specific rows in a table becomes extremely difficult as there is no guaranteed safe way to refer to just the rows to be affected. > ***TIP:* Always Define Primary Keys** Although primary keys are not actually required, most database designers ensure that every table they create has a primary key so that future data manipulation is possible and manageable. Any column in a table can be defined as the primary key, as long as it meets the following conditions: * No two rows can have the same primary key value. * Every row must have a value in the primary key column(s). (So, no NULL values.) * Values in primary key columns should never be modified or updated. * Primary key values should never be reused. (If a row is deleted from the table, its primary key may not be assigned to any new rows in the future.) Primary keys are usually defined on a single column within a table. But this is not required, and multiple columns may be used together as a primary key. When multiple columns are used, the rules listed above must apply to all columns, and the values of all columns together must be unique (individual columns need not have unique values). There is another very important type of key called a foreign key, but I’ll get to that later on in Lesson 12, “Joining Tables.” ## What Is SQL? SQL (pronounced as the letters S-Q-L or as sequel) is an abbreviation for Structured Query Language. SQL is a language designed specifically for communicating with databases. Unlike other languages (spoken languages like English, or programming languages like Java, C, or Python), SQL is made up of very few words. This is deliberate. SQL is designed to do one thing and do it well—provide you with a simple and efficient way to read and write data from a database. What are the advantages of SQL? * SQL is not a proprietary language used by specific database vendors. Almost every major DBMS supports SQL, so learning this one language will enable you to interact with just about every database you’ll run into. * SQL is easy to learn. The statements are all made up of descriptive English words, and there aren’t that many of them. * Despite its apparent simplicity, SQL is a very powerful language, and by cleverly using and combining its language elements, you can perform very complex and sophisticated database operations. And with that, let’s learn SQL. > ***NOTE:* SQL Extensions** Many DBMS vendors have extended their support for SQL by adding statements or instructions to the language. The purpose of these extensions is to provide additional functionality or simplified ways to perform specific operations. And while often extremely useful, these extensions tend to be very DBMS specific, and they are rarely supported by more than a single vendor.

Standard SQL is governed by the ANSI standards committee, and is thus called ANSI SQL. All major DBMSs, even those with their own extensions, support ANSI SQL. Individual implementations have their own names (PL-SQL, used by Oracle; Transact-SQL, used by Microsoft SQL Server; and so on). For the most part, the SQL taught in this book is ANSI SQL. On the odd occasion where DBMS-specific SQL is used, it is so noted. ## Try It Yourself As with any language, the best way to learn SQL is to try it for yourself. To help you do this, We've integrated the ability to excecute SQL directly in this book. You'll see it in the next chapter ## Summary In this first lesson, you learned what SQL is and why it is useful. Because SQL is used to interact with databases, you also reviewed some basic database terminology.
person
navigate_before Table of Contents expand_more navigate_next
Loading Page hourglass_empty