# Databases and SQL
---[SIDEBAR]---
"type":"objectives",
"title":"Databases and SQL",
"text":"After completing this lesson, you can:
When you break up data, the level of granularity is up to you and your specific requirements. For example, addresses are typically stored with the house num- ber and street name together. This is fine, unless you might one day need to sort data by street name, in which case splitting house number and street name would be preferable. Each column in a database has an associated datatype. A *datatype* defines what type of data the column can contain. For example, if the column were to contain a number (perhaps the number of items in an order), the datatype would be a numeric datatype. If the column were to contain dates, text, notes, currency amounts, and so on, the appropriate datatype would be used to specify this. > ***NEW TERM:* Datatype** A type of allowed data. Every table column has an associated datatype that restricts (or allows) specific data in that column. Datatypes restrict the type of data that can be stored in a column (for example, preventing the entry of alphabetical characters into a numeric field). Datatypes also help sort data correctly and play an important role in optimizing disk usage. As such, special attention must be given to picking the right datatype when tables are created. > ***CAUTION:* Datatype Compatibility** Datatypes and their names are one of the primary sources of SQL incompatibil- ity. While most basic datatypes are supported consistently, many more advanced datatypes are not. And worse, occasionally you’ll find that the same datatype is referred to by different names in different DBMSs. There is not much you can do about this, but it is important to keep in mind when you create table schemas. ## Rows Data in a table is stored in rows; each record saved is stored in its own row. Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows. For example, a customers table might store one customer per row. The number of rows in the table is the number of records in it. > ***NEW TERM:* Row** A record in a table.
*Should I use the term "Record" or "Row"?* You may hear users refer to database records when referring to rows. For the most part the two terms are used interchangeably, but row is technically the correct term. ## Primary Keys Every row in a table should have some column (or set of columns) that uniquely identifies it. A table containing customers might use a customer number column for this purpose, whereas a table containing orders might use the order ID. An employee list table might use an employee ID. A table containing a list of books might use the ISBN for this purpose. > ***NEW TERM:* Primary key** A column (or set of columns) whose values uniquely identify every row in a table. This column (or set of columns) that uniquely identifies each row in a table is called a primary key. The primary key is used to refer to a specific row. Without a primary key, updating or deleting specific rows in a table becomes extremely difficult as there is no guaranteed safe way to refer to just the rows to be affected. > ***TIP:* Always Define Primary Keys** Although primary keys are not actually required, most database designers ensure that every table they create has a primary key so that future data manipulation is possible and manageable. Any column in a table can be defined as the primary key, as long as it meets the following conditions: * No two rows can have the same primary key value. * Every row must have a value in the primary key column(s). (So, no NULL values.) * Values in primary key columns should never be modified or updated. * Primary key values should never be reused. (If a row is deleted from the table, its primary key may not be assigned to any new rows in the future.) Primary keys are usually defined on a single column within a table. But this is not required, and multiple columns may be used together as a primary key. When multiple columns are used, the rules listed above must apply to all columns, and the values of all columns together must be unique (individual columns need not have unique values). There is another very important type of key called a foreign key, but I’ll get to that later on in Lesson 12, “Joining Tables.” ## What Is SQL? SQL (pronounced as the letters S-Q-L or as sequel) is an abbreviation for Structured Query Language. SQL is a language designed specifically for communicating with databases. Unlike other languages (spoken languages like English, or programming languages like Java, C, or Python), SQL is made up of very few words. This is deliberate. SQL is designed to do one thing and do it well—provide you with a simple and efficient way to read and write data from a database. What are the advantages of SQL? * SQL is not a proprietary language used by specific database vendors. Almost every major DBMS supports SQL, so learning this one language will enable you to interact with just about every database you’ll run into. * SQL is easy to learn. The statements are all made up of descriptive English words, and there aren’t that many of them. * Despite its apparent simplicity, SQL is a very powerful language, and by cleverly using and combining its language elements, you can perform very complex and sophisticated database operations. And with that, let’s learn SQL. > ***NOTE:* SQL Extensions** Many DBMS vendors have extended their support for SQL by adding statements or instructions to the language. The purpose of these extensions is to provide additional functionality or simplified ways to perform specific operations. And while often extremely useful, these extensions tend to be very DBMS specific, and they are rarely supported by more than a single vendor.
Standard SQL is governed by the ANSI standards committee, and is thus called ANSI SQL. All major DBMSs, even those with their own extensions, support ANSI SQL. Individual implementations have their own names (PL-SQL, used by Oracle; Transact-SQL, used by Microsoft SQL Server; and so on). For the most part, the SQL taught in this book is ANSI SQL. On the odd occasion where DBMS-specific SQL is used, it is so noted. ## Try It Yourself As with any language, the best way to learn SQL is to try it for yourself. To help you do this, We've integrated the ability to excecute SQL directly in this book. You'll see it in the next chapter ## Summary In this first lesson, you learned what SQL is and why it is useful. Because SQL is used to interact with databases, you also reviewed some basic database terminology.
- Describe the building blocks of a database
- Explain what SQL is
When you break up data, the level of granularity is up to you and your specific requirements. For example, addresses are typically stored with the house num- ber and street name together. This is fine, unless you might one day need to sort data by street name, in which case splitting house number and street name would be preferable. Each column in a database has an associated datatype. A *datatype* defines what type of data the column can contain. For example, if the column were to contain a number (perhaps the number of items in an order), the datatype would be a numeric datatype. If the column were to contain dates, text, notes, currency amounts, and so on, the appropriate datatype would be used to specify this. > ***NEW TERM:* Datatype** A type of allowed data. Every table column has an associated datatype that restricts (or allows) specific data in that column. Datatypes restrict the type of data that can be stored in a column (for example, preventing the entry of alphabetical characters into a numeric field). Datatypes also help sort data correctly and play an important role in optimizing disk usage. As such, special attention must be given to picking the right datatype when tables are created. > ***CAUTION:* Datatype Compatibility** Datatypes and their names are one of the primary sources of SQL incompatibil- ity. While most basic datatypes are supported consistently, many more advanced datatypes are not. And worse, occasionally you’ll find that the same datatype is referred to by different names in different DBMSs. There is not much you can do about this, but it is important to keep in mind when you create table schemas. ## Rows Data in a table is stored in rows; each record saved is stored in its own row. Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows. For example, a customers table might store one customer per row. The number of rows in the table is the number of records in it. > ***NEW TERM:* Row** A record in a table.
*Should I use the term "Record" or "Row"?* You may hear users refer to database records when referring to rows. For the most part the two terms are used interchangeably, but row is technically the correct term. ## Primary Keys Every row in a table should have some column (or set of columns) that uniquely identifies it. A table containing customers might use a customer number column for this purpose, whereas a table containing orders might use the order ID. An employee list table might use an employee ID. A table containing a list of books might use the ISBN for this purpose. > ***NEW TERM:* Primary key** A column (or set of columns) whose values uniquely identify every row in a table. This column (or set of columns) that uniquely identifies each row in a table is called a primary key. The primary key is used to refer to a specific row. Without a primary key, updating or deleting specific rows in a table becomes extremely difficult as there is no guaranteed safe way to refer to just the rows to be affected. > ***TIP:* Always Define Primary Keys** Although primary keys are not actually required, most database designers ensure that every table they create has a primary key so that future data manipulation is possible and manageable. Any column in a table can be defined as the primary key, as long as it meets the following conditions: * No two rows can have the same primary key value. * Every row must have a value in the primary key column(s). (So, no NULL values.) * Values in primary key columns should never be modified or updated. * Primary key values should never be reused. (If a row is deleted from the table, its primary key may not be assigned to any new rows in the future.) Primary keys are usually defined on a single column within a table. But this is not required, and multiple columns may be used together as a primary key. When multiple columns are used, the rules listed above must apply to all columns, and the values of all columns together must be unique (individual columns need not have unique values). There is another very important type of key called a foreign key, but I’ll get to that later on in Lesson 12, “Joining Tables.” ## What Is SQL? SQL (pronounced as the letters S-Q-L or as sequel) is an abbreviation for Structured Query Language. SQL is a language designed specifically for communicating with databases. Unlike other languages (spoken languages like English, or programming languages like Java, C, or Python), SQL is made up of very few words. This is deliberate. SQL is designed to do one thing and do it well—provide you with a simple and efficient way to read and write data from a database. What are the advantages of SQL? * SQL is not a proprietary language used by specific database vendors. Almost every major DBMS supports SQL, so learning this one language will enable you to interact with just about every database you’ll run into. * SQL is easy to learn. The statements are all made up of descriptive English words, and there aren’t that many of them. * Despite its apparent simplicity, SQL is a very powerful language, and by cleverly using and combining its language elements, you can perform very complex and sophisticated database operations. And with that, let’s learn SQL. > ***NOTE:* SQL Extensions** Many DBMS vendors have extended their support for SQL by adding statements or instructions to the language. The purpose of these extensions is to provide additional functionality or simplified ways to perform specific operations. And while often extremely useful, these extensions tend to be very DBMS specific, and they are rarely supported by more than a single vendor.
Standard SQL is governed by the ANSI standards committee, and is thus called ANSI SQL. All major DBMSs, even those with their own extensions, support ANSI SQL. Individual implementations have their own names (PL-SQL, used by Oracle; Transact-SQL, used by Microsoft SQL Server; and so on). For the most part, the SQL taught in this book is ANSI SQL. On the odd occasion where DBMS-specific SQL is used, it is so noted. ## Try It Yourself As with any language, the best way to learn SQL is to try it for yourself. To help you do this, We've integrated the ability to excecute SQL directly in this book. You'll see it in the next chapter ## Summary In this first lesson, you learned what SQL is and why it is useful. Because SQL is used to interact with databases, you also reviewed some basic database terminology.