The fact that you are reading this book indicates that you, somehow, need to interact with databases. And so before diving into MySQL and its implementation of the SQL language, it is important that you understand some basic concepts about databases and database technologies.
Whether you are aware of it or not, you use databases all the time. Each time you select a name from your email address book, you are using a database. If you conduct a search on an Internet search site, you are using a database. When you log into your network at work, you are validating your name and password against a database. Even when you use your ATM card at a cash machine, you are using databases for PIN verification and balance checking.
But even though we all use databases all the time, there remains much confusion over what exactly a database is. This is especially true because different people use the same database terms to mean different things. Therefore, a good place to start our study is with a list and explanation of the most important database terms.
Reviewing Basic Concepts What follows is a very brief overview of some basic database concepts. It is intended to either jolt your memory if you already have some database experience, or to provide you with the absolute basics, if you are new to databases. Understanding databases is an important part of mastering MySQL, and you might want to find a good book on database fundamentals to brush up on the subject if needed.
What Is a Database?
The term database is used in many different ways, but for our purposes a database is a collection of data stored in some organized fashion. The simplest way to think of it is to imagine a database as a filing cabinet. The filing cabinet is simply a physical location to store data, regardless of what that data is or how it is organized.
Database A container (usually a file or set of files) to store organized data.
Misuse Causes Confusion People often use the term database to refer to the database software they are running. This is incorrect, and it is a source of much confusion. Database software is actually called the Database Management System (or DBMS). The database is the container created and manipulated via the DBMS. A database might be a file stored on a hard drive, but it might not. And for the most part this is not even significant as you never access a database directly anyway; you always use the DBMS and it accesses the database for you.
In the database world, that file is called a table. A table is a structured file that can store data of a specific type. A table might contain a list of customers, a product catalog, or any other list of information.
Table A structured list of data of a specific type.
The key here is that the data stored in the table is one type of data or one list. You would never store a list of customers and a list of orders in the same database table. Doing so would make subsequent retrieval and access difficult. Rather, you'd create two tables, one for each list.
Every table in a database has a name that identifies it. That name is always uniquemeaning no other table in that database can have the same name.
Table Names What makes a table name unique is actually a combination of several things, including the database name and table name. This means that while you cannot use the same table name twice in the same database, you definitely can reuse table names in different databases.
Tables have characteristics and properties that define how data is stored in them. These include information about what data may be stored, how it is broken up, how individual pieces of information are named, and much more. This set of information that describes a table is known as a schema, and schema are used to describe specific tables within a database, as well as entire databases (and the relationship between tables in them, if any).
Schema or Database? Occasionally schema is used as a synonym for database (and schemata as a synonym for databases). While unfortunate, it is usually clear from the context which meaning of schema is intended. In this book, schema will refer to the definition given here.
Columns and Datatypes
Tables are made up of columns. A column contains a particular piece of information within a table.
The best way to understand this is to envision database tables as grids, somewhat like spreadsheets. Each column in the grid contains a particular piece of information. In a customer table, for example, one column contains the customer number, another contains the customer name, and the address, city, state, and ZIP Code are all stored in their own columns.
Breaking Up Data It is extremely important to break data into multiple columns correctly. For example, city, state, and ZIP Code should always be separate columns. By breaking these out, it becomes possible to sort or filter data by specific columns (for example, to find all customers in a particular state or in a particular city). If city and state are combined into one column, it would be extremely difficult to sort or filter by state.
Each column in a database has an associated datatype. A datatype defines what type of data the column can contain. For example, if the column is to contain a number (perhaps the number of items in an order), the datatype would be a numeric datatype. If the column were to contain dates, text, notes, currency amounts, and so on, the appropriate datatype would be used to specify this.
Datatypes restrict the type of data that can be stored in a column (for example, preventing the entry of alphabetical characters into a numeric field). Datatypes also help sort data correctly, and play an important role in optimizing disk usage. As such, special attention must be given to picking the right datatype when tables are created.
Data in a table is stored in rows; each record saved is stored in its own row. Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows.
For example, a customers table might store one customer per row. The number of rows in the table is the number of records in it.
Row A record in a table.
Every row in a table should have some column (or set of columns) that uniquely identifies it. A table containing customers might use a customer number column for this purpose, whereas a table containing orders might use the order ID. An employee list table might use an employee ID or the employee Social Security number column.
Primary Key A column (or set of columns) whose values uniquely identify every row in a table.
This column (or set of columns) that uniquely identifies each row in a table is called a primary key. The primary key is used to refer to a specific row. Without a primary key, updating or deleting specific rows in a table becomes extremely difficult because there is no guaranteed safe way to refer to just the rows to be affected.
Always Define Primary Keys Although primary keys are not actually required, most database designers ensure that every table they create has a primary key so future data manipulation is possible and manageable.
Any column in a table can be established as the primary key, as long as it meets the following conditions:
Primary Key Rules The rules listed here are enforced by MySQL itself.
Primary keys are usually defined on a single column within a table. But this is not required, and multiple columns may be used together as a primary key. When multiple columns are used, the rules previously listed must apply to all columns that make up the primary key, and the values of all columns together must be unique (individual columns need not have unique values).
Primary Key Best Practices In addition to the rules that MySQL enforces, several universally accepted best practices that should be adhered to are
There is another very important type of key called a foreign key, but I'll get to that later on in Chapter 15, "Joining Tables."