How to Use the Database Functions in Excel
The Excel database functions are designed to help one retrieve relevant information from large sets of data stored in Excel. There are twelve database functions available in Excel namely: DAVERAGE, DCOUNT, DCOUNTA, DGET, DMAX, DMIN, DPRODUCT, DSTDEV, DSTDEVP, DSUM, DVAR, and DVARP.
All the database functions are structured in the same way in terms of input arguments:
For example, the DSUM function has the following arguments
=DSUM (database, field, criteria)
- The database is the entire range of cells that makes up the database.
- The field is the column used in the function.
- The criteria is the cells that specify the conditions one wants to apply.
In this tutorial, we are going to review how to use the DGET, DAVERAGE and DMAX database functions. So, let’s get started with a simple example.
There is a great deal of interest in tennis court surfaces, in competitive tennis leagues. The three major types of court surfaces are hard courts, grass courts and clay courts. The three have associated advantages and disadvantages. The resultant bounce of the tennis ball upon impact is very different for the different surfaces.
Traditionally tennis was played on grass courts. The grass utilized for the surface is grown in hard packed soil and is similar to the grass used for golf courts. The behavior of the ball upon impact with the grass surface is that it has a relatively low bounce and skids across the surface quickly. The behavior of the ball can be unpredictable when bouncing and strong serves are a challenge to return to grass courts. The type of play that occurs most often is a serve and volley type of play. Wimbledon, the most famous tennis tournament in the world is played on grass courts.
Clay courts are the softest of the three surfaces. Clay courts come in a variety of colors. Green clay is made up of crushed basalt while red clay is comprised of crushed brick and tile. The ball’s behavior on clay is slower than grass, with a higher more predictable bounce. So players that rely mostly on serves don’t tend to dominate as much on clay as they do on the other surfaces. Clay also tends to be mentally and physically demanding, but not as demanding as grass courts. Clay courts tend to need a lot of maintenance. The French Open uses clay courts.
Hard courts are the most popular globally and don’t require as much maintenance in comparison to grass courts and clay courts. Hard courts have a concrete or asphalt base and the top layer or coating is a mixture of rubber, latex, and other plastic particles. Hard courts in terms of ball speed and bounce are in the middle between grass and clay courts. The bounce is consistent and suited to players who hit the ball with strength. Both the US Open and the Australian Open are played on hard courts.
In our example, a hypothetical tennis coach running a tennis clinic franchise is evaluating the performance of the players, who come to the clinic franchises, on the different court surfaces. He is going to use different database functions in order to query the data according to certain criteria. A picture of the source data is shown below.
The DGET Function
1) First things first, make a copy of the original data since its always best to keep a version of the source data intact. We want to see the name of the player, whose PlayerID is #1077. So we can use the DGET function to retrieve this value.
2) On the new sheet, copy the field names of the database to Row 2. It is very important that they are exactly the same since this will form part of our criteria.
3) Enter the value #1077 in Cell A3, as shown.
4) In Cell H5 enter the following formula:
=DGET ($A$6: $E$106, “Player Name”, $A$2: $E$3)
The database, in this case, refers to the entire range of data from $A$6:$E$106, field refers to the column containing the value which one wants to retrieve data from, and criteria forms the basis of the retrieval. In this case, we want to retrieve the name of the player whose PlayerID is #1077, this follows a similar logic to a SQL query.
5) When we press CTRL-ENTER, Alyssa James is returned since she is the player whose PlayerID is #1077. This is also similar to the way one uses a VLOOKUP or HLOOKUP function to lookup a value in another column, based on an input value in a different column.
The DAVERAGE Function
1) Again, we make a copy of the original data since its always best to keep a version of the source data intact. We want to find out what the average age of the female players who attend the clinics, is.
2) So like in our example above, on a new sheet, copy the field names of the database to Row 2, in order to ensure that they are exactly the same.
3) Enter the value Female in Cell C3, as shown.
4) In Cell H6 enter the following formula:
=DAVERAGE ($A$6: $E$106, “Age”, $A$2: $E$3)
5) When we press CTRL-ENTER, 21.55 which evaluates to 21 years, is returned so that means that the average age of the female players attending the tennis clinics is 21 years old.
The DMAX Function
1) Once again, we make a copy of the original data. We now want to see the oldest male playing on hard courts, attending the clinics. The database functions can accept multiple criteria, so this is not an issue to compute.
2) So like in our examples above, on the new sheet, copy the field names of the database to Row 2, in order to ensure that they are exactly the same.
3) Enter the values Male in Cell C3, and Hard Court in E3 as shown.
4) In Cell H6 enter the following formula:
=DMAX ($A$6: $E$106, “Age”, $A$2: $E$3)
5) When we press CTRL-ENTER, 28 is returned which means that the oldest male player, of those who play best on hard courts, is 28 years old.
Download working file
The database functions are useful as lookup functions and can take multiple criteria. Their only drawback is that they require the data and the criteria fields to be setup in a certain manner.
Please feel free to comment on whether you use database functions in Excel or play tennis.