# Aggregate Functions in SQL

Functions in SQL enable we to perform feats such as determining the sum of a column or converting all the characters of a string to uppercase.

### On this page

By the end of the day, we will understand and be able to use all the following:

Date and time functions

Arithmetic functions

Character functions

Aggregate functions

Conversion functions

General Functions

Miscellaneous functions

These functions greatly increase our ability to manipulate the information we retrieved using the basic functions of SQL. The first five aggregate functions, COUNT, SUM, AVG, MAX, and MIN, are defined in the ANSI standard. Most implementations of SQL have extensions to these aggregate functions, some of which are covered today. Some implementations may use different names for these functions.

### Aggregate Functions

These functions are also referred to as group functions. They return a value based on the values in a column. (After all, we wouldn’t ask for the average of a single field.) The examples in this section use the table TEAMSTATS:

```
SQL> SELECT * FROM TEAMSTATS;
```

NAME | POS | AB | HITS | WALKS | SINGLES | DOUBLES | TRIPLES | HR | SO |
---|---|---|---|---|---|---|---|---|---|

JONES | 1B | 145 | 45 | 34 | 31 | 8 | 1 | 5 | 10 |

DONKNOW | 3B | 175 | 65 | 23 | 50 | 10 | 1 | 4 | 15 |

WORLEY | LF | 157 | 49 | 15 | 35 | 8 | 3 | 3 | 16 |

DAVID | OF | 187 | 70 | 24 | 48 | 4 | 0 | 17 | 42 |

HAMHOCKER | 3B | 50 | 12 | 10 | 10 | 2 | 0 | 0 | 13 |

CASEY | DH | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |

6 rows selected.

#### COUNT

The function COUNT returns the number of rows that satisfy the condition in the WHERE clause. Say we wanted to know how many ball players were hitting under 350. We would type

```
SQL> SELECT COUNT(*) FROM TEAMSTATS WHERE HITS/AB < .35;
COUNT(*)
--------
4
```

To make the code more readable, try an alias:

```
SQL> SELECT COUNT(*) NUM_BELOW_350 FROM TEAMSTATS WHERE HITS/AB < .35;
NUM_BELOW_350
-------------
4
```

Would it make any difference if we tried a column name instead of the asterisk? (Notice the use of parentheses around the column names.) Try this:

```
SQL> SELECT COUNT(NAME) NUM_BELOW_350 FROM TEAMSTATS WHERE HITS/AB < .35;
NUM_BELOW_350
-------------
4
```

The answer is no. The NAME column that we selected was not involved in the WHERE statement. If we use COUNT without a WHERE clause, it returns the number of records in the table.

```
SQL> SELECT COUNT(*) FROM TEAMSTATS;
COUNT(*)
---------
6
```

#### SUM

SUM does just that. It returns the sum of all values in a column. To find out how many singles have been hit, type

```
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES FROM TEAMSTATS;
TOTAL_SINGLES
-------------
174
```

To get several sums, use

```
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES,
SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR FROM TEAMSTATS;
TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR
------------- ------------- ------------- --------
174 32 5 29
```

To collect similar information on all 300 or better players, type

```
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES,
SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR FROM TEAMSTATS WHERE HITS/AB
>- .300;
TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR
------------- ------------- ------------- --------
164 30 5 29
```

To compute a team batting average, type

```
SQL> SELECT SUM(HITS)/SUM(AB) TEAM_AVERAGE FROM TEAMSTATS;
TEAM_AVERAGE
------------
.33706294
```

SUM works only with numbers. If we try it on a non-numerical field, we get

```
SQL> SELECT SUM(NAME)FROM TEAMSTATS;
***### ERROR:
ORA-01722: invalid number
no rows selected
```

This error message is logical because we cannot sum a group of names.

#### AVG

The AVG function computes the average of a column. To find the average number of strike outs, use this:

```
SQL> SELECT AVG(SO) AVE_STRIKE_OUTS FROM TEAMSTATS;
AVE_STRIKE_OUTS
---------------
16.166667
```

The following example illustrates the difference between SUM and AVG:

```
SQL> SELECT AVG(HITS/AB) TEAM_AVERAGE FROM TEAMSTATS;
TEAM_AVERAGE
------------
.26803448
```

Analysis:

The team was batting over 300 in the previous example! What happened? AVG computed the average of the combined column hits divided by at bats, whereas the example with SUM divided the total number of hits by the number of at bats. For example, player A gets 50 hits in 100 at bats for a .500 average. Player B gets 0 hits in 1 at bat for a 0.0 average. The average of 0.0 and 0.5 is .250. If we compute the combined average of 50 hits in 101 at bats, the answer is a respectable .495. The following statement returns the correct batting average:

```
SQL> SELECT AVG(HITS)/AVG(AB) TEAM_AVERAGE FROM TEAMSTATS;
TEAM_AVERAGE
------------
.33706294
```

Like the SUM function, AVG works only with numbers.

#### MAX

If we want to find the largest value in a column, use MAX. For example, what is the highest number of hits?

```
SQL> SELECT MAX(HITS)FROM TEAMSTATS;
MAX(HITS)
---------
70
```

Can we find out who has the most hits?

```
SQL> SELECT NAME FROM TEAMSTATS WHERE HITS = MAX(HITS);
***### ERROR at line 3:
ORA-00934: group function is not allowed here
```

Unfortunately, we can’t. The error message is a reminder that this group function ***### (remember that aggregate functions are also called group functions) doesnâ€™t work in WHERE clause. What happens if we try a non-numerical column?

```
SQL> SELECT MAX(NAME)FROM TEAMSTATS;
MAX(NAME)
---------------
WORLEY
```

Here’s something new. MAX returns the highest (closest to Z) string. Finally, a function works with both characters and numbers.

#### MIN

MIN returns the lowest member of a column. To find out the fewest at bats, type

```
SQL> SELECT MIN(AB)FROM TEAMSTATS;
MIN(AB)
---------
1
```

The following statement returns the name closest to the beginning of the alphabet:

```
SQL> SELECT MIN(NAME)FROM TEAMSTATS;
MIN(NAME)
---------------
CASEY
```

We can combine MIN with MAX to give a range of values. For example:

```
SQL> SELECT MIN(AB), MAX(AB) FROM TEAMSTATS;
MIN(AB) MAX(AB)
-------- --------
1 187
```

This sort of information can be useful when using statistical functions. NOTE: As we mentioned in the introduction, the first five aggregate functions are described in the ANSI standard. The remaining aggregate functions have become de facto standards, present in all important implementations of SQL. We use the Oracle7 names for these functions. Other implementations may use different names.

#### VARIANCE

VARIANCE produces square of standard deviation, a number vital to many statistical calculations. It works like this:

```
SQL> SELECT VARIANCE(HITS)FROM TEAMSTATS;
VARIANCE(HITS)
--------------
802.96667
```

If we try a string

```
SQL> SELECT VARIANCE(NAME)FROM TEAMSTATS;
ERROR:
ORA-01722: invalid number
no rows selected
```

We find that VARIANCE is another function that works exclusively with numbers.

⌖ sql rdbms