SQL GROUP BY Clause - Count, Sum, And More

When working with databases and doing general data investigation the GROUP BY clause is going to become a part of your regular vocabulary. It’s very useful when writing SQL – whether it’s for reporting or trouble-shooting data issues and It’s especially useful for finding duplicates.

Note: When writing a SQL statement with a GROUP BY clause, you cannot or should not select more columns than you are grouping by.

Some databases allow you to select more columns than selected in a GROUP BY such as the DBeaver SQLite sample database, but if you don’t know the data inside and out this is a recipe for disaster. Usually the SQL editor will display an error from your database as you’ve selected more columns that you are grouping by.

Let’s give an example of using a GROUP BY. Let’s say that we want to know how many employees we have at their respective title within our organization. We could write a query like this:

SELECT TITLE, COUNT(*)
FROM EMPLOYEE E
— WHERE TITLE = ‘IT Staff’ — Example to show that a WHERE clause must come before GROUP BY
GROUP BY TITLE;

Looking At Practical Examples Of GROUP BY

We can also rename the COUNT(*) to something more relevant:

Let’s say instead of getting an employee count, we want to get the invoice totals by state. We could use the GROUP BY with a SUM:

The number of aggregate functions can vary based on the database, but other aggregate functions that can be used with a SQL GROUP BY include:

AVG()
MAX()
MIN()

These sorts of functions can answer a wide variety of questions such as:

Which company is bringing us the most income?
Where are most of our companies located?
What is the average amount invoiced per customer?
How many employees live outside of the U.S.?

This list of questions is never-ending, but hopefully you can see how these sorts of questions can be very useful. These types of data questions can offer insights to a company. Similarly, insights to a business are also referred to as Business Intelligence.

Next Section: The HAVING Clause – Filtering Aggregates