Complementary to the GROUP BY clause is the HAVING clause and may only be used when grouping data. The HAVING clause will assist in filtering grouped data by aggregate very similarly as the WHERE clause is used to filter data. This clause is also commonly used when trying to find duplicate records in data when expecting a table to have only unique records. We can see an example of this shown below.
In the above example, we may expect the employee table to only show us unique records based on the employee first name and last name. The HAVING clause can filter out all records that are unique. One common and problematic inclination might be to not use the HAVING keyword, but use the WHERE clause instead. Unfortunately, the WHERE clause does not allow using aggregate functions as filters. We can see below if we try to write a query this way we get an error of misuse of the aggregate.
Many comparison operators are allowed such as equals (=), less than (<), greater than (>), less than equals (<=), and greater than equals (>=). Let’s say we want to see the average salary of our employees by their Title, but only where the average salary is greater than $30k. We could write a query for that.
SELECT Title, AVG(Salary) AS AverageSalary
FROM Employee
GROUP BY Title
HAVING AVG(Salary) > 30000;
The HAVING clause is very useful and commonly used in reports, views, data investigation, and more. We may find a lot of common use cases – some discussed already and many others that we’ll discover as we gather requirements and continue to do data investigation.
Some examples include:
- Finding duplicates in data (ex. HAVING COUNT(*) > 1)
- The opposite – finding unique records (ex. HAVING COUNT(*) = 1)
- Finding the number of employees for a region (ex. HAVING COUNT(EmployeeId) > 10)
- Customers with more than 100 orders (ex. HAVING COUNT(OrderId) > 100)