Two very essential keywords in the SQL vocabulary are COUNT and DISTINCT. Both keywords are essential in providing the count of records from objects and to determine uniqueness in a selected dataset.
Why are they so important?
- SELECT COUNT – returns the number of records from a SQL query.
- SELECT DISTINCT – is used to eliminate duplication in a result set and can also be used with COUNT to determine a number of distinct records in a table.
We can see our table of customers has 59 records. SELECT COUNT(*) is asking to give us the number of all records in the customer table, but what about DISTINCT records?
Still 59 records – that’s good! If the count was less that would mean we had duplicate records in the table. Usually duplication is something we want to avoid.
The DISTINCT keyword has an alternative use. For example, let’s say I want to know how many customers have unique first names. I could use SQL to try and answer the question like this:
Out of the 59 customers, 57 have a unique first name! That makes sense and had our dataset been larger we may have found it strange that all first names were potentially unique. What about last names?
In our customer table everyone has a unique last name. As we can see, using the DISTINCT and COUNT operators can be very useful when trying to find out certain information from a database. Instead of last names we could’ve looked at a unique list of countries for our customers. If we wanted to do an advertising campaign focused on a specific location this could be useful. Maybe a company only wants to hone in on the country with the largest customer base. Writing SQL can get us the information a company may need.