SQL COUNT() with distinct
COUNT() function with distinct clause
SQL COUNT() function with DISTINCT clause eliminates the repetitive appearance of the same data. The DISTINCT can come only once in a given select statement.
Syntax :
COUNT(DISTINCT expr,[expr...]) or SELECT COUNT(DISTINCT expression) FROM table_name WHERE condition;
Here, expression refers to the combination of columns or an expression that needs to be evaluated to obtain distinct values, and condition is an optional parameter.
Example :
Sample table: ordersORD_NUM ORD_AMOUNT ADVANCE_AMOUNT ORD_DATE CUST_CODE AGENT_CODE ORD_DESCRIPTION ---------- ---------- -------------- --------- --------------- --------------- ----------------- 200114 3500 2000 15-AUG-08 C00002 A008 200122 2500 400 16-SEP-08 C00003 A004 200118 500 100 20-JUL-08 C00023 A006 200119 4000 700 16-SEP-08 C00007 A010 200121 1500 600 23-SEP-08 C00008 A004 200130 2500 400 30-JUL-08 C00025 A011 200134 4200 1800 25-SEP-08 C00004 A005 200108 4000 600 15-FEB-08 C00008 A004 200103 1500 700 15-MAY-08 C00021 A005 200105 2500 500 18-JUL-08 C00025 A011 200109 3500 800 30-JUL-08 C00011 A010 200101 3000 1000 15-JUL-08 C00001 A008 200111 1000 300 10-JUL-08 C00020 A008 200104 1500 500 13-MAR-08 C00006 A004 200106 2500 700 20-APR-08 C00005 A002 200125 2000 600 10-OCT-08 C00018 A005 200117 800 200 20-OCT-08 C00014 A001 200123 500 100 16-SEP-08 C00022 A002 200120 500 100 20-JUL-08 C00009 A002 200116 500 100 13-JUL-08 C00010 A009 200124 500 100 20-JUN-08 C00017 A007 200126 500 100 24-JUN-08 C00022 A002 200129 2500 500 20-JUL-08 C00024 A006 200127 2500 400 20-JUL-08 C00015 A003 200128 3500 1500 20-JUL-08 C00009 A002 200135 2000 800 16-SEP-08 C00007 A010 200131 900 150 26-AUG-08 C00012 A012 200133 1200 400 29-JUN-08 C00009 A002 200100 1000 600 08-JAN-08 C00015 A003 200110 3000 500 15-APR-08 C00019 A010 200107 4500 900 30-AUG-08 C00007 A010 200112 2000 400 30-MAY-08 C00016 A007 200113 4000 600 10-JUN-08 C00022 A002 200102 2000 300 25-MAY-08 C00012 A012
To get unique number of rows from the 'orders' table with following conditions -
1. only unique cust_code will be counted,
2. result will appear with the heading "Number of employees",
the following SQL statement can be used :
-- Counting the number of distinct values in the 'cust_code' column of the 'orders' table
-- and aliasing the result as "Number of employees"
SELECT COUNT(DISTINCT cust_code) AS "Number of employees"
-- From the 'orders' table
FROM orders;
Explanation:
- SELECT COUNT(DISTINCT cust_code): This is the main part of the SQL query. It uses the COUNT() function with the DISTINCT keyword to count the number of distinct (unique) values in the 'cust_code' column of the 'orders' table. The DISTINCT keyword ensures that each unique value of 'cust_code' is counted only once.
- AS "Number of employees": This renames the column returned by the COUNT(DISTINCT cust_code) function as "Number of employees". The AS keyword is used to give an alias or a new name to the result column. In this case, the result will have a single column with the name "Number of employees".
- FROM orders: This specifies the source of the data for the query, which is the 'orders' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'orders' table.
Output :
Number of employees ------------------- 25
Visual Presentation:
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
To count the number of distinct products sold in the year 2018, we can use the following SQL query:
-- Counting the number of distinct values in the 'prod' column
SELECT COUNT(DISTINCT prod)
-- From the 'product_mast' table
FROM product_mast
-- Filtering the results to include only rows where the 'year' is equal to 2018
WHERE year = 2018;
Explanation:
- SELECT COUNT(DISTINCT prod): This is the main part of the SQL query. It uses the COUNT() function with the DISTINCT keyword to count the number of distinct (unique) values in the 'prod' column of the 'product_mast' table. The DISTINCT keyword ensures that each unique value of 'prod' is counted only once.
- FROM product_mast: This specifies the source of the data for the query, which is the 'product_mast' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.
- WHERE year = 2018: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'year' column is equal to 2018. This condition restricts the count to only include products from the year 2018.
Output :
count --------- 2
This will return the result 2, as there are two distinct products (Pro1 and Pro2) sold in the year 2018.
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
If we want to count the number of distinct product and year combinations, we can use the following SQL query:
-- Counting the number of distinct combinations of 'prod' and 'year' in the 'product_mast' table
SELECT COUNT(DISTINCT (prod, year))
-- From the 'product_mast' table
FROM product_mast;
Explanation:
- SELECT COUNT(DISTINCT CONCAT(prod, year)): This is the main part of the SQL query. It concatenates the values of the 'prod' and 'year' columns for each row using the CONCAT() function, and then counts the number of distinct combinations of these concatenated values using the COUNT() function with the DISTINCT keyword.
- FROM product_mast: This specifies the source of the data for the query, which is the 'product_mast' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.
Output :
count --------- 4
This will return the result 4, as there are four distinct product and year combinations (Pro1-2018, Pro2-2019, Pro2-2018, and Pro3-2019) in the product_mast table.
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
To count the number of distinct products sold in the year 2018 with a price greater than 300, we can use the following SQL query:
-- Counting the number of distinct values in the 'prod' column
SELECT COUNT(DISTINCT prod)
-- From the 'product_mast' table
FROM product_mast
-- Filtering the results to include only rows where the 'year' is equal to 2018
-- and the 'price' is greater than 300
WHERE year = 2018 AND price > 300;
Explanation:
- SELECT COUNT(DISTINCT prod): This is the main part of the SQL query. It selects the count of distinct (unique) values in the 'prod' column of the 'product_mast' table using the COUNT() function with the DISTINCT keyword.
- FROM product_mast: This specifies the source of the data for the query, which is the 'product_mast' table. The FROM keyword is used to indicate the table from which the data will be selected.
- WHERE year = 2018 AND price > 300;: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'year' column is equal to 2018 and the value in the 'price' column is greater than 300. This condition restricts the count to only include products from the year 2018 with a price greater than 300.
Output :
count --------- 2
This will return the result 2, as there are two distinct products (Pro1 and Pro2) sold in the year 2018 with a price greater than 300.
Example :
Sample table : product_mast
id|prod|price|year|
--+----+-----+----+
1|Pro1| 300|2018|
2|Pro2| 400|2019|
3|Pro1| 350|2018|
4|Pro3| 500|2019|
5|Pro2| 450|2018|
If we want to count the number of distinct product and year combinations with a price greater than 400, we can use the following SQL query:
-- Counting the number of distinct combinations of 'prod' and 'year'
SELECT COUNT(DISTINCT (prod, year))
-- From the 'product_mast' table
FROM product_mast
-- Filtering the results to include only rows where the 'price' is greater than 400
WHERE price > 400;
Explanation:
- SELECT COUNT(DISTINCT CONCAT(prod, year)): This is the main part of the SQL query. It concatenates the values of the 'prod' and 'year' columns for each row using the CONCAT() function, and then counts the number of distinct combinations of these concatenated values using the COUNT() function with the DISTINCT keyword.
- FROM product_mast: This specifies the source of the data for the query, which is the 'product_mast' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'product_mast' table.
- WHERE price > 400;: This is a condition applied to the data being selected. It filters the rows from the 'product_mast' table where the value in the 'price' column is greater than 400. This condition restricts the count to only include products with a price greater than 400.
Output :
count --------- 2
This will return the result 2, as there is two distinct products and year combination (Pro3-2022, Pro2-2018) in the product_mast table with a price greater than 400.
SQL COUNT( ) with All
In the following, we have discussed the usage of ALL clause with SQL COUNT() function to count only the non NULL value for the specified column within the argument. The difference between ‘*’(asterisk) and ALL are, '*' counts the NULL value also but ALL counts only NON NULL value.
Example:
To get data of number of valid 'grade' from the 'customer' table with the following condition -
1. every customer must be a valid grade,
the following SQL statement can be used :
-- Counting the total number of rows in the 'customer' table,
-- including duplicate values in the 'grade' column
SELECT COUNT(ALL grade)
-- From the 'customer' table
FROM customer;
Explanation:
- SELECT COUNT(ALL grade): This is the main part of the SQL query. It uses the COUNT() function to count the number of values in the 'grade' column of the 'customer' table. The ALL keyword is optional and doesn't change the behavior in this context; it's used to explicitly indicate that all values, including NULLs, should be counted. However, ALL is the default behavior for COUNT() if no distinct keyword is specified.
- FROM customer;: This specifies the source of the data for the query, which is the 'customer' table. The FROM keyword is used to indicate the table from which the data will be selected. In this case, it selects data from the 'customer' table.
+-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+ |CUST_CODE | CUST_NAME | CUST_CITY | WORKING_AREA | CUST_COUNTRY | GRADE | OPENING_AMT | RECEIVE_AMT | PAYMENT_AMT |OUTSTANDING_AMT| PHONE_NO | AGENT_CODE | +-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+ | C00013 | Holmes | London | London | UK | 2 | 6000.00 | 5000.00 | 7000.00 | 4000.00 | BBBBBBB | A003 | | C00001 | Micheal | New York | New York | USA | 2 | 3000.00 | 5000.00 | 2000.00 | 6000.00 | CCCCCCC | A008 | | C00020 | Albert | New York | New York | USA | 3 | 5000.00 | 7000.00 | 6000.00 | 6000.00 | BBBBSBB | A008 | | C00025 | Ravindran | Bangalore | Bangalore | India | 2 | 5000.00 | 7000.00 | 4000.00 | 8000.00 | AVAVAVA | A011 | | C00024 | Cook | London | London | UK | 2 | 4000.00 | 9000.00 | 7000.00 | 6000.00 | FSDDSDF | A006 | | C00015 | Stuart | London | London | UK | 1 | 6000.00 | 8000.00 | 3000.00 | 11000.00 | GFSGERS | A003 | | C00002 | Bolt | New York | New York | USA | 3 | 5000.00 | 7000.00 | 9000.00 | 3000.00 | DDNRDRH | A008 | | C00018 | Fleming | Brisban | Brisban | Australia | 2 | 7000.00 | 7000.00 | 9000.00 | 5000.00 | NHBGVFC | A005 | | C00021 | Jacks | Brisban | Brisban | Australia | 1 | 7000.00 | 7000.00 | 7000.00 | 7000.00 | WERTGDF | A005 | | C00019 | Yearannaidu | Chennai | Chennai | India | 1 | 8000.00 | 7000.00 | 7000.00 | 8000.00 | ZZZZBFV | A010 | | C00005 | Sasikant | Mumbai | Mumbai | India | 1 | 7000.00 | 11000.00 | 7000.00 | 11000.00 | 147-25896312 | A002 | | C00007 | Ramanathan | Chennai | Chennai | India | 1 | 7000.00 | 11000.00 | 9000.00 | 9000.00 | GHRDWSD | A010 | | C00022 | Avinash | Mumbai | Mumbai | India | 2 | 7000.00 | 11000.00 | 9000.00 | 9000.00 | 113-12345678 | A002 | | C00004 | Winston | Brisban | Brisban | Australia | 1 | 5000.00 | 8000.00 | 7000.00 | 6000.00 | AAAAAAA | A005 | | C00023 | Karl | London | London | UK | 0 | 4000.00 | 6000.00 | 7000.00 | 3000.00 | AAAABAA | A006 | | C00006 | Shilton | Torento | Torento | Canada | 1 | 10000.00 | 7000.00 | 6000.00 | 11000.00 | DDDDDDD | A004 | | C00010 | Charles | Hampshair | Hampshair | UK | 3 | 6000.00 | 4000.00 | 5000.00 | 5000.00 | MMMMMMM | A009 | | C00017 | Srinivas | Bangalore | Bangalore | India | 2 | 8000.00 | 4000.00 | 3000.00 | 9000.00 | AAAAAAB | A007 | | C00012 | Steven | San Jose | San Jose | USA | 1 | 5000.00 | 7000.00 | 9000.00 | 3000.00 | KRFYGJK | A012 | | C00008 | Karolina | Torento | Torento | Canada | 1 | 7000.00 | 7000.00 | 9000.00 | 5000.00 | HJKORED | A004 | | C00003 | Martin | Torento | Torento | Canada | 2 | 8000.00 | 7000.00 | 7000.00 | 8000.00 | MJYURFD | A004 | | C00009 | Ramesh | Mumbai | Mumbai | India | 3 | 8000.00 | 7000.00 | 3000.00 | 12000.00 | Phone No | A002 | | C00014 | Rangarappa | Bangalore | Bangalore | India | 2 | 8000.00 | 11000.00 | 7000.00 | 12000.00 | AAAATGF | A001 | | C00016 | Venkatpati | Bangalore | Bangalore | India | 2 | 8000.00 | 11000.00 | 7000.00 | 12000.00 | JRTVFDD | A007 | | C00011 | Sundariya | Chennai | Chennai | India | 3 | 7000.00 | 11000.00 | 7000.00 | 11000.00 | PPHGRTS | A010 | +-----------+-------------+-------------+--------------+--------------+-------+-------------+-------------+-------------+---------------+--------------+------------+
Output :
COUNT(ALLGRADE) --------------- 25
Note: Outputs of the said SQL statement shown here is taken by using Oracle Database 10g Express Edition.
Here is a slide presentation of all aggregate functions.
Previous: COUNT Function
Next: COUNT with Group by
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics