w3resource

SQL DISTINCT

Select with distinct

Redundancy is the repetition of certain data in a table. With the use of DISTINCT clause data redundancy may be avoided. This clause will eliminate the repetitive appearance of same data. DISTINCT can come only once in a given select statement.

Syntax:

SELECT DISTINCT <column_name> 
FROM <table_name> 
WHERE <conditions>;

Parameters:

Name Description
column_name Name of the column.
table_name Name of the table.
conditions It may be a condition, a select query or an expression.

Visual presentation :

SQL Distinct

Some important questions related to the SQL DISTINCT keyword

What is the purpose of the DISTINCT keyword in SQL?

  • The DISTINCT keyword is used to retrieve unique values from a specific column or combination of columns in a result set.

  • Does using the DISTINCT keyword affect the order of the result set?

  • No, the DISTINCT keyword does not guarantee the order of the result set. If you need the result set to be sorted, you should use the ORDER BY clause in conjunction with DISTINCT.

  • Can you use the DISTINCT keyword with multiple columns?

  • Yes, the DISTINCT keyword can be used with multiple columns to retrieve unique combinations of values across those columns.

  • Can you use the DISTINCT keyword with aggregate functions?

  • Yes, the DISTINCT keyword can be used in conjunction with aggregate functions like COUNT(), SUM(), AVG(), etc., to perform calculations on unique values.

  • How does the DISTINCT keyword differ from the GROUP BY clause?

  • The DISTINCT keyword is used to retrieve unique values from one or more columns in a result set, while the GROUP BY clause is used to group rows that have the same values into summary rows and perform aggregate functions on them.

  • Can you use the DISTINCT keyword with subqueries?

  • Yes, the DISTINCT keyword can be used with subqueries to retrieve unique values from the result of the subquery.

  • What happens if you use the DISTINCT keyword with a column that contains NULL values?

  • The DISTINCT keyword treats NULL values as unique, so if a column contains NULL values, they will be considered as distinct values.

  • What is the performance impact of using the DISTINCT keyword?

  • Using the DISTINCT keyword may impact performance, especially on large datasets, as it requires sorting and removing duplicate values from the result set.

  • When should you use the DISTINCT keyword?

  • The DISTINCT keyword is used when you want to remove duplicate values from the result set and only display unique values.

  • Are there any alternatives to using the DISTINCT keyword?

  • Yes, you can achieve similar results using the GROUP BY clause, but it may have different implications depending on the specific requirements of the query. Additionally, some databases offer specific functions or operators for removing duplicates without sorting, such as the UNION operator.

  • Example-1: SQL DISTINCT

    Sample table: orders
       ORD_NUM ORD_AMOUNT ADVANCE_AMOUNT ORD_DATE  CUST_CODE       AGENT_CODE      ORD_DESCRIPTION
    ---------- ---------- -------------- --------- --------------- --------------- -----------------
        200114       3500           2000 15-AUG-08 C00002          A008
        200122       2500            400 16-SEP-08 C00003          A004
        200118        500            100 20-JUL-08 C00023          A006
        200119       4000            700 16-SEP-08 C00007          A010
        200121       1500            600 23-SEP-08 C00008          A004
        200130       2500            400 30-JUL-08 C00025          A011
        200134       4200           1800 25-SEP-08 C00004          A005
        200108       4000            600 15-FEB-08 C00008          A004
        200103       1500            700 15-MAY-08 C00021          A005
        200105       2500            500 18-JUL-08 C00025          A011
        200109       3500            800 30-JUL-08 C00011          A010
        200101       3000           1000 15-JUL-08 C00001          A008
        200111       1000            300 10-JUL-08 C00020          A008
        200104       1500            500 13-MAR-08 C00006          A004
        200106       2500            700 20-APR-08 C00005          A002
        200125       2000            600 10-OCT-08 C00018          A005
        200117        800            200 20-OCT-08 C00014          A001
        200123        500            100 16-SEP-08 C00022          A002
        200120        500            100 20-JUL-08 C00009          A002
        200116        500            100 13-JUL-08 C00010          A009
        200124        500            100 20-JUN-08 C00017          A007
        200126        500            100 24-JUN-08 C00022          A002
        200129       2500            500 20-JUL-08 C00024          A006
        200127       2500            400 20-JUL-08 C00015          A003
        200128       3500           1500 20-JUL-08 C00009          A002
        200135       2000            800 16-SEP-08 C00007          A010
        200131        900            150 26-AUG-08 C00012          A012
        200133       1200            400 29-JUN-08 C00009          A002
        200100       1000            600 08-JAN-08 C00015          A003
        200110       3000            500 15-APR-08 C00019          A010
        200107       4500            900 30-AUG-08 C00007          A010
        200112       2000            400 30-MAY-08 C00016          A007
        200113       4000            600 10-JUN-08 C00022          A002
        200102       2000            300 25-MAY-08 C00012          A012
    

    You can use the following SQL statement to retrieve all 'agent_code' values from the 'orders' table.

    SQL Code:

    
    SELECT agent_code FROM orders;
    -- Select the 'agent_code' column from the 'orders' table
    

    Explanation:

    • SELECT agent_code: This line specifies the column that you want to retrieve data from. It selects the 'agent_code' column from the 'orders' table.

    • FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.

    Relational Algebra Expression:

    Relational Algebra Expression: SELECT with DISTINCT.

    Relational Algebra Tree:

    Relational Algebra Tree: SELECT with DISTINCT.

    Output:

    AGENT_CODE
    ----------
    A008
    A004
    A006
    A010
    A004
    A011
    A005
    A013
    A004
    A005
    A011
    
    ...
    ...
    

    The above picture shows the same 'agent_code' appears more than once.

    Example-2: SQL DISTINCT

    To retrieve each unique 'agent_code' from the 'orders' table, you can use the following SQL statement.:

    SQL Code:

    
    SELECT DISTINCT agent_code 
    -- Select distinct values of the 'agent_code' column
    FROM orders;
    -- From the table 'orders'
    

    Explanation:

    • SELECT DISTINCT agent_code: This line specifies that you want to retrieve unique/distinct values from the 'agent_code' column. The DISTINCT keyword ensures that only unique values are returned; any duplicate values will be eliminated.

    • FROM orders: This line specifies the table from which you want to retrieve data. In this case, it's the 'orders' table.

    Relational Algebra Expression:

    Relational Algebra Expression: SELECT with DISTINCT.

    Relational Algebra Tree:

    Relational Algebra Tree: SELECT with DISTINCT.

    Output:

    AGENT_CODE
    ----------
    A004
    A002
    A007
    A009
    A011
    A012
    A010
    A013
    A001
    A008
    A006
    A005
    A003
    

    Check out our 1000+ SQL Exercises with solution and explanation to improve your skills.

    

    Follow us on Facebook and Twitter for latest update.