w3resource

SQL Correlated Subqueries


Understanding Co-related Subqueries in SQL

Introduction

SQL Correlated Subqueries retrieve data from a table referenced in the outer query. They are termed "correlated" because the subquery's execution is influenced by the outer query's rows.

When using correlated subqueries, it's essential to employ a table alias (or correlation name) to clarify the table reference intended for use within the subquery. Unlike regular subqueries, co-related subqueries are executed repeatedly, once for each row processed by the outer query

How Co-related Subqueries Work

In an uncorrelated subquery, the subquery is executed independently first, and its result is then used as a value in the outer query. In contrast, a correlated subquery is linked to the outer query by referencing a column and is executed repeatedly, once for each row processed by the outer query.

This dependency allows co-related subqueries to dynamically adapt their results based on the outer query's current row.

Tips:

An alias is a shorthand or nickname assigned to a table by placing it immediately after the table name in the FROM clause. This practice is useful when retrieving data from multiple tables simultaneously.

Syntax:

SELECT column1, column2, ...
FROM table1 outer_alias
WHERE column1 operator (
    SELECT aggregate_function(column)
    FROM table2 inner_alias
    WHERE inner_alias.column = outer_alias.column
);

Example:

Consider two tables:

  1. Employees: Contains employee details.
  2. Departments: Contains department details.

Task: Find employees whose salary is higher than the average salary of their respective departments.

SQL Code:


SELECT e.employee_id, e.name, e.salary, e.department_id
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department_id = e.department_id
);
  • Outer Query: Retrieves employee details.
  • Inner Query: Calculates the average salary for the department of the current employee.
  • The WHERE clause ensures only employees earning above their department's average salary are selected.

Uses of Co-related Subqueries

  1. Comparative Analysis: Comparing row-specific data with aggregated results.

  2. Data Validation: Ensuring rows meet conditions based on related data.

  3. Hierarchical Queries: Retrieving parent-child relationships in data.

  4. Filtering: Dynamically filtering rows based on related criteria.

Advantages

  • Allows dynamic filtering based on related data.
  • Provides a powerful way to handle complex queries.

Disadvantages

  • Can be slower due to repeated execution for each outer query row.
  • Difficult to optimize compared to regular subqueries or joins.

Example: SQL Correlated Subqueries

The following correlated subqueries retrieve ord_num, ord_amount, cust_code, and agent_code from the table orders ('a' and 'b' are aliases for the orders and agents tables), based on the following conditions -

The agent_code in the orders table must match the agent_code in the agents table, and the agent_name in the agents table must be 'Alex',

the following SQL statement can be used:

Sample table: orders


   ORD_NUM ORD_AMOUNT ADVANCE_AMOUNT ORD_DATE  CUST_CODE       AGENT_CODE      ORD_DESCRIPTION
---------- ---------- -------------- --------- --------------- --------------- -----------------
    200114       3500           2000 15-AUG-08 C00002          A008
    200122       2500            400 16-SEP-08 C00003          A004
    200118        500            100 20-JUL-08 C00023          A006
    200119       4000            700 16-SEP-08 C00007          A010
    200121       1500            600 23-SEP-08 C00008          A004
    200130       2500            400 30-JUL-08 C00025          A011
    200134       4200           1800 25-SEP-08 C00004          A005
    200108       4000            600 15-FEB-08 C00008          A004
    200103       1500            700 15-MAY-08 C00021          A005
    200105       2500            500 18-JUL-08 C00025          A011
...........

View the table

Sample table: agents

+------------+----------------------+--------------------+------------+-----------------+---------+
| AGENT_CODE | AGENT_NAME           | WORKING_AREA       | COMMISSION | PHONE_NO        | COUNTRY |
+------------+----------------------+--------------------+------------+-----------------+---------+
| A007       | Ramasundar           | Bangalore          |       0.15 | 077-25814763    |         |
| A003       | Alex                 | London             |       0.13 | 075-12458969    |         |
| A008       | Alford               | New York           |       0.12 | 044-25874365    |         |
| A011       | Ravi Kumar           | Bangalore          |       0.15 | 077-45625874    |         |
| A010       | Santakumar           | Chennai            |       0.14 | 007-22388644    |         |
| A012       | Lucida               | San Jose           |       0.12 | 044-52981425    |         |
| A005       | Anderson             | Brisban            |       0.13 | 045-21447739    |         |
| A001       | Subbarao             | Bangalore          |       0.14 | 077-12346674    |         |
| A002       | Mukesh               | Mumbai             |       0.11 | 029-12358964    |         |
| A006       | McDen                | London             |       0.15 | 078-22255588    |         |
| A004       | Ivan                 | Torento            |       0.15 | 008-22544166    |         |
| A009       | Benjamin             | Hampshair          |       0.11 | 008-22536178    |         |
+------------+----------------------+--------------------+------------+-----------------+---------+

SQL Code:


-- Selecting specific columns from the orders table and aliasing it as 'a'
SELECT a.ord_num, a.ord_amount, a.cust_code, a.agent_code
-- Filtering orders based on agent_code
FROM orders a
-- Specifying the condition for filtering
WHERE a.agent_code = (
    -- Selecting agent_code from agents table and aliasing it as 'b'
    SELECT b.agent_code
    -- Filtering agents based on agent_name
    FROM agents b 
    -- Specifying the condition for filtering
    WHERE b.agent_name = 'Alex'
);

Explanation:

  • This SQL query retrieves specific columns from the "orders" table, aliasing it as 'a'.

  • It filters the orders based on the agent_code.

  • The subquery selects the agent_code from the "agents" table, aliasing it as 'b'.

  • It filters the agents based on the agent_name being 'Alex'.

  • The main query then selects orders where the agent_code matches the result of the subquery, meaning it selects orders handled by the agent with the name 'Alex'.

Output:

   ORD_NUM ORD_AMOUNT CUST_CODE  AGENT_CODE
---------- ---------- ---------- ----------
    200127       2500 C00015     A003
    200100       1000 C00015     A003

The inner of the above query returns the 'agent_code' A003.

The simplified form of above code is:

SQL Code:

SELECT a.ord_num,a.ord_amount,a.cust_code,a.agent_code
FROM orders a
WHERE a.agent_code='A003';

Visual Presentation:

SQL Correlated Subqueries


Using EXISTS with a Correlated Subquery

We have already used the EXISTS operator to check the existence of a result of a subquery. EXISTS operator can be used in correlated subqueries also. Using EXISTS the following query display the employee_id, manager_id, first_name and last_name of those employees who manage other employees.

Sample table: employees
employee_id  first_name  last_name   email       phone_number  hire_date   job_id      salary      commission_pct  manager_id  department_id
-----------  ----------  ----------  ----------  ------------  ----------  ----------  ----------  --------------  ----------  -------------
100          Steven      King        SKING       515.123.4567  6/17/1987   AD_PRES     24000                                   90
101          Neena       Kochhar     NKOCHHAR    515.123.4568  6/18/1987   AD_VP       17000                       100         90
102          Lex         De Haan     LDEHAAN     515.123.4569  6/19/1987   AD_VP       17000                       100         90
103          Alexander   Hunold      AHUNOLD     590.423.4567  6/20/1987   IT_PROG     9000                        102         60
104          Bruce       Ernst       BERNST      590.423.4568  6/21/1987   IT_PROG     6000                        103         60
105          David       Austin      DAUSTIN     590.423.4569  6/22/1987   IT_PROG     4800                        103         60
106          Valli       Pataballa   VPATABAL    590.423.4560  6/23/1987   IT_PROG     4800                        103         60
107          Diana       Lorentz     DLORENTZ    590.423.5567  6/24/1987   IT_PROG     4200                        103         60
108          Nancy       Greenberg   NGREENBE    515.124.4569  6/25/1987   FI_MGR      12000                       101         100
109          Daniel      Faviet      DFAVIET     515.124.4169  6/26/1987   FI_ACCOUNT  9000                        108         100
.................................

View the table

SQL Code:


-- Selecting specific columns from the employees table and aliasing it as 'a'
SELECT employee_id, manager_id, first_name, last_name
-- Filtering employees based on the EXISTS condition
FROM employees a
-- Specifying the condition for filtering using the EXISTS keyword
WHERE EXISTS (
    -- Selecting employee_id from employees table and aliasing it as 'b'
    SELECT employee_id
    -- Filtering employees based on manager_id
    FROM employees b
    -- Specifying the condition for filtering
    WHERE b.manager_id = a.employee_id
);

Explanation:

  • This SQL query retrieves specific columns from the "employees" table, aliasing it as 'a'.

  • It filters the employees based on the existence of records in a correlated subquery.

  • The subquery selects the employee_id from the "employees" table, aliasing it as 'b'.

  • It filters the employees based on whether there exists an employee with a manager_id that matches the employee_id from the outer query.

  • The main query then selects employees where the subquery returns at least one record, meaning it selects employees who manage other employees.

Visual Presentation:

SQL: Using EXISTS operator with a Correlated Subquery


Output:

EMPLOYEE_ID MANAGER_ID FIRST_NAME           LAST_NAME
----------- ---------- -------------------- ---------------
        100            Steven               King
        101        100 Neena                Kochhar
        102        100 Lex                  De Haan
        103        102 Alexander            Hunold
        108        101 Nancy                Greenberg
        114        100 Den                  Raphaely
        120        100 Matthew              Weiss
        121        100 Adam                 Fripp
        122        100 Payam                Kaufling
        123        100 Shanta               Vollman
        124        100 Kevin                Mourgos
        145        100 John                 Russell
        146        100 Karen                Partners
        147        100 Alberto              Errazuriz
        148        100 Gerald               Cambrault
        149        100 Eleni                Zlotkey
        201        100 Michael              Hartstein
        205        101 Shelley              Higgins

Using NOT EXISTS with a Correlated Subquery

NOT EXISTS is logically opposite of EXISTS operator. NOT EXISTS is used when we need to check if rows do not exist in the results returned by a subquery. Using NOT EXISTS the following query display the employee_id, manager_id, first_name and last_name of those employees who have no manager status. This query is opposite to the previous one.

Sample table: employees
employee_id  first_name  last_name   email       phone_number  hire_date   job_id      salary      commission_pct  manager_id  department_id
-----------  ----------  ----------  ----------  ------------  ----------  ----------  ----------  --------------  ----------  -------------
100          Steven      King        SKING       515.123.4567  6/17/1987   AD_PRES     24000                                   90
101          Neena       Kochhar     NKOCHHAR    515.123.4568  6/18/1987   AD_VP       17000                       100         90
102          Lex         De Haan     LDEHAAN     515.123.4569  6/19/1987   AD_VP       17000                       100         90
103          Alexander   Hunold      AHUNOLD     590.423.4567  6/20/1987   IT_PROG     9000                        102         60
104          Bruce       Ernst       BERNST      590.423.4568  6/21/1987   IT_PROG     6000                        103         60
105          David       Austin      DAUSTIN     590.423.4569  6/22/1987   IT_PROG     4800                        103         60
106          Valli       Pataballa   VPATABAL    590.423.4560  6/23/1987   IT_PROG     4800                        103         60
107          Diana       Lorentz     DLORENTZ    590.423.5567  6/24/1987   IT_PROG     4200                        103         60
108          Nancy       Greenberg   NGREENBE    515.124.4569  6/25/1987   FI_MGR      12000                       101         100
109          Daniel      Faviet      DFAVIET     515.124.4169  6/26/1987   FI_ACCOUNT  9000                        108         100
.................................

View the table

SQL Code:


-- Selecting specific columns from the employees table and aliasing it as 'a'
SELECT employee_id, manager_id, first_name, last_name
-- Filtering employees based on the NOT EXISTS condition
FROM employees a
-- Specifying the condition for filtering using the NOT EXISTS keyword
WHERE NOT EXISTS (
    -- Selecting employee_id from employees table and aliasing it as 'b'
    SELECT employee_id
    -- Filtering employees based on manager_id
    FROM employees b
    -- Specifying the condition for filtering
    WHERE b.manager_id = a.employee_id
);

Explanation:

  • This SQL query retrieves specific columns from the "employees" table, aliasing it as 'a'.

  • It filters the employees based on the non-existence of records in a correlated subquery.

  • The subquery selects the employee_id from the "employees" table, aliasing it as 'b'.

  • It filters the employees based on whether there does not exist an employee with a manager_id that matches the employee_id from the outer query.

  • The main query then selects employees where the subquery returns no records, meaning it selects employees who do not manage other employees.

Visual Presentation:

SQL: Using NOT EXISTS with a Correlated Subquery


Output:

EMPLOYEE_ID MANAGER_ID FIRST_NAME           LAST_NAME
----------- ---------- -------------------- --------------
        104        103 Bruce                Ernst
        105        103 David                Austin
        106        103 Valli                Pataballa
        107        103 Diana                Lorentz
        109        108 Daniel               Faviet
        110        108 John                 Chen
        111        108 Ismael               Sciarra
        112        108 Jose Manuel          Urman
        113        108 Luis                 Popp
        115        114 Alexander            Khoo
        116        114 Shelli               Baida
        117        114 Sigal                Tobias
        118        114 Guy                  Himuro
        119        114 Karen                Colmenares
        125        120 Julia                Nayer
        126        120 Irene                Mikkilineni
        127        120 James                Landry
        128        120 Steven               Markle
        129        121 Laura                Bissot
        130        121 Mozhe                Atkinson
        131        121 James                Marlow
........
.......

Example: Performance Consideration with Correlated Subqueries

Find orders where the order amount exceeds the average order amount handled by the same agent

SQL Code:


-- Selects order number, order amount, and agent code from the orders table aliased as 'o'.
-- Filters orders where the order amount is greater than the average order amount handled by the same agent.

SELECT ord_num, ord_amount, agent_code
FROM orders o
WHERE ord_amount > (
    -- Subquery: Calculates the average order amount for orders handled by the agent of the current row in the outer query.
    SELECT AVG(ord_amount)
    FROM orders
    WHERE agent_code = o.agent_code
);

Explanation:

  • This SQL query retrieves ord_num, ord_amount, and agent_code from the orders table, aliased as 'o'.

  • It filters the results to include only those orders where the ord_amount is greater than the average ord_amount handled by the same agent_code.

  • The subquery calculates the average ord_amount for orders associated with the agent_code of each row (o.agent_code) in the outer query.

  • This demonstrates the use of a correlated subquery to compare values within the same table based on a condition from the outer query.

Real-World Use Case of Correlated Subquery

List employees who have been with the company longer than any of their direct reports

SQL Code:


-- Selects employee_id, first_name, and last_name from the employees table aliased as 'e'.
-- Filters employees where the hire date is greater than all hire dates of employees managed by them.

SELECT employee_id, first_name, last_name
FROM employees e
WHERE hire_date > ALL (
    -- Subquery: Retrieves hire dates of employees who report to the manager of the current row in the outer query.
    SELECT hire_date
    FROM employees
    WHERE manager_id = e.employee_id
);

Explanation:

  • This SQL query selects employee_id, first_name, and last_name from the employees table, aliased as 'e'.

  • It filters the results to include only those employees whose hire_date is greater than the hire dates of all employees who report to them.

  • The subquery retrieves hire_date values from employees who have the same manager_id as the employee_id of the current row (e.employee_id) in the outer query.

  • This query demonstrates the use of the ALL operator with a correlated subquery to compare values across related rows in the same table based on a condition from the outer query.

Frequently Asked Questions (FAQ) - SQL Correlated Subqueries

1. What are SQL Correlated Subqueries?

SQL Correlated Subqueries are used to select data from a table referenced in the outer query. The subquery in a correlated subquery is related to the outer query, hence the term "correlated". Each execution of the subquery in the correlated subquery depends on the row processed by the outer query.

2. How do Correlated Subqueries differ from Uncorrelated Subqueries?

In SQL, an uncorrelated subquery executes independently and provides a single value or set of values to the outer query. In contrast, a correlated subquery references columns from the outer query and executes once for each row processed by the outer query.

3. What is a table alias (correlation name) used for in Correlated Subqueries?

A table alias, or correlation name, specifies which table reference is to be used in the context of a correlated subquery. It helps differentiate columns from the outer query and the subquery when both reference the same table.

4. When should I use SQL Correlated Subqueries?

Correlated subqueries are useful when you need to filter rows based on conditions evaluated from related rows in another table. They are typically used in scenarios where the filtering condition depends on values from the current row being processed by the outer query.

5. What are some keywords commonly used with Correlated Subqueries?

Commonly used keywords include EXISTS and NOT EXISTS, which help to check for the existence or non-existence of rows returned by a subquery in the context of a correlated subquery.


See our Model Database

Check out our 1000+ SQL Exercises with solution and explanation to improve your skills.

Previous: Multiple Row and Column Subqueries
Next: Nested subqueries



Follow us on Facebook and Twitter for latest update.