SQL Equi Join
What is Equi Join in SQL?
An Equi Join in SQL is a type of join that combines rows from two or more tables based on a common column or set of columns, using the equality operator = to compare column values.
It is one of the most frequently used types of joins because it matches columns from different tables that have equal values, allowing the combination of related data.
Why are Equi Joins Important?
Equi Joins are crucial for:
- Data Integrity: Ensuring only related data from multiple tables is combined.
- Query Performance: Optimizing data retrieval by leveraging indexes on join columns.
- Data Normalization: Supporting database normalization by joining tables that have been split into logically related entities.
Comparison with Other Joins
While Equi Joins use the equality operator, other joins might use different conditions:
- Inner Join: Similar to Equi Join but can use any comparison operator.
- Outer Join: Returns all rows from one table and the matched rows from the other.
- Cross Join: Combines all rows from both tables, resulting in a Cartesian product.
- Natural Join: Automatically joins tables based on columns with the same names and compatible data types.
Pictorial presentation of SQL Equi Join:
Syntax:
SELECT column_list FROM table1, table2.... WHERE table1.column_name = table2.column_name;
or
SELECT * FROM table1 JOIN table2 [ON (join_condition)]
Example:
Here is an example of Equi Join in SQL.
Sample table: agents
Sample table: customer
To get agent name column from agents table and cust name and cust city columns from customer table after joining said two tables with the following condition -
1. working area of agents and customer city of customer table must be same,
the following SQL statement can be used:
SQL Code:
-- Selecting columns agent_name, cust_name, and cust_city from tables agents and customer
SELECT agents.agent_name, customer.cust_name, customer.cust_city
-- Specifying the tables involved in the query
FROM agents, customer
-- Filtering the rows where the working_area of agents matches the cust_city of customers
WHERE agents.working_area = customer.cust_city;
Explanation:
- This SQL query retrieves data from the 'agents' and 'customer' tables based on a specific join condition.
- It selects the 'agent_name' column from the 'agents' table and the 'cust_name' and 'cust_city' columns from the 'customer' table.
- The query performs an equi join between the 'agents' and 'customer' tables, joining rows where the 'working_area' column in the 'agents' table matches the 'cust_city' column in the 'customer' table.
- The result set will include the names of agents and customers who are located in the same city, indicating which agent serves which customer in their respective cities.
Output:
AGENT_NAME CUST_NAME CUST_CITY ---------------------------------------- ---------------------------------------- ------------ Ravi Kumar Ravindran Bangalore Ramasundar Ravindran Bangalore Subbarao Ravindran Bangalore Ravi Kumar Srinivas Bangalore Ramasundar Srinivas Bangalore Subbarao Srinivas Bangalore Ravi Kumar Rangarappa Bangalore Ramasundar Rangarappa Bangalore Subbarao Rangarappa Bangalore Ravi Kumar Venkatpati Bangalore Ramasundar Venkatpati Bangalore Subbarao Venkatpati Bangalore Anderson Fleming Brisban Anderson Jacks Brisban Anderson Winston Brisban Santakumar Yearannaidu Chennai ........... ...........
Example with three tables
To get how much did customers who were served by salesmen based in Paris spend on their orders, and who were these customers and salesmen
Sample table: salesman
salesman_id|name |city |commission| -----------+----------+--------+----------+ 5001|James Hoog|New York| 0.15| 5002|Nail Knite|Paris | 0.13| 5005|Pit Alex |London | 0.11| 5006|Mc Lyon |Paris | 0.14| 5007|Paul Adam |Rome | 0.13| 5003|Lauson Hen|San Jose| 0.12|
Sample table: customer
customer_id|cust_name |city |grade|salesman_id| -----------+--------------+----------+-----+-----------+ 3002|Nick Rimando |New York | 100| 5001| 3007|Brad Davis |New York | 200| 5001| 3005|Graham Zusi |California| 200| 5002| 3008|Julian Green |London | 300| 5002| 3004|Fabian Johnson|Paris | 300| 5006| 3009|Geoff Cameron |Berlin | 100| 5003| 3003|Jozy Altidor |Moscow | 200| 5007| 3001|Brad Guzan |London | | 5005|
Sample table: orders
ord_no|purch_amt|ord_date |customer_id|salesman_id| ------+---------+----------+-----------+-----------+ 70009| 270.65|2012-09-10| 3001| 5005| 70002| 65.26|2012-10-05| 3002| 5001| 70004| 110.50|2012-08-17| 3009| 5003| 70005| 2400.60|2012-07-27| 3007| 5001| 70008| 5760.00|2012-09-10| 3002| 5001| 70010| 1983.43|2012-10-10| 3004| 5006| 70003| 2480.40|2012-10-10| 3009| 5003| 70011| 75.29|2012-08-17| 3003| 5007| 70013| 3045.60|2012-04-25| 3002| 5001| 70001| 150.50|2012-10-05| 3005| 5002| 70007| 948.50|2012-09-10| 3005| 5002| 70012| 250.45|2012-06-27| 3008| 5002|
SQL Code:
-- Selecting columns to be retrieved from the query
SELECT s.name, -- Select the name of the salesman from the salesman table
c.cust_name, -- Select the name of the customer from the customer table
o.purch_amt -- Select the purchase amount from the orders table
-- Joining the salesman table with the customer table on the matching salesman_id
FROM salesman s
JOIN customer c ON s.salesman_id = c.salesman_id
-- Further joining the result with the orders table on the matching customer_id
JOIN orders o ON c.customer_id = o.customer_id
-- Filtering the results to include only salesmen based in Paris
WHERE s.city = 'Paris';
- SELECT s.name, c.cust_name, o.purch_amt: This line specifies the columns to be included in the result set:
- s.name: The name of the salesman from the salesman table.
- c.cust_name: The name of the customer from the customer table.
- o.purch_amt: The purchase amount from the orders table.
- FROM salesman s: This line specifies the main table (salesman) to retrieve data from and assigns it an alias (s) for easier reference in the query.
- JOIN customer c ON s.salesman_id = c.salesman_id: This line joins the salesman table with the customer table based on a common column (salesman_id).
- The ON clause indicates that the join is performed where the salesman_id from the salesman table matches the salesman_id from the customer table.
- JOIN orders o ON c.customer_id = o.customer_id: This line further joins the result with the orders table based on a common column (customer_id).
- The ON clause specifies that the join is performed where the customer_id from the customer table matches the customer_id from the orders table.
- WHERE s.city = 'Paris': This line filters the joined data to include only those rows where the city of the salesman (from the salesman table) is 'Paris'.
Optimization Techniques
Creating indexes to speed up joins: Using indexes in join queries can significantly reduce query execution time by minimizing the data scan needed to find matching rows.
For Example:
CREATE INDEX idx_salesman_id ON salesman(salesman_id);
CREATE INDEX idx_customer_id ON customer(customer_id);
Handling Null Values
Equi Joins typically do not match NULL values. Use functions to handle NULLs:
SQL Code:
SELECT s.name, c.cust_name
FROM salesman s
JOIN customer c ON COALESCE(s.city, 'N/A') = COALESCE(c.city, 'N/A');
Output:
name |cust_name | ----------+--------------+ James Hoog|Nick Rimando | James Hoog|Brad Davis | Pit Alex |Julian Green | Mc Lyon |Fabian Johnson| Nail Knite|Fabian Johnson| Pit Alex |Brad Guzan |
What is the difference between Equi Join and Inner Join in SQL?
An equijoin is a join with a join condition containing an equality operator. An equijoin returns only the rows that have equivalent values for the specified columns.
An inner join is a join of two or more tables that returns only those rows (compared using a comparison operator) that satisfy the join condition.
Pictorial presentation : SQL Equi Join Vs. SQL Inner Join
Frequently Asked Questions (FAQ) - SQL Equi Join
1. What is an Equi Join in SQL?
An Equi Join is a type of SQL join that combines rows from two or more tables based on a condition that checks for equality between specified columns. This join uses the equality operator = to match column values across tables.
2. Why are SQL Equi Joins Important?
- Data Integrity: They ensure that only related data from multiple tables is combined, maintaining the logical relationships between datasets.
- Query Performance: By leveraging indexes on the join columns, Equi Joins can optimize data retrieval processes, making queries more efficient.
- Data Normalization: They support database normalization by allowing the combination of tables that have been split into logically related entities to avoid redundancy and maintain consistency.
3. How does an SQL Equi Join differ from other types of joins?
- Inner Join: Similar to an Equi Join, but it can use any comparison operator (e.g., <, >, <>). Equi Joins are essentially a specific case of Inner Joins that use the equality operator.
- Outer Join: Unlike Equi Joins, Outer Joins return all rows from one table and the matched rows from the other table(s). If there is no match, the result will include NULLs for the columns from the table without a match.
- Cross Join: This join combines all rows from both tables, producing a Cartesian product. It does not use any join condition and typically returns a large number of rows.
- Natural Join: Automatically joins tables based on columns with the same names and compatible data types. It simplifies the join condition by not requiring explicit column names in the join clause.
4. Can SQL Equi Joins be used with more than two tables?
Yes, Equi Joins can be used to join more than two tables. By chaining multiple join conditions, you can combine data from several tables. Each additional join condition must specify how the tables are related by matching columns.
5. Do SQL Equi Joins handle NULL values?
Equi Joins do not inherently match rows with NULL values in the join columns because NULL is not considered equal to any value, including another NULL. To include NULL values, you might need to use functions like COALESCE or modify the join conditions to handle these cases.
6. How can SQL Equi Joins be optimized?
- Create Indexes: Indexes on the join columns can significantly speed up the query by reducing the amount of data scanned.
- Minimize the Data Set: Use selective WHERE clauses to reduce the number of rows processed before applying the join.
- Use Database-Specific Optimizations: Depending on our database system, there may be specific features or settings to enhance join performance.
7. What are some common use cases for SQL Equi Joins?
- Combining Related Data: For example, joining customer and order tables to view all orders placed by each customer.
- Data Integration: Merging data from different sources into a unified view for reporting or analysis.
- Supporting Analytical Queries: Combining tables like sales, products, and customers to analyze sales performance, customer demographics, etc.
8. How are SQL Equi Joins implemented in SQL?
Equi Joins can be implemented using the JOIN keyword combined with the ON clause to specify the equality condition, or by using the WHERE clause to define the equality between columns. The choice of syntax can depend on readability and preference but both achieve the same result.
9. Are there any limitations to using SQL Equi Joins?
- We need to include all rows regardless of matches: In such cases, an Outer Join might be more appropriate.
- We are working with columns containing NULLs: Additional handling is required to include rows where the join columns might have NULL values.
- Performance concerns: With large datasets, Equi Joins can be resource-intensive, and optimizing with indexes or partitioning might be necessary.
Key points to remember
Click on the following to get the slides presentation -
Check out our 1000+ SQL Exercises with solution and explanation to improve your skills.
Previous: Introduction
Next: SQL NON EQUI JOIN
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://www.w3resource.com/sql/joins/perform-an-equi-join.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics