w3resource

Find duplicate data in MySQL

Objective

There are many occasions when you need to find duplicate values available in a column of a MySql table. Often, you may want to count the number of duplicate values in a MySQL table.

In this article, we have discussed a query where you can find duplicates, triplicates, quadruplicates (or more) data from a MySQL table.

We have discussed how to find duplicate values with INNER JOIN and subquery, INNER JOIN and DISTINCT, and also how to count duplicate values with GROUP BY and HAVING.

Table in question

We have used a table called 'item' to apply the query :
Table Name: item
Structure: item_code varchar(20), value int(11), quantity int(11) where item_code is the primary key.

item master

Using INNER JOIN and Subquery

Now we want to get the details of those records where quantity field have duplicate/triplicates values. In the image above, values marked with red rectangle exist more than once.

item masterduplicate

Here is the query:

Code:

SELECT item_code, value, item.quantity
FROM item
INNER JOIN(
SELECT quantity
FROM item
GROUP BY quantity
HAVING COUNT(item_code) >1
)temp ON item.quantity= temp.quantity;

Relational Algebra Expression:

Relational Algebra Expression: Using INNER JOIN and Subquery.

Relational Algebra Tree:

Relational Algebra Tree: Using INNER JOIN and Subquery.

Sample Output:

item master duplicate result

To get the above result we have used a query with an INNER JOIN (INNER JOIN selects all rows from both participating tables as long as there is a match between the columns.) statement. INNER JOIN uses the main table 'item' and a temporary table 'temp' whose data comes from a subquery. Here is the subquery and it's output:

Code:

SELECT quantity
FROM item
GROUP BY quantity
HAVING COUNT(item_code) >1

Relational Algebra Expression:

Relational Algebra Expression: Using INNER JOIN and Subquery.

Relational Algebra Tree:

Relational Algebra Tree: Using INNER JOIN and Subquery.

Sample Output:

item-duplicate-data-subquery

Now the following main query will execute on 'item' and 'temp' tables where the common field is quantity and the result will be as follows:

Code:

SELECT item_code, value, item.quantity
FROM item
INNER JOIN temp ON item.quantity= temp.quantity;

Relational Algebra Expression:

Relational Algebra Expression: Using INNER JOIN and Subquery.

Relational Algebra Tree:

Relational Algebra Tree: Using INNER JOIN and Subquery.

Using INNER JOIN and DISTINCT

You can use the following query to get the same result. Here we apply INNER JOIN the table with itself. As the same quantity value exists in more than two records, a DISTINCT clause is used.

Here is the code and the output :

Code:

SELECT distinct a.item_code, a.value, a.quantity
FROM item a
INNER JOIN item b ON a.quantity = b.quantity
WHERE a.item_code <> b.item_code

Relational Algebra Expression:

Relational Algebra Expression: Using INNER JOIN and DISTINCT.

Relational Algebra Tree:

Relational Algebra Tree: Using INNER JOIN and DISTINCT.

Sample Output:

item-master-duplicate-result2

Count duplicate data in MySQL

The following query count those records where quantity field holds duplicate/triplicates (or more) data.

Table data:

item master

Code:

SELECT item_code, COUNT( quantity ) x
FROM item
GROUP BY quantity
HAVING x >1

Sample Output:

item-master-duplicate-data-count

Count duplicate records in MySQL

To count the total duplicate (or more) 'quantity' of 'item' table you can use the following query:

Code:

SELECT count(*) AS Total_duplicate_count
FROM
(SELECT item_code FROM item
GROUP BY quantity HAVING COUNT(quantity) > 1
)AS x

Relational Algebra Expression:

Relational Algebra Expression: Count duplicate records in MySQL.

Relational Algebra Tree:

Relational Algebra Tree: Count duplicate records in MySQL.

Sample Output:

item-master-duplicate-data-count-product

Previous: MySQL Miscellaneous Functions
Next: Rows holding group wise maximum for a column



Follow us on Facebook and Twitter for latest update.