Assumptions: We are trying to find duplicates within a table which contains more than 1 row. So lets take an example of user table containing the columns username and email.
Hence the first approach will be to
Step 1: Find whether any duplicates really exists within the table.
E.g Table name user
Arjun | arjun@gmail.com |
Akash | akash@gmail.com |
Prity | prity@gmail.com |
Sudeshna | sudeshna@gmail.com |
Arjun | arjun@gmail.com |
Prity | prity@gmail.com |
Sudeshna | sudeshna@gmail.com |
Arjun | arjun@gmail.com |
Sudeshna | sudeshna@gmail.com |
Akash | akash@gmail.com |
Prity | prity@gmail.com |
Create the data by using the below query as shown below:
CREATE table user(
username varchar(50) not null,
email varchar(150) not null);
insert into user values
(‘Arjun’, ‘arjun@gmail.com’),
(‘Akash’,’akash@gmail.com’),
(‘Prity’,’prity@gmail.com’),
(‘Sudeshna’,’sudeshna@gmail.com’),
(‘Arjun’,’arjun@gmail.com’),
(‘Prity’,’prity@gmail.com’),
(‘Sudeshna’,’sudeshna@gmail.com’),
(‘Arjun’,’arjun@gmail.com’),
(‘Sudeshna’,’sudeshna@gmail.com’),
(‘Akash’,’akash@gmail.com’),
(‘Prity’,’prity@gmail.com’)
(‘sandeep’,’sandeep@gmail.com’);
In order to find duplicates I will use the below query
Select username, email, count(*)
From user
Group by username, email
Having count(*) >1;
Step 2: The step 2 approach will be to list all rows containing duplicates
Select a.*
From user a
Join (Select username, email, count(*)
From user
Group by username, email
Having count(*)>1)b
On a.username=b.username
And a.email=b.email
Order by a.email
The idea here is very simple. You are first having the select statement from the current table and then perfrom self join on the table and then perform groupby and count query from the rows with duplicated data set. Hope this helps you in getting the required necessary solutions. For further details please contact us on https://www.bitsenze.com/contact-us/