Teachnique
      CourseRoadmaps
      Login

      HomeIntroductionFeaturesVersionsVariablesInstallationAdministrationPHP SyntaxNode.js SyntaxJava SyntaxPython SyntaxConnectionWorkbench

      Create DatabaseDrop DatabaseSelect DatabaseShow DatabaseCopy DatabaseDatabase ExportDatabase ImportDatabase Info

      Create UsersDrop UsersShow UsersChange PasswordGrant PrivilegesShow PrivilegesRevoke PrivilegesLock User AccountUnlock User Account

      Create TablesShow TablesAlter TablesRename TablesClone TablesTruncate TablesTemporary TablesRepair TablesDescribe TablesAdd/Delete ColumnsShow ColumnsRename ColumnsTable LockingDrop TablesDerived Tables

      QueriesConstraintsInsert QuerySelect QueryUpdate QueryDelete QueryReplace QueryInsert IgnoreInsert on Duplicate Key UpdateInsert Into Select

      Create ViewsUpdate ViewsDrop ViewsRename Views

      IndexesCreate IndexDrop IndexShow IndexesUnique IndexClustered IndexNon-Clustered Index

      Where ClauseLimit ClauseDistinct ClauseOrder By ClauseGroup By ClauseHaving ClauseAND OperatorOR OperatorLike OperatorIN OperatorANY OperatorEXISTS OperatorNOT OperatorNOT EQUAL OperatorIS NULL OperatorIS NOT NULL OperatorBetween OperatorUNION OperatorUNION vs UNION ALLMINUS OperatorINTERSECT OperatorINTERVAL Operator

      Using JoinsInner JoinLeft JoinRight JoinCross JoinFull JoinSelf JoinDelete JoinUpdate JoinUnion vs Join

      Unique KeyPrimary KeyForeign KeyComposite KeyAlternate Key

      TriggersCreate TriggerShow TriggerDrop TriggerBefore Insert TriggerAfter Insert TriggerBefore Update TriggerAfter Update TriggerBefore Delete TriggerAfter Delete Trigger

      Data TypesVARCHARBOOLEANENUMDECIMALINTFLOATBITTINYINTBLOBSET

      Regular ExpressionsRLIKE OperatorNOT LIKE OperatorNOT REGEXP Operatorregexp_instr() Functionregexp_like() Functionregexp_replace() Functionregexp_substr() Function

      Fulltext SearchNatural Language Fulltext SearchBoolean Fulltext SearchQuery Expansion Fulltext Searchngram Fulltext Parser

      Date and Time FunctionsArithmetic OperatorsNumeric FunctionsString FunctionsAggregate Functions

      NULL ValuesTransactionsUsing SequencesHandling DuplicatesSQL InjectionSubQueryCommentsCheck ConstraintsStorage EnginesExport Table into CSV FileImport CSV File into DatabaseUUIDCommon Table ExpressionsOn Delete CascadeUpsertHorizontal PartitioningVertical PartitioningCursorStored FunctionsSignalResignalCharacter SetCollationWildcardsAliasROLLUPToday DateLiteralsStored ProcedureExplainJSONStandard DeviationFind Duplicate RecordsDelete Duplicate RecordsSelect Random RecordsShow ProcesslistChange Column TypeReset Auto-IncrementCoalesce() Function

      Useful FunctionsStatements ReferenceQuick GuideUseful ResourcesDiscussion

      Feedback

      Submit request if you have any questions.

      Course
      Find Duplicate Records

      MySQL Tutorial

      This SQL tutorial is structured for beginners to guide them from the foundational concepts to advanced data manipulation and querying techniques in SQL. By the end of this tutorial, you will have developed a robust understanding of SQL, equipping you with the knowledge to efficiently manage and analyze data across various database systems. This guide sets the stage for your growth into a skilled data professional, ready to tackle complex data challenges and contribute to the field of data analysis and database management.

      Find Duplicate Records

      Duplicate records in a table decrease the efficiency of a MySQL database (by increasing the execution time, using unnecessary space, etc.). Thus, locating duplicates becomes necessary to efficiently use the database.
      We can, however, also prevent users from entering duplicate values into a table, by adding constraints on the desired column(s), such as PRIMARY KEY and UNIQUE constraints.
      But, due to various reasons like, human error, an application bug or data extracted from external resources, if duplicates are still entered into the database, there are various ways to find the records. Using SQL GROUP BY and HAVING clauses is one of the common ways to filter records containing duplicates.

      Finding Duplicate Records

      Before finding the duplicate records in a table we need to define the criteria for which we need the duplicate records for. You can do this in two steps −
      • First of all, we need to group all the rows by the columns on which you want to check the duplicity on, using the GROUPBY clause.
      • Then Using the Having clause and the count function then, we need to verify whether any of the above formed groups have more than 1 entity.

      Example

      First of all, let us create a table with the name CUSTOMERS using the following query
      CREATE TABLE CUSTOMERS (
      ID INT NOT NULL,
      NAME VARCHAR (20) NOT NULL,
      AGE INT NOT NULL,
      ADDRESS CHAR (25),
      SALARY DECIMAL (18, 2),
      PRIMARY KEY (ID)
      );
      Now, let us insert some duplicate records into the above-created table using the INSERT IGNORE INTO statement as shown below
      INSERT INTO CUSTOMERS VALUES
      (1, 'Ramesh', 32, 'Ahmedabad', 2000.00),
      (2, 'Khilan', 25, 'Delhi', 1500.00),
      (3, 'Kaushik', 23, 'Kota', 2000.00),
      (4, 'Chaitali', 25, 'Mumbai', 6500.00),
      (5, 'Hardik', 27, 'Bhopal', 8500.00),
      (6, 'Komal', 22, 'Hyderabad', 4500.00),
      (7, 'Muffy', 24, 'Indore', 10000.00);
      The table is created as
      ID
      NAME
      AGE
      ADDRESS
      SALARY
      1
      Ramesh
      32
      Ahmedabad
      2000.00
      2
      Khilan
      25
      Delhi
      1500.00
      3
      Kaushik
      23
      Kota
      2000.00
      4
      Chaitali
      25
      Mumbai
      6500.00
      5
      Hardik
      27
      Bhopal
      8500.00
      6
      Komal
      22
      Hyderabad
      4500.00
      7
      Muffy
      24
      Indore
      10000.00
      On the following query, we are trying to return the count of duplicate records using the MySQL COUNT() function
      SELECT SALARY, COUNT(SALARY)
      AS "COUNT" FROM CUSTOMERS
      GROUP BY SALARY
      ORDER BY SALARY;

      Output

      The output for the query above is produced as given below
      SALARY
      COUNT
      1500.00
      1
      2000.00
      2
      4500.00
      1
      6500.00
      1
      8500.00
      1
      10000.00
      1

      With Having Clause

      The HAVING clause in MySQL can be used to filter conditions for a group of rows in a table. Here, we are going to use the HAVING clause with the COUNT() function to find the duplicate values in one or more columns of a table.

      Duplicates values in single column

      Following are the steps to find the duplicate values in a single column of a table:
      Step-1: Firstly, we need to use the GROUP BY clause to group all rows in the column that we want to check the duplicates.
      Step-2: Then , to find duplicate groups, use COUNT() function in the HAVING clause to check if any group has more than one element.

      Example

      Using the following query, we can find all rows that have duplicate DOG_NAMES in the PETS table
      SELECT SALARY, COUNT(SALARY)
      FROM CUSTOMERS
      GROUP BY SALARY
      HAVING COUNT(SALARY) > 1;

      Output

      The output is as follows
      SALARY
      COUNT
      2000.00
      2

      Duplicate Values in Multiple Columns

      We can use the AND operator in the HAVING clause to find the duplicate rows in multiple columns. The rows are considered duplicate only when the combination of columns are duplicate.

      Example

      In the following query, we are finding rows in the PETS table with duplicate records in DOG_NAME, AGE, OWNER_NAME columns
      SELECT SALARY, COUNT(SALARY),
      AGE, COUNT(AGE)
      FROM CUSTOMERS
      GROUP BY SALARY, AGE
      HAVING COUNT(SALARY) > 1
      AND COUNT(AGE) > 1;

      Output

      The output is as follows
      SALARY
      COUNT
      AGE
      COUNT
      2000.00
      2
      23
      2

      The ROW_NUMBER() function with PARTITION BY

      In MySQL, the ROW_NUMBER() function and PARTITION BY clause can be used to find duplicate records in a table. The partition clause divides the table based on a specific column or multiple columns, then the ROW_NUMBER() function assigns a unique row number to each row within each partition. Rows with the same partition and row number are considered duplicates rows.

      Example

      In the following query, we are assigning a
      SELECT *, ROW_NUMBER() OVER (
      PARTITION BY SALARY, AGE
      ORDER BY SALARY, AGE
      ) AS row_numbers
      FROM CUSTOMERS;

      Output

      The output for the query above as follows
      ID
      NAME
      AGE
      ADDRESS
      SALARY
      row_numbers
      2
      Khilan
      25
      Delhi
      1500.00
      1
      1
      Ramesh
      23
      Ahmedabad
      2000.00
      1
      3
      Kaushik
      23
      Kota
      2000.00
      2
      4
      Chaitali
      25
      Mumbai
      6500.00
      1
      5
      Hardik
      27
      Bhopal
      8500.00
      1
      6
      Komal
      22
      Hyderabad
      4500.00
      1
      7
      Muffy
      24
      Indore
      10000.00
      1

      Find Duplicate Records Using Client Program

      We can also find duplicates records using Client Program.

      Syntax

      PHPNodeJSJavaPython
      To find the duplicate records through a PHP program, we need to group all the rows by column using the GROUPBY clause and then use the COUNT function to count the duplicates. For this, we need to execute the SELECT statement using the mysqli function query() as follows
      $sql = "SELECT SALARY, COUNT(SALARY) AS "COUNT" FROM CUSTOMERS GROUP BY SALARY ORDER BY SALARY";
      $mysqli->query($sql);

      Example

      Following are the programs
      PHPNodeJSJavaPython
      $dbhost = 'localhost';
      $dbuser = 'root';
      $dbpass = 'password';
      $db = 'TUTORIALS';
      $mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
      if ($mysqli->connect_errno) {
      printf("Connect failed: %s", $mysqli->connect_error);
      exit();
      }
      //printf('Connected successfully.');
      //let's create a table
      $sql = "CREATE TABLE Pets (ID int,DOG_NAME varchar(30) not null,AGE int not null,OWNER_NAME varchar(30) not null)";
      if($mysqli->query($sql)){
      printf("Pets table created successfully...!\n");
      }
      //now lets insert some duplicate records;
      $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(1, 'Fluffy', 1, 'Micheal')";
      if($mysqli->query($sql)){
      printf("First record inserted successfully...!\n");
      }
      $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(1, 'Fluffy', 1, 'Micheal')";
      if($mysqli->query($sql)){
      printf("Second record inserted successfully...!\n");
      }
      $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(2, 'Harry', 2, 'Jack')";
      if($mysqli->query($sql)){
      printf("Third records inserted successfully...!\n");
      }
      $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(3, 'Sheero', 1, 'Rose')";
      if($mysqli->query($sql)){
      printf("Fourth record inserted successfully...!\n");
      }
      $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(4, 'Simba', 2, 'Rahul')";
      if($mysqli->query($sql)){
      printf("Fifth record inserted successfully...!\n");
      }
      //display the table records
      $sql = "SELECT * FROM PETS";
      if($result = $mysqli->query($sql)){
      printf("Table records: \n");
      while($row = mysqli_fetch_array($result)){
      printf("ID: %d, DOG_NAME %s, AGE: %d,OWNER_NAME: %s ",
      $row['ID'],
      $row['DOG_NAME'],
      $row['AGE'],
      $row['OWNER_NAME']);
      printf("\n");
      }
      }
      //now lets group the all rows to find duplicate records...
      $sql = "SELECT ID, DOG_NAME, AGE, OWNER_NAME, COUNT(*) AS 'Count' FROM PETS GROUP BY ID, DOG_NAME, OWNER_NAME ORDER BY ID";
      if($result = $mysqli->query($sql)){
      printf("Table duplicate records: \n");
      while($row = mysqli_fetch_array($result)){
      printf("ID: %d, DOG_NAME %s, AGE: %d, OWNER_NAME: %s ",
      $row['ID'],
      $row['DOG_NAME'],
      $row['AGE'],
      $row['OWNER_NAME'],
      $row['Count']);
      printf("\n");
      }
      }
      if($mysqli->error){
      printf("Error message: ", $mysqli->error);
      }
      $mysqli->close();

      Output

      The output obtained is as shown below
      Pets table created successfully...!
      First record inserted successfully...!
      Second record inserted successfully...!
      Third records inserted successfully...!
      Fourth record inserted successfully...!
      Fifth record inserted successfully...!
      Table records:
      ID: 1, DOG_NAME Fluffy, AGE: 1,OWNER_NAME: Micheal
      ID: 1, DOG_NAME Fluffy, AGE: 1,OWNER_NAME: Micheal
      ID: 2, DOG_NAME Harry, AGE: 2,OWNER_NAME: Jack
      ID: 3, DOG_NAME Sheero, AGE: 1,OWNER_NAME: Rose
      ID: 4, DOG_NAME Simba, AGE: 2,OWNER_NAME: Rahul
      Table duplicate records:
      ID: 1, DOG_NAME Fluffy, AGE: 1,OWNER_NAME: Micheal
      ID: 2, DOG_NAME Harry, AGE: 2,OWNER_NAME: Jack
      ID: 3, DOG_NAME Sheero, AGE: 1,OWNER_NAME: Rose
      ID: 4, DOG_NAME Simba, AGE: 2,OWNER_NAME: Rahul