Teachnique
      CourseRoadmaps
      Login

      HomeIntroductionFeaturesVersionsVariablesInstallationAdministrationPHP SyntaxNode.js SyntaxJava SyntaxPython SyntaxConnectionWorkbench

      Create DatabaseDrop DatabaseSelect DatabaseShow DatabaseCopy DatabaseDatabase ExportDatabase ImportDatabase Info

      Create UsersDrop UsersShow UsersChange PasswordGrant PrivilegesShow PrivilegesRevoke PrivilegesLock User AccountUnlock User Account

      Create TablesShow TablesAlter TablesRename TablesClone TablesTruncate TablesTemporary TablesRepair TablesDescribe TablesAdd/Delete ColumnsShow ColumnsRename ColumnsTable LockingDrop TablesDerived Tables

      QueriesConstraintsInsert QuerySelect QueryUpdate QueryDelete QueryReplace QueryInsert IgnoreInsert on Duplicate Key UpdateInsert Into Select

      Create ViewsUpdate ViewsDrop ViewsRename Views

      IndexesCreate IndexDrop IndexShow IndexesUnique IndexClustered IndexNon-Clustered Index

      Where ClauseLimit ClauseDistinct ClauseOrder By ClauseGroup By ClauseHaving ClauseAND OperatorOR OperatorLike OperatorIN OperatorANY OperatorEXISTS OperatorNOT OperatorNOT EQUAL OperatorIS NULL OperatorIS NOT NULL OperatorBetween OperatorUNION OperatorUNION vs UNION ALLMINUS OperatorINTERSECT OperatorINTERVAL Operator

      Using JoinsInner JoinLeft JoinRight JoinCross JoinFull JoinSelf JoinDelete JoinUpdate JoinUnion vs Join

      Unique KeyPrimary KeyForeign KeyComposite KeyAlternate Key

      TriggersCreate TriggerShow TriggerDrop TriggerBefore Insert TriggerAfter Insert TriggerBefore Update TriggerAfter Update TriggerBefore Delete TriggerAfter Delete Trigger

      Data TypesVARCHARBOOLEANENUMDECIMALINTFLOATBITTINYINTBLOBSET

      Regular ExpressionsRLIKE OperatorNOT LIKE OperatorNOT REGEXP Operatorregexp_instr() Functionregexp_like() Functionregexp_replace() Functionregexp_substr() Function

      Fulltext SearchNatural Language Fulltext SearchBoolean Fulltext SearchQuery Expansion Fulltext Searchngram Fulltext Parser

      Date and Time FunctionsArithmetic OperatorsNumeric FunctionsString FunctionsAggregate Functions

      NULL ValuesTransactionsUsing SequencesHandling DuplicatesSQL InjectionSubQueryCommentsCheck ConstraintsStorage EnginesExport Table into CSV FileImport CSV File into DatabaseUUIDCommon Table ExpressionsOn Delete CascadeUpsertHorizontal PartitioningVertical PartitioningCursorStored FunctionsSignalResignalCharacter SetCollationWildcardsAliasROLLUPToday DateLiteralsStored ProcedureExplainJSONStandard DeviationFind Duplicate RecordsDelete Duplicate RecordsSelect Random RecordsShow ProcesslistChange Column TypeReset Auto-IncrementCoalesce() Function

      Useful FunctionsStatements ReferenceQuick GuideUseful ResourcesDiscussion

      Feedback

      Submit request if you have any questions.

      Course
      Handling Duplicates

      MySQL Tutorial

      This SQL tutorial is structured for beginners to guide them from the foundational concepts to advanced data manipulation and querying techniques in SQL. By the end of this tutorial, you will have developed a robust understanding of SQL, equipping you with the knowledge to efficiently manage and analyze data across various database systems. This guide sets the stage for your growth into a skilled data professional, ready to tackle complex data challenges and contribute to the field of data analysis and database management.

      Handling Duplicates

      Tables or result sets in a database usually contain duplicate records. While duplicates are generally allowed, there are situations where it is necessary to prevent them. In such cases, it becomes essential to identify and remove duplicate records from a database table.

      Importance of Handling MySQL Duplicates

      There are various reasons why handling duplicates in a database becomes necessary. One of the main reasons is that the existence of duplicates in an organizational database will lead to logical errors. In addition to it, we need to handle redundant data to prevent the following consequences −
      • Duplicate data occupies storage space, reducing the efficiency of database usage and increasing storage costs.
      • Dealing with duplicate records consumes additional resources, driving up the overall cost of maintaining the database.
      • Duplicates in a database can lead to logical errors in data, affecting the integrity and reliability of the information stored.

      Preventing Duplicate Entries

      You can use a PRIMARY KEY or a UNIQUE Index on a table with the appropriate fields to prevent duplicate record entries into a table.

      Example

      The following table contains no such index or primary key, so it would allow duplicate records for first_name and last_name.
      CREATE TABLE CUSTOMERS (
      first_name CHAR(20),
      last_name CHAR(20),
      sex CHAR(10)
      );
      To prevent multiple records with the same first and last name values from being created in this table, add a PRIMARY KEY to its definition. When you do this, it is also necessary to declare the indexed columns to be NOT NULL, because a PRIMARY KEY does not allow NULL values
      CREATE TABLE CUSTOMERS (
      first_name CHAR(20) NOT NULL,
      last_name CHAR(20) NOT NULL,
      sex CHAR(10),
      PRIMARY KEY (last_name, first_name)
      );
      Using INSERT IGNORE Query
      The existence of a unique index in a table normally causes an error when attempting to insert a record that duplicates an existing record in the indexed column(s).
      To handle this situation without generating an error, you can use the "INSERT IGNORE" command. When a record is not a duplicate, MySQL inserts it as usual. However, if the record is duplicate, the "IGNORE" keyword instructs MySQL to discard it without producing an error.
      The provided example does not result in an error, and it also ensures that duplicate records are not inserted
      INSERT IGNORE INTO CUSTOMERS (LAST_NAME, FIRST_NAME)
      VALUES ('Jay', 'Thomas'), ('Jay', 'Thomas');
      We get the following output
      Query OK, 1 row affected, 1 warning (0.01 sec)
      Records: 2 Duplicates: 1 Warnings: 1
      Using REPLACE Query
      Instead of using the INSERT command, consider using the REPLACE command. When dealing with a new record, it is inserted just as with INSERT. However, if it is a duplicate, the new record replaces the old one.
      REPLACE INTO CUSTOMERS (LAST_NAME, FIRST_NAME)
      VALUES ( 'Ajay', 'Kumar'), ( 'Ajay', 'Kumar');
      Following is the output of the above code
      Query OK, 2 rows affected (0.01 sec)
      Records: 2 Duplicates: 0 Warnings: 0
      Your choice between the INSERT IGNORE and REPLACE commands should depend on the specific duplicate-handling behaviour you wish to achieve. The INSERT IGNORE command retains the first set of duplicated records and discards the remaining. On the other hand, the REPLACE command keeps the last set of duplicates and removes any earlier instances.
      Using UNIQUE Constraint
      Another way to enforce uniqueness is to add a UNIQUE index rather than a PRIMARY KEY to a table
      CREATE TABLE CUSTOMERS (
      first_name CHAR(20) NOT NULL,
      last_name CHAR(20) NOT NULL,
      sex CHAR(10),
      UNIQUE (last_name, first_name)
      );

      Counting and Identifying Duplicates

      You can use the COUNT function and GROUP BY clause to count and identify duplicate records based on specific columns.

      Example

      Following is the query to count duplicate records with first_name and last_name in a table
      SELECT COUNT(*) as repetitions, last_name, first_name
      FROM CUSTOMERS
      GROUP BY last_name, first_name
      HAVING repetitions > 1;
      This query will return a list of all the duplicate records in the CUSTOMERS table. In general, to identify sets of values that are duplicated, follow the steps given below.
      • Determine which columns may contain duplicated values.
      • Include those columns in the column selection list, along with COUNT(*).
      • List the columns in the GROUP BY clause as well.
      • Apply a HAVING clause to filter unique values by requiring the group counts to be greater than one.

      Eliminating Duplicates from a Query Result

      You can use the DISTINCT command along with the SELECT statement to find out unique records available in a table.
      SELECT DISTINCT last_name, first_name
      FROM CUSTOMERS
      ORDER BY last_name;
      An alternative to the DISTINCT command is to add a GROUP BY clause that specifies the columns you are selecting. This approach eliminates duplicates and retrieves only the unique combinations of values from the specified columns.
      SELECT last_name, first_name
      FROM CUSTOMERS
      GROUP BY (last_name, first_name);

      Removing Duplicates Using Table Replacement

      If you have duplicate records in a table and you want to remove all the duplicate records from that table, then follow the procedure given below
      CREATE TABLE tmp AS
      SELECT DISTINCT last_name, first_name, sex
      FROM CUSTOMERS;
      
      DROP TABLE CUSTOMERS;
      ALTER TABLE tmp RENAME TO CUSTOMERS;

      Handling Duplicates Using a Client Program

      We can also handle duplicate using the client program.

      Syntax

      PHPNodeJSJavaPython
      To handle duplicates value through a PHP program, we need to execute the "INSERT IGNORE" statement using the mysqli function query() as follows
      $sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')";
      $mysqli->query($sql);

      Example

      Following are the programs
      PHPNodeJSJavaPython
      $dbhost = 'localhost';
      $dbuser = 'root';
      $dbpass = 'password';
      $db = 'TUTORIALS';
      $mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
      if ($mysqli->connect_errno) {
      printf("Connect failed: %s", $mysqli->connect_error);
      exit();
      }
      //printf('Connected successfully.');
      $sql = "CREATE TABLE person_tbl (first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name))";
      if($mysqli->query($sql)){
      printf("Table created successfully...!\n");
      }
      //let's insert some records
      $sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')";
      if($mysqli->query($sql)){
      printf("First record inserted successfully...!\n");
      }
      $sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')";
      if($mysqli->query($sql)){
      printf("Second record inserted successfully...!\n");
      }
      //now lets insert duplicate record with IGNORE keyword
      $sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')";
      if($mysqli->query($sql)){
      printf("Duplicate record inserted successfully using IGNORE keyword...!\n");
      }
      $sql = "SELECT * from person_tbl";
      if($result = $mysqli->query($sql)){
      printf("Table records: \n");
      while($row = mysqli_fetch_array($result)){
      printf("First Name: %s, Last name: %s, Sex: %s",
      $row['first_name'],
      $row['last_name'],
      $row['sex']);
      printf("\n");
      }
      }
      //lets insert a duplicate record
      $sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')";
      if(!$mysqli->query($sql)){
      printf("You can't insert any duplicate records...!\n");
      }
      if($mysqli->error){
      printf("Error message: ", $mysqli->error);
      }
      $mysqli->close();

      Output

      The output obtained is as shown below
      Table created successfully...!
      First record inserted successfully...!
      Second record inserted successfully...!
      Duplicate record inserted successfully using IGNORE keyword...!
      Table records:
      First Name: Thomas, Last name: Jay, Sex:
      First Name: Smith, Last name: John, Sex:
      PHP Fatal error: Uncaught mysqli_sql_exception: Duplicate entry 'John-Smith' for key 'person_tbl.PRIMARY' in D:\test\handlingduplicates.php:48