MySQL Tutorial

This SQL tutorial is structured for beginners to guide them from the foundational concepts to advanced data manipulation and querying techniques in SQL. By the end of this tutorial, you will have developed a robust understanding of SQL, equipping you with the knowledge to efficiently manage and analyze data across various database systems. This guide sets the stage for your growth into a skilled data professional, ready to tackle complex data challenges and contribute to the field of data analysis and database management.

Handling Duplicates
Tables or result sets in a database usually contain duplicate records. While duplicates are generally allowed, there are situations where it is necessary to prevent them. In such cases, it becomes essential to identify and remove duplicate records from a database table.
Importance of Handling MySQL Duplicates
There are various reasons why handling duplicates in a database becomes necessary. One of the main reasons is that the existence of duplicates in an organizational database will lead to logical errors. In addition to it, we need to handle redundant data to prevent the following consequences −
Duplicate data occupies storage space, reducing the efficiency of database usage and increasing storage costs.
Dealing with duplicate records consumes additional resources, driving up the overall cost of maintaining the database.
Duplicates in a database can lead to logical errors in data, affecting the integrity and reliability of the information stored.
Preventing Duplicate Entries
You can use a PRIMARY KEY or a UNIQUE Index on a table with the appropriate fields to prevent duplicate record entries into a table.
Example
The following table contains no such index or primary key, so it would allow duplicate records for first_name and last_name.
CREATE TABLE CUSTOMERS (
   first_name CHAR(20),
   last_name CHAR(20),
   sex CHAR(10)
);
To prevent multiple records with the same first and last name values from being created in this table, add a PRIMARY KEY to its definition. When you do this, it is also necessary to declare the indexed columns to be NOT NULL, because a PRIMARY KEY does not allow NULL values 
CREATE TABLE CUSTOMERS (
   first_name CHAR(20) NOT NULL,
   last_name CHAR(20) NOT NULL,
   sex CHAR(10),
   PRIMARY KEY (last_name, first_name)
);
Using INSERT IGNORE Query 
The existence of a unique index in a table normally causes an error when attempting to insert a record that duplicates an existing record in the indexed column(s).
To handle this situation without generating an error, you can use the "INSERT IGNORE" command. When a record is not a duplicate, MySQL inserts it as usual. However, if the record is duplicate, the "IGNORE" keyword instructs MySQL to discard it without producing an error.
The provided example does not result in an error, and it also ensures that duplicate records are not inserted 
INSERT IGNORE INTO CUSTOMERS (LAST_NAME, FIRST_NAME) 
VALUES ('Jay', 'Thomas'), ('Jay', 'Thomas');
We get the following output 
Query OK, 1 row affected, 1 warning (0.01 sec)
Records: 2  Duplicates: 1  Warnings: 1
Using REPLACE Query 
Instead of using the INSERT command, consider using the REPLACE command. When dealing with a new record, it is inserted just as with INSERT. However, if it is a duplicate, the new record replaces the old one.
REPLACE INTO CUSTOMERS (LAST_NAME, FIRST_NAME) 
VALUES ( 'Ajay', 'Kumar'), ( 'Ajay', 'Kumar');
Following is the output of the above code 
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0
Your choice between the INSERT IGNORE and REPLACE commands should depend on the specific duplicate-handling behaviour you wish to achieve. The INSERT IGNORE command retains the first set of duplicated records and discards the remaining. On the other hand, the REPLACE command keeps the last set of duplicates and removes any earlier instances.
Using UNIQUE Constraint 
Another way to enforce uniqueness is to add a UNIQUE index rather than a PRIMARY KEY to a table 
CREATE TABLE CUSTOMERS (
   first_name CHAR(20) NOT NULL,
   last_name CHAR(20) NOT NULL,
   sex CHAR(10),
   UNIQUE (last_name, first_name)
);
Counting and Identifying Duplicates
You can use the COUNT function and GROUP BY clause to count and identify duplicate records based on specific columns.
Example
Following is the query to count duplicate records with first_name and last_name in a table 
SELECT COUNT(*) as repetitions, last_name, first_name
FROM CUSTOMERS
GROUP BY last_name, first_name
HAVING repetitions > 1;
This query will return a list of all the duplicate records in the CUSTOMERS table. In general, to identify sets of values that are duplicated, follow the steps given below.
Determine which columns may contain duplicated values.
Include those columns in the column selection list, along with COUNT(*).
List the columns in the GROUP BY clause as well.
Apply a HAVING clause to filter unique values by requiring the group counts to be greater than one.
Eliminating Duplicates from a Query Result
You can use the DISTINCT command along with the SELECT statement to find out unique records available in a table.
SELECT DISTINCT last_name, first_name
FROM CUSTOMERS
ORDER BY last_name;
An alternative to the DISTINCT command is to add a GROUP BY clause that specifies the columns you are selecting. This approach eliminates duplicates and retrieves only the unique combinations of values from the specified columns.
SELECT last_name, first_name
FROM CUSTOMERS
GROUP BY (last_name, first_name);
Removing Duplicates Using Table Replacement
If you have duplicate records in a table and you want to remove all the duplicate records from that table, then follow the procedure given below 
CREATE TABLE tmp AS 
SELECT DISTINCT last_name, first_name, sex
FROM CUSTOMERS;
﻿
DROP TABLE CUSTOMERS;
ALTER TABLE tmp RENAME TO CUSTOMERS;
Handling Duplicates Using a Client Program
We can also handle duplicate using the client program.
Syntax
PHPNodeJSJavaPython
To handle duplicates value through a PHP program, we need to execute the "INSERT IGNORE" statement using the mysqli function query() as follows 
$sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')";
$mysqli->query($sql);
Example
Following are the programs 
PHPNodeJSJavaPython
$dbhost = 'localhost';
$dbuser = 'root';
$dbpass = 'password';
$db = 'TUTORIALS';
$mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
if ($mysqli->connect_errno) {
    printf("Connect failed: %s", $mysqli->connect_error);
    exit();
}
//printf('Connected successfully.');
$sql = "CREATE TABLE person_tbl (first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name))";
if($mysqli->query($sql)){
    printf("Table created successfully...!\n");
}
//let's insert some records
$sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')";
if($mysqli->query($sql)){
    printf("First record inserted successfully...!\n");
}
$sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')";
if($mysqli->query($sql)){
    printf("Second record inserted successfully...!\n");
}
//now lets insert duplicate record with IGNORE keyword
$sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')";
if($mysqli->query($sql)){
    printf("Duplicate record inserted successfully using IGNORE keyword...!\n");
}
$sql = "SELECT * from person_tbl";
if($result = $mysqli->query($sql)){
    printf("Table records: \n");
    while($row = mysqli_fetch_array($result)){
        printf("First Name: %s, Last name: %s, Sex: %s",
        $row['first_name'],
        $row['last_name'], 
        $row['sex']);
        printf("\n");
    }
}
//lets insert a duplicate record
$sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')";
if(!$mysqli->query($sql)){
    printf("You can't insert any duplicate records...!\n");
}
if($mysqli->error){
    printf("Error message: ", $mysqli->error);
}
$mysqli->close();
Output
The output obtained is as shown below 
Table created successfully...!
First record inserted successfully...!
Second record inserted successfully...!
Duplicate record inserted successfully using IGNORE keyword...!
Table records:
First Name: Thomas, Last name: Jay, Sex:
First Name: Smith, Last name: John, Sex:
PHP Fatal error:  Uncaught mysqli_sql_exception: Duplicate entry 'John-Smith' for key 'person_tbl.PRIMARY' in D:\test\handlingduplicates.php:48    
﻿
﻿
﻿
﻿
﻿
﻿

MySQL Basics

MySQL Databases

MySQL Users

MySQL Tables

MySQL Queries

MySQL Views

MySQL Indexes

MySQL Operators and Clauses

MySQL Joins

MySQL Keys

MySQL Triggers

MySQL Data Types

MySQL Regular Expressions

MySQL Fulltext Search

MySQL Functions & Operators

MySQL Misc Concepts

MySQL Useful Resources

MySQL Tutorial

Handling Duplicates

Importance of Handling MySQL Duplicates

Preventing Duplicate Entries

Example

Counting and Identifying Duplicates

Example

Eliminating Duplicates from a Query Result

Removing Duplicates Using Table Replacement

Handling Duplicates Using a Client Program

Syntax

Example

Output