• Ekta Aggarwal

JOINS in SQL

In SQL, the most unavoidable task is joining or merging two tables i.e. gathering information from multiple tables and creating a new dataset.


There are majorly 5 types of joins in SQL:

In this tutorial we shall be covering all 5 of them in detail.


Dataset:


For this tutorial we shall make use of 2 datasets: patient_info and hospital_record

CREATE TABLE patient_info(
patient_id int,
department varchar(20),
gender char(1),
age int)

INSERT INTO
patient_info(patient_id,department,gender,age)
VALUES
(1001,'Cardiology','M',24),
(1002,'Radiology','M',32),
(1003,'E&A','F',44),
(1004,'Radiology','M',32),
(1005,'Neurology','M',38);
CREATE TABLE hospital_record(
patient_id int,
department varchar(20),
length_of_stay int,
last_visit_date date)

INSERT INTO
hospital_record(patient_id,department,length_of_stay,last_visit_date)
VALUES
(1001,'Cardiology',20,'12Aug2020'),
(1002,'Radiology',0,'13Feb2019'),
(1004,'Radiology',2,'31Jan2020'),
(1005,'Neurology',4,'9Aug2018'),
(1005,'Cardiology',18,'9Aug2018'),
(1004,'Neurology',65,'19Jan2020');

Our data looks as follows:

SELECT * FROM patient_info;
SELECT * FROM hospital_record;

Basic Syntax for all 5 joins:

SELECT a.* , b.*

FROM table1 as a

JOIN <specify it yourself which join you want> table2 as b

ON a.primary_key = b.primary_key;


In SQL it is more convenient to provide aliases for the tables we want to join. Here we have taken aliases 'a' and 'b'.


SELECT: specify the columns which you want from both the tables.

FROM : Name of table1

JOIN: Specify the join name (INNER, LEFT, RIGHT, FULL, CROSS) and name of second table

ON: List of common column names to be matched.


1. Inner Join:


Inner Join takes common elements (specified in ON columns)in both the tables and join them.


Here we are selecting all the columns from patient_info and hospital_record and our common keys are patient_id and department in both the tables.

SELECT a.*, b.* 
FROM patient_info as a
INNER JOIN hospital_record as b
ON a.patient_id = b.patient_id AND a.department = b.department;

Note that no information about patient_id 1003 is provided since it is available only in one table. For inner join the record has to be present in both the tables.


Ques: Suppose in hospital_record table we have column name as dept instead of department then how do we modify the ON statement?Ans: ON a.patient_id = b.patient_id AND a.department = b.dept To select only specific columns in the resultant dataset: In the resultant dataset select all the columns from patient_info while only last_visit_date from hospital_record.

SELECT a.*, b.last_visit_date
FROM patient_info as a
INNER JOIN hospital_record as b
ON a.patient_id = b.patient_id AND a.department = b.department;

For this we have made the changes in the SELECT statement.

One to Many Mapping:

One of the most common questions asked in an interview is about one to many mapping. For instance, for patient_id 1004 and 1005 we have multiple department names in table hospital_record. Let us join the table by patient_id and see the output.

SELECT a.*, b.* 
FROM patient_info as a
INNER JOIN hospital_record as b
ON a.patient_id = b.patient_id;

Note the difference in columns for patient id 1004 and 1005.

2. Left join:

Left join takes all the records from left table (specified in FROM statement) irrespective of availability of record in right table (specified in JOIN statement).

SELECT a.*, b.last_visit_date
FROM patient_info as a
LEFT JOIN hospital_record as b
ON a.patient_id = b.patient_id AND a.department = b.department;

Notice patient id 1003! It was available in left table(patient_info) but not in right table (hospital_record). Thus we have last_visit_date as NULL for it.

One to Many Mapping:

Since patient id 1004 and 1005 had multiple deparments thus resultant table maps both the deparments of hospital_record to the left table

SELECT a.*, b.*
FROM patient_info as a
LEFT JOIN hospital_record as b
ON a.patient_id = b.patient_id;

3. Right join:

Right join takes all the records from right table (specified in JOIN statement) irrespective of availability of record in left table (specified in FROM statement).

SELECT b.*,a.*
FROM patient_info as a
RIGHT JOIN hospital_record as b
ON a.patient_id = b.patient_id AND a.department = b.department;

4. Full Join:


Full join takes all the records from both the tables (specified in FROM and JOIN statement).

SELECT a.*,b.*
FROM patient_info as a
FULL JOIN hospital_record as b
ON a.patient_id = b.patient_id AND a.department = b.department;

NULL values get generated if a record is available in only one table.

5. Cross Join:

patient_info had 5 unique patient ids and hospital_Record had 3 unique departments. A cross join refers to all 5X3 = 15 permutation combinations of these patient ids and departments.


SELECT a.*,b.*
FROM (select patient_id from patient_info) as a
CROSS JOIN (select distinct department from hospital_record) as b;

Too much of theory. Isn't it? Here is a meme which I found which can definitely light up your mood! :D

Tags: