Thursday, July 5, 2012

A Visual Explanation of SQL Joins

Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each.


CREATE TABLE [TableA](
      [id] [int] NOT NULL,
      [Name] [varchar](50) NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
INSERT [TableA] ([id], [Name]) VALUES (1, N'Pirate')
INSERT [TableA] ([id], [Name]) VALUES (2, N'Monkey')
INSERT [TableA] ([id], [Name]) VALUES (3, N'Ninja')
INSERT [TableA] ([id], [Name]) VALUES (4, N'Spaghetti')

 GO


CREATE TABLE [TableB](
      [id] [int] NOT NULL,
      [Name] [varchar](50) NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
INSERT [TableB] ([id], [Name]) VALUES (1, N'Rutabaga')
INSERT [TableB] ([id], [Name]) VALUES (2, N'Pirate')
INSERT [TableB] ([id], [Name]) VALUES (3, N'Darth Vader')
INSERT [TableB] ([id], [Name]) VALUES (4, N'Ninja')



SELECT * FROM TableA

id    Name
1     Pirate
2     Monkey
3     Ninja
4     Spaghetti



SELECT * FROM TableB

id    Name
1     Rutabaga
2     Pirate
3     Darth Vader
4     Ninja

Inner Join

SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name

id  name       id   name
--  ----       --   ----
1   Pirate     2    Pirate
3   Ninja      4    Ninja
Inner join produces only the set of records that match in both Table A and Table B.


Full Outer Join

SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name

id    name       id    name
--    ----       --    ----
1     Pirate     2     Pirate
2     Monkey     null  null
3     Ninja      4     Ninja
4     Spaghetti  null  null
null  null       1     Rutabaga       
null  null       3     Darth Vader
 
Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.

Left Outer Join

SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name

id  name       id    name
--  ----       --    ----
1   Pirate     2     Pirate
2   Monkey     null  null
3   Ninja      4     Ninja
4   Spaghetti  null  null
Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.


SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name
WHERE TableB.id IS null

id  name       id     name
--  ----       --     ----
2   Monkey     null   null
4   Spaghetti  null   null
To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause.



SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name
WHERE TableA.id IS null 
OR TableB.id IS null

id    name       id    name
--    ----       --    ----
2     Monkey     null  null
4     Spaghetti  null  null
null  null       1     Rutabaga
null  null       3     Darth Vader
To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause.


There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:

SELECT * FROM TableA
CROSS JOIN TableB
 
This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables.

No comments:

Post a Comment