Working with Intersection in MySQL

02.04.2022

Intro

The INTERSECT operator allows us to join two queries together and only select the rows they share in common. This is similar to UNION except we only get rows that overlap. In this article, we will learn how to use the INTERSECT operator in MySQL.

The Syntax

The basic syntax of INTERSECT is as follows:

SELECT * [column_names] FROM [table]
INTERSECT
SELECT * [column_names] FROM [table2]

However, MySQL does not support intersect. We can accomplish the same result using the following.

SELECT DISTINCT [column_names]
FROM [table]
WHERE [column_names] IN 
  (SELECT [column_names] FROM [table2]);

Getting Setup

We will be using docker in this article, but feel free to install your database locally instead. Once you have docker installed, create a new file called docker-compose.yml and add the following.

version: '3'
 
services:
  db:
    image: mysql:latest
    container_name: db
    environment:
      MYSQL_ROOT_PASSWORD: root_pass
      MYSQL_DATABASE: app_db
      MYSQL_USER: db_user
      MYSQL_PASSWORD: db_user_pass
    ports:
      - "6033:3306"
    volumes:
      - dbdata:/var/lib/mysql
  phpmyadmin:
    image: phpmyadmin/phpmyadmin
    container_name: pma
    links:
      - db
    environment:
      PMA_HOST: db
      PMA_PORT: 3306
      PMA_ARBITRARY: 1
    restart: always
    ports:
      - 8081:80

volumes:
  dbdata:

Next, run docker-compose up.

Now, navigate to http://localhost:8081/ to access phpMyAdmin. Then log in with the username root and pass root_pass.

Click the SQL tab and you are ready to go.

Creating a DB

In this article, we will need some data to work with. We will be using the sample db provided here: https://dev.mysql.com/doc/employee/en/. However, we will only enter what we need rather than import the whole db.

With the SQL tab open (or your own sql cli going), let's first create our DB and select it.

create DATABASE if not EXISTS sakila;

USE sakila;
CREATE TABLE employees (
    emp_no      INT             NOT NULL,
    birth_date  DATE            NOT NULL,
    first_name  VARCHAR(14)     NOT NULL,
    last_name   VARCHAR(16)     NOT NULL,
    gender      ENUM ('M','F')  NOT NULL,    
    hire_date   DATE            NOT NULL,
    PRIMARY KEY (emp_no)
);
CREATE TABLE salaries (
    emp_no      INT             NOT NULL,
    salary      INT             NOT NULL,
    from_date   DATE            NOT NULL,
    to_date     DATE            NOT NULL,
    FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
    PRIMARY KEY (emp_no, from_date)
);

Now, let's enter a few rows

INSERT INTO `employees` VALUES (10001,'1953-09-02','Georgi','Facello','M','1986-06-26'),
(10002,'1964-06-02','Bezalel','Simmel','F','1985-11-21'),
(10003,'1959-12-03','Parto','Bamford','M','1986-08-28'),
(10004,'1954-05-01','Chirstian','Koblick','M','1986-12-01'),
(10005,'1955-01-21','Kyoichi','Maliniak','M','1989-09-12'),
(10006,'1953-04-20','Anneke','Preusig','F','1989-06-02'),
(10007,'1957-05-23','Tzvetan','Zielinski','F','1989-02-10'),
(10008,'1958-02-19','Saniya','Kalloufi','M','1994-09-15'),
(10009,'1952-04-19','Sumant','Peac','F','1985-02-18'),
(10010,'1963-06-01','Duangkaew','Piveteau','F','1989-08-24'),
(10011,'1953-11-07','Mary','Sluis','F','1990-01-22'),
(10012,'1960-10-04','Patricio','Bridgland','M','1992-12-18'),
(10013,'1963-06-07','Eberhardt','Terkki','M','1985-10-20'),
(10014,'1956-02-12','Berni','Genin','M','1987-03-11'),
(10015,'1959-08-19','Guoxiang','Nooteboom','M','1987-07-02'),
(10016,'1961-05-02','Kazuhito','Cappelletti','M','1995-01-27'),
(10017,'1958-07-06','Cristinel','Bouloucos','F','1993-08-03'),
(10018,'1954-06-19','Kazuhide','Peha','F','1987-04-03'),
(10019,'1953-01-23','Lillian','Haddadi','M','1999-04-30'),
(10020,'1952-12-24','Mayuko','Warwick','M','1991-01-26');
INSERT INTO `salaries` VALUES (10001,60117,'1986-06-26','1987-06-26'),
(10001,62102,'1987-06-26','1988-06-25'),
(10001,66074,'1988-06-25','1989-06-25'),
(10001,66596,'1989-06-25','1990-06-25'),
(10001,66961,'1990-06-25','1991-06-25'),
(10001,71046,'1991-06-25','1992-06-24'),
(10001,74333,'1992-06-24','1993-06-24'),
(10001,75286,'1993-06-24','1994-06-24'),
(10001,75994,'1994-06-24','1995-06-24'),
(10001,76884,'1995-06-24','1996-06-23'),
(10001,80013,'1996-06-23','1997-06-23'),
(10001,81025,'1997-06-23','1998-06-23'),
(10001,81097,'1998-06-23','1999-06-23');

An Example

Let’s start with an example. Here we want to grab all employees who are born before 1962 and hired after 1982.

Since MySQL doesn't support INTERSECT, we have to use a sub query with the IN operator.

SELECT
	emp_no,
    first_name, 
    last_name,
    birth_date,
    hire_date 
FROM
    employees 
WHERE 
	birth_date < Date('1962-01-01')
AND emp_no IN (
	SELECT 
    	emp_no
	FROM
	    employees AS e2
	WHERE 
		e2.hire_date  > Date('1989-01-01')
);
emp_no first_name last_name birth_date hire_date
10005 Kyoichi Maliniak 1955-01-21 1989-09-12
10006 Anneke Preusig 1953-04-20 1989-06-02
10007 Tzvetan Zielinski 1957-05-23 1989-02-10
10008 Saniya Kalloufi 1958-02-19 1994-09-15
10011 Mary Sluis 1953-11-07 1990-01-22
10012 Patricio Bridgland 1960-10-04 1992-12-18
10016 Kazuhito Cappelletti 1961-05-02 1995-01-27
10017 Cristinel Bouloucos 1958-07-06 1993-08-03
10019 Lillian Haddadi 1953-01-23 1999-04-30
10020 Mayuko Warwick 1952-12-24 1991-01-26