Working with Delete Join in Postgres

03.02.2022

Intro

The DELETE JOIN allows us to delete data from multiple tables using a join. Often we will want o connect multiple tables, then delete rows. In this article, we will learn how to use DELETE JOIN in PostgreSQL.

The Syntax

The basic syntax of DELETE JOIN is as follows:

DELETE [table1], [table2]
FROM [table1]
JOIN [table2] ON key1 = key2
WHERE [condition]

This syntax will allow us to delete multiple rows based on the join and conditions we supply.

However, PSQL does not support the DELETE JOIN, so we need to emulate this query using the IN operator in a subquery.

DELETE FROM employees
	WHERE emp_no IN (
		SELECT emp_no FROM salaries WHERE salary < 70000
	);

Getting Setup

We will be using docker in this article, but feel free to install your database locally instead. Once you have docker installed, create a new file called docker-compose.yml and add the following.

version: '3'
 
services:
  db:
    image: 'postgres:latest'
    ports:
      - 5432:5432
    environment:
      POSTGRES_USER: username
      POSTGRES_PASSWORD: password
      POSTGRES_DB: default_database
    volumes:
      - psqldata:/var/lib/postgresql

  phpmyadmin:
    image: phpmyadmin/phpmyadmin
    links:
      - db
    environment:
      PMA_HOST: db
      PMA_PORT: 3306
      PMA_ARBITRARY: 1
    restart: always
    ports:
      - 8081:80

volumes:
  psqldata:

Next, run docker-compose up.

Now, navigate to http://localhost:8081/ to access phpMyAdmin. Then log in with the username root and pass root_pass.

Click the SQL tab and you are ready to go.

Creating a DB

In this article, we will need some data to work with. If you don't understand these commands, don't worry, we will cover them in later articles.

We will be using the sample db provided here: https://dev.mysql.com/doc/sakila/en/. However, we will only enter what we need rather than import the whole db.

Next, let's create an film table. This is a slightly simplified version of the sakila database.

CREATE TABLE employees (
    emp_no      INT             NOT NULL,
    birth_date  DATE            NOT NULL,
    first_name  VARCHAR(14)     NOT NULL,
    last_name   VARCHAR(16)     NOT NULL,
    gender      VARCHAR(1),
    hire_date   DATE            NOT NULL,
    PRIMARY KEY (emp_no)
);
CREATE TABLE salaries (
    emp_no      INT             NOT NULL,
    salary      INT             NOT NULL,
    from_date   DATE            NOT NULL,
    to_date     DATE            NOT NULL,
    FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
    PRIMARY KEY (emp_no, from_date)
);

Now, let's enter a few rows

INSERT INTO employees VALUES (10001,'1953-09-02','Georgi','Facello','M','1986-06-26'),
(10002,'1964-06-02','Bezalel','Simmel','F','1985-11-21'),
(10003,'1959-12-03','Parto','Bamford','M','1986-08-28'),
(10004,'1954-05-01','Chirstian','Koblick','M','1986-12-01'),
(10005,'1955-01-21','Kyoichi','Maliniak','M','1989-09-12'),
(10006,'1953-04-20','Anneke','Preusig','F','1989-06-02'),
(10007,'1957-05-23','Tzvetan','Zielinski','F','1989-02-10'),
(10008,'1958-02-19','Saniya','Kalloufi','M','1994-09-15'),
(10009,'1952-04-19','Sumant','Peac','F','1985-02-18'),
(10010,'1963-06-01','Duangkaew','Piveteau','F','1989-08-24'),
(10011,'1953-11-07','Mary','Sluis','F','1990-01-22'),
(10012,'1960-10-04','Patricio','Bridgland','M','1992-12-18'),
(10013,'1963-06-07','Eberhardt','Terkki','M','1985-10-20'),
(10014,'1956-02-12','Berni','Genin','M','1987-03-11'),
(10015,'1959-08-19','Guoxiang','Nooteboom','M','1987-07-02'),
(10016,'1961-05-02','Kazuhito','Cappelletti','M','1995-01-27'),
(10017,'1958-07-06','Cristinel','Bouloucos','F','1993-08-03'),
(10018,'1954-06-19','Kazuhide','Peha','F','1987-04-03'),
(10019,'1953-01-23','Lillian','Haddadi','M','1999-04-30'),
(10020,'1952-12-24','Mayuko','Warwick','M','1991-01-26');
INSERT INTO salaries VALUES (10001,60117,'1986-06-26','1987-06-26'),
(10002,62102,'1987-06-26','1988-06-25'),
(10003,66074,'1988-06-25','1989-06-25'),
(10004,66596,'1989-06-25','1990-06-25'),
(10005,66961,'1990-06-25','1991-06-25'),
(10006,71046,'1991-06-25','1992-06-24'),
(10007,74333,'1992-06-24','1993-06-24'),
(10008,75286,'1993-06-24','1994-06-24'),
(10009,75994,'1994-06-24','1995-06-24'),
(10010,76884,'1995-06-24','1996-06-23'),
(10011,80013,'1996-06-23','1997-06-23'),
(10012,81025,'1997-06-23','1998-06-23'),
(10013,81097,'1998-06-23','1999-06-23');

Examples

Let’s do an example where we select multiple employees, joined with their salaries. We will filter based on salary then delete. This will remove all employee rows and salary rows that meet our search.

First, let’s select to see all the rows we will delete.

SELECT *
FROM employees AS e
	JOIN salaries AS s on e.emp_no = s.emp_no
WHERE s.salary < 70000;
emp_no birth_date first_name last_name gender hire_date emp_no salary from_date to_date
10001 1953-09-02 Georgi Facello M 1986-06-26 10001 60117 1986-06-26 1987-06-26
10001 1953-09-02 Georgi Facello M 1986-06-26 10001 62102 1987-06-26 1988-06-25
10001 1953-09-02 Georgi Facello M 1986-06-26 10001 66074 1988-06-25 1989-06-25
10001 1953-09-02 Georgi Facello M 1986-06-26 10001 66596 1989-06-25 1990-06-25
10001 1953-09-02 Georgi Facello M 1986-06-26 10001 66961 1990-06-25 1991-06-25

Now, we can emulate a DELETE JOIN using a IN with a Subquery. Here we delete all employees that are in the subquery where salaries are less than 70,000.

DELETE FROM employees
	WHERE emp_no IN (
		SELECT emp_no FROM salaries WHERE salary < 70000
	);

Since we specify employee after DELETE this will delete the employee rows based on our join condition. On thing to notice is we have ON DELETE CASCADE on the salary table, thus when we delete the linked employee, the salary row will also be deleted.

SELECT * FROM salaries s;
emp_no salary from_date to_date
10006 71046 1991-06-25 1992-06-24
10007 74333 1992-06-24 1993-06-24
10008 75286 1993-06-24 1994-06-24
10009 75994 1994-06-24 1995-06-24
10010 76884 1995-06-24 1996-06-23
10011 80013 1996-06-23 1997-06-23
10012 81025 1997-06-23 1998-06-23
10013 81097 1998-06-23 1999-06-23