Working with Aggregate Functions in Postgres

Intro

Postgresql aggregate functions allow us to summarize groups of rows. For example, we can count the number of sales, get the average revenue, and more. In this article we will learn how to use aggregate functions in Postgresql.

The Syntax

The basic syntax of a Upper is as follows:

SELECT upper(string|column);

Getting Setup

We will be using docker in this article, but feel free to install your database locally instead. Once you have docker installed, create a new file called docker-compose.yml and add the following.

version: '3'
 
services:
  db:
    image: 'postgres:latest'
    ports:
      - 5432:5432
    environment:
      POSTGRES_USER: username
      POSTGRES_PASSWORD: password
      POSTGRES_DB: default_database
    volumes:
      - psqldata:/var/lib/postgresql

  phpmyadmin:
    image: phpmyadmin/phpmyadmin
    links:
      - db
    environment:
      PMA_HOST: db
      PMA_PORT: 3306
      PMA_ARBITRARY: 1
    restart: always
    ports:
      - 8081:80

volumes:
  psqldata:

Next, run docker-compose up.

Now, navigate to http://localhost:8081/ to access phpMyAdmin. Then log in with the username root and pass root_pass.

Click the SQL tab and you are ready to go.

Creating a DB

In this article, we will need some data to work with. If you don't understand these commands, don't worry, we will cover them in later articles.

We will be using the sample db provided here: https://dev.mysql.com/doc/sakila/en/. However, we will only enter what we need rather than import the whole db.

Next, let's create an film table. This is a slightly simplified version of the sakila database.

CREATE TABLE employees (
    emp_no      INT             NOT NULL,
    birth_date  DATE            NOT NULL,
    first_name  VARCHAR(14)     NOT NULL,
    last_name   VARCHAR(16)     NOT NULL,
    gender      VARCHAR(1),
    hire_date   DATE            NOT NULL,
    PRIMARY KEY (emp_no)
);

CREATE TABLE salaries (
    emp_no      INT             NOT NULL,
    salary      INT             NOT NULL,
    from_date   DATE            NOT NULL,
    to_date     DATE            NOT NULL,
    FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
    PRIMARY KEY (emp_no, from_date)
);

Now, let's enter a few rows

INSERT INTO employees VALUES (10001,'1953-09-02','Georgi','Facello','M','1986-06-26'),
(10002,'1964-06-02','Bezalel','Simmel','F','1985-11-21'),
(10003,'1959-12-03','Parto','Bamford','M','1986-08-28'),
(10004,'1954-05-01','Chirstian','Koblick','M','1986-12-01'),
(10005,'1955-01-21','Kyoichi','Maliniak','M','1989-09-12');

INSERT INTO salaries VALUES 
(10001,60117,'1986-06-26','1987-06-26'),
(10002,62102,'1987-06-26','1988-06-25'),
(10003,66074,'1988-06-25','1989-06-25'),
(10004,66596,'1989-06-25','1990-06-25'),
(10005,66961,'1990-06-25','1991-06-25');

The Basics

Here is a quick overview of the basic aggregate functions that are used often in postgresql.

AVG(): This function returns the average value.
COUNT(): This function returns the number of values.
MAX(): This function returns the maximum value.
MIN(): This function returns the minimum value.
SUM(): This function returns the sum of values.

Count

Our first example will count the number of rows in our employee table. We can use COUNT(*) in our select to count all rows.

select 
	count(*)
from employees e;

count
5

Now, let's say we want to see how many employees share the same birthday. We can use our count function with the birth_date in the select. Then, we can group our rows by birth date.

select 
	count(*), birth_date 
from employees e
group by birth_date;

We see in our data set, none of the employees share the same birthday as each row of our result table has a count of 1.

count	birth_date
1	1953-09-02
1	1964-06-02
1	1955-01-21
1	1959-12-03
1	1954-05-01

Average and Sum

Let's see more examples by joining the salary table. We start with the basic join, no aggregation.

select
	*
from employees e 
join salaries s on s.emp_no = e.emp_no;

emp_no	birth_date	first_name	last_name	gender	hire_date	emp_no	salary	from_date	to_date
10001	1953-09-02	Georgi	Facello	M	1986-06-26	10001	60117	1986-06-26	1987-06-26
10002	1964-06-02	Bezalel	Simmel	F	1985-11-21	10002	62102	1987-06-26	1988-06-25
10003	1959-12-03	Parto	Bamford	M	1986-08-28	10003	66074	1988-06-25	1989-06-25
10004	1954-05-01	Chirstian	Koblick	M	1986-12-01	10004	66596	1989-06-25	1990-06-25
10005	1955-01-21	Kyoichi	Maliniak	M	1989-09-12	10005	66961	1990-06-25	1991-06-25

Let's now take the average of al the employees salary. To do this, we will use the avg function and pass in the salary column.

select
	avg(salary)
from employees e 
join salaries s on s.emp_no = e.emp_no;

The result is that our average salary is $64,370.

avg
64370.000000000000

Let's also sum the salaries to see the total amount paid to our employees.

select
	sum(salary)
from employees e 
join salaries s on s.emp_no = e.emp_no;

sum
321850

Min and Max

In our final example, let's see that we don't have to use only one aggregate function per a select. We can combine them. Let's select the min and the max salary to see what we pay on the high and low end.

select
	min(salary), max(salary)
from employees e 
join salaries s on s.emp_no = e.emp_no;

min	max
60117	66961