Postgresql aggregate functions allow us to summarize groups of rows. For example, we can count the number of sales, get the average revenue, and more. In this article we will learn how to use aggregate functions in Postgresql.
The basic syntax of a Upper is as follows:
SELECT upper(string|column);
We will be using docker in this article, but feel free to install your database locally instead. Once you have docker installed, create a new file called docker-compose.yml
and add the following.
version: '3'
services:
db:
image: 'postgres:latest'
ports:
- 5432:5432
environment:
POSTGRES_USER: username
POSTGRES_PASSWORD: password
POSTGRES_DB: default_database
volumes:
- psqldata:/var/lib/postgresql
phpmyadmin:
image: phpmyadmin/phpmyadmin
links:
- db
environment:
PMA_HOST: db
PMA_PORT: 3306
PMA_ARBITRARY: 1
restart: always
ports:
- 8081:80
volumes:
psqldata:
Next, run docker-compose up
.
Now, navigate to http://localhost:8081/
to access phpMyAdmin. Then log in with the username root
and pass root_pass
.
Click the SQL tab and you are ready to go.
In this article, we will need some data to work with. If you don't understand these commands, don't worry, we will cover them in later articles.
We will be using the sample db provided here: https://dev.mysql.com/doc/sakila/en/. However, we will only enter what we need rather than import the whole db.
Next, let's create an film
table. This is a slightly simplified version of the sakila database.
CREATE TABLE employees (
emp_no INT NOT NULL,
birth_date DATE NOT NULL,
first_name VARCHAR(14) NOT NULL,
last_name VARCHAR(16) NOT NULL,
gender VARCHAR(1),
hire_date DATE NOT NULL,
PRIMARY KEY (emp_no)
);
CREATE TABLE salaries (
emp_no INT NOT NULL,
salary INT NOT NULL,
from_date DATE NOT NULL,
to_date DATE NOT NULL,
FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
PRIMARY KEY (emp_no, from_date)
);
Now, let's enter a few rows
INSERT INTO employees VALUES (10001,'1953-09-02','Georgi','Facello','M','1986-06-26'),
(10002,'1964-06-02','Bezalel','Simmel','F','1985-11-21'),
(10003,'1959-12-03','Parto','Bamford','M','1986-08-28'),
(10004,'1954-05-01','Chirstian','Koblick','M','1986-12-01'),
(10005,'1955-01-21','Kyoichi','Maliniak','M','1989-09-12');
INSERT INTO salaries VALUES
(10001,60117,'1986-06-26','1987-06-26'),
(10002,62102,'1987-06-26','1988-06-25'),
(10003,66074,'1988-06-25','1989-06-25'),
(10004,66596,'1989-06-25','1990-06-25'),
(10005,66961,'1990-06-25','1991-06-25');
Here is a quick overview of the basic aggregate functions that are used often in postgresql.
Our first example will count the number of rows in our employee table. We can use COUNT(*)
in our select to count all rows.
select
count(*)
from employees e;
count |
---|
5 |
Now, let's say we want to see how many employees share the same birthday. We can use our count function with the birth_date in the select. Then, we can group our rows by birth date.
select
count(*), birth_date
from employees e
group by birth_date;
We see in our data set, none of the employees share the same birthday as each row of our result table has a count of 1.
count | birth_date |
---|---|
1 | 1953-09-02 |
1 | 1964-06-02 |
1 | 1955-01-21 |
1 | 1959-12-03 |
1 | 1954-05-01 |
Let's see more examples by joining the salary table. We start with the basic join, no aggregation.
select
*
from employees e
join salaries s on s.emp_no = e.emp_no;
emp_no | birth_date | first_name | last_name | gender | hire_date | emp_no | salary | from_date | to_date |
---|---|---|---|---|---|---|---|---|---|
10001 | 1953-09-02 | Georgi | Facello | M | 1986-06-26 | 10001 | 60117 | 1986-06-26 | 1987-06-26 |
10002 | 1964-06-02 | Bezalel | Simmel | F | 1985-11-21 | 10002 | 62102 | 1987-06-26 | 1988-06-25 |
10003 | 1959-12-03 | Parto | Bamford | M | 1986-08-28 | 10003 | 66074 | 1988-06-25 | 1989-06-25 |
10004 | 1954-05-01 | Chirstian | Koblick | M | 1986-12-01 | 10004 | 66596 | 1989-06-25 | 1990-06-25 |
10005 | 1955-01-21 | Kyoichi | Maliniak | M | 1989-09-12 | 10005 | 66961 | 1990-06-25 | 1991-06-25 |
Let's now take the average of al the employees salary. To do this, we will use the avg
function and pass in the salary column.
select
avg(salary)
from employees e
join salaries s on s.emp_no = e.emp_no;
The result is that our average salary is $64,370.
avg |
---|
64370.000000000000 |
Let's also sum the salaries to see the total amount paid to our employees.
select
sum(salary)
from employees e
join salaries s on s.emp_no = e.emp_no;
sum |
---|
321850 |
In our final example, let's see that we don't have to use only one aggregate function per a select. We can combine them. Let's select the min and the max salary to see what we pay on the high and low end.
select
min(salary), max(salary)
from employees e
join salaries s on s.emp_no = e.emp_no;
min | max |
---|---|
60117 | 66961 |