The Group By clause allows us to summarize multiple rows into a less or even a single row. For example, if we want to count the number of people with the same last name or sum the number of orders in a day, we can use the Group By clause. In this article, we will learn how to use the Group By claus in MySql.
The basic syntax of using a Group By is as follows:
SELECT [columns] FROM [table] WHERE [conditions] GROUP BY [columns];
We will be using docker in this article, but feel free to install your database locally instead. Once you have docker installed, create a new file called
docker-compose.yml and add the following.
version: '3' services: db: image: mysql:latest container_name: db environment: MYSQL_ROOT_PASSWORD: root_pass MYSQL_DATABASE: app_db MYSQL_USER: db_user MYSQL_PASSWORD: db_user_pass ports: - "6033:3306" volumes: - dbdata:/var/lib/mysql phpmyadmin: image: phpmyadmin/phpmyadmin container_name: pma links: - db environment: PMA_HOST: db PMA_PORT: 3306 PMA_ARBITRARY: 1 restart: always ports: - 8081:80 volumes: dbdata:
Now, navigate to
http://localhost:8081/ to access phpMyAdmin. Then log in with the username
root and pass
Click the SQL tab and you are ready to go.
In this article, we will need some data to work with. If you don't understand these commands, don't worry, we will cover them in later articles.
We will be using the sample db provided here: https://dev.mysql.com/doc/sakila/en/. However, we will only enter what we need rather than import the whole db.
With the SQL tab open (or your own sql cli going), let's first create our DB and select it.
create DATABASE if not EXISTS sakila; USE sakila;
Next, let's create an
CREATE TABLE actor ( actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (actor_id), KEY idx_actor_last_name (last_name) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
And finally, let's enter a few rows.
INSERT INTO actor VALUES (1,'PENELOPE','GUINESS','2006-02-15 04:34:33'), (2,'NICK','WAHLBERG','2006-02-15 04:34:33'), (3,'ED','CHASE','2006-02-15 04:34:33'), (4,'JENNIFER','DAVIS','2006-02-15 04:34:33'), (5,'JOHNNY','LOLLOBRIGIDA','2006-02-15 04:34:33'), (6,'BETTE','NICHOLSON','2006-02-15 04:34:33'), (7,'GRACE','MOSTEL','2006-02-15 04:34:33'), (8,'MATTHEW','JOHANSSON','2006-02-15 04:34:33'), (9,'JOAN','JOHANSSON','2006-02-15 04:34:33')
In our first example, we will use group by to group all actors by their last name. We wont use any aggregate function, such as
COUNT. This results in giving us a distinct list of names, similar to the
SELECT last_name AS LastName FROM actor GROUP BY LastName;
Next, we will do the same command, but will add the
COUNT(*) clause to the column list. The
* will just infer to count the groups. This should return a list of actor last names and the count of each.
SELECT last_name AS LastName, COUNT(*) FROM actor GROUP BY LastName;
Often, we will want to filter our groups. We can no longer use the
WHERE clause as that works on the initial rows before grouping. We can use the
HAVING clause to filter groups.
SELECT last_name AS LastName, COUNT(*) AS LastNameCount FROM actor GROUP BY LastName HAVING LastNameCount > 1;