Introduction
In this chapter, we will learn how to use the GROUP BY
clause in SQL. The GROUP BY
clause is used to arrange identical data into groups. This is particularly useful when combined with aggregate functions to summarize data. This chapter will cover the syntax, usage, and provide examples to help you understand how to use the GROUP BY
clause effectively.
What is the GROUP BY Clause?
The GROUP BY
clause groups rows that have the same values in specified columns into summary rows. It is often used with aggregate functions such as COUNT
, SUM
, AVG
, MAX
, and MIN
to perform calculations on each group.
Syntax for GROUP BY
Basic Syntax
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1, column2, ...;
column1, column2, ...
: The columns by which you want to group the result set.aggregate_function(column2)
: The aggregate function to apply to the grouped data.table_name
: The name of the table from which you want to retrieve data.condition
: The condition that must be met for a row to be included in the result (optional).
Example
Assume we have a table named employees
:
SELECT department_id, COUNT(*)
FROM employees
GROUP BY department_id;
This command groups the rows by department_id
and counts the number of employees in each department.
Step-by-Step Example
1. Create a Sample Table
First, we will create a sample tables - employees
and departments
.
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(50)
);
CREATE TABLE employees (
id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100),
department_id INT,
salary DECIMAL(10, 2)
);
2. Insert Sample Data
INSERT INTO departments (department_id, department_name)
VALUES
(1, 'HR'),
(2, 'Finance'),
(3, 'IT');
INSERT INTO employees (first_name, last_name, email, department_id, salary)
VALUES
('Ramesh', 'Kumar', 'ramesh.kumar@example.com', 1, 50000),
('Sita', 'Patel', 'sita.patel@example.com', 2, 60000),
('Arjun', 'Singh', 'arjun.singh@example.com', 1, 55000),
('Priya', 'Sharma', 'priya.sharma@example.com', 2, 62000),
('Ramesh', 'Kumar', 'ramesh.kumar2@example.com', 3, 50000);
3. Group by a Single Column
To group employees by department_id
and count the number of employees in each department:
SELECT department_id, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id;
Output
department_id | num_employees |
---|---|
1 | 2 |
2 | 2 |
3 | 1 |
4. Group by Multiple Columns
To group employees by department_id
and last_name
and count the number of employees in each group:
SELECT department_id, last_name, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id, last_name;
Output
department_id | last_name | num_employees |
---|---|---|
1 | Kumar | 1 |
1 | Singh | 1 |
2 | Patel | 1 |
2 | Sharma | 1 |
3 | Kumar | 1 |
Using Aggregate Functions with GROUP BY
You can use aggregate functions to perform calculations on grouped data.
Examples
-
SUM: To calculate the total salary for each department:
SELECT department_id, SUM(salary) AS total_salary FROM employees GROUP BY department_id;
Output
department_id total_salary 1 105000.00 2 122000.00 3 50000.00 -
AVG: To calculate the average salary for each department:
SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id;
Output
department_id avg_salary 1 52500.00 2 61000.00 3 50000.00 -
MAX: To find the highest salary in each department:
SELECT department_id, MAX(salary) AS highest_salary FROM employees GROUP BY department_id;
Output
department_id highest_salary 1 55000.00 2 62000.00 3 50000.00 -
MIN: To find the lowest salary in each department:
SELECT department_id, MIN(salary) AS lowest_salary FROM employees GROUP BY department_id;
Output
department_id lowest_salary 1 50000.00 2 60000.00 3 50000.00
Using HAVING Clause with GROUP BY
The HAVING
clause is used to filter groups based on a condition. It is similar to the WHERE
clause but is used for groups.
Example
To find departments with more than one employee:
SELECT department_id, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 1;
Output
department_id | num_employees |
---|---|
1 | 2 |
2 | 2 |
Conclusion
The GROUP BY
clause is used for summarizing data in SQL. This chapter covered the basic syntax, grouping by single and multiple columns, using aggregate functions, and filtering groups with the HAVING
clause. Understanding how to use the GROUP BY
clause effectively will enhance your ability to analyze and summarize your database data.
In the next chapter, you will learn more about HAVING
clause.
Comments
Post a Comment
Leave Comment