MySQL分组统计你会吗?
分组统计在日常工作中经常用到,今天分享使用sum,case when进行分组统计。
表结构如下:
CREATE TABLE `staff` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'ID',
`name` varchar(50) NOT NULL COMMENT '姓名',
`dept_id` varchar(10) DEFAULT NULL COMMENT '部门id',
`salary` double DEFAULT NULL COMMENT '工资',
`sex` int(11) DEFAULT NULL COMMENT '性别,1是男生,0是女生',
`created_time` datetime DEFAULT NULL COMMENT '创建时间',
`updated_time` datetime DEFAULT NULL COMMENT '更新时间',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8 COMMENT='职员表'
现有记录:
mysql> select * from staff ;
+----+-------+---------+--------+------+---------------------+---------------------+
| id | name | dept_id | salary | sex | created_time | updated_time |
+----+-------+---------+--------+------+---------------------+---------------------+
| 1 | Tom | 101 | 2500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 2 | Tonny | 101 | 3500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 3 | Bob | 101 | 3500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 4 | Rob | 101 | 4500 | NULL | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 5 | Sudey | 101 | 5500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 6 | Sunny | 102 | 5500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 7 | Sedey | 102 | 3500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 8 | Bobby | 102 | 4500 | NULL | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 9 | Weedy | 103 | 4500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
| 10 | Jimme | 104 | 5500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |
+----+-------+---------+--------+------+---------------------+---------------------+
10 rows in set (0.00 sec)
1.查询每个部门男生、女生分别多少人?
使用sum能正确查出,如下语句1符合期望:
select
dept_id,
count(*) as total_num,
sum(
case
when sex = 1 then 1 else 0 end
) as female_num
,
sum(
case
when sex = 0 then 1 else 0 end
) as male_num
,
sum(
case
when (sex<>1 and sex<>0) or sex is null then 1 else 0 end
) as unknown_num
from staff
group by dept_id;
+---------+-----------+------------+----------+-------------+
| dept_id | total_num | female_num | male_num | unknown_num |
+---------+-----------+------------+----------+-------------+
| 101 | 5 | 2 | 2 | 1 |
| 102 | 3 | 1 | 1 | 1 |
| 103 | 1 | 1 | 0 | 0 |
| 104 | 1 | 0 | 1 | 0 |
+---------+-----------+------------+----------+-------------+
4 rows in set (0.00 sec)
有些人会问使用count(case when......)可以吗?如下语句2不符合期望:
select
dept_id,
count(*) as total_num,
count(
case
when sex = 1 then 1 else 0 end
) as female_num
,
count(
case
when sex = 0 then 1 else 0 end
) as male_num
,
count(
case
when (sex<>1 and sex<>0) or sex is null then 1 else 0 end
) as unknown_num
from staff
group by dept_id;
+---------+-----------+------------+----------+-------------+
| dept_id | total_num | female_num | male_num | unknown_num |
+---------+-----------+------------+----------+-------------+
| 101 | 5 | 5 | 5 | 5 |
| 102 | 3 | 3 | 3 | 3 |
| 103 | 1 | 1 | 1 | 1 |
| 104 | 1 | 1 | 1 | 1 |
+---------+-----------+------------+----------+-------------+
4 rows in set (0.00 sec)
可见如上语句执行不报错,但是结果不对,这就是sum和count的区别。sum(c1)意思是对c1字段的所有值求和,可能是小数、整数、负数等等。count(c1)的意思是返回该表的总记录数,不是字段值求和,肯定是0或正整数。
2.查询 101部门salary>3000,102部门salary>4000, 其他部门salary>5000的男生、女生分别多少人?
可见,部门不同,条件也不同,怎么写呢?where条件中使用case when ,如下语句3符合期望:
select
dept_id,
count(*) as total_num,
sum(
case
when sex = 1 then 1 else 0 end
) as female_num
,
sum(
case
when sex = 0 then 1 else 0 end
) as male_num
,
sum(
case
when (sex<>1 and sex<>0) or sex is null then 1 else 0 end
) as unknown_num
from staff
where salary >
(case
when dept_id = '101' then 3000
when dept_id = '102' then 4000
else 5000
end)
group by dept_id;
+---------+-----------+------------+----------+-------------+
| dept_id | total_num | female_num | male_num | unknown_num |
+---------+-----------+------------+----------+-------------+
| 101 | 4 | 1 | 2 | 1 |
| 102 | 2 | 1 | 0 | 1 |
| 104 | 1 | 0 | 1 | 0 |
+---------+-----------+------------+----------+-------------+
3 rows in set (0.00 sec)
where条件中使用case when ,使用另一种语法,如下语句4符合期望:
select
dept_id,
count(*) as total_num,
sum(
case
when sex = 1 then 1 else 0 end
) as female_num
,
sum(
case
when sex = 0 then 1 else 0 end
) as male_num
,
sum(
case
when (sex<>1 and sex<>0) or sex is null then 1 else 0 end
) as unknown_num
from staff
where -- 换一种case when的语法
(case
when dept_id = '101' then salary > 3000
when dept_id = '102' then salary > 4000
else salary > 5000
end)
group by dept_id;
使用if,where条件中使用case when ,如下语句5符合期望:
select
dept_id,
count(*) as total_num,
sum(if(sex=1,1,0)) as female_num,
sum(if(sex=0,1,0)) as male_num,
sum(if((sex<>1 and sex<>0) or sex is null,1,0)) as unknown_num
from staff
where ......
细心的朋友会发现,条件中为啥还要带有"or sex is null"呢?
MySQL在执行where c1<>v1时会把c1为NUL的记录过滤掉!
mysql> select id,name,sex from staff where sex <> 1 ;
-- 没有sex为NULL的记录
+----+-------+------+
| id | name | sex |
+----+-------+------+
| 3 | Bob | 0 |
| 5 | Sudey | 0 |
| 7 | Sedey | 0 |
| 10 | Jimme | 0 |
+----+-------+------+
4 rows in set (0.00 sec)
mysql> select id,name,sex from staff where sex <> 1 or sex is null ;
-- 带上 or sex is null 后才正确
+----+-------+------+
| id | name | sex |
+----+-------+------+
| 3 | Bob | 0 |
| 4 | Rob | NULL |
| 5 | Sudey | 0 |
| 7 | Sedey | 0 |
| 8 | Bobby | NULL |
| 10 | Jimme | 0 |
+----+-------+------+
6 rows in set (0.00 sec)
如果需要使用多重case when 怎么办呢 ?如下:
select
salary,
case
when salary >5000 then -- 这里的then别忘了写
case
when salary <7000 and salary >=6000 then 'T6'
when salary <6000 and salary >=5000 then 'T5'
end
when salary >3000 then
case
when salary <5000 and salary >=4000 then 'T4'
when salary <4000 and salary >=3000 then 'T3'
end
when salary >1000 then
case
when salary <3000 and salary >=2000 then 'T2'
when salary <2000 and salary >=1000 then 'T1'
end
else 'T0'
end as salary_level
from staff ;
+--------+--------------+
| salary | salary_level |
+--------+--------------+
| 2500 | T2 |
| 3500 | T3 |
| 3500 | T3 |
| 4500 | T4 |
| 5500 | T5 |
| 5500 | T5 |
| 3500 | T3 |
| 4500 | T4 |
| 4500 | T4 |
| 5500 | T5 |
+--------+--------------+
10 rows in set (0.00 sec
历史文章:
「欢迎关注,一起学习,一起进步」