vlambda博客
学习文章列表

MySQL分组统计你会吗?

分组统计在日常工作中经常用到,今天分享使用sum,case when进行分组统计。

表结构如下:

CREATE TABLE `staff` ( `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'ID', `name` varchar(50) NOT NULL COMMENT '姓名', `dept_id` varchar(10) DEFAULT NULL COMMENT '部门id', `salary` double DEFAULT NULL COMMENT '工资', `sex` int(11) DEFAULT NULL COMMENT '性别,1是男生,0是女生', `created_time` datetime DEFAULT NULL COMMENT '创建时间', `updated_time` datetime DEFAULT NULL COMMENT '更新时间', PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8 COMMENT='职员表'

现有记录:

mysql> select * from staff ;+----+-------+---------+--------+------+---------------------+---------------------+| id | name | dept_id | salary | sex | created_time | updated_time |+----+-------+---------+--------+------+---------------------+---------------------+| 1 | Tom | 101 | 2500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 2 | Tonny | 101 | 3500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 3 | Bob | 101 | 3500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 4 | Rob | 101 | 4500 | NULL | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 5 | Sudey | 101 | 5500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 6 | Sunny | 102 | 5500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 7 | Sedey | 102 | 3500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 8 | Bobby | 102 | 4500 | NULL | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 9 | Weedy | 103 | 4500 | 1 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 || 10 | Jimme | 104 | 5500 | 0 | 2020-03-08 18:37:55 | 2020-03-08 18:37:55 |+----+-------+---------+--------+------+---------------------+---------------------+10 rows in set (0.00 sec)

1.查询每个部门男生、女生分别多少人?

使用sum能正确查出,如下语句1符合期望:

selectdept_id,count(*) as total_num,sum(case  when sex = 1 then 1 else 0 end) as female_num ,sum(case  when sex = 0 then 1 else 0 end) as male_num,sum(case  when (sex<>1 and sex<>0) or sex is null then 1 else 0 end) as unknown_numfrom staff group by dept_id;
+---------+-----------+------------+----------+-------------+| dept_id | total_num | female_num | male_num | unknown_num |+---------+-----------+------------+----------+-------------+| 101 | 5 | 2 | 2 | 1 || 102 | 3 | 1 | 1 | 1 || 103 | 1 | 1 | 0 | 0 || 104 | 1 | 0 | 1 | 0 |+---------+-----------+------------+----------+-------------+4 rows in set (0.00 sec)

有些人会问使用count(case when......)可以吗?如下语句2不符合期望:

selectdept_id,count(*) as total_num,count(case  when sex = 1 then 1 else 0 end) as female_num ,count(case  when sex = 0 then 1 else 0 end) as male_num,count(case  when (sex<>1 and sex<>0) or sex is null then 1 else 0 end) as unknown_numfrom staff group by dept_id;
+---------+-----------+------------+----------+-------------+| dept_id | total_num | female_num | male_num | unknown_num |+---------+-----------+------------+----------+-------------+| 101 | 5 | 5 | 5 | 5 || 102 | 3 | 3 | 3 | 3 || 103 | 1 | 1 | 1 | 1 || 104 | 1 | 1 | 1 | 1 |+---------+-----------+------------+----------+-------------+4 rows in set (0.00 sec)

可见如上语句执行不报错,但是结果不对,这就是sum和count的区别。sum(c1)意思是对c1字段的所有值求和,可能是小数、整数、负数等等。count(c1)的意思是返回该表的总记录数,不是字段值求和,肯定是0或正整数。



2.查询 101部门salary>3000,102部门salary>4000, 其他部门salary>5000的男生、女生分别多少人?

可见,部门不同,条件也不同,怎么写呢?where条件中使用case when ,如下语句3符合期望:

selectdept_id,count(*) as total_num,sum(case  when sex = 1 then 1 else 0 end) as female_num ,sum(case  when sex = 0 then 1 else 0 end) as male_num,sum(case  when (sex<>1 and sex<>0) or sex is null then 1 else 0 end) as unknown_numfrom staff where salary > (case  when dept_id = '101' then 3000 when dept_id = '102' then 4000 else 5000 end)group by dept_id;
+---------+-----------+------------+----------+-------------+| dept_id | total_num | female_num | male_num | unknown_num |+---------+-----------+------------+----------+-------------+| 101 | 4 | 1 | 2 | 1 || 102 | 2 | 1 | 0 | 1 || 104 | 1 | 0 | 1 | 0 |+---------+-----------+------------+----------+-------------+3 rows in set (0.00 sec)

where条件中使用case when ,使用另一种语法,如下语句4符合期望:

selectdept_id,count(*) as total_num,sum(case  when sex = 1 then 1 else 0 end) as female_num ,sum(case  when sex = 0 then 1 else 0 end) as male_num,sum(case  when (sex<>1 and sex<>0) or sex is null then 1 else 0 end) as unknown_numfrom staff where -- 换一种case when的语法(case  when dept_id = '101' then salary > 3000 when dept_id = '102' then salary > 4000 else salary > 5000 end)group by dept_id;

使用if,where条件中使用case when ,如下语句5符合期望:

selectdept_id,count(*) as total_num,sum(if(sex=1,1,0)) as female_num,sum(if(sex=0,1,0)) as male_num,sum(if((sex<>1 and sex<>0) or sex is null,1,0)) as unknown_numfrom staff where ......


细心的朋友会发现,条件中为啥还要带有"or sex is null"呢?

MySQL在执行where c1<>v1时会把c1为NUL的记录过滤掉!

mysql> select id,name,sex from staff where sex <> 1 ; -- 没有sex为NULL的记录+----+-------+------+| id | name | sex |+----+-------+------+| 3 | Bob | 0 || 5 | Sudey | 0 || 7 | Sedey | 0 || 10 | Jimme | 0 |+----+-------+------+4 rows in set (0.00 sec)
mysql> select id,name,sex from staff where sex <> 1 or sex is null ;-- 带上 or sex is null 后才正确+----+-------+------+| id | name | sex |+----+-------+------+| 3 | Bob | 0 || 4 | Rob | NULL || 5 | Sudey | 0 || 7 | Sedey | 0 || 8 | Bobby | NULL || 10 | Jimme | 0 |+----+-------+------+6 rows in set (0.00 sec)



如果需要使用多重case when 怎么办呢 ?如下:

selectsalary,case  when salary >5000 then -- 这里的then别忘了写 case  when salary <7000 and salary >=6000 then 'T6' when salary <6000 and salary >=5000 then 'T5' end when salary >3000 then case  when salary <5000 and salary >=4000 then 'T4' when salary <4000 and salary >=3000 then 'T3' end when salary >1000 then case  when salary <3000 and salary >=2000 then 'T2' when salary <2000 and salary >=1000 then 'T1' end else 'T0'end as salary_levelfrom staff ;
+--------+--------------+| salary | salary_level |+--------+--------------+|   2500 | T2           || 3500 | T3 ||   3500 | T3           || 4500 | T4 ||   5500 | T5           || 5500 | T5 ||   3500 | T3           || 4500 | T4 ||   4500 | T4           || 5500 | T5 |+--------+--------------+10 rows in set (0.00 sec

历史文章:






「欢迎关注,一起学习,一起进步」