mysql实现分组top记录

mysql row_number

hive,oracle等数据库支持row_number() over(partition by x order by y)语法，即实现分组分别排序，mysql不支持这种语法，可以通过变量实现。创建一个部门，员工，工资关系表，返回各个部门员工工资从高到低的排序

create table emp (emp_id int ,dep_id int ,salary decimal(10,2) );

写入测试数据

insert into emp values
(1,10,5500.00),
(2,10,4500.00),
(3,20,1900.00),
(4,20,4800.00),
(5,40,6500.00),
(6,40,14500.00),
(7,40,44500.00),
(8,50,6500.00),
(9,50,7500.00);

首先查询看下当前数据

mysql> select * from emp;
+--------+--------+----------+
| emp_id | dep_id | salary   |
+--------+--------+----------+
|      1 |     10 |  5500.00 |
|      2 |     10 |  4500.00 |
|      3 |     20 |  1900.00 |
|      4 |     20 |  4800.00 |
|      5 |     40 |  6500.00 |
|      6 |     40 | 14500.00 |
|      7 |     40 | 44500.00 |
|      8 |     50 |  6500.00 |
|      9 |     50 |  7500.00 |
+--------+--------+----------+
9 rows in set (0.00 sec)

通过下面的sql返回每个员工在所在部门的工资排序

select emp_id,dep_id,salary, 
@order_idx:=IF(@cur_row = dep_id, @order_idx + 1, 1) AS order_idx, 
@cur_row:=dep_id 
from emp order by dep_id asc,salary desc;

这里关键是通过order_idx和cur_row两个变量实现分组排序。首先数据按dep_id排序，然后按salary排序，这样确保输出数据是符合要求的。然后通过控制cur_row和order_idx实现每个分组内的计数，查询每个分组第一行数据时 IF语句不满足，此时order_idx会赋值为默认的行号1，从第二行开始会在此基础上累加。输出结果为

mysql> select emp_id,dep_id,salary,
    -> @order_idx:=IF(@cur_row = dep_id, @order_idx + 1, 1) AS order_idx,
    -> @cur_row:=dep_id
    -> from emp order by dep_id asc,salary desc;
+--------+--------+----------+-----------+------------------+
| emp_id | dep_id | salary   | order_idx | @cur_row:=dep_id |
+--------+--------+----------+-----------+------------------+
|      1 |     10 |  5500.00 |         1 |               10 |
|      2 |     10 |  4500.00 |         2 |               10 |
|      4 |     20 |  4800.00 |         1 |               20 |
|      3 |     20 |  1900.00 |         2 |               20 |
|      7 |     40 | 44500.00 |         1 |               40 |
|      6 |     40 | 14500.00 |         2 |               40 |
|      5 |     40 |  6500.00 |         3 |               40 |
|      9 |     50 |  7500.00 |         1 |               50 |
|      8 |     50 |  6500.00 |         2 |               50 |
+--------+--------+----------+-----------+------------------+
9 rows in set (0.00 sec)

在这个输出结果上再进行order_idx<=n过滤就能实现分组取top n的需求

本文由至简创作，采用知识共享署名4.0 国际许可协议进行许可
本站文章除注明转载/出处外，均为本站原创或翻译，转载前请务必署名
最后编辑时间为: 2021/09/22 13:24