Mysql group by 分组内排序

假设有一张评论表,记录 post 的每一条评论:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
CREATE TABLE `comments` (
`id` INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`post_id` INT(11) NOT NULL,
`content` varchar(1024) default null,
`created` DATETIME default null,
`modified` DATETIME default null,

key `i_post_id` (`post_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into `comments` values
(null, 1, 'aaa', '2017-07-01 00:00:00'),
(null, 2, 'bbb', '2017-07-02 00:00:00'),
(null, 1, 'ccc', '2017-07-03 00:00:00'),
(null, 3, 'ddd', '2017-07-04 00:00:00'),
(null, 1, 'eee', '2017-07-05 00:00:00'),
(null, 2, 'fff', '2017-07-06 00:00:00');
1
2
3
4
5
6
7
8
9
10
11
MariaDB [test]> select * from comments;
+----+---------+---------+---------------------+
| id | post_id | content | created |
+----+---------+---------+---------------------+
| 1 | 1 | aaa | 2017-07-01 00:00:00 |
| 2 | 2 | bbb | 2017-07-02 00:00:00 |
| 3 | 1 | ccc | 2017-07-03 00:00:00 |
| 4 | 1 | eee | 2017-07-05 00:00:00 |
| 5 | 2 | fff | 2017-07-06 00:00:00 |
+----+---------+---------+---------------------+
5 rows in set (0.01 sec)

需求:取得每一篇 post 的最后评论时间。根据上面的数据,用肉眼可以看出应该是 id 为 4 和 5 的记录。

1
2
3
4
5
6
7
8
9
MariaDB [test]> select id, post_id, created from comments
-> group by post_id order by created desc;
+----+---------+---------------------+
| id | post_id | created |
+----+---------+---------------------+
| 2 | 2 | 2017-07-02 00:00:00 |
| 1 | 1 | 2017-07-01 00:00:00 |
+----+---------+---------------------+
2 rows in set (0.00 sec)

显然这个结果不对。原因是 Mysql 关键字的执行顺序 group by > order by,即先根据 post_id 分组,默认都取得第一笔资料,然后再根据 created 字段进行排序,所以得到上面的结果。

这里我们先假设 id 字段和 created 字段是一起增加的,即 id 越大,created 越新。

1
2
3
4
5
6
7
8
9
10
11
MariaDB [test]> select id, post_id, created from comments c1
-> inner join (
-> select max(id) as max_id from comments group by post_id
-> ) as c2 on c2.max_id = c1.id;
+----+---------+---------------------+
| id | post_id | created |
+----+---------+---------------------+
| 4 | 1 | 2017-07-05 00:00:00 |
| 5 | 2 | 2017-07-06 00:00:00 |
+----+---------+---------------------+
2 rows in set (0.00 sec)

结果符合预期。

那么如果 id 和 created 不是一起增加的呢。假设以下数据:

1
2
3
4
5
6
insert into `comments` values 
(null, 1, 'aaa', '2017-07-11 00:00:00'),
(null, 2, 'bbb', '2017-07-02 00:00:00'),
(null, 1, 'ccc', '2017-07-03 00:00:00'),
(null, 1, 'eee', '2017-07-05 00:00:00'),
(null, 2, 'fff', '2017-07-06 00:00:00');
1
2
3
4
5
6
7
8
9
10
11
MariaDB [test]> select * from comments;
+----+---------+---------+---------------------+
| id | post_id | content | created |
+----+---------+---------+---------------------+
| 1 | 1 | aaa | 2017-07-11 00:00:00 |
| 2 | 2 | bbb | 2017-07-02 00:00:00 |
| 3 | 1 | ccc | 2017-07-03 00:00:00 |
| 4 | 1 | eee | 2017-07-05 00:00:00 |
| 5 | 2 | fff | 2017-07-06 00:00:00 |
+----+---------+---------+---------------------+
5 rows in set (0.00 sec)

可以看出,符合结果的记录应该 id 为 1 和 5。

1
2
3
4
5
6
7
8
9
10
11
MariaDB [zireael]> select id, post_id, created from comments c1
-> inner join (
-> select max(id) as max_id from comments group by post_id
-> ) as c2 on c2.max_id = c1.id;
+----+---------+---------------------+
| id | post_id | created |
+----+---------+---------------------+
| 4 | 1 | 2017-07-05 00:00:00 |
| 5 | 2 | 2017-07-06 00:00:00 |
+----+---------+---------------------+
2 rows in set (0.00 sec)

所以这个时候继续用 max(id) 来解决是不对的。

1
2
3
4
5
6
7
8
9
10
11
MariaDB [zireael]> select c1.id, c1.post_id, c1.created from comments c1
-> inner join (
-> select post_id, max(created) as max_created from comments group by post_id
-> ) as c2 on c2.post_id = c1.post_id and c2.max_created = c1.created;
+----+---------+---------------------+
| id | post_id | created |
+----+---------+---------------------+
| 1 | 1 | 2017-07-11 00:00:00 |
| 5 | 2 | 2017-07-06 00:00:00 |
+----+---------+---------------------+
2 rows in set (0.00 sec)

这个时候的结果就对了。