MySQL: Select all twos of a kind with highest ids - mysql

I have table that consists multiple rows of a kind that have different ids. (Kinds are many. Ids are unique. Both columns are indexed.)
Now I need to select the two with highest ids of each kind.
Here is what I do.
select max(c.id), max(d.id) from theTable c left join
theTable d on c.id > d.id and c.kind=d.kind
where c.id > constant group by c.kind;
However the query above doesnt perform very well and it is not a big surprise.
Ive figured out a faster version of it...
select c.id, max(d.id) from (select max(id) id, kind from theTable
where id > constatnt group by kind) c left join
theTable d on c.id > d.id and c.kind=d.kind group by c.kind;
.... but still it is not fast enough
Is there a more efficient way to achieve the same result?
Thanks!
Edit:
theTbale is a history table so my task is to get the current values and the previous ones for each kind and compare them as part of an expression (logical operations, coalesces, ifs and etc) and determine if expression results are different
here is an example resultset:
+-----------+-----------+
| max(c.id) | max(d.id) |
+-----------+-----------+
| 1747 | NULL |
| 1701 | 1432 |
| 1703 | 1434 |
| 1706 | 1437 |
| 1707 | 1438 |
| 1751 | NULL |
| 1713 | 1444 |
| 1750 | NULL |
| 1709 | 1440 |
| 1742 | 1741 |
| 1711 | 1442 |
| 1746 | 1745 |
| 1708 | 1439 |
| 1719 | 1450 |
| 1725 | 1456 |
| 1723 | 1454 |
| 1740 | 1733 |
| 1705 | 1436 |
| 1702 | 1433 |
| 1749 | 1748 |
| 1712 | 1443 |
| 1718 | 1449 |
| 1722 | 1453 |
| 1728 | 1459 |
| 1721 | 1452 |
| 1739 | 1731 |
| 1714 | 1445 |
| 1717 | 1448 |
| 1716 | 1447 |
| 1724 | 1455 |
| 1710 | 1441 |
| 1727 | 1458 |
| 1720 | 1451 |
| 1738 | NULL |
| 1715 | 1446 |
| 1704 | 1435 |
| 1726 | 1457 |
| 1758 | 1757 |
+-----------+-----------+

What if instead of producing (kind, id, id) tuples with one row for each kind, your result set was (kind, id) with two rows per kind? I'm not sure if this will be more performant without running it myself, though.
SELECT x.kind, x.id
FROM (SELECT a.kind, a.id
FROM theTable a
LEFT OUTER JOIN theTable b
ON a.kind = b.kind
AND a.id < b.id
GROUP BY a.id
HAVING COUNT(*) < 2
ORDER BY b.id) x
WHERE x.id > constant
ORDER BY x.kind;
The last ORDER BY clause is just to make it easier for you to verify results, so omit it when evaluating performance. Note that some kinds may only have one id exceeding your constant, so you'll only have one (kind, id) row for that kind.

The following may perform pretty well:
select kind, max(id) as maxid,
(select id from t t2 where t2.kind = t.kind and t2.id < max(t1.id) order by id desc limit 1) as secondId
from t
group by kind
This will work well if you have an index on kind, id.

Related

MySQL query using limit and offset does not return expected ordered results

I am trying to get the count of entries by users grouped by year, month and user name from a table which has 45M entries. The query result has around 4M records which I wasn't able to get in one go so I decided to use limit and offset.
To retrieve the first 1M records I've written the query below:
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
limit 1000000
) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
order by t.year, t.month, uis.nick;
To retrieve the second 1M records I've set an offset of 999998 so I would have 2 overlapping rows so that I could double check that it's correct, hence this query below:
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
limit 999998, 1000000
) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
order by t.year, t.month, uis.nick;
Then to compare the results and double check, I've got the tail of the first 1M records and the head of the second 1M records. There should be 2 overlapping records in my understanding -since I've used an offset of 999998- but there is something wrong.
It's also evident that there is something wrong with the query because the first file ends with zzzzz but then the second file starts with 0 3 kalem ucu which should not be after z in alphabetical order.
$ tail entry_counts_by_users_1_1m.csv
| user_nick | user_id | date | entry_count |
|-------------|---------|---------|-------------|
| zskal | 493395 | 2013-05 | 8 |
| zuhanzee | 397659 | 2013-05 | 2 |
| zulmet | 446672 | 2013-05 | 74 |
| zuluuuuuu | 1240043 | 2013-05 | 9 |
| zverkov | 502616 | 2013-05 | 2 |
| zvezdite | 750458 | 2013-05 | 1 |
| zx | 249598 | 2013-05 | 15 |
| zyprexa 5mg | 779519 | 2013-05 | 16 |
| zzgx | 584985 | 2013-05 | 2 |
| zzzzz | 22730 | 2013-05 | 1 |
$ head entry_counts_by_users_1m_2m.csv
| nick | user_id | DATE | count |
|---------------|---------|---------|-------|
| 0 3 kalem ucu | 624699 | 2013-05 | 4 |
| 0132 | 995914 | 2013-05 | 3 |
| 03072010 | 960606 | 2013-05 | 9 |
| 0312020008 | 804486 | 2013-05 | 2 |
| 0326 | 446816 | 2013-05 | 1 |
| 05 | 575534 | 2013-05 | 1 |
| 05012009 | 1171153 | 2013-05 | 6 |
| 0904 | 514964 | 2013-05 | 2 |
| 0kmzeka | 777191 | 2013-05 | 4 |
Could you help me understand what I am doing wrong here?
+-----------+
| ##version |
+-----------+
| 8.0.19 |
+-----------+
UPDATE
These are the results I get after using ORDER BY in my subquery:
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
order by year, month, user_id
limit 1000000) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
For the first 1M records:
$ tail entry_counts_by_users_1_1m.csv
| user_name | user_id | date | entry_count |
|----------------------------|---------|---------|-------------|
| statistic er | 667546 | 2012-06 | 1 |
| mula | 612905 | 2013-02 | 1 |
| sisman cirkin bi de kezban | 1327434 | 2013-02 | 2 |
| tyra34 | 1329280 | 2013-03 | 1 |
| ecemazkan | 1332628 | 2013-02 | 1 |
| susamlicubuk | 1333079 | 2013-02 | 1 |
| hemenhemenherterim | 631784 | 2011-04 | 1 |
| umursamaz tavrin hastasi | 1060158 | 2012-09 | 2 |
| uslucocuk | 1254758 | 2012-09 | 1 |
| dharamsala | 956110 | 2012-09 | 1 |
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
order by year, month, user_id
limit 999998, 1000000) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
For the second 1M records:
$ head entry_counts_by_users_1m_2m.csv
| user_name | user_id | date | entry_count |
|-----------|---------|---------|-------------|
| ssg | 8097 | 2013-06 | 101 |
| ssg | 8097 | 2013-07 | 73 |
| ssg | 8097 | 2013-08 | 100 |
| ssg | 8097 | 2013-09 | 88 |
| ssg | 8097 | 2013-10 | 84 |
| ssg | 8097 | 2013-11 | 54 |
| ssg | 8097 | 2013-12 | 64 |
| ssg | 8097 | 2014-01 | 78 |
| ssg | 8097 | 2014-02 | 31 |
I still don't get what I am doing wrong.
Starting in MySQL 8.0.13, implicit ordering for GROUP BY has been removed:
Incompatible Change: The deprecated ASC or DESC qualifiers for GROUP BY clauses have been removed. Queries that previously relied on GROUP BY sorting may produce results that differ from previous MySQL versions. To produce a given sort order, provide an ORDER BY clause.
The implicit ordering has been deprecated since 5.6, so there has been some warning.
Your subquery is using GROUP BY with no ORDER BY. The ordering of the result set is not specified and it might change from one run to the next. To produce a stable result, using an ORDER BY before the LIMIT.

Selecting conditions only when both rows (not either) are met

Parts of the table that I am looking into is as follow:
table: store_inventories
+---------+----------+-------+----------+
| stor_id | title_id | qty | minStock |
+---------+----------+-------+----------+
| 8042 | TC7777 | 630 | 630 |
| 8042 | TH1217 | 0 | 630 |
| 9012 | AK1231 | -100 | 13 |
| 9012 | AK4153 | 5 | 1 |
| 9012 | BU2075 | 39 | 7 |
| 7131 | AW1234 | 10277 | 2055 |
| 7131 | AW5678 | 13150 | 2630 |
| 7131 | BU1032 | 545 | 109 |
| 7131 | BU2075 | 35 | 7 |
How can I select title_id from this table with where conditions that will meet both (not either) stor_ids 9012 and 7131.
The result should be
+----------+
| title_id |
+----------+
| BU2075 |
I tried inner join and using and statement but they either returning wrong result or empty set.
Use WHERE clause to filter the stor_ids and HAVING to count the instances of the rows return in the WHERE clause.
SELECT title_id
FROM store_inventories
WHERE stor_ids IN (9012, 7131) --
GROUP BY title_id
HAVING COUNT(*) = 2
Use DISTINCT if there title can have multiple rows of the same store.
HAVING COUNT(DISTINCT stor_ids) = 2
Here's a Demo.
You can try with a Join of subqueries:
SELECT a.title_id
FROM (select title_id
from store_inventories
where stor_id=9012) a
JOIN (select title_id
from store_inventories
where stor_id=7131) b
ON (a.title_id=b.title_id)

msaccess join most recent matching record from one table to another

The result I want.
+--------+-------------+------------+--------+
| Tag | most_recent | Comment | Author |
+--------+-------------+------------+--------+
| TAG001 | 2015-07-23 | Something3 | AM |
| TAG002 | 2015-07-25 | Something5 | BN |
+--------+-------------+------------+--------+
The tables I have:
Status Table
+--------+-------------+------------+
| Tag | Status | DateStatus |
+--------+-------------+------------+
| TAG001 | Not Started | |
| TAG002 | Complete | 2015-07-23 |
+--------+-------------+------------+
Comments Table
+----+--------+-------------+------------+--------+
| ID | Tag | DateCreated | Comment | Author |
+----+--------+-------------+------------+--------+
| 1 | TAG001 | 2015-07-22 | Something1 | JS |
| 2 | TAG002 | 2015-07-23 | Something2 | JS |
| 3 | TAG001 | 2015-07-23 | Something3 | AM |
| 4 | TAG002 | 2015-07-23 | Something4 | AS |
| 5 | TAG002 | 2015-07-25 | Something5 | BN |
+----+--------+-------------+------------+--------+
I've tried 4 different queries, each getting progressively more complicated, but still not working.
The queries I have tried:
Query 1)
SELECT Comments.[Tag], Max(Comments.[DateCreated]) AS most\_recent
FROM Comments
GROUP BY Comments.[Tag];
Result 1)
+--------+-------------+
| Tag | most_recent |
+--------+-------------+
| TAG001 | 2015-07-23 |
| TAG002 | 2015-07-25 |
+--------+-------------+
Just gives me the most recent date, but no values.
Query 2)
SELECT Comments.[Tag], Max(Comments.[DateCreated]) AS most\_recent
FROM Comments
GROUP BY Comments.[Tag];
Result 2)
+--------+-------------+------------+
| Tag | most_recent | Comment |
+--------+-------------+------------+
| TAG001 | 2015-07-22 | Something1 |
| TAG001 | 2015-07-23 | Something3 |
| TAG002 | 2015-07-23 | Something2 |
| TAG002 | 2015-07-23 | Something4 |
| TAG002 | 2015-07-25 | Something5 |
+--------+-------------+------------+
Now I see all the information I want, but I cannot filter for the most recent.
I tried DISTINCT, but it didn't work.
Query 3)
Modified from here: MYSQL - Join most recent matching record from one table to another
SELECT Status.\*,Comments.\*
FROM Status S
LEFT JOIN Comments C ON S.tag = C.tag
JOIN(SELECT x.tag, MAX(x.DateCreated) AS MaxCommentDate FROM Comments x
GROUP BY x.tag) y ON y.tag = x.tag AND y.MaxCommentDate = x.DateCreated
Result: Syntax error (missing operator) in query expression
Query 4)
Modified from here:
Left Join to most recent record
SELECT
Status.\*,Comments.\*
FROM Status S
LEFT JOIN
(
Comments C
INNER JOIN
(
SELECT
x.tag, MAX(x.DateCreated) AS MaxCommentDate
FROM
Comments x
GROUP BY
x.tag
)
y
ON y.tag = x.tag
AND y.MaxCommentDate = x.DateCreated
)
ON S.tag = C.tag;
Result: Syntax Error on JOIN
Not having much luck...thanks in advanced.
Thanks.
The following seems to work for me in Access 2010:
SELECT c.Tag, c.DateCreated AS most_recent, c.Comment, c.Author
FROM
(
SELECT Tag, MAX(DateCreated) AS MaxDate
FROM Comments
GROUP BY Tag
) AS md
INNER JOIN
Comments AS c
ON c.Tag = md.Tag AND c.DateCreated = md.MaxDate

Get MAX row for GROUP in MySQL

I have the following data:
+---------+----------+----------+--------+
| id | someId | number | data |
+---------+----------+----------+--------+
| 27 | 123 | 1 | abcde1 |
| 28 | 123 | 3 | abcde2 |
| 29 | 123 | 1 | abcde3 |
| 30 | 123 | 5 | abcde4 |
| 31 | 124 | 4 | abcde1 |
| 32 | 124 | 8 | abcde2 |
| 33 | 124 | 1 | abcde3 |
| 34 | 124 | 2 | abcde4 |
| 35 | 123 | 16 | abcde1 |
| 245 | 123 | 3 | abcde2 |
| 250 | 125 | 0 | abcde3 |
| 251 | 125 | 1 | abcde4 |
| 252 | 125 | 7 | abcde1 |
| 264 | 125 | 0 | abcde2 |
| 294 | 123 | 0 | abcde3 |
| 295 | 126 | 0 | abcde4 |
| 296 | 126 | 0 | abcde1 |
| 376 | 126 | 0 | abcde2 |
+---------+----------+----------+--------+
And I want to get a MySQL query that gets me the data of the row with the highest number for each someId. Note that id is unique, but number isn't
SELECT someid, highest_number, data
FROM test_1
INNER JOIN (SELECT someid sid, max(number) highest_number
FROM test_1
GROUP BY someid) t
ON (someid=sid and number=highest_number)
Unfortunately it is not look quite efficient. In Oracle it could be possible to user OVER clause without subqueries, but MySQL…
Update 1
If there are several instances of highest number this will returs also several data for each pair of someid and number.
To get the only row per each someid we should preaggregate the source table to make someid and number pairs unique (see t1 subquery)
SELECT someid, highest_number, data
FROM
(SELECT someid, number, MIN(data) data
FROM test_1
GROUP BY
someid, number) t1
INNER JOIN
(SELECT someid sid, max(number) highest_number
FROM test_1
GROUP BY someid) t2
ON (someid=sid and number=highest_number)
Update 2
It is possible to simplify previous solution
SELECT someid,highest_nuimber,
(select min(data)
from test_1
where someid=t1.someid and number=highest_nuimber)
FROM
(SELECT someid, max(number) highest_nuimber
FROM test_1
GROUP BY someid) t1
If we materialize unique pairs of someid and number than it is possible to use correlated subquery. Unlike a JOIN it would not produce additional rows if highest value of number is repeated several times.
Slight tweak to Naeel's answer but to return just a single data result for any someId even if there's a tie you should add a GROUP BY:
SELECT t1.someid, t1.number, t1.data
FROM Table1 t1
INNER JOIN (SELECT someId sid, max(number) max_number
FROM Table1
GROUP BY someId) t2
ON (someId = sid AND number = max_number)
GROUP BY t1.someId
SQL Fiddle here

How I create a table without singles records in MySQL

For example, I have the next table (IN MySQL)
| a | 1002 |
| b | 1002 |
| c | 1015 |
| a | 1005 |
| b | 1016 |
| a | 1106 |
| d | 1006 |
| a | 1026 |
| f | 1106 |
I want to select the objects that are duplicates.
| a | 1002 |
| a | 1106 |
| a | 1026 |
| a | 1005 |
| b | 1002 |
| b | 1016 |
Thank you
If I understand the question, you want to select rows where the number column is duplicated. One way to do it is to join against a subquery returns a list of number values that occur more than once.
SELECT letter, number
FROM myTable A
INNER JOIN (
SELECT number
FROM myTable
GROUP BY number
HAVING COUNT(*) > 1
) B ON A.number = B.number
As an alternative, if you want the list of all values where there are duplicates, you can use group_concat:
select col1, group_concat(col2)
from t
group by col1
having count(*) > 1
This does not return the exact format you want. Instead it would return:
| a | 1002,1106,1026,1005 |
| b | 1002,1016 |
But you might find it useful.