Get rid of slow subquery - mysql

I have a database of connections to my server, currently with 2,300,000 rows.
I run the following query to find connections by a user, removing duplicates if the nick/ip/client_id are the same.
SELECT
`nick`,
INET_NTOA(`ip`) as `ip`,
HEX(`client_id`) as `client_id`,
UNIX_TIMESTAMP(`date`) as `date`
FROM
(SELECT * FROM `joins` ORDER BY `date` DESC) as `sub`
WHERE
`nick` LIKE '%nick%'
-- Can also be things like this:
-- `ip` & INET_ATON('255.255.0.0') = INET_ATON('123.123.0.0')
GROUP BY
`nick`,
`ip`,
`client_id`
ORDER BY
`date` DESC
LIMIT 500
Why do I use a subquery in the first place? To get the latest date value when using GROUP BY.

I think you've misunderstood the role of ORDER BY and GROUP BY in this query. In order to get the latest date per nick,ip,client_id you would write the query as follows:
SELECT
`nick`,
INET_NTOA(`ip`) as `ip`,
HEX(`client_id`) as `client_id`,
MAX(UNIX_TIMESTAMP(`date`)) as `date`
FROM
`joins`
WHERE
`nick` LIKE '%nick%'
-- Can also be things like this:
-- `ip` & INET_ATON('255.255.0.0') = INET_ATON('123.123.0.0')
GROUP BY
`nick`,
`ip`,
`client_id`
ORDER BY
`date` DESC
LIMIT 500
There is no need for a subquery at all. This code groups the data and then returns the maximum value of
UNIX_TIMESTAMP(`date`)
as date.

Related

how to implement two aggregate functions on the same column mysql

SELECT max(sum(`orderquantity`)), `medicinename`
FROM `orerdetails`
WHERE `OID`=
(
SELECT `OrderID`
FROM `order`
where `VID` = 5 AND `OrerResponse` = 1
)
GROUP BY `medicinename`
i want to get the max of the result(sum of the order quantity) but it gives error any soultion to solve this
You don't need Max() here. Instead sort your recordset by that Sum('orderquantity') descending, and take the first record returned:
SELECT sum(`orderquantity`) as sumoforderqty, `medicinename`
FROM `orerdetails`
WHERE `OID`=
(
SELECT `OrderID`
FROM `order`
where `VID` = 5 AND `OrerResponse` = 1
)
GROUP BY `medicinename`
ORDER BY sumoforderqty DESC
LIMIT 1

Limit the sum only on the top 10 amount

I have a mysql table with that is called transactions and have the following fields: user (varchar), amount (float).
I want to make a group by like this
select `user`
, sum(`amount`) as s
from (
select *
from `transactions`
order by `amount` desc
) t group by `user`, s
but I want to limit the sum only on the top 10 amounts.
Is it possible to do that with plain sql?
Yes, use limit and don't group by sum:
select `user`
, sum(`amount`) as s
from (
select *
from `transactions`
order by `amount` desc
limit 10
) t group by `user`

Get rid of the subqueries for the sake of sorting grouped data

Tables
CREATE TABLE `aircrafts_in` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`city_from` int(11) NOT NULL COMMENT 'Откуда',
`city_to` int(11) NOT NULL COMMENT 'Куда',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=91 DEFAULT CHARSET=utf8 COMMENT='Самолёты по направлениям'
CREATE TABLE `aircrafts_in_parsed_data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`price` int(11) NOT NULL COMMENT 'Ценник',
`airline` varchar(255) NOT NULL COMMENT 'Авиакомпания',
`date` date NOT NULL COMMENT 'Дата вылета',
`info_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `info_id` (`info_id`),
KEY `price` (`price`),
KEY `date` (`date`)
) ENGINE=InnoDB AUTO_INCREMENT=940682 DEFAULT CHARSET=utf8
date - departure date
CREATE TABLE `aircrafts_in_parsed_info` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`status` enum('success','error') DEFAULT NULL,
`type` enum('roundtrip','oneway') NOT NULL,
`date` datetime NOT NULL COMMENT 'Дата парсинга',
`aircrafts_in_id` int(11) DEFAULT NULL COMMENT 'ID направления',
PRIMARY KEY (`id`),
KEY `aircrafts_in_id` (`aircrafts_in_id`)
) ENGINE=InnoDB AUTO_INCREMENT=577759 DEFAULT CHARSET=utf8
date - created date, when was parsed
Task
Get lowest price of ticket and date of departure for each month. Be aware that the minimum price is relevant, not just the minimum. If multiple dates with minimum cost, we need a first.
My solution
I think that there's something not quite right.
I don't like subqueries for grouping, how to solve this problem
select *
from (
select * from (
select airline,
price,
pdata.`date` as `date`
from aircrafts_in_parsed_data `pdata`
inner join aircrafts_in_parsed_info `pinfo`
on pdata.`info_id` = pinfo.`id`
where pinfo.`aircrafts_in_id` = {$id}
and pinfo.status = 'success'
and pinfo.`type` = 'roundtrip'
and `price` <> 0
group by pdata.`date`, year(pinfo.`date`) desc, month(pinfo.`date`) desc, day(pinfo.`date`) desc
) base
group by `date`
order by price, year(`date`) desc, month(`date`) desc, day(`date`) asc
) minpriceperdate
group by year(`date`) desc, month(`date`) desc
Takes 0.015 s without cache, table size can view in auto increment
SELECT MIN(price) AS min_price,
LEFT(date, 7) AS yyyy_mm
FROM aircrafts_in_parsed_data
GROUP BY LEFT(date, 7)
will get the lowest price for each month. But it can't say 'first'.
From my groupwise-max cheat-sheet, I derive this:
SELECT
yyyy_mm, date, price, airline -- The desired columns
FROM
( SELECT #prev := '' ) init
JOIN
( SELECT LEFT(date, 7) != #prev AS first,
#prev := LEFT(date, 7)
LEFT(date, 7) AS yyyy_mm, date, price, airline
FROM aircrafts_in_parsed_data
ORDER BY
LEFT(date, 7), -- The 'GROUP BY'
price ASC, -- ASC to do "MIN()"
date -- To get the 'first' if there are dup prices for a month
) x
WHERE first -- extract only the first of the lowest price for each month
ORDER BY yyyy_mm; -- Whatever you like
Sorry, but subqueries are necessary. (I avoided YEAR(), MONTH(), and DAY().)
You are right, your query is not correct.
Let's start with the innermost query: You group by pdata.date + pinfo.date, so you get one result row per date combination. As you don't specify which price or airline you are interested in for each date combination (such as MAX(airline) and MIN(price)), you get one airline arbitrarily chosen for a date combination and one price also arbitrarily chosen. These don't even have to belong to the same record in the table; the DBMS is free to chose one airline and one price matching the dates. Well, maybe the date combination of pdata.date and pinfo.date is already unique, but then you wouldn't have to group by at all. So however we look at this, this isn't proper.
In the next query you group by pdata.date only, thus again getting arbitrary matches for airline and price. You could have done that in the innermost query already. It makes no sense to say: "give me a randomly picked price per pdata.date and pinfo.date and from these give me a randomly picked price per pdata.date", you could just as well say it directly: "give me a randomly picked price per pdata.date". Then you order your result rows. This is completely useless, as you are using the results as a subquery (derived table) again, and such is considered an unordered set. So the ORDER BY gives the DBMS more work to do, but is in no way guaranteed to influence the main queries results.
In your main query then you group by year and month, again resulting in arbitrarily picked values.
Here is the same query a tad shorter and cleaner:
select
pdata.airline, -- some arbitrily chosen airline matching year and month
pdata.price, -- some arbitrily chosen price matching year and month
pdata.date -- some arbitrily chosen date matching year and month
from aircrafts_in_parsed_data pdata
inner join aircrafts_in_parsed_info pinfo on pdata.info_id = pinfo.id
where pinfo.aircrafts_in_id = {$id}
and pinfo.status = 'success'
and pinfo.type = 'roundtrip'
and pdata.price <> 0
group by year(pdata.date), month(pdata.date)
order by year(pdata.date) desc, month(pdata.date) desc
As to the original task (as far as I understand it): Find the records with the lowest price per month. Per month means GROUP BY month. The lowest price is MIN(price).
select
min_price_record.departure_year,
min_price_record.departure_month,
min_price_record.min_price,
full_record.departure_date,
full_record.airline
from
(
select
year(`date`) as departure_year,
month(`date`) as departure_month,
min(price) as min_price
from aircrafts_in_parsed_data
where price <> 0
and info_id in
(
select id
from aircrafts_in_parsed_info
where aircrafts_in_id = {$id}
and status = 'success'
and type = 'roundtrip'
)
group by year(`date`), month(`date`)
) min_price_record
join
(
select
`date` as departure_date,
year(`date`) as departure_year,
month(`date`) as departure_month,
price,
airline
from aircrafts_in_parsed_data
where price <> 0
and info_id in
(
select id
from aircrafts_in_parsed_info
where aircrafts_in_id = {$id}
and status = 'success'
and type = 'roundtrip'
)
) full_record on full_record.departure_year = min_price_record.departure_year
and full_record.departure_month = min_price_record.departure_month
and full_record.price = min_price_record.min_price
order by
min_price_record.departure_year desc,
min_price_record.departure_month desc;

MYSQL Query : How to get values per category?

I have huge table with millions of records that store stock values by timestamp. Structure is as below:
Stock, timestamp, value
goog,1112345,200.4
goog,112346,220.4
Apple,112343,505
Apple,112346,550
I would like to query this table by timestamp. If the timestamp matches,all corresponding stock records should be returned, if there is no record for a stock for that timestamp, the immediate previous one should be returned. In the above ex, if I query by timestamp=1112345 then the query should return 2 records:
goog,1112345,200.4
Apple,112343,505 (immediate previous record)
I have tried several different ways to write this query but no success & Im sure I'm missing something. Can someone help please.
SELECT `Stock`, `timestamp`, `value`
FROM `myTable`
WHERE `timestamp` = 1112345
UNION ALL
SELECT `Stock`, `timestamp`, `value`
FROM `myTable`
WHERE `timestamp` < 1112345
ORDER BY `timestamp` DESC
LIMIT 1
select Stock, timestamp, value from thisTbl where timestamp = ? and fill in timestamp to whatever it should be? Your demo query is available on this fiddle
I don't think there is an easy way to do this query. Here is one approach:
select tprev.*
from (select t.stock,
(select timestamp from t.stock = s.stock and timestamp <= <whatever> order by timestamp limit 1
) as prevtimestamp
from (select distinct stock
from t
) s
) s join
t tprev
on s.prevtimestamp = tprev.prevtimestamp and s.stock = t.stock
This is getting the previous or equal timestamp for the record and then joining it back in. If you have indexes on (stock, timestamp) then this may be rather fast.
Another phrasing of it uses group by:
select tprev.*
from (select t.stock,
max(timestamp) as prevtimestamp
from t
where timestamp <= YOURTIMESTAMP
group by t.stock
) s join
t tprev
on s.prevtimestamp = tprev.prevtimestamp and s.stock = t.stock

Every derived table must have its own alias - error from combination descending MySQL

I want to order one mysql table by two strtotime timestamps from two different columns. I've got the following mysql command:
SELECT * FROM (
(SELECT '1' AS `table`, `vid_req_timestamp` AS `timestamp`, `title` FROM `movies` WHERE `vid_req` = '1')
UNION
(SELECT '2' AS `table`, `ost_req_timestamp` AS `timestamp`, `title` FROM `movies` WHERE `ost_req` = '1')
)
ORDER BY `timestamp` DESC
This gives me an error:
#1248 - Every derived table must have its own alias
I want to combine vid_req_timestamp and ost_req_timestamp and make those descending. And it's important to know where the timestamp came from (somehow).
In this case, the derived table that requires an alias is the one that you are SELECTing * from.
Indentation helps make that clearer.
SELECT * FROM
(
(SELECT '1' AS `table`, `vid_req_timestamp` AS `timestamp`, `title` FROM `movies` WHERE `vid_req` = '1')
UNION
(SELECT '2' AS `table`, `ost_req_timestamp` AS `timestamp`, `title` FROM `movies` WHERE `ost_req` = '1')
) AS `some_table_name_lol_this_is_an_alias`
ORDER BY `timestamp` DESC