I was wondering if there is an easy way to sort a table view, so you only get every latest different test run, sorted by date.
I have a table with a date, engine (It's like a project name for a test, and it changes depending on tests), and the ID of every run.
SELECT
`testjob`.`id` AS `Testjobid`,
`testjob`.`engine` AS `Engine`,
`testjob`.`StartTime`
FROM TestRun
ORDER BY `testjob`.`StartTime` DESC
So after the SQL is run, this table will be shown:
But actually I only need this:
NOTE: This table updates every day, and the engine names will also change over time, so I cant just type the three engine names and get the latest data that way.
The only solution I can think of, would be making an SQL to get every engine name, and then make a loop to run through the table for every engine, but I hope there is a better way than this?
You can use correlated subquery :
select tr.*
from TestRun tr
where tr.StartTime = (select max(tr1.StartTime) from TestRun tr1 where tr1.engine = tr.engine);
In newer version, you can use row_number() :
select tr.*
from (select tr.*, row_number() over (partition by tr.engine order by tr.StartTime desc) as seq
from TestRun tr
) tr
where tr.seq = 1;
If you have a ties with starttime then sub-query will need to modify with limit 1.
Related
I'm working on a table counting around 40,000,000 rows, and I'm trying to extract first entry for each "subscription_id" (foreign key from another table), here is my acutal request:
SELECT * FROM billing bill WHERE bill.billing_value not like 'not_ok%'
AND
(SELECT bill2.billing_id
FROM billing bill2
WHERE bill2.subscription_id = bill.subscription_id
ORDER BY bill2.billing_id ASC LIMIT 1
)= bill.billing_id;
This request is working correctly, when I put a small limit on it, but I cannot seem to process it for all the database.
Is there a way I could optimise it somehow ? Or do things in an other way ?
Table indexes and structure:
Indexes:
This is an example of the ROW_NUMBER() solution mentioned in the comments above.
select *
from (
select *, row_number() over (partition by subscription_id order by billing_id) as rownum
from billing
where billing_value not like 'not_ok%'
) t
where rownum = 1;
The ROW_NUMBER() function is available in MySQL 8.0, so if you haven't upgraded yet, you must do so to use this function.
Unfortunately, this won't be much of an improvement, because the NOT LIKE causes a table-scan regardless of the pattern you search for.
I believe it requires a virtual column with an index to optimize that condition:
alter table billing
add column ok as tinyint(1) as (billing_value not like 'not_ok%'),
add index (ok);
select *
from (
select *, row_number() over (partition by subscription_id order by billing_id) as rownum
from billing
where ok = true
) t
where rownum = 1;
Now it will use the index on the ok virtual column to reduce the set of examined rows.
This still might be a costly query on a 40 million row table, because the derived table subquery creates a large temporary table. If it's not fast enough, you'll have to really reconsider how you store and query this data.
For example, adding a column first_ok with an index, which is true only on the rows you need to fetch (the first row per subscriber_id without 'not_ok' as the billing value). But you must maintain this new column manually, and risk it being wrong if you don't do that. This is a denormalized design, but tailored to the query you want to run.
I haven't tried it, because I don't have an MySQL DB at hand, but this query seems much simpler:
select *
from billing
where billing_id in (select min(billing_id)
from billing
group by subscription_id)
and billing_value not like 'not_ok%';
The inner select get the minimum billing_id for all subscriptions. The outer gets the rest of the billing record.
If performance is an issue, I'd add the billing_id field in the third index, so you get an index with (subscription_id,billing_id). This will help for the inner query.
Assume a houses table with lot's of fields, related images tables, and 3 other related tables. I have an expensive query that retrieves all houses data, with all data from the related tables. Do I need to run the same expensive MySql query twice in the case of pagination: once for current result page and once to get the total number of records?
I'm using server-side pagination with Limit 0,10, and need to return the total number of houses along with the data. It doesn't make sense to me to run the same expensive query with the count(*) function, just because I'm limiting the result-set for pagination.
Is there another way to instruct MySQL to count the whole query, but bring back only the current pagination data?
I hope my question is clear...
thanks
I don't know MySql but for many dbs, I think you'll find that the cost of running it twice isn't as high as you'd suspect - if you do it in such a way that the db's optimization engine sees the two queries as having a lot in common.
Running
select count(1) from (
select some_fields, row_number over (order by field) as rownum
from some_table
)
and then
select * from (
select some_fields, row_number over (order by field) as rownum
from some_table
)
where rownum between :startRow and :endRow
order by row_number
This also has the advantage of you being able to maintain the query in just one place with two different wrappers around it, 1 for paging and 1 for getting the total count.
Just as a side note, the best optimization you can do is make sure you send the exact same query to the db every time. In other words, if the user can change the sort or change what fields they can query on, bake it all into the same query. E.g:
select some_fields,
case
when :sortField = 'ID' and :sortType = 'asc'
then row_number over (order by id)
when :sortField = 'ID' and :sortType = 'desc'
then row_number over (order by id desc)
end as rownum
from some_table
where (:searchType = 'name'
and last_name like :lastName and first_name like :firstName)
or (:searchType = 'customerType'
and customer_type = :customer_type)
cfquery has a recordcount variable that might be useful. You can also use the startrow and maxrows attributes of cfoutput to control how many records get displayed. Finally, you can cache the query results in coldfusion so you don't have to run it against the database each time.
I have the following query I use and it works great:
SELECT * FROM
(
SELECT * FROM `Transactions` ORDER BY DATE DESC
) AS tmpTable
GROUP BY Machine
ORDER BY Machine ASC
What's not great, is when I try to create a view from it. It says that subqueries cannot be used in a view, which is fine - I've searched here and on Google and most people say to break this down into multiple views. Ok.
I created a view that orders by date, and then a view that just uses that view to group by and order by machines - the results however, are not the same. It seems to have taken the date ordering and thrown it out the window.
Any and all help will be appreciated, thanks.
This ended up being the solution, after hours of trying, apparently you can use a subquery on a WHERE but not FROM?
CREATE VIEW something AS
SELECT * FROM Transactions AS t
WHERE Date =
(
SELECT MAX(Date)
FROM Transactions
WHERE Machine = t.Machine
)
You don't need a subquery here. You want to have the latest date in the group of machines, right?
So just do
SELECT
t.*, MAX(date)
FROM Transactions t
GROUP BY Machine
ORDER BY Machine ASC /*this line is obsolete by the way, since in MySQL a group by automatically does sort, when you don't specify another sort column or direction*/
A GROUP BY is used together with a aggregate function (in your case MAX()) anyway.
Alternatively you can also specify multiple columns in the ORDER BY clause.
SELECT
*
FROM
Transactions
GROUP BY Machine
ORDER BY Date DESC, Machine ASC
should give you also what you want to achieve. But using the MAX() function is definitely the better way to go here.
Actually I have never used a GROUP BY without an aggregate function.
I need to get the last (newest) row in a table (using MySQL's natural order - i.e. what I get without any kind of ORDER BY clause), however there is no key I can ORDER BY on!
The only 'key' in the table is an indexed MD5 field, so I can't really ORDER BY on that. There's no timestamp, autoincrement value, or any other field that I could easily ORDER on either. This is why I'm left with only the natural sort order as my indicator of 'newest'.
And, unfortunately, changing the table structure to add a proper auto_increment is out of the question. :(
Anyone have any ideas on how this can be done w/ plain SQL, or am I SOL?
If it's MyISAM you can do it in two queries
SELECT COUNT(*) FROM yourTable;
SELECT * FROM yourTable LIMIT useTheCountHere - 1,1;
This is unreliable however because
It assumes rows are only added to this table and never deleted.
It assumes no other writes are performed to this table in the meantime (you can lock the table)
MyISAM tables can be reordered using ALTER TABLE, so taht the insert order is no longer preserved.
It's not reliable at all in InnoDB, since this engine can reorder the table at will.
Can I ask why you need to do this?
In oracle, possibly the same for MySQL too but the optimiser will choose the quickest record / order to return you results. So there is potential if your data was static to run the same query twice and get a different answer.
You can assign row numbers using the ROW_NUMBER function and then sort by this value using the ORDER BY clause.
SELECT *,
ROW_NUMBER() OVER() AS rn
FROM table
ORDER BY rn DESC
LIMIT 1;
Basically, you can't do that.
Normally I'd suggest adding a surrogate primary key with auto-incrememt and ORDER BY that:
SELECT *
FROM yourtable
ORDER BY id DESC
LIMIT 1
But in your question you write...
changing the table structure to add a proper auto_increment is out of the question.
So another less pleasant option I can think of is using a simulated ROW_NUMBER using variables:
SELECT * FROM
(
SELECT T1.*, #rownum := #rownum + 1 AS rn
FROM yourtable T1, (SELECT #rownum := 0) T2
) T3
ORDER BY rn DESC
LIMIT 1
Please note that this has serious performance implications: it requires a full scan and the results are not guaranteed to be returned in any particular order in the subquery - you might get them in sort order, but then again you might not - when you dont' specify the order the server is free to choose any order it likes. Now it probably will choose the order they are stored on disk in order to do as little work as possible, but relying on this is unwise.
Without an order by clause you have no guarantee of the order in which you will get your result. The SQL engine is free to choose any order.
But if for some reason you still want to rely on this order, then the following will indeed return the last record from the result (MySql only):
select *
from (select *,
#rn := #rn + 1 rn
from mytable,
(select #rn := 0) init
) numbered
where rn = #rn
In the sub query the records are retrieved without order by, and are given a sequential number. The outer query then selects only the one that got the last attributed number.
We can use the having for that kind of problem-
SELECT MAX(id) as last_id,column1,column2 FROM table HAVING id=last_id;
I'm dealing with a legacy database table that has no insertion date column or unique id column, however the natural order of insertion is still valid when examined with a simple SELECT * showing oldest to newest.
I'd like like to fetch that data with pagination but reverse the order as if it was ORDER BY date DESC
I've thought about wrapping the query, assigning a numeric id to the resulting rows and then do an ORDER BY on the result but wow that seems crazy.
Is there a more simple solution I am overlooking?
I cannot add columns to the existing table, I have to work with it as is.
Thanks for any ideas!
Use #rownum in your query to number each row and then order by the #rownum desc. Here's an example.
select #rownum:=#rownum+1 ‘rank’, p.* from player p, (SELECT #rownum:=0) r order by score desc limit 10;
Finally, beware that relying on the current order being returned long-term isn't recommended.
If you're writing an application to process the data, another approach might be to run your current query, then iterate over the returned records from last to first.
If you have too many records, then you may wish to instead use a view. This is a Database object which can be used to combine data from different tabls, or present a modified view of a single table, amongst other things. In this case, you could try creating a view of your table and add a generated ID column. You could then run SELECT statements against this view ordering by the new column you have added.
However be aware of the advice from another poster above: the order in which rows are returned without an ORDER BY clause is arbitrary and may change without notification. It would be best to amend your table if at all possible.
mySQL CREATE VIEW syntax