I have the following query I use and it works great:
SELECT * FROM
(
SELECT * FROM `Transactions` ORDER BY DATE DESC
) AS tmpTable
GROUP BY Machine
ORDER BY Machine ASC
What's not great, is when I try to create a view from it. It says that subqueries cannot be used in a view, which is fine - I've searched here and on Google and most people say to break this down into multiple views. Ok.
I created a view that orders by date, and then a view that just uses that view to group by and order by machines - the results however, are not the same. It seems to have taken the date ordering and thrown it out the window.
Any and all help will be appreciated, thanks.
This ended up being the solution, after hours of trying, apparently you can use a subquery on a WHERE but not FROM?
CREATE VIEW something AS
SELECT * FROM Transactions AS t
WHERE Date =
(
SELECT MAX(Date)
FROM Transactions
WHERE Machine = t.Machine
)
You don't need a subquery here. You want to have the latest date in the group of machines, right?
So just do
SELECT
t.*, MAX(date)
FROM Transactions t
GROUP BY Machine
ORDER BY Machine ASC /*this line is obsolete by the way, since in MySQL a group by automatically does sort, when you don't specify another sort column or direction*/
A GROUP BY is used together with a aggregate function (in your case MAX()) anyway.
Alternatively you can also specify multiple columns in the ORDER BY clause.
SELECT
*
FROM
Transactions
GROUP BY Machine
ORDER BY Date DESC, Machine ASC
should give you also what you want to achieve. But using the MAX() function is definitely the better way to go here.
Actually I have never used a GROUP BY without an aggregate function.
Related
I need you help to resolve a problem..
Before, my website used MySql 5.5
Now, it seems to use MariaDB 10.0
I found no difference but...
This request (I have simplified the request for a better understanding)
select * from ( select * from MYTABLE ) tmpTable ORDER BY tmpTable.id DESC
This request WORKS on Mysql and MariaDB
BUT...
select * from ( select * from MYTABLE ORDER BY tmpTable.id DESC) tmpTable
I think if my order by is inside my seconde select, he is not concidered
This request DOESN'T WORK ! Result is good, but ORDER BY doesn't work ... It's order by ASCENDING and not DESCENDING like I specified in my second select ...
Someone understand why ? Is it a difference between mysql and Maria DB?
Thanks a lot !
Have a good day
In SQL, rows of a table have no pre-defined order. You need order by to sort a record set.
What happens with the second query is that the subquery creates a derived table that is then used in the outer query. The fact that you order the rows in the subquery does not make a difference: from the perspective of the outer query, rows of the derived table have no inherent ordering.
In other words there is no guarantee that the inner sort propagates to the outer scope. If you want the resultset to be consistently sorted, use order by in the outer scope.
Why am I getting an error on the last line?
SELECT
*
FROM
datawarehouse.shipments
WHERE
ControlBranch = 'SFO'
AND ShipmentCreateDateUTC >= '2020-03-01'
ORDER BY id DESC
GROUP BY ShipmentNumber;
What I'm trying to accomplish is order by id Desc, Group by ShipmentNumber, and then pull all the fields.
Well, first of all the query won't work because you are grouping something, and then showing everything with an * syntax. You need to show the column that you are grouping and then use aggregating functions with the rest (SUM, COUNT, AVG, etc) so it shows the value grouped.
After that, you need to check the instruction order that SQL Server uses. The Order by goes always at the end of the instructions. So, the correct order of your query is like this:
SELECT id, COUNT(*) -- The count is an example
FROM datawarehouse.shipments
WHERE
ControlBranch = 'SFO'
AND ShipmentCreateDateUTC >= '2020-03-01'
GROUP BY ShipmentNumber
ORDER BY id DESC;
Also, even if it's not relevant, you can have a better understanding of how the engine works behind the curtain (the order of execution) checking this link:
https://blog.sqlauthority.com/2009/04/06/sql-server-logical-query-processing-phases-order-of-statement-execution/
I have what should be a simple query for any database and which always runs in MySQL but not in SQL Server
select
tagalerts.id,
ts,
assetid,
node.zonename,
battlevel
from tagalerts, node
where
ack=0 and
tagalerts.nodeid=node.id
group by assetid
order by ts desc
The error is:
column tagalerts.id is invalid in the select list because it is not contained in either an aggregate function or the group by clause.
It is not a simple case of adding tagalerts.id to the group by clause because the error repeats for ts and for assetid etc, implying that all the selects need to be in a group or in aggregate functions... either of which will result in a meaningless and inaccurate result.
Splitting the select into a subquery to sort and group correctly (which again works fine with MySQL, as you would expect) makes matters worse
SELECT * from
(select
tagalerts.id,
ts,
assetid,
node.zonename,
battlevel
from tagalerts, node
where
ack=0 and
tagalerts.nodeid=node.id
order by ts desc
)T1
group by assetid
the order by clause is invalid in views, inline functions, derived tables and expressions unless TOP etc is used
the 'correct output' should be
id ts assetid zonename battlevel
1234 a datetime 1569 Reception 0
3182 another datetime 1572 Reception 0
Either I am reading SQL Server's rules entirely wrong or this is a major flaw with that database.
How can I write this to work on both systems?
In most databases you can't just include columns that aren't in the GROUP BY without using an aggregate function.
MySql is an exception to that. But MS SQL Server isn't.
So you could keep that GROUP BY with only the "assetid".
But then use the appropriate aggregate functions for all the other columns.
Also, use the JOIN syntax for heaven's pudding sake.
A SQL like select * from table1, table2 where table1.id2 = table2.id is using a syntax from the previous century.
SELECT
MAX(node.id) AS id,
MAX(ta.ts) AS ts,
ta.assetid,
MAX(node.zonename) AS zonename,
MAX(ta.battlevel) AS battlevel
FROM tagalerts AS ta
JOIN node ON node.id = ta.nodeid
WHERE ta.ack = 0
GROUP BY ta.assetid
ORDER BY ta.ts DESC;
Another trick to use in MS SQL Server is the window function ROW_NUMBER.
But this is probably not what you need.
Example:
SELECT id, ts, assetid, zonename, battlevel
FROM
(
SELECT
node.id,
ta.ts,
ta.assetid,
node.zonename,
ta.battlevel,
ROW_NUMBER() OVER (PARTITION BY ta.assetid ORDER BY ta.ts DESC) AS rn
FROM tagalerts AS ta
JOIN node ON node.id = ta.nodeid
WHERE ta.ack = 0
) q
WHERE rn = 1
ORDER BY ts DESC;
I strongly suspect this query is WRONG even in MySql.
We're missing a lot of details (sample data, and we don't know which table all of the columns belong to), but what I do know is you're grouping by assetid, where it looks like one assetid value could have more than one ts (timestamp) value in the group. It also looks like you're counting on the order by ts desc to ensure both that you see recent timestamps in the results first and that each assetid group uses the most recent possible ts timestamp for that group.
MySql only guarantees the former, not the latter. Nothing in this query guarantees that each assetid is using the most recent timestamp available. You could be seeing the wrong timestamps, and then also using those wrong timestamps for the order by. This is the problem the Sql Server rule is there to stop. MySql violates the SQL standard to allow you to write that wrong query.
Instead, you need to look at each column and either add it to the group by (best when all of the values are known to be the same, anyway) or wrap it in an aggregrate function like MAX(), MIN(), AVG(), etc, so there is a deterministic result for which value from the group is used.
If all of the values for a column in a group are the same, then there's no problem adding it to the group by. If the values are different, you want to be precise about which one is chosen for the result set.
While I'm here, the tagalerts, node join syntax has been obsolete for more than 20 years now. It's also good practice to use an alias with every table and prefix every column with the alias. I mention these to explain why I changed it for my code sample below, though I only prefix columns where I am confident in which table the column belongs to.
This query should run on both databases:
SELECT ta.assetid, MAX(ta.id) "id", MAX(ta.ts) "ts",
MAX(n.zonename) "zonename", MAX(battlevel) "battlevel"
FROM tagalerts ta
INNER JOIN node n ON ta.nodeid = n.id
WHERE ack = 0
GROUP BY ta.assetid
ORDER BY ts DESC
There is also a concern here the results may be choosing values from different records in the joined node table. So if battlevel is part of the node table, you might see a result that matches a zonename with a battlevel that never occurs in any record in the data. In Sql Server, this is easily fixed by using APPLY to match only one node record to each tagalert. MySql doesn't support this (APPLY or an equivalent has been in every other major database since at least 2012), but you can simulate with it in this case with two JOINs, where the first join is a subquery that uses GROUP BY to determine values will uniquely identify the needed node record, and second join is to the node table to actually produce that record. Unfortunately, we need to know more about the tables in question to actually write this code for you.
SELECT alert,
(select created_at from alerts
WHERE alert = #ALERT ORDER BY created_at desc LIMIT 1)
AS latest FROM alerts GROUP BY alert;
I am having an issue with the above query where I would like to pass in each alert into the subquery so that I have a column called latest which displays the latest alert for each group of alerts. How should I do this?
This is called a correlated subquery. To make it work, you need table aliases:
SELECT a1.alert,
(select a2.created_at
from alerts a2
WHERE a2.alert = a1.alert
ORDER BY a2.created_at desc
LIMIT 1
) AS latest
FROM alerts a1
GROUP BY a1.alert;
Table aliases are a good habit to get into, because they often make the SQL more readable. It is also a good idea to use table aliases with column references, so you easily know where the column is coming from.
EDIT:
If you really want the latest, you can get it by simply doing:
select alert, max(created_at)
from alerts
group by alert;
If you are trying to get the latest created_at date for each group of alerts, there is a simpler way.
SELECT
alert,
max (created_at) AS latest
FROM
alerts
GROUP BY
alert;
I would do the following
SELECT
alert_group_name,
MAX(created_at) AS latest
FROM
alerts A
GROUP BY
alert_group_name;
For a correlated subquery, you need to reference an expression from the outer query.
The best way to do that is to assign an alias to the table on the outer query, and then reference that in the inner query. Best practice is to assign an alias to EVERY table reference, and qualify EVERY column reference.
All that needs to be done to "fix" your query is to replace the reference to "#ALERT" with a reference to the alert column from the table on the outer query.
In our shop, that statement would be formatted something like this:
SELECT a.alert
, (SELECT l.created_at
FROM alerts l
WHERE l.alert = a.alert
ORDER BY l.created_at DESC
LIMIT 1
) AS latest
FROM alerts a
GROUP
BY a.alert
Not because that's easier to write that way, but more importantly it's easier to read and understand what the statement is doing.
The correlated subquery approach can be efficient for a small number of rows returned (a very restrictive WHERE clause on the outermost query.) But in general, correlated subqueries in the SELECT list can make for a (what we refer to in our shop) an LDQ "light dimming query".
In our shop, if we needed the resultset returned by that query, that statement would likely be rewritten as:
SELECT a.alert
, MAX(a.created_at) AS latest
FROM alerts a
GROUP
BY a.alert
And we'd definitely have an index defined ON alerts(alert,created_at) (or an index with additional columns after those first two.)
size, we
(I don't anticipate any cases where this statement would return a different result.)
This simple SQL problem is giving me a very hard time. Either because I'm seeing the problem the wrong way or because I'm not that familiar with SQL. Or both.
What I'm trying to do: I have a table with several columns and I only need two of them: the datetime when the entry was created and the id of the entry. Note that the hours/minutes/seconds part is important here.
However, I want to group my selection according to the DATE part only. Otherwise all groups will most likely have 1 element.
Here's my query:
SELECT MyDate as DateCr, COUNT(Id) as Occur
FROM MyTable tb WITH(NOLOCK)
GROUP BY CAST(tb.MyDate as Date)
ORDER BY DateCr ASC
However I get the following error from it:
Column "MyTable.MyDate" is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
If I don't do the cast in the GROUP BY, everything fine. If I cast MyDate to DATE in the SELECT and keep the CAST from GROUP BY, everything fine once more. Apparently it wants to keep the same DATE or DATETIME format in the GROUP BY as in the SELECT.
My approach can be completely wrong so I am not necessarily looking to fix the above query, but to find the proper way to do it.
LE: I get the above error on line 1.
LE2: On a second look, my question indeed is not very explicit. You can ignore the above approach if it is completely wrong. Below is a sample scenario
Let me tell you what I need: I want to retrieve (1) the DateTime when each entry was created. So if I have 20 entries, then I want to get 20 DateTimes. Then if I have multiple entries created on the same DAY, I want the number of those entries. For example, let's say I created 3 entries on Monday, 1 on Tuesday and 2 today. Then from my table I need the datetimes of these 6 entries + the number of entries which were created on each day (3 for 19/03/2012, 1 for 20/03/2012 and 2 for 21/03/2012).
I'm not sure why you're objecting to performing the CONVERT in both the SELECT and the GROUP BY. This seems like a perfectly logical way to do this:
SELECT
DateCr = CONVERT(DATE, MyDate),
Occur = COUNT(Id)
FROM dbo.MyTable
GROUP BY CONVERT(DATE, MyDate)
ORDER BY DateCr;
If you want to keep the time portion of MyDate in the SELECT list, why are you bothering to group? Or how do you expect the results to look? You'll have a row for every individual date/time value, where the grouping seems to indicate you want a row for each day. Maybe you could clarify what you want with some sample data and example desired results.
Also, why are you using NOLOCK? Are you willing to trade accuracy for a haphazard turbo button?
EDIT adding a version for the mixed requirements:
;WITH d(DateCr,d,Id) AS
(
SELECT MyDate, d = CONVERT(DATE, MyDate), Id
FROM dbo.MyTable)
SELECT DateCr, Occur = (SELECT COUNT(Id) FROM d AS d2 WHERE d2.d = d.d)
FROM d
ORDER BY DateCr;
Even though this is an old post, I thought I would answer it. The solution below will work with SQL Server 2008 and above. It uses the over clause, so that the individual lines will be returned, but will also count the rows grouped by the date (without time).
SELECT MyDate as DateCr,
COUNT(Id) OVER(PARTITION BY CAST(tb.MyDate as Date)) as Occur
FROM MyTable tb WITH(NOLOCK)
ORDER BY DateCr ASC
Darren White