MySql group average and group last value in one query - mysql

I have table named Amounts with columns RowId, CounterId and Amount. It is easy to group by CounterId and get counters average value, but if I want also to get last value of Amount in group to know is it bigger or smaller than average, I’m in trouble? How to get that as just including Amount in query gives me first value of Amount in group what is useless. Maybe it is easy to do, but I have not found answer for my problem with just one table. I found, how to find only last Amount in group by help with RowId, but how to obtain them - average and last value - to one result, is mystery for me now… Thanks ahead.
Thanks to Ram Bath I built what I needed and result is here:
SELECT Kliendid.Id AS Id,
Kliendid.Nimi AS Klient,
MIN(X.Tarbimine) AS Piseim,
AVG(X.Tarbimine) AS Keskmine,
MAX(X.Tarbimine) AS Suureim,
COUNT(X.Tarbimine) AS Kuid,
(
Select Tarbimine
from Naidud A
where A.Id=MAX(X.Id)
) as Viimane
FROM Naidud X
INNER JOIN Kliendid ON Kliendid.ID=X.Klient
INNER JOIN Mooturid ON Mooturid.ID=X.Mootur
WHERE X.Tarbimine>0
AND X.Aeg>'2015-12-31'
AND Mooturid.Kasutusel=1
GROUP BY X.Mootur
HAVING Kuid>5
AND (Viimane=Piseim OR Viimane=Suureim)
As you see, my question was simplified as I use Estonian for table and column names and there would be much harder to help if I had shared code from the beginning... Thanks again for all of you.

Here I am assuming you RowId is Unique or is the Primary Key.
SELECT RowId,AVG(AmountId) as Average_Amount,(Select Amount from Amounts a where a.RowId=MAX(X.RowId)) as LastAmount from Amounts X Group by X.CounterId;

Related

Do we have a workaround to use alias with 'where' in sql

Sales :
Q1) Return the name of the agent who had the highest increase in sales compared to the previous year
A) Initially I wrote the following query
Select name, (sales_2018-sales_2017) as increase
from sales
where increase= (select max(sales_2018-sales_2017)
from sales)
I got an error saying I cannot use increase with the keyword where because "increase" is not a column but an alias
So I changed the query to the following :
Select name, (sales_2018-sales_2017) as increase
from sales
where (sales_2018-sales_2017)= (select max(sales_2018-sales_2017)
from sales)
This query did work, but I feel there should be a better to write this queryi.e instead of writing where (sales_2018-sales_2017)= (select max(sales_2018-sales_2017) from sales). So I was wondering if there is a work around to using alias with where.
Q2) suppose the table is as following, and we are asked to return the EmpId, name who got rating A for consecutive 3 years :
I wrote the following query its working :
select id,name
from ratings
where rating_2017='A' and rating_2018='A' and rating_2019='A'
Chaining 3 columns (ratings_2017,rating_2018,rating_2019) with AND is easy, I want know if there is a better way to chain columns with AND when say we want to find a employee who has rating 'A' fro 10 consective years.
Q3) Last but not the least, I'm really interested in learning to write intermediate-complex SQL queries and take my sql skills to next level. Is there a website out there that can help me in this regard ?
1) You are referencing an expression with a table column value, and therefore you would need to define the expression first(either using an inline view/cte for increase). After that you can refer it in the query
Eg:
select *
from ( select name, (sales_2018-sales_2017) as increase
from sales
)x
where x.increase= (select max(sales_2018-sales_2017)
from sales)
Another option would be to use analytical functions for getting your desired results, if you are in mysql 8.0
select *
from ( select name
,(sales_2018-sales_2017) as increase
,max(sales_2018-sales_2017) over(partition by (select null)) as max_increase
from sales
)x
where x.increase=x.max_increase
Q2) There are alternative ways to write this. But the basic issue is with the table design where you are storing each rating year as a new column. Had it been a row it would have been more easy.
Here is another way
select id,name
from ratings
where length(concat(rating_2017,rating_2018,rating_2019))-
length(replace(concat(rating_2017,rating_2018,rating_2019)),'A','')=3
Q3) Check out some example of problems from hackerrank or https://msbiskills.com/tsql-puzzles-asked-in-interview-over-the-years/. You can also search for the questions and answers from stackoverflow to get solutions to tough problems people faced
Q1 : you can simply order and limit the query results (hence no subquery is necessary) ; also, column aliases are allowed in the ORDER BY clause
SELECT
name,
sales_2018-sales_2017 as increase
FROM sales
ORDER BY increase DESC
LIMIT 1
Q2 : your query is fine ; other options exists, but they will not make it faster or easier to maintain.
Finally, please note that your best option overall would be to modify your database layout : you want to have yearly data in rows, not in columns ; there should be only one column to store the year instead of several. That would make your queries simpler to write and to maintain (and you wouldn’t need to create a new column every new year...)

I want to extract a random id from a MYSQL database

I am trying to extract a random article who has a picture from a database.
SELECT FLOOR(MAX(id) * RAND()) FROM `table` WHERE `picture` IS NOT NULL
My table is 33 MB big and has 1,006,394 articles but just 816 with pictures.
My problem is this query takes 0.4640 sek
I need this to be much much more faster.
Any idea is welcome.
P.S.
1. of course I have a index on id.
2. there is no index on the picture field. should I add one?
3. the product name is unique, also the product number, but thats out of question.
RESULT OF TESTING SESSION.
#cHao's Solution is faster when I use it to select one of the random entries with a picture.(les then 0.1 sec.
But its slower if I try to do the opposite, to select a random article without picture. 2..3 sec.
#Kickstart's Solution is a bit slower when trying to find a entry with picture, but is almost same speed when trying to find a entry without picture. average 0,149 sec.
#bob-kruithof's Solution don't work for me.
when trying to find a entry with picture, it selects a entry without picture.
and #ganesh-bora, yes you are right, in my case the speed difference is about 5..15 times.
I want to thank you all for your help, and I decided for #Kickstart.
You need to get a range of values with matching records and then find a matching record within that range.
Something like this:-
SELECT r1.id
FROM `table` AS r1
INNER JOIN (
SELECT RAND( ) * ( MAX( id ) - MIN( id ) ) + MIN( id ) AS id
FROM `table`
WHERE `picture` IS NOT NULL
) AS r2
ON r1.id >= r2.id
WHERE `picture` IS NOT NULL
ORDER BY r1.id ASC
LIMIT 1
However for any hope of efficiency you need an index on the field it is checking (ie, picture in your example)
Just an explanation of how this works.
The sub select finds a random id from the table which is between the min and max ids for records for a picture. This random id may or may not be for a picture.
The resulting id from this sub select is joined back against the main table, but using >= and with a WHERE clause specifying that the record is a picture record. Hence it joins against all picture records where the id is greater than or equal to the random id. The highest random id will be the one for the picture record with the highest id, so it will always find a record (if there are any picture records). The ORDER BY / LIMIT is then used to bring back that single id.
Note that there is an obvious flaw to this, but most of the time it will be irrelevant. The record retrieved may not be entirely random. The picture with the lowest id is unlikely to be returned (will only be returned if the RAND() returns exactly 0), but if this is important this is easy enough to fix by rounding the resulting random id. The other flaw is that if the ids are not vaguely equally distributed in the full range of ids then some will be returned more often than others. For example, take the situation where the first 1000 ids were pictures, then no more until the last (33 millionth) record. The random id could be any of those 33 million, but unless it is less than or equal to 1000 then it will be the 33 millionth record that will be returned.
You might try attaching a random number to each row, then sorting by that. The row with the lowest number will be at the top.
SELECT `table`.`id`, RAND() as `order`
FROM `table`
WHERE `picture` IS NOT NULL
ORDER BY `order`
LIMIT 1;
This is of course slower than just magicking up an ID with RAND(), but (1) it'll always give you a valid ID (as long as there's a record with a non-null picture field in the table, anyway), and (2) the WTF ratio is pretty low; most people can tell what's going on here. :) Its performance rivals Kickstart's solution with a decently indexed table, when the number of items to select from is relatively small (around 1%). Definitely don't try to select from a whole huge table like this; limit it first with a WHERE clause on some indexed field(s).
Performancewise, if you have a long-running app (ie: not PHP; i'm talking about Java, .net, etc where the app is alive even between requests), you might try to keep a list of all the IDs of items with pictures, select a random ID from that list, and load the article. You could do that in PHP too, if you wanted. It might not work as well when you have to query all the IDs each time, but it could be very useful if you can cache the list of IDs in APC or something.
for performance you can first add index on picture column so 814 records get sorted out at the top while executing the query and then you can fire your query.
How has someone else solved the problem?
I would suggest looking at the this article about different possible ways of selecting random rows in mysql.
Modified example from the article
SELECT name
FROM random JOIN
( SELECT CEIL( RAND() * (
SELECT MAX( id ) FROM random WHERE picture IS NOT NULL
) ) AS id ) AS r2 USING ( id );
This might work in your case.
Efficiency
As user Kickstart mentioned: Do you have an index on the column picture? This might help getting you the results a bit faster.
Are your tables optimized?

MySQL adding the difference between time values to find the avg difference.

I have a column that is Time formatted it needs to be sorted newest to oldest. What I would like to do is find the differences in time between each adjoin record. The tricky part is I need to sum all of the time differences then divide by the count – 1 of all the time records. Can this be done in MySQL
Im sorry if i am being a bit too wordy, but i cant quite glean your level of mysql experience.
Also apologies if i dont understand your question. But here goes...
First of all, you dont need to sum and devide, MySQL has an average function for you called avg(). See here for details
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
What you want can be done by sub-queries i think. For more info on subqueries look here.
http://dev.mysql.com/doc/refman/5.0/en/select.html
Basically, you want to first create a table that sorts the column.
SELECT someid, time
FROM table
ORDER BY TIME
Use that in a subquery that joins the table with itself but with a shifted index (To get the time before and time after)
SELECT *
FROM table1 as t1 INNER JOIN table2 as t2 ON t1.someid = t2.someid+1
And use avg on that
SELECT avg(t2.time-t1.time)
GROUP BY t1.someid

MySql Query Grouping by Time

I am trying to create a report to understand the time-of-day that orders are being placed, so I need to sum and group them by time. For example, I would like a sum of all orders placed between 1 and 1:59, then the next row listing the sum of all orders between 2:00 and 2:59, etc. The field is a datetime variable, but for the life me I haven't been able to find the right query to do this. Any suggestions sending me down the right path would be greatly appreciated.
Thanks
If by luck it is mysql and by sum of orders you mean the number of orders and not the value amount:
select date_format(date_field, '%Y-%m-%d %H') as the_hour, count(*)
from my_table
group by the_hour
order by the_hour
This king of grouping (using a calculated field) will certainly not scale over time. If you really need to execute this specific GROUP BY/ORDER BY frequently, you should create an extra field (an UNSIGNED TINYINT field will suffice) storing the hour and place an INDEX on that column.
That is of course if your table is becoming quite big, if it is small (which cannot be stated in mere number of records because it is actually a matter of server configuration and capabilities as well) you won't probably notice much difference in performance.

Will grouping an ordered table always return the first row? MYSQL

I'm writing a query where I group a selection of rows to find the MIN value for one of the columns.
I'd also like to return the other column values associated with the MIN row returned.
e.g
ID QTY PRODUCT TYPE
--------------------
1 2 Orange Fruit
2 4 Banana Fruit
3 3 Apple Fruit
If I GROUP this table by the column 'TYPE' and select the MIN qty, it won't return the corresponding product for the MIN row which in the case above is 'Apple'.
Adding an ORDER BY clause before grouping seems to solve the problem. However, before I go ahead and include this query in my application I'd just like to know whether this method will always return the correct value. Is this the correct approach? I've seen some examples where subqueries are used, however I have also read that this inefficient.
Thanks in advance.
Adding an ORDER BY clause before grouping seems to solve the problem. However, before I go ahead and include this query in my application I'd just like to know whether this method will always return the correct value. Is this the correct approach? I've seen some examples where subqueries are used, however I have also read that this inefficient.
No, this is not the correct approach.
I believe you are talking about a query like this:
SELECT product.*, MIN(qty)
FROM product
GROUP BY
type
ORDER BY
qty
What you are doing here is using MySQL's extension that allows you to select unaggregated/ungrouped columns in a GROUP BY query.
This is mostly used in the queries containing both a JOIN and a GROUP BY on a PRIMARY KEY, like this:
SELECT order.id, order.customer, SUM(price)
FROM order
JOIN orderline
ON orderline.order_id = order.id
GROUP BY
order.id
Here, order.customer is neither grouped nor aggregated, but since you are grouping on order.id, it is guaranteed to have the same value within each group.
In your case, all values of qty have different values within the group.
It is not guaranteed from which record within the group the engine will take the value.
You should do this:
SELECT p.*
FROM (
SELECT DISTINCT type
FROM product p
) pd
JOIN p
ON p.id =
(
SELECT pi.id
FROM product pi
WHERE pi.type = pd.type
ORDER BY
type, qty, id
LIMIT 1
)
If you create an index on product (type, qty, id), this query will work fast.
It's difficult to follow you properly without an example of the query you try.
From your comments I guess you query something like,
SELECT ID, COUNT(*) AS QTY, PRODUCT_TYPE
FROM PRODUCTS
GROUP BY PRODUCT_TYPE
ORDER BY COUNT(*) DESC;
My advice, you group by concept (in this case PRODUCT_TYPE) and you order by the times it appears count(*). The query above would do what you want.
The sub-queries are mostly for sorting or dismissing rows that are not interested.
The MIN you look is not exactly a MIN, it is an occurrence and you want to see first the one who gives less occurrences (meaning appears less times, I guess).
Cheers,