MySQL wrong results with GROUP BY and ORDER BY - mysql

I have a table user_comission_configuration_history and I need to select the last Comissions configuration from a user_id.
Tuples:
I'm trying with many queries, but, the results are wrong. My last SQL:
SELECT *
FROM(
SELECT * FROM user_comission_configuration_history
ORDER BY on_date DESC
) AS ordered_history
WHERE user_id = 408002
GROUP BY comission_id
The result of above query is:
But, the correct result is:
id user_id comission_id value type on_date
24 408002 12 0,01 PERCENTUAL 2014-07-23 10:45:42
23 408002 4 0,03 CURRENCY 2014-07-23 10:45:41
21 408002 6 0,015 PERCENTUAL 2014-07-23 10:45:18
What is wrong in my SQL?

This is your query:
SELECT *
FROM (SELECT *
FROM user_comission_configuration_history
ORDER BY on_date DESC
) AS ordered_history
WHERE user_id = 408002
GROUP BY comission_id;
One major problem with your query is that it uses a MySQL extension to group by that MySQL explicitly warns against. The extension is the use of other columns in the in theselect that are not in the group by or in aggregation functions. The warning (here) is:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
So, the values returned in the columns are indeterminate.
Here is a pretty efficient way to get what you want (with "comission" spelled correctly in English):
SELECT *
FROM user_commission_configuration_history cch
WHERE NOT EXISTS (select 1
from user_commission_configuration_history cch2
where cch2.user_id = cch.user_id and
cch2.commission_id = cch.commission_id and
cch2.on_date > cch.on_date
) AND
cch.user_id = 408002;

Here's one way to do what your trying. It gets the max date for each user_ID and commissionID and then joins this back to the base table to limit the results to just the max date for each commissionID.
SELECT *
FROM user_comission_configuration_history A
INNER JOIN (
SELECT User_ID, Comission_Id, max(on_Date) mOn_Date
FROM user_comission_configuration_history
Group by User-Id, Comission_Id
) B
on B.User_ID = A.User_Id
and B.Comission_Id = A.Comission_ID
and B.mOnDate=A.on_date
WHERE user_id = 408002
ORDER BY on_Date desc;

Related

How to sum and group MySQL?

I have the following table (see pic.):
I need to sum(AT_amount) and group by AT_balance_tax_id .
But at the same time I need to get all AT_balance_tax_type_id for all groupped rows, of course without name duplications.
How to do that, I tried:
SELECT t.*
, i.*
, ABS(SUM(AT_amount)) amount
FROM account_transactions t
JOIN balance_tax_invoices i
ON i.id = t.AT_balance_tax_id
WHERE AT_createuser = 15
GROUP
BY AT_balance_tax_id
, AT_balance_tax_type_id
ORDER
BY AT_transactiondatetime DESC
It returns me not all AT_balance_tax_type_id for groupped rows.
Result is:
I expect this data for AT_balance_tax_id:
AT_amount AT_balance_tax_id AT_balance_tax_type_id
33000 9 1, 1, 3, 3
Delete duplicates:
AT_amount AT_balance_tax_id AT_balance_tax_type_id
33000 9 1, 3
Dont use * in the select clause. MySQL group by clause automatically includes other columns listed in select cause but not in Group By clause.
Also it seems you need GROUP_CONCAT function with distinct caluse. You can modify your query to -
SELECT at.`AT_balance_tax_id`
,GROUP_CONCAT(DISTINCT at.`AT_balance_tax_type_id`)
,ABS(SUM(AT_amount)) AS amount
FROM `account_transactions` at
INNER JOIN `balance_tax_invoices` bi ON bi.`id` = at.`AT_balance_tax_id`
WHERE `AT_createuser` = 15
GROUP BY at.`AT_balance_tax_id`
ORDER BY at.`AT_transactiondatetime` DESC
Also I have used aliases to increase the readability of query.

Retrieving last row inserted in table for each "parameter"

I have a table, currently about 1.3M rows which stores measured data points for a couple of different parameters. It is a bout 30 parameters.
Table:
* id
* station_id (int)
* comp_id (int)
* unit_id (int)
* p_id (int)
* timestamp
* value
I have a UNIQUE index on: (station_id, comp_id, unit_id, p_id, timestamp)
Due to timestamp differ for every parameter i have difficulties sorting by the timestamp (I have to use a group by).
So today I select the last value for each parameter by this query:
select p_id, timestamp, value
from (select p_id, timestamp, value
from table
where station_id = 3 and comp_id = 9112 and unit_id = 1 and
p_id in (1,2,3,4,5,6,7,8,9,10)
order by timestamp desc
) table_x
group by p_id;
This query takes about 3 seconds to execute.
Even though i have index as mentioned before the optimizer uses filesort to find the values.
Querying for only 1 specific parameter:
select p_id, timestamp, value from table where station_id = 3 and comp_id = 9112 and unit_id = 1 and p_id =1 order by timestamp desc limit 1;
Takes no time (0.00).
I've also tried joining the parameter-ids to a table which I store the parameter ID's in without luck.
So, is there a simple ( & fast) way to ask for the latest values for a couple of rows with different parameters?
Doing a procedure running a loop asking for each parameter individually seems much faster than asking all for once which I think not is the way to use a database.
Your query is incorrect. You are aggregating by p_id, but including other columns. These come from indeterminate rows, and the documentation is quite clear:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
Furthermore, the selection of values from each group cannot be
influenced by adding an ORDER BY clause.
The following should work:
select p_id, timestamp, value
from table t join
(select p_id, max(timestamp) as maxts
from table
where station_id = 3 and comp_id = 9112 and unit_id = 1 and
p_id in (1,2,3,4,5,6,7,8,9,10)
order by timestamp desc
) tt
on tt.pid = t.pid and tt.timestamp = t.maxts;
The best index for this query is a composite index on table(station_id, comp_id, unit_id, p_id, timestamp).

Aggregate function in BETWEEN and AND

I have joined 3 tables in my query. In my Inventory db,Price is taken from table c and quantity is taken from table b. How can I show the records list of users who have ordered between the given value and maximum value of the column.
I am using below query in mysql to retrieve records. As expected it shows error. Any help will be highly appreciated
SELECT .... GROUP BY userid HAVING SUM(c.`price` * b.`quantity`) BETWEEN 9000 AND MAX(SUM(c.`price` * b.`quantity`))
If I understand correctly you don't need BETWEEN. Try it this way
SELECT ....
GROUP BY userid
HAVING SUM(c.`price` * b.`quantity`) >= 9000
In case you wondered you can't chain aggregate functions. And even if you could it wouldn't make sense because you group by userid, but trying to get MAX of SUM from all users. In order for this to work you should've used a subquery to get max value e.g.
SELECT ....
GROUP BY userid
HAVING SUM(c.`price` * b.`quantity`) =
(
SELECT MAX(total) total
FROM
(
SELECT SUM(c.`price` * b.`quantity`) total
GROUP BY userid
) q
)

Why is 'ORDER BY' needed to get correct result from MySQL join?

I have the following query:
SELECT t.ID, t.caseID, time
FROM tbl_test t
INNER JOIN (
SELECT ID, MAX( TIME )
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
ORDER BY caseID DESC -- ERROR HERE!
) s
USING (ID)
It seems that I only get the correct result if I use the ORDER BY in the inner join. Why is that? I am using the ID for the join, so the order should take no effekt.
If I remove the order by, I get too old entries from the database.
ID is the primary key, the caseID is a kind of object with multiple entries with different timestamps.
This query is ambiguous:
SELECT ID, MAX( TIME )
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
It's ambiguous because it does not guarantee that it returns the ID of the row where the MAX(TIME) occurs. It returns the MAX(TIME) for each distinct value of caseID, but the value of other columns (like ID) is chosen arbitrarily from members of the group.
In practice, MySQL chooses the row that it finds first in the group as it scans rows in storage order.
Example:
caseID ID time
1 10 15:00
1 12 18:00
1 14 13:00
The max time is 18:00, which is the row with ID 12. But the query will return ID 10, simply because it's the first one in the group. If you were to reverse the order with ORDER BY, it would return ID 14. Still not the row where the max time is found, but it's from the other end of the group of rows.
Your query works with ORDER BY caseID DESC because, by coincidence, your Time values increase with the increasing ID.
This sort of query is actually an error in standard SQL and most other brands of SQL database. MySQL permits it, trusting that you know how to form an unambiguous query.
The fix is to use columns in the select-list only if they are unambiguous, that is, if they are in the GROUP BY clause, then each group is guaranteed to have only one distinct value:
SELECT caseID, MAX( TIME )
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
SELECT t.ID, t.caseID, time
FROM tbl_test t
INNER JOIN (
SELECT caseID, MAX( TIME ) maxtime
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
) s
ON t.caseID = s.caseID and t.time = s.maxtime
You are seeing that issue because you are getting the MAX(TIME) per caseID, but since you are grouping by caseID and NOT ID, you are getting an arbitrary ID. That happens because when you use an aggregate function, like MAX, you must, for every non-grouped field in the select specify how you want to aggregate it. That means, if it's in the SELECT and NOT in the GROUP BY, you have to tell MySQL how to aggregate. If you don't then you get a RANDOM row (well, not random per se, but it's not going to be in an order that you necessarily expect).
The reason ORDER BY is working for you, is that it kind of tricks the query optimizer into sorting the results before grouping, which just so happens to produce the result you want, but be warned, that will not always be the case.
What you want is the ID that has the MAX(TIME) given a caseID. Which means your INNER join needs to connect by caseID (not ID) and time (which will give you 1 row per each 1 row in the outer table).
Barmar beat me to the actual query, but that's the way you want to go.

Find most frequent value in SQL column

How can I find the most frequent value in a given column in an SQL table?
For example, for this table it should return two since it is the most frequent value:
one
two
two
three
SELECT
<column_name>,
COUNT(<column_name>) AS `value_occurrence`
FROM
<my_table>
GROUP BY
<column_name>
ORDER BY
`value_occurrence` DESC
LIMIT 1;
Replace <column_name> and <my_table>. Increase 1 if you want to see the N most common values of the column.
Try something like:
SELECT `column`
FROM `your_table`
GROUP BY `column`
ORDER BY COUNT(*) DESC
LIMIT 1;
Let us consider table name as tblperson and column name as city. I want to retrieve the most repeated city from the city column:
select city,count(*) as nor from tblperson
group by city
having count(*) =(select max(nor) from
(select city,count(*) as nor from tblperson group by city) tblperson)
Here nor is an alias name.
Below query seems to work good for me in SQL Server database:
select column, COUNT(column) AS MOST_FREQUENT
from TABLE_NAME
GROUP BY column
ORDER BY COUNT(column) DESC
Result:
column MOST_FREQUENT
item1 highest count
item2 second highest
item3 third higest
..
..
For use with SQL Server.
As there is no limit command support in that.
Yo can use the top 1 command to find the maximum occurring value in the particular column in this case (value)
SELECT top1
`value`,
COUNT(`value`) AS `value_occurrence`
FROM
`my_table`
GROUP BY
`value`
ORDER BY
`value_occurrence` DESC;
Assuming Table is 'SalesLT.Customer' and the Column you are trying to figure out is 'CompanyName' and AggCompanyName is an Alias.
Select CompanyName, Count(CompanyName) as AggCompanyName from SalesLT.Customer
group by CompanyName
Order By Count(CompanyName) Desc;
If you can't use LIMIT or LIMIT is not an option for your query tool. You can use "ROWNUM" instead, but you will need a sub query:
SELECT FIELD_1, ALIAS1
FROM(SELECT FIELD_1, COUNT(FIELD_1) ALIAS1
FROM TABLENAME
GROUP BY FIELD_1
ORDER BY COUNT(FIELD_1) DESC)
WHERE ROWNUM = 1
If you have an ID column and you want to find most repetitive category from another column for each ID then you can use below query,
Table:
Query:
SELECT ID, CATEGORY, COUNT(*) AS FREQ
FROM TABLE
GROUP BY 1,2
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY FREQ DESC) = 1;
Result:
Return all most frequent rows in case of tie
Find the most frequent value in mysql,display all in case of a tie gives two possible approaches:
Scalar subquery:
SELECT
"country",
COUNT(country) AS "cnt"
FROM "Sales"
GROUP BY "country"
HAVING
COUNT("country") = (
SELECT COUNT("country") AS "cnt"
FROM "Sales"
GROUP BY "country"
ORDER BY "cnt" DESC,
LIMIT 1
)
ORDER BY "country" ASC
With the RANK window function, available since MySQL 8+:
SELECT "country", "cnt"
FROM (
SELECT
"country",
COUNT("country") AS "cnt",
RANK() OVER (ORDER BY COUNT(*) DESC) "rnk"
FROM "Sales"
GROUP BY "country"
) AS "sub"
WHERE "rnk" = 1
ORDER BY "country" ASC
This method might save a second recount compared to the first one.
RANK works by ranking all rows, such that if two rows are at the top, both get rank 1. So it basically directly solves this type of use case.
RANK is also available on SQLite and PostgreSQL, I think it might be SQL standard, not sure.
In the above queries I also sorted by country to have more deterministic results.
Tested on SQLite 3.34.0, PostgreSQL 14.3, GitHub upstream.
Most frequent for each GROUP BY group
MySQL: MySQL SELECT most frequent by group
PostgreSQL:
Get most common value for each value of another column in SQL
https://dba.stackexchange.com/questions/193307/find-most-frequent-values-for-a-given-column
SQLite: SQL query for finding the most frequent value of a grouped by value
SELECT TOP 20 WITH TIES COUNT(Counted_Column) AS Count, OtherColumn1,
OtherColumn2, OtherColumn3, OtherColumn4
FROM Table_or_View_Name
WHERE
(Date_Column >= '01/01/2023') AND
(Date_Column <= '03/01/2023') AND
(Counted_Column = 'Desired_Text')
GROUP BY OtherColumn1, OtherColumn2, OtherColumn3, OtherColumn4
ORDER BY COUNT(Counted_Column) DESC
20 can be changed to any desired number
WITH TIES allows all ties in the count to be displayed
Date range used if date/time column exists and can be modified to search a date range as desired
Counted_Column 'Desired_Text' can be modified to only count certain entries in that column
Works in INSQL for my instance
One way I like to use is:
select *<given_column>*,COUNT(*<given_column>*)as VAR1 from Table_Name
group by *<given_column>*
order by VAR1 desc
limit 1