Mysql Group By condition - mysql

I want to Apply condition for GROUP BY.
When the condition city_id != 0 is true, group the list. Otherwise normal list.
I used this query for that:
(
SELECT city_id, sum(sales) as counts
FROM product_sales
WHERE city_id !=0
GROUP BY city_id
)
UNION
(
SELECT city_id, sales
FROM product_sales
WHERE city_id =0
ORDER BY sales_id
)
Anyone can help me avoid the UNION and get the list in a single query?

One idea : GROUP BY the city_id when it is not zero, else emulate a random unique value for grouping with UUID(). So each row with city_id = 0 will not be grouped.
select city_id, sum(sales)
from product_sales
group by
case when city_id = 0
then UUID()
else city_id
end
SQL Fiddle.
A UUID is designed as a number that is globally unique in space and
time. Two calls to UUID() are expected to generate two different
values, even if these calls are performed on two separate computers
that are not connected to each other.

Related

Sql query max, group by

I am trying to get all students group by class_id, student_id, teacher_id
SO what I mean is this one :
Select id,class_id, student_id,teacher_id, max(active)
FROM student_classes
GROUP BY class_id, student_id, teacher_id
But this is what I get
Actually what I want as a result is:
114 137 1 47 1
108 138 2 49 0
113 197 3 47 1
So basically the problem is at the third row. Instead of having id = 113 I get ID=111.
What should I do in this case? Can you please help me with the query
As mentioned in the comments, MySQL allows something against the SQL standard, letting you include a non-aggregated column (in this case id) in the select list of a query that includes a group by. As far as I know, it will arbitrarily pick one row in each grouping and display the id value from that row.
If you have a specific rule about which id value you want to see, you need to express that in your query.
By the way, your desired output appears to have multiple typos (e.g. 197, which doesn't appear in your data at all).
From your comment (which you should edit into your original question), and your desired output, I think the rule you want for the id column is:
If there are any rows with active=1 in the group, choose the maximum id value from those rows
If all rows in the group have active=0, choose the minimum id value. (You didn't say this specifically; I'm assuming it based on the presence of 108 on the second row of your desired output.)
I think that this query will produce those results. (And also eliminate the non-standard MySQL behavior.)
SELECT
COALESCE(
MAX(CASE WHEN active=1 THEN id ELSE NULL END),
MIN(id)
) AS some_id
class_id, student_id, teacher_id, max(active)
FROM student_classes
GROUP BY class_id, student_id, teacher_id
MySQL versions 5.5, 5.6 works as you coded. But actually it's not correct. With version 5.7 and higher it will throw error. The error will be like "SELECT list is not in GROUP BY clause and contains nonaggregated column 'student_classes.id'..."
Therefore it seems your DB version is old and maybe this code should work as you wanted
select
---------
min(x.id) as id,
---------
x.class_id,
x.student_id,
x.active
from student_classes x
inner join (select
class_id,
student_id,
teacher_id,
---------
max(active) max_active
---------
from student_classes x
group by class_id, student_id, teacher_id
) y
on x.class_id = y.class_id and
x.student_id = y.student_id and
x.teacher_id = y.teacher_id and
x.active = y.max_active
group by x.class_id, x.student_id, x.active
order by id, class_id, student_id
;
You don't want an aggregation actually, but rather pick particular rows. The rule for picking a row is: Per class_id, student_id, teacher_id get the one with the maximum active and in case of a tie the lowest id. This is a ranking of rows.
As of MySQL 8 you can use a window function like ROW_NUMBER to rank rows:
select *
from
(
select
sc.*,
row_number() over (partition by class_id, student_id, teacher_id
order by active desc, id) as rn
from student_classes sc
) with_wanted_id
where rn = 1;
In older versions you could use NOT EXISTS to exclude rows for which a better row exists:
select *
from student_classes sc1
where not exists
(
select null
from student_classes sc2
where sc2.class_id = sc1.class_id
and sc2.student_id = sc1.student_id
and sc2.teacher_id = sc1.teacher_id
and
(
sc2.active > sc1.active
or
(sc2.active = sc1.active and sc2.id < sc1.id)
)
);

Group by - do not Group NULL and control record displayed

Similar to this question: group by not-null values
I'm trying to only group my records that have the column groupID not null:
+--+-------+------+-----+-----+----------+
|id|groupId|isMain|name |stars|created |
+--+-------+------+-----+-----+----------+
..1..abcd....1.....john....5...2018-06-01.
..2..NULL....0.....albert..3...2018-05-01.
..3..abcd....0.....clara...1...2018-06-01.
..4..NULL....0.....steph...2...2018-07-01.
With this query I'm able to group only those records where groupId is not null:
SELECT *, SUM(stars) as stars
FROM table AS
GROUP BY (case when `groupId` is null then id else `group` end)
ORDER BY created DESC
This gives me the result:
4..NULL....0.....steph...2...2018-07-01
3..NULL....0.....clara...6...2018-06-01
2..NULL....0.....albert..3...2018-05-01
I'm trying to select, for those records grouped, the ones that isMain is 1 but I have no clue how to achieve that.
I've tried playing with HAVING but that gives me a totally different result.
You could use CASE or IFNULL
but you should use proper aggregation function and group by columns clause eg:
select ifnull(groupID, id), name, sum(stars), max(created) as my_create
from table
group by ifnull(groupID, id), name
order by my_create

How can I get a column value from another table outside of UNION ALL?

In my SQL I am getting transactions relating to a user and a business. However, I also need to get the name of the business. It is found in column business_name under table Businesses. In my example SQL, I would want to get the business name for business_id=1. My current code works aside from not getting the business name.
(SELECT TRUNCATE(code_reward_amount, 2) AS amount, UNIX_TIMESTAMP(code_redeemed_date) AS date, 0 AS action_number
FROM CodesRedeemed
WHERE code_redeemed_by_user_id=191 AND code_business_id=1)
UNION ALL
(SELECT TRUNCATE(action_amount, 2) AS amount, UNIX_TIMESTAMP(action_date) AS date, action_number
FROM BusinessAccountActions
WHERE action_user_id=191 AND action_business_id=1)
ORDER BY date DESC
LIMIT 100
In my second code attempt, it does get the business name, however, it is not efficient to do the select in every row since the business name would be the same for each row. How can I do it once and apply it to each row? Perhaps somewhere outside of the UNION ALL? Here is my working code, however, I would like to optimize it so it doesn't SELECT from Businesses for the business_name in every single row (since the business_name is guaranteed to be the same for all rows since they share the same business_id).
(SELECT TRUNCATE(code_reward_amount, 2) AS amount, UNIX_TIMESTAMP(code_redeemed_date) AS date, 0 AS action_number, (SELECT business_name FROM Businesses WHERE business_id=1) AS business_name
FROM CodesRedeemed
WHERE code_redeemed_by_user_id=191 AND code_business_id=1)
UNION ALL
(SELECT TRUNCATE(action_amount, 2) AS amount, UNIX_TIMESTAMP(action_date) AS date, action_number, (SELECT business_name FROM Businesses WHERE business_id=1) AS business_name
FROM BusinessAccountActions
WHERE action_user_id=191 AND action_business_id=1)
ORDER BY date DESC
LIMIT 100
business_id would change depending on the business. I am just testing it for business_id 1 right now. How would I optimize (mainly not checking for business_name in every single row)? Thank you.
Use a JOIN.
SELECT u.amount, u.date, b.business_name, u.action_number
FROM (
(SELECT TRUNCATE(code_reward_amount, 2) AS amount, UNIX_TIMESTAMP(code_redeemed_date) AS date, 0 AS action_number
FROM CodesRedeemed
WHERE code_redeemed_by_user_id=191 AND code_business_id=1)
UNION ALL
(SELECT TRUNCATE(action_amount, 2) AS amount, UNIX_TIMESTAMP(action_date) AS date, action_number
FROM BusinessAccountActions
WHERE action_user_id=191 AND action_business_id=1)
ORDER BY date DESC
LIMIT 100) AS u
CROSS JOIN Businesses AS b
WHERE b.business_id = 1
Using a JOIN as Bamar suggested is a perfectly acceptable way to do it, and is how I would most likely do it.
However, you could use a user defined variable and replace that additional select with that.
SELECT business_name FROM Businesses WHERE business_id=1 LIMIT 1 INTO #bname;
(SELECT TRUNCATE(code_reward_amount, 2) AS amount, UNIX_TIMESTAMP(code_redeemed_date) AS date, 0 AS action_number, (SELECT business_name FROM Businesses WHERE business_id=1) AS business_name
FROM CodesRedeemed
WHERE code_redeemed_by_user_id=191 AND code_business_id=1)
UNION ALL
(SELECT TRUNCATE(action_amount, 2) AS amount, UNIX_TIMESTAMP(action_date) AS date, action_number, #bname AS business_name
FROM BusinessAccountActions
WHERE action_user_id=191 AND action_business_id=1)
ORDER BY date DESC
LIMIT 100

MySQL -- Finding % of orders with a transaction failure

I have an order_transactions table with 3 relevant columns. id (unique id for the transaction attempt), order_id (the id of the order for which the attempt is being made), and success an int which is 0 if failed, and 1 if successful.
There can be 0 or more failed transactions before a successful transaction, for each order_id.
The question is, how do I find:
The number of orders which never had a successful transaction
The number of orders which had a transaction with a failure (eventually successful or not)
The number of orders which never had a failed transaction (success only)
I realize this is some combination of distinct, group by, maybe a subselect, etc, I'm just not well versed in this enough. Thanks.
To get the number of orders which never had a successful transaction you can use:
SELECT COUNT(*)
FROM (
SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN success = 1 THEN 1 END) = 0) AS t
Demo here
The number of orders which had a transaction with a failure (eventually successful or not) can be obtained using the query:
SELECT COUNT(*)
FROM (
SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN success = 0 THEN 1 END) > 0) AS t
Demo here
Finally, to get the number of orders which never had a failed transaction (success only):
SELECT COUNT(*)
FROM (
SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN success = 0 THEN 1 END) = 0) AS t
Demo here
You want "counts" of orders that meet specific conditions over multiple rows, so I'd start with a GROUP BY order_id
SELECT ...
FROM mytable t
GROUP BY t.order_id
To find out if a particular order ever had a failed transaction, etc. we can use aggregates on expressions that "test" for conditions.
For example:
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
The expressions in the SELECT list of that query are MySQL shorthand. We could use longer expressions (MySQL IF() function or ANSI CASE expressions) to achieve an equivalent result, e.g.
CASE WHEN t.success = 1 THEN 1 ELSE 0 END
We could include the `order_id` column in the SELECT list for testing. We can compare the results for each order_id to the rows in the original table, to verify that the results returned meet the specification.
To get "counts" of orders, we can reference the query as an inline view, and use aggregate expressions in the SELECT list.
For example:
SELECT SUM(r.succeeded) AS cnt_succeeded
, SUM(r.failed) AS cnt_failed
, SUM(r.never_succeeded) AS cnt_never_succeeded
FROM (
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
) r
Since the expressions in the SELECT list return either 0, 1 or NULL, we can use the SUM() aggregate to get a count. To make use of a COUNT() aggregate, we would need to return NULL in place of a 0 (FALSE) value.
SELECT COUNT(IF(r.succeeded,1,NULL)) AS cnt_succeeded
, COUNT(IF(r.failed,1,NULL)) AS cnt_failed
, COUNT(IF(r.never_succeeded,1,NULL)) AS cnt_never_succeeded
FROM (
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
) r
If you want a count of all order_id, add a COUNT(1) expression in the outer query. If you need percentages, do the division and multiply by 100,
For example
SELECT SUM(r.succeeded) AS cnt_succeeded
, SUM(r.failed) AS cnt_failed
, SUM(r.never_succeeded) AS cnt_never_succeeded
, SUM(1) AS cnt_all_orders
, SUM(r.failed)/SUM(1)*100.0 AS pct_with_a_failure
, SUM(r.succeeded)/SUM(1)*100.0 AS pct_succeeded
, SUM(r.never_succeeded)/SUM(1)*100.0 AS pct_never_succeeded
FROM (
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
) r
(The percentages here are a comparison to the count of distinct order_id values, not as the total number of rows in the table).
successful order
select count(*) from
( select distinct order_id from my_table where success = 1 ) as t;
unsuccessful order
select count(*) from
( select distinct order_id from my_table where success = 0 ) as t;
never filed transaction
select count(*) from
( select distintc order_id from my_table where id not in
(select distinct order_id from my_table where success = 0) ) as t;

Retrieving last row inserted in table for each "parameter"

I have a table, currently about 1.3M rows which stores measured data points for a couple of different parameters. It is a bout 30 parameters.
Table:
* id
* station_id (int)
* comp_id (int)
* unit_id (int)
* p_id (int)
* timestamp
* value
I have a UNIQUE index on: (station_id, comp_id, unit_id, p_id, timestamp)
Due to timestamp differ for every parameter i have difficulties sorting by the timestamp (I have to use a group by).
So today I select the last value for each parameter by this query:
select p_id, timestamp, value
from (select p_id, timestamp, value
from table
where station_id = 3 and comp_id = 9112 and unit_id = 1 and
p_id in (1,2,3,4,5,6,7,8,9,10)
order by timestamp desc
) table_x
group by p_id;
This query takes about 3 seconds to execute.
Even though i have index as mentioned before the optimizer uses filesort to find the values.
Querying for only 1 specific parameter:
select p_id, timestamp, value from table where station_id = 3 and comp_id = 9112 and unit_id = 1 and p_id =1 order by timestamp desc limit 1;
Takes no time (0.00).
I've also tried joining the parameter-ids to a table which I store the parameter ID's in without luck.
So, is there a simple ( & fast) way to ask for the latest values for a couple of rows with different parameters?
Doing a procedure running a loop asking for each parameter individually seems much faster than asking all for once which I think not is the way to use a database.
Your query is incorrect. You are aggregating by p_id, but including other columns. These come from indeterminate rows, and the documentation is quite clear:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
Furthermore, the selection of values from each group cannot be
influenced by adding an ORDER BY clause.
The following should work:
select p_id, timestamp, value
from table t join
(select p_id, max(timestamp) as maxts
from table
where station_id = 3 and comp_id = 9112 and unit_id = 1 and
p_id in (1,2,3,4,5,6,7,8,9,10)
order by timestamp desc
) tt
on tt.pid = t.pid and tt.timestamp = t.maxts;
The best index for this query is a composite index on table(station_id, comp_id, unit_id, p_id, timestamp).