mysql select ordernumber by group - mysql

I'm trying to do something like 'select groupwise maximum', but I'm looking for groupwise order number.
so with a table like this
briefs
----------
id_brief | id_case | date
1 | 1 | 06/07/2010
2 | 1 | 04/07/2010
3 | 1 | 03/07/2010
4 | 2 | 18/05/2010
5 | 2 | 17/05/2010
6 | 2 | 19/05/2010
I want a result like this
breifs result
----------
id_brief | id_case | dateOrder
1 | 1 | 3
2 | 1 | 2
3 | 1 | 1
4 | 2 | 2
5 | 2 | 1
6 | 2 | 3
I think I want to do something like described here MySQL - Get row number on select, but I don't know how I would reset the variable for each id_case.

This will give you how many records are there with this id_case value and a date less than or equal to this date value.
SELECT t1.id_brief,
t1.id_case,
COUNT(t2.*) AS dateOrder
FROM yourtable AS t1
LEFT JOIN yourtable AS t2 ON t2.id_case = t1.id_case AND t2.date <= t1.date
GROUP BY t1.id_brief
Mysql is permissive about columns which can be queries using GROUP BY. With a more stric DBMS you may need GROUP BY t1.id_brief, t1.id_case.
I strongly advise you to have the right indexes on the table:
CREATE INDEX filter1 ON yourtabl (id_case, date)

Related

Using LIMIT in a subquery based on another field in MySQL

Is it possible to use LIMIT based on another column inside a subquery in MySQL? Here is a working query of what I mean.
SELECT id, name,
(SELECT AVG(value) FROM t2 WHERE t1id = t1.id ORDER BY value DESC LIMIT 4) as average
FROM t1
However I'd like to replace the "4" to a field inside t1.
Something like this where table t1 has fields id, name, size:
SELECT id, name,
(SELECT AVG(value) FROM t2 WHERE t1id = t1.id ORDER BY value DESC LIMIT t1.size) as average
FROM t1
I could join t1 and t2, but I'm not sure that works for this. Does it?
Edit:
Here's some sample data to show what I mean:
Table t1
| id | name | Size |
|----|------|------|
| 1 | Bob | 4 |
| 2 | Joe | 3 |
| 3 | Sam | 4 |
Table t2
| t1id | value |
|------|-------|
| 1 | 16 |
| 1 | 14 |
| 1 | 12 |
| 1 | 10 |
| 1 | 8 |
| 2 | 10 |
| 2 | 8 |
| 2 | 6 |
| 2 | 4 |
| 3 | 20 |
| 3 | 15 |
| 3 | 10 |
| 3 | 5 |
| 3 | 2 |
Expected result:
| id | name | avg |
|----|------|------|
| 1 | Bob | 13 |
| 2 | Joe | 8 |
| 3 | Sam | 12.5 |
Notice that the average is the average of only the top t1.size values. For example the average for Bob is 13 and not 12 (based on 4 values and not 5) and the average for Joe is 8 and not 7 (based on 3 values and not 4).
In MySQL, you have little choice other than LEFT JOIN and aggregation:
SELECT t1.id, t1.name, AVG(t2.value) as average
FROM t1 LEFT JOIN
(SELECT t2.*,
ROW_NUMBER() OVER (PARTITION BY t1id ORDER BY VALUE desc) as seqnum
FROM t2
) t2
on t2.t1id = t1.id AND seqnum <= t1.size
GROUP BY t1.id, t1.name;
Here is a db<>fiddle.
No, you cannot use a column reference in a LIMIT clause.
https://dev.mysql.com/doc/refman/8.0/en/select.html has detailed documentation about MySQL's SELECT statement including all its clauses.
It says:
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants, with these exceptions:
Within prepared statements, LIMIT parameters can be specified using ? placeholder markers.
Within stored programs, LIMIT parameters can be specified using integer-valued routine parameters or local variables.
Expressions, including subqueries, are not mentioned as legal argument in the LIMIT clause.
A simple solution would be to do your task in two queries: the first to get the size and then use that value as a constant value in the second query that includes the LIMIT.
Not every task needs to be done in a single SQL statement.

Select count of rows matching a condition grouped by increments of Id in MySQL

I have a table that has an autoincremented numeric primary. I'm trying to get a count of rows that match a condition grouped by increments of their primary key. Given the data:
| id | value |
|----|-------|
| 1 | a |
| 2 | b |
| 3 | a |
| 4 | a |
| 5 | b |
| 6 | a |
| 7 | b |
| 8 | a |
| 9 | b |
| 10 | b |
| 11 | a |
| 12 | b |
If I wanted to know how many rows matched value = 'a' for every five rows, the result should be:
| count(0) |
|----------|
| 3 |
| 2 |
| 1 |
I can nest a series of subqueries in the SELECT statement, like such:
SELECT (SELECT count(0)
FROM table
WHERE value = 'a'
AND id > 0
AND id <= 5) AS `1-5`,
(SELECT count(0)
FROM table
WHERE value = 'a'
AND id > 5
AND id <=10) AS `6-10`,
...
But is there a way to do this with a GROUP BY statement or something similar where I don't have to manually write out the increments? If not, is there a more time efficient method than a series of subqueries in the SELECT statement as in the above example?
You could divide the ID by 5 and then ceil the result:
SELECT CONCAT((CEIL(id / 5.0) - 1) * 5, '-', CEIL(id / 5.0) * 5), COUNT(*)
FROM mytable
WHERE value = 'a'
GROUP BY CEIL(id / 5.0)
The following aggregated query should do the trick :
SELECT CEIL(id/5), COUNT(*)
FROM table
WHERE value = 'a'
GROUP BY CEIL(id/5)

I need to get the average for every 3 records in one table and update column in separate table

Table Mytable1
Id | Actual
1 ! 10020
2 | 12203
3 | 12312
4 | 12453
5 | 13211
6 | 12838
7 | 10l29
Using the following syntax:
SELECT AVG(Actual), CEIL((#rank:=#rank+1)/3) AS rank FROM mytable1 Group BY rank;
Produces the following type of result:
| AVG(Actual) | rank |
+-------------+------+
| 12835.5455 | 1 |
| 12523.1818 | 2 |
| 12343.3636 | 3 |
I would like to take AVG(Actual) column and UPDATE a second existing table Mytable2
Id | Predict |
1 | 11133
2 | 12312
3 | 13221
I would like to get the following where the Actual value matches the ID as RANK
Id | Predict | Actual
1 | 11133 | 12835.5455
2 | 12312 | 12523.1818
3 | 13221 | 12343.3636
IMPORTANT REQUIREMENT
I need to set an offset much like the following syntax:
SELECT #rank := #rank + 1 AS Id , Mytable2.Actual FROM Mytable LIMIT 3 OFFSET 4);
PLEASE NOTE THE AVERAGE NUMBER ARE MADE UP IN EXAMPLES
you can join your existing query in the UPDATE statement
UPDATE Table2 T2
JOIN (
SELECT AVG(Actual) as AverageValue,
CEIL((#rank:=#rank+1)/3) AS rank
FROM Table1, (select #rank:=0) t
Group BY rank )T1
on T2.id = T1.rank
SET Actual = T1.AverageValue

Remove duplicates from one column keeping whole rows

id | userid | total_points_spent
1 | 1 | 10
2 | 2 | 15
3 | 2 | 50
4 | 3 | 5
5 | 1 | 15
With the above table, I would first like to remove duplicates of userid keeping the rows with the largest total_points_spent, like so:
id | userid | total_points_spent
3 | 2 | 50
4 | 3 | 5
5 | 1 | 15
And then I would like to sum the values of total_points_spent, which would be the easy part, resulting in 70.
I am not really sure the "remove" you meant is to delete or to select. Here is the query for select only max totalpointspend record respectively.
SELECT tblA.*
FROM ( SELECT userid, MAX(totalpointspend) AS maxtotal
FROM tblA
GROUP BY userid ) AS dt
INNER JOIN tblA
ON tblA.userid = dt.userid
AND tblA.totalpointspend = dt.maxtotal
ORDER BY tblA.userid

Complex Join - involving date ranges and sum

I have two tables that I need to join... I want to join table1 and table2 on 'id' - however in table two id is not unique. I only want one value returned for table two, and this value represents the sum of a column called 'total_sold' - within a specified date range (say one month), however I want more than one date range at the same time...
SELECT ta.id, sum(tb.total_sold) as total_sold_this_week, sum(tc.total_sold) as total_sold_this_month
FROM table_a as ta
LEFT JOIN table_b as tb ON ta.id=tb.id AND tb.date_sold BETWEEN ADDDATE(NOW(),INTERVAL -1 WEEK) AND NOW()
LEFT JOIN table_b as tc ON ta.id=tc.id AND tc.date_sold BETWEEN ADDDATE(NOW(),INTERVAL -1 MONTH) AND NOW()
GROUP BY ta.id
this works but does not SUM the rows - only returning one row for each id... how do I get the sum from table b instead of only one row???
Please criticise if format of question could use more work - I can rewrite and provide sample data if required - this is a trivialised version of a much larger problem.
-Thanks
Using Subqueries
One way to solve this would be to use subqueries. LEFT JOIN creates a new "result" for each match in the right table, so using two LEFT JOINs is creating more ROWS than you want. You could just sub select the value you want, but this can be slow:
SELECT ta.id,
(SELECT SUM(total_sold) as total_sold
FROM table_b
WHERE date_sold BETWEEN ADDDATE(NOW(), INTERVAL -1 WEEK) AND NOW()
AND id=ta.id) as total_sold_this_week,
(SELECT SUM(total_sold) as total_sold
FROM table_b
WHERE date_sold BETWEEN ADDDATE(NOW(), INTERVAL -1 MONTH) AND NOW()
AND id = ta.id) as total_sold_this_month
FROM table_a ta;
Result:
+----+----------------------+-----------------------+
| id | total_sold_this_week | total_sold_this_month |
+----+----------------------+-----------------------+
| 1 | 3 | 7 |
| 2 | 4 | 4 |
| 3 | NULL | NULL |
+----+----------------------+-----------------------+
3 rows in set (0.04 sec)
Using SUM(CASE ...)
This method doesn't use subqueries (and will likely be faster on larger data sets). We want to join the table_a and table_b together once, using our "biggest" date range, and then use a SUM() based on a CASE to calculate the "smaller range".
SELECT ta.*,
SUM(total_sold) as total_sold_last_month,
SUM(CASE
WHEN date_sold BETWEEN NOW() - INTERVAL 1 WEEK AND NOW()
THEN total_sold
ELSE 0
END) as total_sold_last_week
FROM table_a AS ta
LEFT JOIN table_b AS tb
ON ta.id=tb.id AND tb.date_sold BETWEEN ADDDATE(NOW(),INTERVAL -1 MONTH) AND NOW()
GROUP BY ta.id;
This returns nearly the same resultset as the subquery example:
+----+-----------------------+----------------------+
| id | total_sold_last_month | total_sold_last_week |
+----+-----------------------+----------------------+
| 1 | 7 | 3 |
| 2 | 4 | 4 |
| 3 | NULL | 0 |
+----+-----------------------+----------------------+
3 rows in set (0.00 sec)
The only difference is the 0 instead of NULL. You could summarize as many date ranges as you'd like using this method, but its still probably best to limit the rows returned to the largest range in the ON clause.
Just to show how it works: removing the GROUP BY and SUM() calls, and adding date_sold to the SELECT returns this:
+----+------------+-----------------------+----------------------+
| id | date_sold | total_sold_last_month | total_sold_last_week |
+----+------------+-----------------------+----------------------+
| 1 | 2010-04-30 | 2 | 2 |
| 1 | 2010-04-24 | 2 | 0 |
| 1 | 2010-04-24 | 2 | 0 |
| 1 | 2010-05-03 | 1 | 1 |
| 2 | 2010-05-03 | 4 | 4 |
| 3 | NULL | NULL | 0 |
+----+------------+-----------------------+----------------------+
6 rows in set (0.00 sec)
Now when you GROUP BY id, and SUM() the two total_sold columns you have your results!
Old Advice
Before you brought the two different date ranges into the mix, you could use GROUP BY to group using the table id on table1, and the SUM() aggregate function to add up the rows returned.
SELECT ta.id, SUM(tb.total_sold) as total_sold_this_week
FROM table_a as ta
LEFT JOIN table_b as tb
ON ta.id=tb.id AND tb.date_sold BETWEEN ADDDATE(NOW(),INTERVAL -3 WEEK) AND NOW()
GROUP BY ta.id
+----+----------------------+
| id | total_sold_this_week |
+----+----------------------+
| 1 | 7 |
| 2 | 4 |
| 3 | NULL |
+----+----------------------+
3 rows in set (0.00 sec)
The test data
NOW() is 2010-05-03
mysql> select * from table_a; select * from table_b;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
3 rows in set (0.00 sec)
+----+------------+------------+
| id | date_sold | total_sold |
+----+------------+------------+
| 1 | 2010-04-24 | 2 |
| 1 | 2010-04-24 | 2 |
| 1 | 2010-04-30 | 2 |
| 1 | 2010-05-03 | 1 |
| 2 | 2010-05-03 | 4 |
+----+------------+------------+
5 rows in set (0.00 sec)