Mysql recursive substracting and multiplying grouped values - mysql

Couldn't really explain my problem with words, but with an example I can show it clearly:
I have a table like this:
id num val flag
0 3 10 1
1 5 12 2
2 7 12 1
3 11 15 2
And I want to go through all the rows, and calculate the increase of the "num", and multiply that difference with the "val" value. And when I calculated all of these, I want to add these results together, but grouped based on the "flag" values.
This is the mathematical equation, that I want to run on the table:
Result_1 = (3-0)*10 + (7-3)*12
Result_2 = (5-0)*12 + (11-5)*15
78 = Result_1
150 = Result_2
Thank you.

Interesting question. Unfortunately MYSQL doesn't support recursive queries, so you'll need to be a little creative here. Something like this could work:
select flag,
sum(calc)
from (
select flag,
(num-if(#prevflag=flag,#prevnum,0))*val calc,
#prevnum:=num prevnum,
#prevflag:=flag prevflag
from yourtable
join (select #prevnum := 0, #prevflag := 0) t
order by flag
) t
group by flag
SQL Fiddle Demo

Related

SQL to club records in sequence

I have data in MySQL table, my data looks like
Key, value
A 1
A 2
A 3
A 6
A 7
A 8
A 9
B 1
B 2
and I want to group it based on the continuous sequence. Data is sorted in the table.
Key, min, max
A 1 3
A 6 9
B 1 2
I tried googling it but could find any solution to it. Can someone please help me with this.
This is way easier with a modern DBMS that support window functions, but you can find the upper bounds by checking that there is no successor. In the same way you can find the lower bounds via absence of a predecessor. By combining the lowest upper bound for each lower bound we get the intervals.
select low.keyx, low.valx, min(high.valx)
from (
select t1.keyx, t1.valx from t t1
where not exists (
select 1 from t t2
where t1.keyx = t2.keyx
and t1.valx = t2.valx + 1
)
) as low
join (
select t3.keyx, t3.valx from t t3
where not exists (
select 1 from t t4
where t3.keyx = t4.keyx
and t3.valx = t4.valx - 1
)
) as high
on low.keyx = high.keyx
and low.valx <= high.valx
group by low.keyx, low.valx;
I changed your identifiers since value is a reserved world.
Using a window function is way more compact and efficient. If at all possible, consider upgrading to MySQL 8+, it is superior to 5.7 in so many aspects.
We can create a group by looking at the difference between valx and an enumeration of the vals, if there is a gap the difference increases. Then, we simply pick min and max for each group:
select keyx, min(valx), max(valx)
from (
select keyx, valx
, valx - row_number() over (partition by keyx order by valx) as grp
from t
) as tt
group by keyx, grp;
Fiddle

Reorganizing mysql aggregate row into single piece rows

Consider the following mysql table:
ID WeightS AmountS WeightM AmountM WeightL AmountL Someothercolumnshere
1 6 3 10 2 18 2 ...
I need to reorganize this data into a pivot-friendly table, where each piece in the amount columns should be one result row. E.g. from the first two columns, WeightS and AmountS, the SELECT should produce 3 result rows, each having a weight of 2 kgs (=6 kgs total). So the full result table should be like this:
Weight Someothercolumnshere
2 ...
2 ...
2 ...
5 ...
5 ...
9 ...
9 ...
I don't even know if there's a SQL syntax which is able to do this kind of operation? I've never had a request like this before. Worst case scenario, I have to do it in php instead, but I think MYSQL is a lot more fun :p
I've built the schema on sqlfiddle, but I'm afraid that's all I've got.
You need a Tally table for the task like this. Create as much rows as needed in it.
Create table Tally(`N` int);
insert into Tally( `N`) values(1),(2),(3),(4),(5);
Then
(select `ID`, `WeightS`/`AmountS`, `Someothercolumnshere`
from Catches
join Tally on Catches.`AmountS` >= Tally.`N`
)
UNION ALL
(select `ID`, `WeightL`/`AmountL`, `Someothercolumnshere`
from Catches
join Tally on Catches.`AmountL` >= Tally.`N`
)
UNION ALL
(select `ID`, `WeightM`/`AmountM`, `Someothercolumnshere`
from Catches
join Tally on Catches.`AmountM` >= Tally.`N`
)

Mysql single column result to multiple column result

I have a problem with a MySQL query, the problem is I have the following table:
id, rep, val dates
1 rep1 200 06/01/2014
2 rep2 300 06/01/2014
3 rep3 400 06/01/2014
4 rep4 500 06/01/2014
5 rep5 100 06/01/2014
6 rep1 200 02/06/2014
7 rep2 300 02/06/2014
8 rep3 900 02/06/2014
9 rep4 700 02/06/2014
10 rep5 600 02/06/2014
and I want a result like this:
rep 01/06/2014 02/06/2014
rep1 200 200
rep2 300 300
rep3 400 900
rep4 500 700
rep5 100 600
thank you very much!
You seem to want the most recent row for each rep. Here is an approach that often performs well:
select t.*
from table t
where not exists (select 1
from table t2
where t2.repid = t.repid and
t2.id > t.id
);
This transforms the problem to: "Get me the rows in table t where there is no other row with the same repid and a larger id." That is the same logic as getting the last one, just convoluted a bit to help the database know what to do.
For performance reasons, an index on t(repid, id) is helpful.
You seem to want the val for each of the dates.
Assuming the dates you are interested in are fixed then you can do that as follows. For output date column you check of the row matches the date for that column. If so you use the value of val , if not you just use 0. Then you sum all the resulting values, grouping by rep. I have assumed a fixed format of date.
SELECT rep, SUM(IF(dates='2014/06/01'), val, 0) AS '2014/06/01', SUM(IF(dates='2014/06/02'), val, 0) AS '2014/06/02'
FROM sometable
GROUP BY rep
Or if you just wanted the highest val for each day
SELECT rep, MAX(IF(dates='2014/06/01'), val, 0) AS '2014/06/01', MAX(IF(dates='2014/06/02'), val, 0) AS '2014/06/02'
FROM sometable
GROUP BY rep
If the number of dates is variable then not really a direct way to do it (as the number of resulting columns would vary). It would be easiest to do this manly in your calling script based on the following, giving you one row per rep / possible date with a sum of the values of val for that rep / date combination:-
SELECT rep, sub0.dates, SUM(IF(sometable.dates=sub0.dates), val, 0)
FROM sometable
CROSS JOIN
(
SELECT DISTINCT dates
FROM sometable
) sub0
GROUP BY rep, sub0.dates

how can I tell if the last x rows of 'state' = 1

I need help with a SQL query.
I have a table with a 'state' column. 0 means closed and 1 means opened.
Different users want to be notified after there have been x consecutive 1 events.
With an SQL query, how can I tell if the last x rows of 'state' = 1?
If, for example, you want to check if the last 5 consecutive rows have a state equals to 1, then here's you could probably do it :
SELECT IF(SUM(x.state) = 5, 1, 0) AS is_consecutive
FROM (
SELECT state
FROM table
WHERE Processor = 3
ORDER BY Status_datetime DESC
LIMIT 5
) as x
If is_consecutive = 1, then, yes, there is 5 last consecutive rows with state = 1.
Edit : As suggested in the comments, you'll have to use ORDER BY in your query, to get the last nth rows.
And for more accuracy, since you have a timestamp column, you should use Status_datetime to order the rows.
You should be able to use something like this (replace the number in the HAVING with the value of x you want to check for):
SELECT Processor, OpenCount FROM
(
SELECT TOP 10 Processor, DateTime, Sum(Status) AS OpenCount
FROM YourTable
WHERE Processor = 3
ORDER BY DateTime DESC
) HAVING OpenCount >= 10

Need Help streamlining a SQL query to avoid redundant math operations in the WHERE and SELECT

*Hey everyone, I am working on a query and am unsure how to make it process as quickly as possible and with as little redundancy as possible. I am really hoping someone there can help me come up with a good way of doing this.
Thanks in advance for the help!*
Okay, so here is what I have as best I can explain it. I have simplified the tables and math to just get across what I am trying to understand.
Basically I have a smallish table that never changes and will always only have 50k records like this:
Values_Table
ID Value1 Value2
1 2 7
2 2 7.2
3 3 7.5
4 33 10
….50000 44 17.2
And a couple tables that constantly change and are rather large, eg a potential of up to 5 million records:
Flags_Table
Index Flag1 Type
1 0 0
2 0 1
3 1 0
4 1 1
….5,000,000 1 1
Users_Table
Index Name ASSOCIATED_ID
1 John 1
2 John 1
3 Paul 3
4 Paul 3
….5,000,000 Richard 2
I need to tie all 3 tables together. The most results that are likely to ever be returned from the small table is somewhere in the neighborhood of 100 results. The large tables are joined on the index and these are then joined to the Values_Table ON Values_Table.ID = Users_Table.ASSOCIATED_ID …. That part is easy enough.
Where it gets tricky for me is that I need to return, as quickly as possible, a list limited to 10 results where value1 and value2 are mathematically operated on to return a new_ value where that new_value is less than 10 and the result is sorted by that new_value and any other where statements I need can be applied to the flags. I do need to be able to move along the limit. EG LIMIT 0,10 / 11,10 / 21,10 etc...
In a subsequent (or the same if possible) query I need to get the top 10 count of all types that matched that criteria before the limit was applied.
So for example I want to join all of these and return anything where Value1 + Value2 < 10 AND I also need the count.
So what I want is:
Index Name Flag1 New_Value
1 John 0 9
2 John 0 9
5000000 Richard 1 9.2
The second response would be:
ID (not index) Count
1 2
2 1
I tried this a few ways and ultimately came up with the following somewhat ugly query:
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
ORDER BY New_Value
LIMIT 0,10
And then for the count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
Being able to filter on the different flags and such in my WHERE clause is important; that may sound stupid to comment on but I mention that because from what I could see a quicker method would have been to use the HAVING statement but I don't believe that will work in certain instance depending on what I want to use my WHERE clause to filter against.
And when filtering using the flags table :
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
ORDER BY New_Value
LIMIT 0,10
...filtered count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
That works fine but has to run the math multiple times for each row, and I get the nagging feeling that it is also running the math multiple times on the same row in the Values_table table. My thought was that I should just get only the valid responses from the Values_table first and then join those to the other tables to cut down on the processing; with how SQL optimizes things though I wasn't sure if it might not already be doing that. I know I could use a HAVING clause to only run the math once if I did it that way but I am uncertain how I would then best join things.
My questions are:
Can I avoid running that math twice and still make the query work
(or I suppose if there is a good way
to make the first one work as well
that would be great)
What is the fastest way to do this
as this is something that will
be running very often.
It seems like this should be painfully simple but I am just missing something stupid.
I contemplated pulling into a temp table then joining that table to itself but that seems like I would trade math for iterations against the table and still end up slow.
Thank you all for your help in this and please let me know if I need to clarify anything here!
** To clarify on a question, I can't use a 3rd column with the values pre-calculated because in reality the math is much more complex then addition, I just simplified it for illustration's sake.
Do you have a benchmark query to compare against? Usually it doesn't work to try to outsmart the optimizer. If you have acceptable performance from a starting query, then you can see where extra work is being expended (indicated by disk reads, cache consumption, etc.) and focus on that.
Avoid the temptation to break it into pieces and solve those. That's an antipattern. That includes temp tables especially.
Redundant math is usually ok - what hurts is disk activity. I've never seen a query that needed CPU work reduction on pure calculations.
Gather your results and put them in a temp table
SELECT * into TempTable FROM (SELECT INDEX, NAME, Type, ID, Flag1, (Value1 + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE New_Value < 10)
ORDER BY New_Value
LIMIT 0,10
Return Result for First Query
SELECT INDEX, NAME, Flag1, New_Value
FROM TempTable
Return Results for count of Types
Select ID, Count(Type)
FROM TempTable
GROUP BY TYPE
Is there any chance that you can add a third column to the values_table with the pre-calculated value? Even if the result of your calculation is dependent on other variables, you could run the calculation for the whole table but only when those variables change.