MySQL same query result in multiple subqueries - mysql

I am trying to optimize following query.
SELECT t3.*,
(SELECT SUM(t4.col_sum)
FROM (...) t4
WHERE t4.timestamp BETWEEN CONCAT(SUBSTR(t3.timestamp, 1, 11), "00:00:00") AND t3.timestamp)
AS cum_sum
FROM (...) t3
Where (...) is a container for long query. It results 2 columns: timestamp and col_sum. I want to add third column to it by writing a query. That third column is a cumulative sum of col_sum.
The problem is I am putting same big query in two places (...)
Is there a way I can obtain a result and use the result in those two/multiple places (...)?

One method is to use a temporary table.
Probably a more efficient method is to use variables to calculate a cumulative sum. It would be something like:
select t.*,
(#c := if(#t = left(t.timestamp, 11), #c + t.col_sum,
if(#t := left(t.timestamp, 11), 0, 0)
)
) as cumesum
from (. . .) t cross join
(select #t := '', #c := 0) vars
order by t.timestamp;
The above query orders the rows by timestamp. The variable #t keeps track of the first 11 characters in the timestamp -- as I read your logic, you want to do the cumulative sum only within a group where this is constant.
The variable #c keeps track of the cumulative sum, resetting to zero when a new "first 11 characters" are encountered. The logic looks a bit complicated, but it is best to put all variable assignments in a single expression, because MySQL does not guarantee the order of evaluation of expressions.

Related

How to easily get row number when using LIMIT in MySQL?

Suppose I have a database table with quite a few rows which I want to display to an user. It would make sense to LIMIT the output, and make pages of rows. In MySQL I would do this:
SELECT * FROM myTable ORDER BY myValue LIMIT 120,10
which will show 10 rows starting from row 120. So MySQL must use, internally, some kind of order, and has numbered the rows accordingly. I would like to display the row number with each row. How do I get access to these numbers, using only MySQL? To be clear, I am looking for something like this:
SELECT *,<LIMIT_ROWNO> FROM myTable ORDER BY myValue LIMIT 120,10
I looked online and in the manual, but I cannot find it. I would prefer something simple, without using variables, or functions. Isn't there a predefined expression for this?
I can solve this problem easily in PHP, but it would be more logical to get the row numbers from MySQL.
You can't do it without using variables, e.g.:
SELECT m.*, #rownum := #rownum + 1 as `num`
FROM myTable m, (SELECT #rownum := 120) a
ORDER BY myValue LIMIT 120,10;
set #rownum=120;
SELECT *,#rownum:=#rownum+1 as rn FROM myTable ORDER BY myValue LIMIT 120,10;
as of final of 2021, why not:
SELECT
t1.*,
COUNT(t1.*) OVER (PARTITION BY RowCounter) as TotalRecords
FROM (
SELECT a, b, c, 1 as RowCounter
FROM MyTable
) t1
LIMIT 120,10
using a subquery with a column marking every row with the same value, will give us the possibility to count all of the same values of the the resulted column with PARTITION BY window function's group

Numbering rows in groups with MySQL: how does it work?

There are several good posts on how to number rows within groups with MySQL, but how does the actually code work? I'm unclear on what MySQL evaluates first in the code below.
For instance, placing #yearqt := yearqt as bloc before the IF() call produces different results, and I'm unclear on the role of the s1 subquery in initializing the # variables: when are they updated as MySQL runs through the data rows? Is the order by statement run before the select?
The code below selects three random records per yearqt group. There may be other ways to do this, but the question pertains to how the code works, not how I could do this differently or whether I can do this more efficiently. Thank you.
select * from (
select customer_id , yearqt , a ,
IF(#yearqt = yearqt , #rownum := #rownum + 1 , #rownum := 1) as rownum ,
#yearqt := yearqt as bloc
from
( select customer_id , yearqt , rand(123) as a from tbl
order by rand(123)
) a join ( select #rownum := 0 , #yearqt := '' ) s1
order by yearqt
) s2
where rownum <= 3
order by bloc
This question is related to how the engine retrieves SQL SELECT query results. The order is roughly the following:
Calculate explain plan
Calculate sets and join them using plan's directives (FROM / JOIN phase)
Apply WHERE clause
Apply GROUP BY/HAVING clause
Apply ORDER BY clause
Projection phase: every row returned is ordered and can now be 'displayed'.
So, in respect to the variables, you now understand why there's subquery to initialize them. This subquery is evaluated only once, and at the beginning of the process.
After that, the project phase seems to treat each selected attribute in the order you decided which is the reason why puting #yearqt := yearqt as bloc up one attribute would changes the outcome of the next/previous IF statement. Since each row will be projected once, it means any work you're doing on the variables will be done as many times as the number of rows in the final resulset.
The purpose of this
join ( select #rownum := 0 , #yearqt := '' ) s1
is to initialize the user-defined variables at the beginning of statement execution. Because this is a rowsource for the outer query (MySQL calls it a derived table) this will be executed BEFORE the outer query runs. We aren't really interested in what this query returns, except that it returns a single row, because of the JOIN operation.
So this inline view s1 could be omitted from the query and be replaced by a couple of SET statements that are executed immediately before the query:
SET #rownum := 0;
SET #yearqt := 0;
But then we'd have three separate statements to run, and we'd get different output from the query if these weren't run, if those variables were set to some other value. By including this in the query itself, it's a single statement, and we remove the dependency on separate SET statements.
This is the query that's really doing the work, whittled down to just the two expressions that matter in this case
SELEECT IF(#yearqt = t.yearqt , #rownum := #rownum + 1 , #rownum := 1) as rownum
, #yearqt := t.yearqt as bloc
FROM ( ... ) t
ORDER BY t.yearqt
Some key points that make this "work"
MySQL processes the expressions in the SELECT list in the order that they appear in the SELECT list.
MySQL processes the rows in the order specified in the ORDER BY.
The references to user-defined variables are evaluated for each row, not once at the beginning of the statement.
Note that the MySQL Reference Manual points out that this behavior is not guaranteed. (So, it may change in a future release.)
So, the processing of that can be described as
for the first expression:
compare the value of the yearqt column from the current row with current value of #yearqt user-defined variable
set the value of #rownum user-defined variable
return the result of the IF() expression in the resultset
for the second expression:
set the value of the #yearqt user-defined variable to the value of the yearqt column from the current row
return the value of the yearqt column in the resultset
The net effect is that for each row processed, we're comparing the value in the yearqt column to the value from the "previously" processed row, and we're saving the current value to compare to the next row.

Getting the top N results of every category in a table

I'd like to extract the top 10 results of a certain category within a table, ordered by date. My table looks like
CREATE TABLE IF NOT EXISTS Table
( name VARCHAR(50)
, category VARCHAR(50)
, date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
So far I've come up with SELECT category FROM Table GROUP BY category;, this will give me every category in store.
Next I need to run SELECT * FROM Table WHERE category=$categ ORDER BY date DESC LIMIT 10; in some kind of foreach loop for every $categ fed to me by the first instruction.
I'd like to do all of this in MySQL, if possible; I've come across several answers online but they all seem to involve two or more tables, or provide difficult examples that seem hard to understand... It would seem silly to me that something that can be dealt with so simply in server code (doesn't even create that much overhead, apart from the needless storage of the category names) is so difficult to translate into SQL code, but if nothing works that's what I'll end up doing, I guess.
You can use an inline view and user-defined variables to set a "row number" column, and then the outer query can filter based on the "row number" column. (Doing this, we can emulate a ROW_NUMBER analytic function.)
For large sets, this may not be the most efficient approach, but it works reasonably for small sets.
The outer query would look something like this:
SELECT q.*
FROM (
<view_query>
) q
WHERE q.row_num <= 10
ORDER
BY q.category, q.date DESC, q.name
The view query would be something like this
SELECT IF(#cat = t.category,#i := #i + 1, #i := 1) AS row_num
, #cat := t.category AS category
, t.date
. t.name
FROM mytable t
CROSS
JOIN ( SELECT #i := 0, #cat := NULL ) i
ORDER BY t.category, t.date DESC

Mysql how to sum column with previous column sum

In XLS I have two columns A and B.
A,B
1,1
2,3
2,5
1,6
5,11
2,13
A column have value, and B column is calculated with formula, (A + (previous row B value))
How can i do this calculation on MYSQL?
I'm trying to join same table twice and i can get previous rows A column next to B.
I can sum them, but how can I sum them with this formula?
XLS formula looks like this:
H20 = H19+G20
This is my SQL created from suggestions.
SELECT
date, time, sum, #b := sum+#b as 'AccSum', count
FROM
(SELECT
t.date, t.time, t.sum, t.count
FROM TMP_DATA_CALC t
ORDER BY t.epoch) as tb
CROSS JOIN
(SELECT #b := 0) AS var
;
SELECT A, #b := A+#b AS B
FROM (SELECT A
FROM YourTable
ORDER BY id) AS t
CROSS JOIN
(SELECT #b := 0) AS var
The user variable #b holds the value of B from the previous row, allowing you to add the current row's A to it.
DEMO
http://sqlfiddle.com/#!2/74488/2/1 shows how to select the data.
SET #runtot:=0;
Select a,b, #runtot:=#runtot+a from b
However there's an underlying problem I can't figure out. Since you don't have a defined order, the SQL could do this ordering in any way, so you may not get the desired results.. without a defined order you results may be unpredictable.
runtot = running total.
In MySQL we don't have any function like partition by which Oracle has. You can use curser to achieve your requirement. Or we can write any function which will get rownumber as input then add these two values then return that to query.
select b from xsl limit rownum-1,1 + select a from xsl limit rownum,1

Creating Temp Variables within Queries

I would like to be able to create a temp variable within a query--not a stored proc nor function-- which will not need to be declared and set so that I don't need to pass the query parameters when I call it.
Trying to work toward this:
Select field1,
tempvariable=2+2,
newlycreatedfield=tempvariable*existingfield
From
table
Away from this:
DECLARE #tempvariable
SET #tempvariable = 2+2
Select field1,
newlycreatedfield=#tempvariable*existingfield
From
table
Thank you for your time
I may have overcomplicated the example; more simply, the following gives the Invalid Column Name QID
Select
QID = 1+1
THN = QID + 1
If this is housed in a query, is there a workaround?
You can avoid derived tables and subqueries if you do a "hidden" assignment as a part of a complex concat_ws expression
Since the assignment is part of the expression of the ultimate desired value for the column, as opposed to sitting in its own column, you don't have to worry about whether MySQL will evaluate it in the correct order. Needless to say, if you want to use the temp var in multiple columns, then all bets are off :-/
caveat: I did this in MySQL 5.1.73; things might have changed in later versions
I wrap everything in concat_ws because it coalesces null args to empty strings, whereas concat does not.
I wrap the assignment to the var #stamp in an if so that it is "consumed" instead of becoming an arg to be concatenated. As a side note, I have guaranteed elsewhere that u.status_timestamp is populated when the user record is first created. Then #stamp is used in two places in date_format, both as the date to be formatted and in the nested if to select which format to use. The final concat is an hour range "h-h" which I have guaranteed elsewhere to exist if the c record exists, otherwise its null return is coalesced by the outer concat_ws as mentioned above.
SELECT
concat_ws( '', if( #stamp := ifnull( cs.checkin_stamp, u.status_timestamp ), '', '' ),
date_format( #stamp, if( timestampdiff( day, #stamp, now() )<120, '%a %b %e', "%b %e %Y" )),
concat( ' ', time_format( cs.start, '%l' ), '-', time_format( cs.end, '%l' ))
) AS as_of
FROM dbi_user AS u LEFT JOIN
(SELECT c.u_id, c.checkin_stamp, s.start, s.end FROM dbi_claim AS c LEFT JOIN
dbi_shift AS s ON(c.shift_id=s.id) ORDER BY c.u_id, c.checkin_stamp DESC) AS cs
ON (cs.u_id=u.id) WHERE u.status='active' GROUP BY u.id ;
A final note: while I happen to be using a derived table in this example, it is only because of the requirement to get the latest claim record and its associated shift record for each user. You probably won't need a derived table if a complex join is not involved in the computation of your temp var. This can be demonstrated by going to the first fiddle in #Fabien TheSolution's answer and changing the right hand query to
Select field1, concat_ws( '', if(#tempvariable := 2+2,'','') ,
#tempvariable*existingfield ) as newlycreatedfield
from table1
Likewise the second fiddle (which appears to be broken) would have a right hand side of
SELECT concat_ws( '', if(#QID := 2+2,'',''), #QID + 1) AS THN
You can do this with subqueries:
Select field1, tempvariable,
(tempvariable*existingfield) as newlycreatedfield
from (select t.*, (2+2) as tempvariable
from table t
) t;
Unfortunately, MySQL has a tendency to actually instantiate (i.e. create) a derived table for the subquery. Most other databases are smart enough to avoid this.
You can gamble that the following will work:
Select field1, (#tempvariable := 2+2) as tempvariable,
(#tempvariable*existingfield) as newlycreatedfield
From table t;
This is a gamble, because MySQL does not guarantee that the second argument is evaluated before the third. It seems to work in practice, but it is not guaranteed.
Why not just:
SET #sum = 4 + 7;
SELECT #sum;
Output:
+------+
| #sum |
+------+
| 11 |
+------+
source
You can do something like this :
SELECT field1, tv.tempvariable,
(tv.tempvariable*existingfield) AS newlycreatedfield
FROM table1
INNER JOIN (SELECT 2+2 AS tempvariable) AS tv
See SQLFIDDLE : http://www.sqlfiddle.com/#!2/8b0724/8/0
And to refer at your simplified example :
SELECT var.QID,
(var.QID + 1) AS THN
FROM (SELECT 1+1 as QID) AS var
See SQLFIDDLE : http://www.sqlfiddle.com/#!2/d41d8/19140/0