what does `(SELECT #rn :=0) var_init` do? - mysql

I have seen some MySql queries using
(SELECT #rn :=0, #ct := NULL ) var_init
i don't know what it does .. i have searched for it for a long time and still i don't have an answer. any help is much appreciated .
it will be much appriaced if some one could explain this query to me ...
SELECT * FROM (
SELECT c. * , #rn := IF( `type` != #ct , 1, #rn +1 ) AS rownumber, #ct := `type` FROM jb_company c ,
(SELECT #rn :=0, #ct := NULL ) var_init ORDER BY `type`
) c
WHERE rownumber <=20
I am using the above query to fetch limited number of rows( i.e 20) of each type in from the table ( see the link below..for the question where i needed this )
Mysql query to fetch limited rows of each type
but i am still not getting the query.. some please help
Thanks in advance.

It plainly initializes the values (#rn :=0, #ct := NULL), which results in an alias var_init containing one row, and joins the rest on it (so, having no effect on the rows themselves other then setting up the variables in the beginning).
This is often used to avoid needing multiple statements to set up the variables. That single query is equal to:
SET #rn :=0;
SET #ct := NULL;
SELECT * FROM (
SELECT c. * ,
#rn := IF( `type` != #ct , 1, #rn +1 ) AS rownumber,
#ct := `type`
FROM jb_company c
ORDER BY `type`
) c
WHERE rownumber <=20
.. which is multiple statements, so usually used due to API limitations of the using code, or to make sure the variables start out as they should be on shared connections.

This is an expression in MySQL used to define variables that might be used in the expression.
By default, MySQL defaults variables to strings. So, to get a numeric variable, you want to assign the variable to a number the first time it is seen. This can be done in a set statement. However, a "single" query would then consist of multiple statements.
Often, variables in MySQL are used to approximate the window/analytic functions available in most other databases. Other databases do not encourage the use of such variables in queries (although they are typically allowed and can be useful under some -- more limited -- circumstances).
The query that you mention would be expressed as the following in most databases:
SELECT *
FROM (SELECT c.* ,
row_number() over (partition by `type` order by `type`) as rownumber
FROM jb_company c
) c
WHERE rownumber <= 20;
The way the MySQL version works is by creating a derived table and using the variables to add rows with the appropriate values.

Related

MySQL: user-variable definition within SQL statement to create counter column

Is it possible to create a counter in mysql/mariadb in one single SELECT-statement. I've tried the following but it returns only the value 1 in the first column:
SELECT #rownr := IF(ISNULL(#rownr),0,#rownr)+1 AS rowNumber, * FROM table_x LIMIT 0,10
If I run the statement more often in the same mysql-instance it starts counting from the last number. So the second time it starts at 2, the third time at 12. This means that the variable is created but seems to be only available for modification when it was instantiated before the SQL statement was issued.
It is possible, but a bit tricky. First, you need to declare the variable outside of the select clause (in a separate set assignment, or in a derived table). Also, it is safer to sort the rows in a subquery first, and then compute the variable.
I would recommend:
set #rn := 0;
select t.*, #rn := #rn + 1 rowNumber
from (select t.* from mytable t order by id limit 10) t
Note that I added an order by clause to the inner query, otherwise it is undefined in which sequence rows will be ordered (I assumed id).
Alternatively, you can declare the variable in a derived table:
select t.*, #rn := #rn + 1 rowNumber
from (select t.* from mytable t order by id limit 10) t
cross join (select #rn := 0) x
Finally: if you are running MySQL 8.0, just use row_number():
select t.*, row_number() over(order by id) rn
from mytable t
order by id
limit 10;
You don't have an order by, so the ordering is indeterminate. But you can initialize the parameter in the statement itself:
SELECT #rownr := (#rownr + 1) AS rowNumber, x.*
FROM table_x x.CROSS JOIN
(SELECT #rownr := 0) params
LIMIT 0, 10;
If you want a particular ordering, you should use an order by in a subquery.
Also note that starting in MySQL 8, variable assignments in SELECT are deprecated. You should be using window functions (row_number()) in more recent versions.

MSSQL: ROW_NUMBER() in Sub-Queries

This below was originally written for MySQL, with the #rownumber is defined as a scalar-variable.
SELECT CONCAT( z.expected, IF(z.got-1>z.expected, CONCAT(' thru ',z.got-1), '') AS `Missing Receipt ID`
FROM (SELECT #rownum \\:= #rownum+1 AS expected,
IF(#rownum=recpt_id, 0, #rownum \\:= recpt_id) AS got
FROM
(SELECT #rownum \\:= (SELECT MIN(CAST(recpt_id AS SIGNED))-1 FROM report_receipt
WHERE outlet_desc IN "+Branch+" )
) AS a
JOIN report_receipt r
ON r.outlet_desc IN "+Branch+"
ORDER BY CAST(recpt_id AS SIGNED)
) AS z
WHERE z.got!=0
I am trying to create the same query but for the SQL Server platform. I am aware that in MSSQL, we can use the ROW_NUMBER function and then have a OVER(ORDER ) clause at a place where the ROW_NUMBER is called. The query is also going to be used in native query format in our Java code.
So far, after some minor syntax adjustments, I have reached to this current state in MSSQL but with some uncertainties:
SELECT CONCAT( z.expected, CASE WHEN z.got-1>z.expected THEN CONCAT(' thru ',z.got-1) ELSE '' END) AS [Missing Receipt ID]
FROM
(SELECT #rownum \:= #rownum+1 AS expected,
IF(#rownum=recpt_id, 0, #rownum \:= recpt_id) AS got
FROM (SELECT ROW_NUMBER() OVER(ORDER BY CAST(recpt_id AS INT))) \:= (SELECT MIN(CAST(recpt_id AS INT))-1 FROM report_receipt
WHERE outlet_desc IN "+Branch+" )
) AS a
JOIN report_receipt r
ON r.outlet_desc IN ('MY011')) AS z
WHERE z.got!=0
What I'm unsure but looking for answers would be:
If there's is only one place ORDER clause is placed, do we just place it inside the innermost query while for other places we have to use only ROW_NUMBER() OVER(ORDER BY SELECT NULL)?
If the query is considered having unnecessary additional nesting, what is the other alternative without having to call too many ROW_NUMBER functions repeatedly?
Thanks in advance.
Note: The '\' is an escape character added for the Java code.

referring select output as a table in where clause

I am having a query and I want to perform the operation like
select *
from (query which i wrote) as x
where
(select count(*)
from x as y
where x.location=y.location
and x.count>=y.count)<=3;
It was giving error
instead of x, I can add the query which I wrote. but the query is pretty much big. when I tried the above query it is giving table doesn't exist error. is there a way to perform the above operation? kindly help me.
You cannot re-use a table alias like that. Instead, you need to copy the subquery. Or use variables:
select q.*
from (select q.*,
(#rn := if(#l = location, #rn + 1,
if(#l := location, 1, 1)
)
) as rn
from (query which i wrote) q cross join
(select #l := '' , #rn := 0) params
order by location, count desc
) q
where rn <= 3;

SQL | insert into + select + variables with condition?

My task is inserting 3 random datas per ID from another table
and I got a mistake with syntax
set #num := 0, #type := '' ,#stat :='';
INSERT INTO random
as
(
SELECT
*
FROM (
select userID,userNAME, chaID, chaNAME,goal,gender,
#num := if(#type = userID, #num +1,1) as row_number,
#type := userID as dummy,
#stat as status
from userchar
order by userID
) as x where x.row_number <= 3)
I'm going to put this code in event scheduler to insert the new datas in daily
1064 - You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB server version for the right syntax to use
near 'INSERT INTO random as ( SELECT * FROM ( select userID,userNAME,
chaID, c' at line 2
thank you so much for every suggestions.
I suspect the problem is trying to run multiple statements at the same time. You can fix this by initializing the variables in the query itself:
INSERT INTO random( . . . )
select u.*
from (select userID, userNAME, chaID, chaNAME, goal, gender,
(#num := if(#u = userID, #num +1,
if(#u := userId, 1, 1)
)
) as row_number,
userID as dummy,
#stat as status
from userchar u cross join
(select #u = '', #num := 0, #stat := '') params
order by userID, rand()
) u
where u.row_number <= 3;
There are several other issues:
When using insert, always list the columns. This is particularly important if you are learning SQL, so you learn good habits.
You should not assign a variable value in one expression and use it in another. MySQL (and MariaDB) do not guarantee the order of evaluation of expressions in a select, so the expressions can be evaluated in either order.
If you want random rows, then use rand(). There is a difference between "indeterminate" and "random".

What are the subquery equivalents of SQL aggregate functions MAX/MIN/AVG/COUNT

Can someone show me how to represent the following SQL statements without the use of aggregate functions?
SELECT COUNT(column) FROM table;
SELECT AVG(column) FROM table;
SELECT MAX(column) FROM table;
SELECT MIN(column) FROM table;
MIN() and MAX() can be done with simple subqueries:
select (select column from table order by column is not null desc, column asc limit 1) as "MIN",
(select column from table order by column is not null desc, column desc limit 1) as "MAX"
COUNT() and AVG() require the use of variables, if you don't allow any aggregations:
select rn as "COUNT", sumcol / rnaas "AVG"
from (select t.*
from (select t.*,
(#rn := #rn + 1) as rn,
(#rna := #rna + if(column is not null, 1, 0)) as rna,
(#sum := #sum + coalesce(column, 0)) as sumcol
from table t cross join
(select #rn := 0, #rna := 0, #sum := 0) const
order by column
) t
order by rn desc
limit 1
) t
This latter formulation only works in MySQL.
EDIT:
The empty table is a challenge. Let's do this with a left outer join:
select cast(coalesce(rn, 0) as int) as "COUNT",
(case when rna > 0 then sumcol / rna end) as "AVG"
from (select 1 as n
) n left outer join
(select t.*
from (select t.*,
(#rn := #rn + 1) as rn,
(#rna := #rna + if(column is not null, 1, 0)) as rna,
(#sum := #sum + coalesce(column, 0)) as sumcol
from table t cross join
(select #rn := 0, #rna := 0, #sum := 0) const
order by column
) t
order by rn desc
limit 1
) t
on n.n = 1;
Notes. This will return 0 for the count if the table is empty. That is correct. If the table is empty, it will return NULL for the average, and that is also correct.
If the table is not empty, but the values are all NULL, then it will also return NULL. The types for the count are always integers, so that should be ok. The type of the average is more problematic, but the variables will return some sort of generic numeric type, which seems compatible in spirit.
min/max can be replaced with something like this:
select t1.pk_column,
t1.some_column
from the_table t1
where t1.some_column < ALL (select t2.some_column
from the_table t2
where t2.pk_column <> t2.pk_column);
For getting the max you need to replace < with >. pk_column is the primary key column of the table and is needed to avoid comparing each row to itself (it doesn't have to be a PK it only needs to be unique)
I don't think there is an alternative for count() or avg() (at least I can't think of one)
I used the_column and the_table because column and table are reserved words
SET #t1=0, #t2=0, #t3=0,#T4=0;
COUNT:
Select #t1:=#t1+1 as CNT from table
order by #t1:=#t1+1 DESC
LIMIT 1
Similar methods could be put together for Avg and max/min using limits...
Still thinking about Min/Max...
Not to supersede the excellent answer from Gordon Linoff, but there's a little more work involved to accurately emulate the AVG(), COUNT(), and SUM() functions. (The answer for the MIN and MAX functions in Gordon's answer are spot on.)
There's a corner case when the table is empty. In order to emulate the SQL aggregate functions, we need our query to return a single row. But at the same time, we need a test of whether or not the table contains at least one row.
Here's a query that is a more precise emulation:
-- create an empty table
CREATE TABLE `foo` (col INT);
-- TRUNCATE TABLE `foo`;
SELECT IF(s.ne IS NULL,0,s.rn) AS `COUNT(*)`
, IF(s.cc>0,s.tc,NULL) AS `SUM(col)`
, IF(s.cc>0,s.tc/s.cc,NULL) AS `AVG(col)`
FROM ( SELECT v.rn
, v.cc
, v.tc
, e.ne
FROM ( SELECT #rn := #rn + 1 AS rn
, #cc := #cc + (t.col IS NOT NULL) AS cc
, #tc := #tc + IFNULL(t.col,0) AS tc
FROM (SELECT #rn := 0, #cc := 0, #tc := 0) c
LEFT
JOIN `foo` t
ON 1=1
) v
LEFT
JOIN (SELECT 1 AS ne FROM `foo` z LIMIT 1) e
ON 1=1
ORDER BY v.rn DESC
LIMIT 1
) s
NOTES:
The purpose of the inline view aliased as e is to give us a way to determine whether or not the table contains any rows. If the table contains at least one row, we'll get a value of 1 returned as column ne (not empty). If the table is empty, that query won't return a row, and e.ne will be NULL, which is something we can test in the outer query.
In order to return a row, so we can return a value, like a 0 for a COUNT, we need to insure that we return at least one row from the inline view v. Since we are guaranteed exactly one row from the inline view aliased as c (which initializes our user defined variables), we'll use that as the "driving" table for a LEFT [OUTER] JOIN operation.
But, if the table is empty, our our row counter (#rn) coming out of v is going to have a value of 1. But we'll deal with that, we have the e.ne we can check to know if the count should really be returned as 0.
In order to calculate the average, we can't divide by the row counter, we have to divide by the number of rows where col was not null. We use the #cc user defined variable to keep track of the count of those rows.
Similarly, for the SUM (and the average) we need to accumulate only the non-NULL values. (If we were to add a NULL, it would turn the whole total to NULL, basically wiping out are accumulation. So, we're going to do a conditional test to check if t.col IS NULL, to avoid accidentally wiping out the accumulation. And our accumulator is going to be a 0 if there aren't any rows that are not null. But that's not a problem, because we'll make sure we check our #cc to see if there were any rows that were included. We're going to need to check it anyway, to avoid a "divide by zero" issue.
To test, run against the empty table foo. It will return a count of 0, and NULL for SUM and AVG, equivalent to the result we get from:
SELECT COUNT(*), SUM(col), AVG(col) FROM foo;
We can also test the query against a table containing only NULL values for col:
INSERT INTO `foo` (col) VALUES (NULL);
As well as some non-NULL values:
INSERT INTO `foo` (col) VALUES (2),(3),(5),(7),(11),(13),(17),(19);
And compare the results of the two queries.
This essentially the same as the answer from Gordon Linoff, with just a little more precision to work around the corner cases of NULL values and the empty table.