So I have a table, users, with user Balances and IDs.
With the below query, I get the table I need – which sorts the users by their balance.
SET #row_num=0; SELECT (#row_num:=#row_num+1) AS serial_num, ID, Balance FROM users ORDER BY Balance DESC; - which returns the following table:
Resulting MYSQL table
How would I find the serial_num of a specific user from the above table by ID?
I've tried SELECT * FROM ( the query above ) WHERE ID = "..."; but I must be getting something wrong with the syntax and I don't quite understand how I would implement a sub-query here.
Cheers
You had actually just 1 like mistake which lead to an uninitialized variable. Replace
SET #row_num=0;
with
SET #row_num:=0;
A little shorter version which can be run in one query would be:
SELECT *
FROM
(
SELECT ID, Balance, #row := #row + 1 AS serial_num
FROM users
CROSS JOIN (SELECT #row := 0) r
ORDER BY Balance DESC
) tmp
WHERE serial_num = 2
SQLFiddle demo
Related
I wrote a SQL query (below) that selects the next 25 records after the record with postID 201. (You don't have to read it)
SELECT
title,
content,
table_with_rn.userID,
table_with_rn.username,
postID,
post_timestamp
FROM
(
SELECT
title,
content,
posts.userID,
users.username,
postID,
post_timestamp,
#rownum := #rownum + 1 AS row_number2
FROM
(
posts
INNER JOIN users ON posts.userID = users.userID
)
CROSS JOIN(
SELECT
#rownum := 0
) AS r
ORDER BY
post_timestamp
DESC
) AS table_with_rn
WHERE
row_number2 >(
SELECT
row_number
FROM
(
SELECT
postID,
#rownum := #rownum + 1 AS row_number
FROM
(
posts
INNER JOIN users ON posts.userID = users.userID
)
CROSS JOIN(
SELECT
#rownum := 0
) AS r
ORDER BY
post_timestamp
DESC
) AS twn
WHERE
postID = 201
)
LIMIT 25
It sorts the table and then creates a column that holds the row number of each row. It then select the row number of the record with the specific postID, before selecting the records with greater row numbers from a duplicate table.
This query works fine, but it seems very complicated for a task that sounds rather simple. Is there a better/more efficient/simpler way of doing it?
Note: I realise I could skip the whole row_number thing and just use postID, since it is incremental, but I would like to keep my options open if I ever decide I don't want my pk to be an integer any more.
Note2: This is MySQL.
I am assuming that there is some column with which to determine whether a record is before or after the record with postID 201. From scanning your query, I'd say you have a timestamp column by which you want to order (to ignore the incremental nature of post ID).
If that is the case, one can employ a self join on the table some_table (for simplicity) where the timestamp columns of both table instance are compared. But one set of colums, is reduced to the timestamp of the record with postID = 201.
In other words, our join condition is 'all records of the table which have a timestamp larger than the one of the record with postID 201' which is the condition OP specified.
The result set now only contains records whose timestamp is larger than the one of postID 201 which we limit to only contain 25 entries. To get the ones directly after postID 201, we order by timestamp again.
The query could look like this:
SELECT
larger.*
FROM
some_table smaller
JOIN
some_table larger
ON
smaller.timestamp < larger.timestamp
AND smaller.postID = 201
ORDER BY larger.timestamp ASC
LIMIT 25
I know it is not possible directly.
But I want to achieve this by any indirect method if possible.
Actually I wanted to add below query to view which throws error , Sub query not allowed in view.
select T1.Code,
T1.month,
T1.value,
IfNull(T2.Value,0)+IfNull(T3.value,0) as value_begin
from (select *,#rownum := #rownum + 1 as rownum
from Table1
Join (SELECT #rownum := 0) r) T1
left join (select *,#rownum1 := #rownum1 + 1 as rownum
from Table1
Join (SELECT #rownum1 := 0) r) T2
on T1.code = T2.code
and T1.rownum = T2.rownum + 1
left join (select *,#rownum2 := #rownum2 + 1 as rownum
from Table1
Join (SELECT #rownum2 := 0) r) T3
on T1.code = T3.code
and T1.rownum = T3.rownum + 2
Order by T1.Code,T1.rownum
So, I thought I will make Sub query as separate view but that again throws error that variables not allowed in view. Please Help to overcome this situation.
Thanx in advance
You could try the method of triangle join + count for assigning row numbers. It will likely not perform well on large datasets, but instead you should be able to implement everything with a couple of views (if you think there's no other way to do what you want to do than with a view). The idea is as follows:
The dataset is joined to itself on the condition of master.key >= secondary.key, where master is the instance where detail data will actually be pulled from, and secondary is the other instance of the same table used to provide the row numbers.
Based on that condition, the first* master row would be joined with one secondary row, the second one with two, the third one with three and so on.
At this point, you can group the result set by the master key column(s) as well as the columns that you need in the output (although in MySQL it would be enough to group by the master key only). Count the rows in every group will give you corresponding row numbers.
So, if there was a table like this:
CREATE TABLE SomeTable (
ID int,
Value int
);
the query to assign row numbers to the table could look like this:
SELECT m.ID, m.Value, COUNT(*) AS rownum
FROM SomeTable AS m
INNER JOIN SomeTable AS s ON m.ID >= s.ID
GROUP BY m.ID, m.Value
;
Since you appear to want to self-join the ranked rowset (and twice too), that would require using the above query as a derived table, and since you also want the entire thing to be a view (which doesn't allow subqueries in the FROM clause), you would probably need to define the ranking query as a separate view:
CREATE RankingView AS
SELECT m.ID, m.Value, COUNT(*) AS rownum
FROM SomeTable AS m
INNER JOIN SomeTable AS s ON m.ID >= s.ID
GROUP BY m.ID, m.Value
;
and subsequently refer to that view in the main query:
CREATE SomeOtherView AS
SELECT ...
FROM RankingView AS t1
LEFT JOIN RankingView AS t2 ON ...
...
This SQL Fiddle demo shows the method and its usage.
One note with regard to your particular situation. Your table probably needs row numbers to be assigned in partitions, i.e. every distinct Code row group needs its own row number set. That means that your ranking view should specify the joining condition as something like this:
ON m.Code = s.Code AND m.Month >= s.Month
Please note that months in this case are assumed to be unique per Code. If that is not the case, you may first need to create a view that groups the original dataset by Code, Month and rank that view instead of the original dataset.
* According to the order of key.
I am running following query to get rank of business in all categories in terms of total number of likes.
SET #rownum = 0;
SELECT b.*
, (
SELECT f4.rank from business as b2 INNER JOIN (
select count(*) count, #rownum:=#rownum + 1 as rank, f3.* from favourites as f3 GROUP BY f3.business_id ORDER BY count DESC ) as f4 ON b2.id = f4.business_id WHERE b2.id = 8 && f4.category_id=c.id
)
as rank FROM business as b, category c where b.id=8
rank give NULL after first row, what should I do to reset #rownum to 0 for next row?
To reset the #rownum user variable, you could try including an inline view (i.e. a derived table) in the FROM clause.
It looks like you would need that within the inner correlated subquery. That correlated subquery should get re-executed for every row from category c, or at least every distinct value of c.id. (I'm going to assume that the id column in each table is the primary key.)
e.g.
FROM ...
JOIN (SELECT #rownum := 0) r
WHERE ...
BUT... I am hesitant to recommend this approach to you, because I am having difficulty unwinding your SQL statement. It's not clear what resultset you want returned. It looks like that query should be throwing an exception, if that subquery returns more than one row. I just don't see anything explicit or implied that would give you that guarantee.
An example of the desired output would go a long ways to getting some useful help.
I am pretty sure you want ROW_NUMBER, RANK or DENSE_RANK partitioned by business_ID but I cannot penetrate your SQL
Some inputs & outputs would be helpful.
select * from
business as f4 inner join
(
select business_id, count(*),rank() over (partition by business_id order by count desc ) as rank) as counts
on f4.business_id=counts.business_id
might be close
It seems to me that your code should increment #rownum for every row in the result because the first subquery and therefore the joined subquery should be executed once for every row.
In my opinion, your query is equivalent to the following:
SELECT b.*, #rownum:=#rownum + 1 AS rank
FROM business AS b, category c
WHERE b.id=8
Edit: If the problem is that you need to reset #rownum in a subquery but you're limited to a single column in the result, use something like this construct:
SELECT IF(#rownum:=0, NULL, f4.rank) AS rank FROM ...
The condition #rownum:=0 is always evaluated, resetting rownum, and because it evaluates to 0, value of f4.rank is always returned.
I have a table of users and a 'points' column. I would like to determine the number/place of the row across all the users ordered by 'points'.
I could just get result of all users data and then do while loop, and stop when id equals necessary user. But I believe there is a more efficient way do to that because my table will contain ~100 000 rows.
Try this:
SET #rownum = 0;
Select sub.*, sub.rank as Rank
FROM
(
Select *, (#rownum:=#rownum+1) as rank
FROM users
ORDER BY points
) sub
WHERE rank = 15
You need the row number of your record
Here's a good way to do it in MySQL
It would look like this
SELECT rank_user.*
FROM
(
SELECT #rownum:=#rownum+1 ‘rank’, p.*
FROM user u, (SELECT #rownum:=0) r
ORDER BY points DESC
) rank_user
WHERE rank BETWEEN 2 AND 4;
I've got a couple of duplicates in a database that I want to inspect, so what I did to see which are duplicates, I did this:
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
This way, I will get all rows with relevant_field occuring more than once. This query takes milliseconds to execute.
Now, I wanted to inspect each of the duplicates, so I thought I could SELECT each row in some_table with a relevant_field in the above query, so I did like this:
SELECT *
FROM some_table
WHERE relevant_field IN
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
)
This turns out to be extreeeemely slow for some reason (it takes minutes). What exactly is going on here to make it that slow? relevant_field is indexed.
Eventually I tried creating a view "temp_view" from the first query (SELECT relevant_field FROM some_table GROUP BY relevant_field HAVING COUNT(*) > 1), and then making my second query like this instead:
SELECT *
FROM some_table
WHERE relevant_field IN
(
SELECT relevant_field
FROM temp_view
)
And that works just fine. MySQL does this in some milliseconds.
Any SQL experts here who can explain what's going on?
The subquery is being run for each row because it is a correlated query. One can make a correlated query into a non-correlated query by selecting everything from the subquery, like so:
SELECT * FROM
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
) AS subquery
The final query would look like this:
SELECT *
FROM some_table
WHERE relevant_field IN
(
SELECT * FROM
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
) AS subquery
)
Rewrite the query into this
SELECT st1.*, st2.relevant_field FROM sometable st1
INNER JOIN sometable st2 ON (st1.relevant_field = st2.relevant_field)
GROUP BY st1.id /* list a unique sometable field here*/
HAVING COUNT(*) > 1
I think st2.relevant_field must be in the select, because otherwise the having clause will give an error, but I'm not 100% sure
Never use IN with a subquery; this is notoriously slow.
Only ever use IN with a fixed list of values.
More tips
If you want to make queries faster,
don't do a SELECT * only select
the fields that you really need.
Make sure you have an index on relevant_field to speed up the equi-join.
Make sure to group by on the primary key.
If you are on InnoDB and you only select indexed fields (and things are not too complex) than MySQL will resolve your query using only the indexes, speeding things way up.
General solution for 90% of your IN (select queries
Use this code
SELECT * FROM sometable a WHERE EXISTS (
SELECT 1 FROM sometable b
WHERE a.relevant_field = b.relevant_field
GROUP BY b.relevant_field
HAVING count(*) > 1)
SELECT st1.*
FROM some_table st1
inner join
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
)st2 on st2.relevant_field = st1.relevant_field;
I've tried your query on one of my databases, and also tried it rewritten as a join to a sub-query.
This worked a lot faster, try it!
I have reformatted your slow sql query with www.prettysql.net
SELECT *
FROM some_table
WHERE
relevant_field in
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT ( * ) > 1
);
When using a table in both the query and the subquery, you should always alias both, like this:
SELECT *
FROM some_table as t1
WHERE
t1.relevant_field in
(
SELECT t2.relevant_field
FROM some_table as t2
GROUP BY t2.relevant_field
HAVING COUNT ( t2.relevant_field ) > 1
);
Does that help?
Subqueries vs joins
http://www.scribd.com/doc/2546837/New-Subquery-Optimizations-In-MySQL-6
Try this
SELECT t1.*
FROM
some_table t1,
(SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT (*) > 1) t2
WHERE
t1.relevant_field = t2.relevant_field;
Firstly you can find duplicate rows and find count of rows is used how many times and order it by number like this;
SELECT q.id,q.name,q.password,q.NID,(select count(*) from UserInfo k where k.NID= q.NID) as Count,
(
CASE q.NID
WHEN #curCode THEN
#curRow := #curRow + 1
ELSE
#curRow := 1
AND #curCode := q.NID
END
) AS No
FROM UserInfo q,
(
SELECT
#curRow := 1,
#curCode := ''
) rt
WHERE q.NID IN
(
SELECT NID
FROM UserInfo
GROUP BY NID
HAVING COUNT(*) > 1
)
after that create a table and insert result to it.
create table CopyTable
SELECT q.id,q.name,q.password,q.NID,(select count(*) from UserInfo k where k.NID= q.NID) as Count,
(
CASE q.NID
WHEN #curCode THEN
#curRow := #curRow + 1
ELSE
#curRow := 1
AND #curCode := q.NID
END
) AS No
FROM UserInfo q,
(
SELECT
#curRow := 1,
#curCode := ''
) rt
WHERE q.NID IN
(
SELECT NID
FROM UserInfo
GROUP BY NID
HAVING COUNT(*) > 1
)
Finally, delete dublicate rows.No is start 0. Except fist number of each group delete all dublicate rows.
delete from CopyTable where No!= 0;
sometimes when data grow bigger mysql WHERE IN's could be pretty slow because of query optimization. Try using STRAIGHT_JOIN to tell mysql to execute query as is, e.g.
SELECT STRAIGHT_JOIN table.field FROM table WHERE table.id IN (...)
but beware: in most cases mysql optimizer works pretty well, so I would recommend to use it only when you have this kind of problem
This is similar to my case, where I have a table named tabel_buku_besar. What I need are
Looking for record that have account_code='101.100' in tabel_buku_besar which have companyarea='20000' and also have IDR as currency
I need to get all record from tabel_buku_besar which have account_code same as step 1 but have transaction_number in step 1 result
while using select ... from...where....transaction_number in (select transaction_number from ....), my query running extremely slow and sometimes causing request time out or make my application not responding...
I try this combination and the result...not bad...
`select DATE_FORMAT(L.TANGGAL_INPUT,'%d-%m-%y') AS TANGGAL,
L.TRANSACTION_NUMBER AS VOUCHER,
L.ACCOUNT_CODE,
C.DESCRIPTION,
L.DEBET,
L.KREDIT
from (select * from tabel_buku_besar A
where A.COMPANYAREA='$COMPANYAREA'
AND A.CURRENCY='$Currency'
AND A.ACCOUNT_CODE!='$ACCOUNT'
AND (A.TANGGAL_INPUT BETWEEN STR_TO_DATE('$StartDate','%d/%m/%Y') AND STR_TO_DATE('$EndDate','%d/%m/%Y'))) L
INNER JOIN (select * from tabel_buku_besar A
where A.COMPANYAREA='$COMPANYAREA'
AND A.CURRENCY='$Currency'
AND A.ACCOUNT_CODE='$ACCOUNT'
AND (A.TANGGAL_INPUT BETWEEN STR_TO_DATE('$StartDate','%d/%m/%Y') AND STR_TO_DATE('$EndDate','%d/%m/%Y'))) R ON R.TRANSACTION_NUMBER=L.TRANSACTION_NUMBER AND R.COMPANYAREA=L.COMPANYAREA
LEFT OUTER JOIN master_account C ON C.ACCOUNT_CODE=L.ACCOUNT_CODE AND C.COMPANYAREA=L.COMPANYAREA
ORDER BY L.TANGGAL_INPUT,L.TRANSACTION_NUMBER`
I find this to be the most efficient for finding if a value exists, logic can easily be inverted to find if a value doesn't exist (ie IS NULL);
SELECT * FROM primary_table st1
LEFT JOIN comparision_table st2 ON (st1.relevant_field = st2.relevant_field)
WHERE st2.primaryKey IS NOT NULL
*Replace relevant_field with the name of the value that you want to check exists in your table
*Replace primaryKey with the name of the primary key column on the comparison table.
It's slow because your sub-query is executed once for every comparison between relevant_field and your IN clause's sub-query. You can avoid that like so:
SELECT *
FROM some_table T1 INNER JOIN
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
) T2
USING(relevant_field)
This creates a derived table (in memory unless it's too large to fit) as T2, then INNER JOIN's it with T1. The JOIN happens one time, so the query is executed one time.
I find this particularly handy for optimising cases where a pivot is used to associate a bulk data table with a more specific data table and you want to produce counts of the bulk table based on a subset of the more specific one's related rows. If you can narrow down the bulk rows to <5% then the resulting sparse accesses will generally be faster than a full table scan.
ie you have a Users table (condition), an Orders table (pivot) and LineItems table (bulk) which references counts of Products. You want the sum of Products grouped by User in PostCode '90210'. In this case the JOIN will be orders of magnitude smaller than when using WHERE relevant_field IN( SELECT * FROM (...) T2 ), and therefore much faster, especially if that JOIN is spilling to disk!