Mysql subquery long work - mysql

I have next mysql query
SELECT DISTINCT * FROM dom_small WHERE count>=0 AND dom NOT IN
(SELECT dom FROM dom_small WHERE was_blocked=1)
ORDER BY count DESC LIMIT 30 OFFSET 4702020
When i increase OFFSET more and more, subquery run long and long.
When OFFSET 0 mysql query load 0 sec but when 4702020 mysql query load 1 min 19,49 sec
How to solve this problem?

You can yield same result with out using subquery
SELECT DISTINCT * FROM dom_small WHERE count>=0 AND was_blocked=1
ORDER BY count DESC LIMIT 30 OFFSET 4702020;

Use following query:
SELECT DISTINCT * FROM dom_small WHERE count>=0 AND dom NOT IN
(select * from(SELECT dom FROM dom_small WHERE was_blocked=1) t1 )
ORDER BY count DESC LIMIT 30 OFFSET 4702020
It can speed up the performance by caching the sub-query result. I previously used this method and it helped me much.
But as others mentioned using offset with big numbers slow down the performance.

SELECT DISTINCT * FROM dom_small WHERE count>=0 AND dom NOT IN
(SELECT dom FROM dom_small WHERE was_blocked=1)
ORDER BY count DESC LIMIT 30 OFFSET 4702020
Although the other comments are accurate, the only suggestion I can offer is that your distinct and subquery are from the same table "dom_small". Also, you are not doing any aggregate count(*) vs what appears to be an actual column in your table called count.
That said, I would have an index to help optimize the query on
( dom, was_blocked, count )

Related

how to select random user

i have the following table with its fields, this table has more than 30K user's,
EACH USE'r has more than 1000 records,
userid is named as ANONID , i want to select randomly 100 user with all their records,using MYSQL
thanks in advance
"rand()" function as mentioned in earlier answers do not work in SQL 2k12 . for SQL use following query to get random 100 rows using "newid()" function
("newid()" is built in function for SQL)
select * from table
order by newid()
offset 0 rows
fetch next 100 rows only
For a table of 30,000 and a single sample, you can use:
select t.*
from t
order by rand()
limit 100;
This does exactly what you want. It will take a few seconds to return.
If performance is an issue, there are other more complicated methods for sampling the data. A simple method reduces the number of rows before the order by. So a 5% sample will speed the query and here is one method for doing that:
select t.*
from t
where rand() < 0.05
order by rand()
limit 100;
EDIT:
You want what is called a clustered sample or hierarchical sample. Use a subquery:
select t.*
from t join
(select userid
from (select distinct userid from t) t
order by rand()
limit 100
) tt
on t.userid = tt.userid;
SELECT * FROM table
ORDER BY RAND()
LIMIT 100
It is slow, but it works.

How to avoid error "Incorrect usage/placement of SQL_CALC_FOUND_ROWS"?

We have in our application two queries, the first very long which takes 30 seconds, and then another to get the number of rows before the LIMIT:
SELECT DISTINCT SQL_CALC_FOUND_ROWS res.*,...GROUP BY...HAVING...ORDER BY fqa_value LIMIT 0,10;
SELECT FOUND_ROWS() as count
We can optimize the speed from 30 seconds down to 1 second if we take out the "ORDER BY fqa_value".
So I put everything in a subselect and then sort it:
select * from (
SELECT DISTINCT SQL_CALC_FOUND_ROWS res.*,...GROUP BY...HAVING...LIMIT 0,10;
) as temptable order by fqa_value;
However this gives me the error: "Incorrect usage/placement of SQL_CALC_FOUND_ROWS".
If I take the SQL_CALC_FOUND_ROWS out, it works:
select * from (
SELECT DISTINCT res.*,...GROUP BY...HAVING...ORDER BY fqa_value LIMIT 0,10
) as temptable order by fqa_value;
But then I don't have the original number of rows that was selected before GROUP BY and HAVING.
How can I both (a) count the original rows, and (b) have a fast query? I'm looking preferably for a pure MySQL solution so we don't have to change the code if necessary.
SQL_CALC_FOUND_ROWS should go before distinct:
SELECT SQL_CALC_FOUND_ROWS DISTINCT res.*,...GROUP BY...HAVING...ORDER BY fqa_value LIMIT 0,10;
SELECT FOUND_ROWS() as count

How can I limit query's results without using LIMIT

I need to show ordered 20 records on my grid but I can't use LIMIT because of my generator(Scriptcase) using LIMIT to show lines per page. It's generator's bug but I need to solve it for my project. So is it possible to show 20 ordered record from my table with a query?
As from comments,if you can't use limit then you can rank your results on basis of some order and in parent select filter limit the results by rank number
select * from (
select *
,#r:=#r + 1 as row_num
from your_table_name
cross join (select #r:=0)t
order by some_column asc /* or desc*/
) t1
where row_num <= 20
Demo with rank no.
Another hackish way would be using group_concat() with order by to get the list of ids ordered on asc/desc and substring_index to pick the desired ids like you need 20 records then join with same table using find_in_set ,But this solution will be very expensive in terms of performance and group_concat limitations if you need more than 20 records
select t.*
from your_table_name t
join (
select
substring_index(group_concat(id order by some_column asc),',',20) ids_list
from your_table_name
) t1 on (find_in_set(t.id , t1.ids_list) > 0)
Demo without rank
What about SELECT in SELECT:
SELECT *
FROM (
-- there put your query
-- with LIMIT 20
) q
So outer SELECT is without LIMIT and your generator can add own.
In a Scriptcase Grid, you CAN use Limit. This is a valid SQL query that selects only the first 20 records from a table. The grid is set to show only 10 records per page, so it will show 20 results split in a total of 2 pages:
SELECT
ProductID,
ProductName
FROM
Products
LIMIT 20
Also the embraced query works out well:
SELECT
ProductID,
ProductName
FROM
(SELECT
ProductID,
ProductName
FROM Products LIMIT 20) tmp

SQL Distinct - Get all values

Thanks for looking, I'm trying to get 20 entries from the database randomly and unique, so the same one doesn't appear twice. But I also have a questionGroup field, which should also not appear twice. I want to make that field distinct, but then get the ID of the field selected.
Below is my NOT WORKING script, because it does the ID as distinct too which
SELECT DISTINCT `questionGroup`,`id`
FROM `questions`
WHERE `area`='1'
ORDER BY rand() LIMIT 20
Any advise is greatly appreciated!
Thanks
Try doing the group by/distinct first in a subquery:
select *
from (select distinct `questionGroup`,`id`
from `questions`
where `area`='1'
) qc
order by rand()
limit 20
I see . . . What you want is to select a random row from each group, and then limit it to 20 groups. This is a harder problem. I'm not sure if you can do this accurately with a single query in mysql, not using variables or outside tables.
Here is an approximation:
select *
from (select `questionGroup`
coalesce(max(case when rand()*num < 1 then id end), min(id)) as id
from `questions` q join
(select questionGroup, count(*) as num
from questions
group by questionGroup
) qg
on qg.questionGroup = q.questionGroup
where `area`='1'
group by questionGroup
) qc
order by rand()
limit 20
This uses rand() to select an id, taking, on average two per grouping (but it is random, so sometimes 0, 1, 2, etc.). It chooses the max() of these. If none appear, then it takes the minimum.
This will be slightly biased away from the maximum id (or minimum, if you switch the min's and max's in the equation). For most applications, I'm not sure that this bias would make a big difference. In other databases that support ranking functions, you can solve the problem directly.
Something like this
SELECT DISTINCT *
FROM (
SELECT `questionGroup`,`id`
FROM `questions`
WHERE `area`='1'
ORDER BY rand()
) As q
LIMIT 20

How to SUM() from an offset through the end of the table?

If SELECT SUM(amount) FROM transactions ORDER BY order LIMIT 0, 50 sums the amount field for the first 50 records in a table, how do a sum all records after the first 50? In other words, I'd like to do something like SELECT SUM(amount) from transactions ORDER BY order LIMIT 50, *, but that doesn't work.
SELECT SUM(amount)
FROM (
SELECT amount
FROM transactions
ORDER BY
order
LIMIT 50, 1000000000000
) q
Note that your original query:
SELECT SUM(amount)
FROM transactions
ORDER BY
order
LIMIT 0, 50
does not do what you probably think it does. It is synonymous to this:
SELECT a_sum, order
FROM (
SELECT SUM(amount) AS a_sum, order
FROM transactions
) q
ORDER BY
order
LIMIT 0, 50
The inner query (which would normally fail in any other engine but works in MySQL due to its GROUP BY extension syntax) returns only 1 records.
ORDER BY and LIMIT are then applied to that one aggregated record, not to the records of transactions.
The documentation advices to use an incredible large number as second parameter to LIMIT:
To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last:
SELECT * FROM tbl LIMIT 95,18446744073709551615;
There is probably a more efficient way, but you could run a count query first, to retrieve total # of rows in your table:
SELECT count(*) FROM transactions
Stuff that into a variable and use that variable as your second argument for LIMIT. You could probably do this as a nested mysql query.