How can I optimize my query (rank query)? - mysql

For the last two days, I have been asking questions on rank queries in Mysql. So far, I have working queries for
query all the rows from a table and order by their rank.
query ONLY one row with its rank
Here is a link for my question from last night
How to get a row rank?
As you might notice, btilly's query is pretty fast.
Here is a query for getting ONLY one row with its rank that I made based on btilly's query.
set #points = -1;
set #num = 0;
select * from (
SELECT id
, points
, #num := if(#points = points, #num, #num + 1) as point_rank
, #points := points as dummy
FROM points
ORDER BY points desc, id asc
) as test where test.id = 3
the above query is using subquery..so..I am worrying about the performance.
are there any other faster queries that I can use?
Table points
id points
1 50
2 50
3 40
4 30
5 30
6 20

Don't get into a panic about subqueries. Subqueries aren't always slow - only in some situations. The problem with your query is that it requires a full scan.
Here's an alternative that should be faster:
SELECT COUNT(DISTINCT points) + 1
FROM points
WHERE points > (SELECT points FROM points WHERE id = 3)
Add an index on id (I'm guessing that you probably you want a primary key here) and another index on points to make this query perform efficiently.

Related

Below Sql query is taking too much time. How to make this faster?

SELECT
call_id
,call_date
,call_no
,call_amountdue
,rechargesamount
,call_penalty
,callpayment_received
,calldiscount
FROM `call`
WHERE calltype = 'Regular'
AND callcode = 98
AND call_connect = 1
AND call_date < '2018-01-01'
ORDER BY
`call_date` DESC
,`call_id` DESC
limit 1
Index is already there on call_date, callcode, calltype, callconnect
Table has 10 million records. Query is taking 2 min
How to get results within 3sec?
INDEX (calltype, callcode, call_connect, -- in any order
call_date, -- next
call_id) -- last
This will make it possible to find the one row that is desired without having to step over other rows.
Since you seem to have INDEX(calltype), Drop it; it will be in the way and, anyway, redundant. The rest of the indexes you mentioned will be ignored.
More discussion in Index Cookbook

DIviding the SQL result into two halves

The SQL query is :
Select ProductName from Products;
The above query returns 5000 rows.
How can the result of 5000 rows be divided into two result sets of 2500 rows each,.i.e., one result set from 1 to 2500 and the other from 2501 to 5000?
Note:
Here ProductName is the primary Key.No ProductID column is present in the table.
It can be done either in the back end or in the front end.
An approach that works for mySQL (based on this answer https://stackoverflow.com/a/4741301/14015737):
Upper half
SELECT *
FROM (
SELECT test.*, #counter := #counter +1 counter
FROM (select #counter:=0) initvar, test
ORDER BY num
) X
WHERE counter <= round(50/100 * #counter);
ORDER BY num;
Lower half
Invert the sort order and remove the rounding
SELECT *
FROM (
SELECT test.*, #counter := #counter +1 counter
FROM (select #counter:=0) initvar, test
ORDER BY num DESC
) X
WHERE counter <= (50/100 * #counter);
ORDER BY num;
In case of an uneven number of records, the middle record is added to the upper half in this example. If you want it the other way around, move the round() to the other statement. If you don't want it at all, remove round().
Dbfiddle example: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=fb70eae0f7f1434a24099b5bb19f0878
If you know the numbers that you want, just use limit:
select ProductName
from Products
order by id
And then either:
limit 2500
limit 2500 offset 2499
If you simply want the results split into half, then you can use:
select t.*
from (select t.*,
ntile(2) over (order by <primary key>) as tile
from t
) t
where tile = 1; -- or 2 for the other half
The easiest and probably fastest approach is to use the table's primary key if you are fine with getting the rows in its order.
Run
select productname, id from products order by id;
and fetch 2500 rows. Then with the last ID, say ID 3456, run
select productname, id from products where id > 3456 order by id;
and fetch 2500 rows again. Etc.
UPDATE: Seeing I got a downvote for this, I'll better explain :-)
The query returns 5000 rows now and the OP doesn't want that many rows, so they want to cut this in halves. But the query may well return 10000 rows next year. Will the OP suddenly be fine with getting 5000 rows at once? This doesn't seem likely. It is more likely that there is an amount of rows that shall not be surpassed. This is why I cut the amount into slices of 2500.
The other approach to number all rows and return the first n rows has a severe drawback: All rows must be read again. Even if it is decided to cut the result in chunks of 100 each, everytime all rows must be read, sorted, numbered, fetched from. Reading all rows from a table and sorting all these rows is a lot of work for a DBMS.

Optimizing SQL query with sub queries

I have got a SQL query that I tried to optimize and I could reduce through various means the time from over 5 seconds to about 1.3 seconds, but no further. I was wondering if anyone would be able to suggest further improvements.
The Explain diagram shows a full scan:
explain diagram
The Explain table will give you more details:
explain tabular
The query is simplified and shown below - just for reference, I'm using MySQL 5.6
select * from (
select
#row_num := if(#yacht_id = yacht_id and #charter_type = charter_type and #start_base_id = start_base_id and #end_base_id = end_base_id, #row_num +1, 1) as row_number,
#yacht_id := yacht_id as yacht_id,
#charter_type := charter_type as charter_type,
#start_base_id := start_base_id as start_base_id,
#end_base_id := end_base_id as end_base_id,
model, offer_type, instant, rating, reviews, loa, berths, cabins, currency, list_price, list_price_per_day,
discount, client_price, client_price_per_day, days, date_from, date_to, start_base_city, end_base_city, start_base_country, end_base_country,
service_binary, product_id, ext_yacht_id, main_image_url
from (
select
offer.yacht_id, offer.charter_type, yacht.model, offer.offer_type, offer.instant, yacht.rating, yacht.reviews, yacht.loa,
yacht.berths, yacht.cabins, offer.currency, offer.list_price, offer.list_price_per_day,
offer.discount, offer.client_price, offer.client_price_per_day, offer.days, date_from, date_to,
offer.start_base_city, offer.end_base_city, offer.start_base_country, offer.end_base_country,
offer.service_binary, offer.product_id, offer.start_base_id, offer.end_base_id,
yacht.ext_yacht_id, yacht.main_image_url
from website_offer as offer
join website_yacht as yacht
on offer.yacht_id = yacht.yacht_id,
(select #yacht_id:='') as init
where date_from > CURDATE()
and date_to <= CURDATE() + INTERVAL 3 MONTH
and days = 7
order by offer.yacht_id, charter_type, start_base_id, end_base_id, list_price_per_day asc, discount desc
) as filtered_offers
) as offers
where row_number=1;
Thanks,
goppi
UPDATE
I had to abandon some performance improvements and replaced the original select with the new one. The select query is actually dynamically built by the backend based on which filter criteria are set. As such the where clause of the most inner select can expland quite a lot. However, this is the default select if no filter is set and is the version that takes significantly longer than 1 sec.
explain in text form - doesn't come out pretty as I couldn't figure out how to format a table, but here it is:
1 PRIMARY ref <auto_key0> <auto_key0> 9 const 10
2 DERIVED ALL 385967
3 DERIVED system 1 Using filesort
3 DERIVED offer ref idx_yachtid,idx_search,idx_dates idx_dates 5 const 385967 Using index condition; Using where
3 DERIVED yacht eq_ref PRIMARY,id_UNIQUE PRIMARY 4 yachtcharter.offer.yacht_id 1
4 DERIVED No tables used
Sub selects are never great,
You should sign up here: https://www.eversql.com/
Run that and it will give you all the right indexes and optimsiations you need for this query.
There's still some optimization you can use. Considering the subquery returns 5000 rows only you could use an index for it.
First rephrase the predicate as:
select *
from website_offer
where date_from >= CURDATE() + INTERVAL 1 DAY -- rephrased here
and date(date_to) <= CURDATE() + INTERVAL 3 MONTH
and days = 7
order by yacht_id, charter_type, list_price_per_day asc, discount desc
limit 5000
Then, if you add the following index the performance could improve:
create index ix1 on website_offer (days, date_from, date_to);

SELECT a column then INSERT that column again randomly

I've got a table:
player_id|player_name|play_with_id|play_with_name|
I made this table for a game.
Everyone who wants to play can sign up to it.
When they sign up the table stores player_id and player_name
When the period while they can sign up expires I want to assign every player_name to a play_with_name randomly.
So for example.. my structure would like this when they in sign up period:
player_id|player_name|play_with_id|play_with_name|
1 someone1
2 someone2
3 someone3
4 someone4
5 someone5
And this when the period expires:
player_id|player_name|play_with_id|play_with_name|
1 someone1 2 someone2
2 someone2 1 someone1
3 someone3 4 someone4
4 someone4 3 someone3
5 someone5 - -
I can't test this since I don't have a MySQL database handy and SQLFiddle seems to take forever to run anything, but this hopefully gets you there or at least close:
SET #row_num = 0;
SET #last_player_id = 0;
UPDATE P
SET
play_with_id =
CASE
WHEN P.player_id = SQ.player_id THEN SQ.last_player_id
ELSE player_id
END
FROM
Players P
LEFT OUTER JOIN
(
SELECT
#row_num := #row_num + 1 row_num,
#last_player_id last_player_id,
#last_player_id := player_id player_id
FROM
Players
WHERE
MOD(#row_num, 2) = 0
ORDER BY
RAND()
) SQ ON SQ.player_id = P.player_id OR SQ.last_player_id = P.player_id
The code (hopefully) sorts the players randomly then it pairs them based on that order. Every other player in the randomly sorted result is paired with the person right before them.
In MS SQL Server RAND() would only be evaluated once here and wouldn't end up affecting the ORDER BY, but I think that MySQL handles RAND() differently and generates a new value for each row in the result set.
I'm not sure why some client code isn't doing this as opposed to having this operation be done at the database level, but I suppose if you get the strategy for retrieving a randomized row set based on your DB from here, you could then write a stored procedure with a cursor or iterator to loop through the result set of something like:
select player_id, player_name from players order by RAND()
and then loop through the all the table rows to update the play_with_id and play_with_name, where the previously selected player_id <> play_with_id.

select random value based on probability chance

How do I select a random row from the database based on the probability chance assigned to each row.
Example:
Make Chance Value
ALFA ROMEO 0.0024 20000
AUDI 0.0338 35000
BMW 0.0376 40000
CHEVROLET 0.0087 15000
CITROEN 0.016 15000
........
How do I select random make name and its value based on the probability it has to be chosen.
Would a combination of rand() and ORDER BY work? If so what is the best way to do this?
You can do this by using rand() and then using a cumulative sum. Assuming they add up to 100%:
select t.*
from (select t.*, (#cumep := #cumep + chance) as cumep
from t cross join
(select #cumep := 0, #r := rand()) params
) t
where #r between cumep - chance and cumep
limit 1;
Notes:
rand() is called once in a subquery to initialize a variable. Multiple calls to rand() are not desirable.
There is a remote chance that the random number will be exactly on the boundary between two values. The limit 1 arbitrarily chooses 1.
This could be made more efficient by stopping the subquery when cumep > #r.
The values do not have to be in any particular order.
This can be modified to handle chances where the sum is not equal to 1, but that would be another question.