How to order data randomly - mysql

I have the following MySQL. I want to pull the data in random order.
Could anyonet teach me how to do it please.
$Q = $this->db->query('SELECT P.*, C.Name AS CatName
FROM products AS P
LEFT JOIN categories C
ON C.id = P.category_id
WHERE C.Name = "Front bottom"
AND p.status = "active"
');

$Q = $this->db->query('SELECT P.*, C.Name AS CatName
FROM products AS P
LEFT JOIN categories C
ON C.id = P.category_id
WHERE C.Name = "Front bottom"
AND p.status = "active"
ORDER BY RAND()
');
you can use the RAND function of MySQL to do that, to be noted that it would perform really slowly on huge dataset (~ about 10k). MySQL would pickup a random number for each row of the table which could lead to problem if the table is huge.
A safer method would be to do a SELECT count(*) as n FROM table and to pickup a random number and do a query with LIMIT 1,n to pickup the nth row. That would work if you need only 1, or you don't care having the result in same order.
After if you really need a complete random set better to do it on server side in my opinion.

You can try
ORDER BY RAND()

The easiest way is using ORDER BY RAND(), but it's performance is miserable, especially for larger datasets (requires a random number for all matching rows).
Another way is randomly creating ids(Either in your code or using RAND() again: WHERE id in (RAND(), RAND(), RAND(), RAND()) should work, but no guarantee). This gets problematic as soon as some IDs don't exist.
Here is an interesting article on the topic.

Related

Speeding up mysql query

I have a mysql query to join four tables and I thought that it was just best to join tables but now that mysql data is getting bigger the query seems to cause the application to stop execution.
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM `purchase_order`
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY id ORDER BY `purchase_order`.`po_date` DESC LIMIT 0, 20
my problem really is the query that takes a lot of time to finish. Is there a way to speed this query or to change this query for faster retrieval of the data?
heres the EXPLAIN EXTENED as requested in the comments.
Thanks in advance, I really hope this is the right channel for me to ask. If not please let me know.
Will this give you the correct list of ids?
SELECT id
FROM purchase_order
ORDER BY`po_date` DESC
LIMIT 0, 20
If so, then start with that before launching into the JOIN. You can also (I think) get rid of the GROUP BY that is causing an "explode-implode" of rows.
SELECT ...
FROM ( SELECT id ... (as above) ...) AS ids
JOIN purchase_order po ON po.id = ids.id
JOIN ... (the other tables)
GROUP BY ... -- (this may be problematic, especially with the LIMIT)
ORDER BY po.po_date DESC -- yes, this needs repeating
-- no LIMIT
Something like this
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM (SELECT id, po_date, po_number, customer_id, status
FROM purchase_order
ORDER BY `po_date` DESC
LIMIT 0, 5) as purchase_order
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items
ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY purchase_order.id DESC
LIMIT 0, 5
You need to be sure that purchase_order.po_date and all id column are indexed. You can check it with below query.
SHOW INDEX FROM yourtable;
Since you mentioned that data is getting bigger. I would suggest doing sharding and then you can parallelize multiple queries. Please refer to the following article
Parallel Query for MySQL with Shard-Query
First, I cleaned up readability a bit. You don't need tick marks around every table.column reference. Also, for short-hand, using aliases works well. Ex: "po" instead of "purchase_order", "poi" instead of "purchase_order_items". The only time I would use tick marks is around reserved words that might cause a problem.
Second, you don't have any aggregations (sum, min, max, count, avg, etc.) in your query so you should be able to strip the GROUP BY clause.
As for indexes, I would have to assume you have an index on your reference tables on their respective "id" key columns.
For your Purchase Order table, I would have an index on that based on the "po_date" in the first index field position in case you already had an index using it. Since your Order by is on that, let the engine jump directly to those dated records first and you have your descending order resolved.
SELECT
po.id,
po.po_date,
po.po_number,
po.customer_id,
c.`name` AS customer_name,
po.`status` AS po_status,
poi.product_id,
poi.po_item_name,
p.weight as product_weight,
p.pending as product_pending,
p.company_owner,
poi.uom,
poi.po_item_type,
poi.order_sequence,
poi.pending_balance,
poi.quantity,
poi.notes,
poi.`status` AS po_item_status,
poi.id AS po_item_id
FROM
purchase_order po
INNER JOIN customer c
ON po.customer_id = c.id
INNER JOIN purchase_order_items poi
ON po.id = poi.po_id
INNER JOIN product p
ON poi.product_id = p.id
ORDER BY
po.po_date DESC
LIMIT
0, 20

Get random rows, but not with ORDER BY RAND() MariaDB

I got quite a big sql query, which needs to select random rows, but because table is large, order by rand() is taking really long.
$getdata = $this->db->query("
SELECT DISTINCT property.id,property.unid,property.imported,property.userid,
CONCAT(user.firstname) as username,property.url,
IFNULL(user.thumbpic,'temp/misc/noimage.png') as profilepic,
property.bedrooms,property.beds,type.meta_val as type,property.accommodates,property.price,
IFNULL((select thumbimg from tblpropertyimages where pid=property.id limit 1),'temp/misc/noimage.png') as image,
property.name as propertyname,(select sum(rating) from tblreviews where pid=property.id) as totalrating,
(select count(id) from tblreviews where pid=property.id) as countratings,
location.name as cityname from tblproperty as property join tbluser as user on property.userid=user.id
join tblcommon as type on property.type=type.id
left join tblpropertyamenities as p_amenities on property.id=p_amenities.pid
join tbllocation as location on location.id=property.city
WHERE property.status='Active' and user.status='Active'
$home $q limit $limit offset $start");
What is the best solution for selecting random rows, for this specific query?
Depending on your detailed requirements, there are several faster approaches in here None is 'perfect', but each is probably 'good enough'.

mysql: subquery with returned many row

I have this query for load user stream in my app , is it too hard if we have 10000 matched row in 'follow' ?
SELECT *
FROM post
WHERE user_id
IN (SELECT follow_id
FROM follow
WHERE id='$some_id')
AND type='accepted'
ORDER BY id DESC LIMIT $page , 20
Syntactically your code looks correct.. I don't see any errors so then if you're talking about efficiency I would join the tables and include the second filter on the JOIN
SELECT p.*
FROM post p
JOIN follow f
ON f.follow_id = p.user_id
AND f.id = '$some_id'
WHERE p.type = 'accepted'
ORDER BY p.id DESC LIMIT $page , 20
MySQL handles large sets of data a lot better through a join than with an IN()...
Think of it this way.. because the IN() can have pretty much anything inside of it, MySQL has to check it with everything returned for each row... instead of checking once when you JOIN..
With that many returning, I have a feeling a Join might be more efficient
SELECT *
FROM post p Join
follow f On p.user_id = f.follow_id
WHERE f.id='$some_id'
AND p.type='accepted'
ORDER BY p.id DESC LIMIT $page , 20

mysql random least used

I need to select a record at random but not one already selected before unless all records have been selected.
Table Setup:
_word (id, nam)
_word_tm (id, word_id, tm)
Every time a word is used it is loaded into _word_tm. What I am wanting to do is make sure I use all the words before I reuse an already used word.
What I am really looking for is something like the below but just trying to figure out how to mesh.
select w.nam FROM _word w LEFT JOIN _word_tm wt ON w.id = wt.word_id ORDER BY count(wt.id) asc, rand() limit 1
First, find out how many times the least-used words have been used:
select _word.id, count(*) c from _word
left join _word_tm on _word.id=_word_tm.word_id order by c limit 1;
Store that value (from c) in a variable $least_used. Then get all the words used that many times, in random order:
select _word.id, count(*) c from _word
left join _word_tm on _word.id=_word_tm.word_id
group by _word.id having c <= {$least_used}
order by rand() limit 1;
You should be able to do something like this:
SELECT * FROM word_table WHERE word NOT IN (SELECT word FROM words_used) ORDER BY rand() LIMIT 1;
You will need to have an updated copy of MySQL for this to work. Also, you would need to include a line of code afterward or before hand to clear/reset your words_used table once it had the same contents as word_table.
You can try the following SQL to return a random row
select A.nam
from _word A
where A.id not in (select B.id from _word_tm B)
order by rand()
limit 1
if the above does not return any reocrd , do a simple
select A.nam
from _word A
order by rand()
limit 1

Joining Tables: case statement for no matches?

I have this query:
SELECT p.text,se.name,s.sub_name,SUM((p.volume / (SELECT SUM(p.volume)
FROM phrase p
WHERE p.volume IS NOT NULL) * sp.position))
AS `index`
FROM phrase p
LEFT JOIN `position` sp ON sp.phrase_id = p.id
LEFT JOIN `engines` se ON se.id = sp.engine_id
LEFT JOIN item s ON s.id = sp.site_id
WHERE p.volume IS NOT NULL
AND s.ignored = 0
GROUP BY se.name,s.sub_name
ORDER BY se.name,s.sub_name
There are a few things I want to do with it:
1) The end of the calculation for 'index', I multiple it all by sp.position, then get it's SUM. If there is NO MATCH in the first LEFT JOIN 'position', I want to give sp.position a value of 200. So basically if in the 'phrase' table I have an ID=2, but that does not exist in sp.phrase_id in the entire 'position' table, then sp.position=200 for the 'index' calculation, otherwise it will it will be whatever value is stored in the 'position' table. I hope that makes sense.
2) I do a GROUP BY se.name. I would like to actually SUM the entire 'index' values for similar se.name fields. So in the resultset as it stands now, if there were 20 p.text rows with the same se.name, I would like to SUM the index column for the same se.name(s).
I am more of a PHP guy, but trying to learn more MySQL. I have become a big believer in making the DB do as much of the work as possible instead of trying to manipulate the dataset after it's been returned.
I hope the questions were clear. Anyways, can both 1) and 2) be done? There's much more I want to modify this query to do, but I think if I need more help in the future on it, it would require a different question.
The position table has a engines_id, phrase_id, item_id which will make it a unique entry. The value I am trying to calculate is the sp.position value. But there are cases when there is no entry for these IDs combined. If there is no entry for the combo of 3 IDs I just listed, I would like to use sp.position=200 in my calculation.
How's this:
select x.name, sum(index) from
(
SELECT p.text,se.name,s.sub_name,SUM((p.volume / (SELECT SUM(p.volume)
FROM phrase p
WHERE p.volume IS NOT NULL) * if(sp.position is null,200,sp.position)))
AS `index`
FROM phrase p
LEFT JOIN `position` sp ON sp.phrase_id = p.id
LEFT JOIN `engines` se ON se.id = sp.engine_id
LEFT JOIN item s ON s.id = sp.site_id
WHERE p.volume IS NOT NULL
AND s.ignored = 0
GROUP BY se.name,s.sub_name
ORDER BY se.name,s.sub_name
)x
GROUP BY x.name
Try the following:
1.) Use IFNULL(), in your case IFNULL(sp.position, 200)
2.) I am not entirely clear on this part, but it seems like you already have part of what you are asking.