I got quite a big sql query, which needs to select random rows, but because table is large, order by rand() is taking really long.
$getdata = $this->db->query("
SELECT DISTINCT property.id,property.unid,property.imported,property.userid,
CONCAT(user.firstname) as username,property.url,
IFNULL(user.thumbpic,'temp/misc/noimage.png') as profilepic,
property.bedrooms,property.beds,type.meta_val as type,property.accommodates,property.price,
IFNULL((select thumbimg from tblpropertyimages where pid=property.id limit 1),'temp/misc/noimage.png') as image,
property.name as propertyname,(select sum(rating) from tblreviews where pid=property.id) as totalrating,
(select count(id) from tblreviews where pid=property.id) as countratings,
location.name as cityname from tblproperty as property join tbluser as user on property.userid=user.id
join tblcommon as type on property.type=type.id
left join tblpropertyamenities as p_amenities on property.id=p_amenities.pid
join tbllocation as location on location.id=property.city
WHERE property.status='Active' and user.status='Active'
$home $q limit $limit offset $start");
What is the best solution for selecting random rows, for this specific query?
Depending on your detailed requirements, there are several faster approaches in here None is 'perfect', but each is probably 'good enough'.
Related
Please check my query and suggest me indexing value and how can I decide which columns will be in indexes. Query is very slow when where clause exist otherwise query is just fine. Offset value is also slow down query.
SELECT
attachment.attachment_id AS attachmentID,
attachment.data_item_id AS candidateID,
attachment.title AS title,
candidate.first_name AS firstName,
candidate.last_name AS lastName,
candidate.city AS city,
candidate.state AS state
FROM
attachment
LEFT JOIN candidate
ON attachment.data_item_id = candidate.candidate_id
where candidate.is_active = 1
ORDER BY
lastName ASC
LIMIT 92000, 20
Your query is basically:
SELECT . . .
FROM attachment a JOIN
candidate c
ON a.data_item_id = c.candidate_id
WHERE c.is_active = 1
ORDER BY c.last_name ASC
LIMIT 92000, 20;
Note that the WHERE clause turns the LEFT JOIN into an INNER JOIN anyway, so there is no reason to use LEFT JOIN.
I would recommend the following indexes:
candidate(is_active, candidate_id, last_name)
attachment(data_item_id)
You could expand the indexes to include all columns being selected.
Note that offsetting 92,000 rows takes a bit of effort so the query will never be lightning fast.
Server won't byte on this query, it takes too long to execute:
select prodavac.id, count(artikl.id) as brojartikala, count(poruceno.id) as brojporudzbina from prod_prodavac prodavac
inner join prod_artikl artikl
on prodavac.id=artikl.prodavacid
inner join prod_poruceno poruceno
on prodavac.id=poruceno.prodavacid
group by prodavac.id
On the other hand, both semi-queries run mega fast:
select prodavac.id, count(artikl.id) as brojartikala from prod_prodavac prodavac
inner join prod_artikl artikl
on prodavac.id=artikl.prodavacid
group by prodavac.id
Also the other one:
select prodavac.id, count(poruceno.id) as brojporudzbina from prod_prodavac prodavac
inner join prod_poruceno poruceno
on prodavac.id=poruceno.prodavacid
group by prodavac.id
order by prodavac.id asc
I would really like to do it in one query, so how to merge them correct way? All IDs are indexed integers.
Explain select shows this:
Depending on the relations between the tables and the data, your combined query might not even return the desired result. For a simple count of relations you can use correlated subqueries in the SELECT clause:
select prodavac.id, (
select count(*)
from prod_artikl artikl
where artikl.prodavacid = prodavac.id
) as brojartikala, (
select count(*)
from prod_poruceno poruceno
where poruceno.prodavacid = prodavac.id
) as brojporudzbina
from prod_prodavac prodavac
order by prodavac.id asc
I have a mysql query to join four tables and I thought that it was just best to join tables but now that mysql data is getting bigger the query seems to cause the application to stop execution.
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM `purchase_order`
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY id ORDER BY `purchase_order`.`po_date` DESC LIMIT 0, 20
my problem really is the query that takes a lot of time to finish. Is there a way to speed this query or to change this query for faster retrieval of the data?
heres the EXPLAIN EXTENED as requested in the comments.
Thanks in advance, I really hope this is the right channel for me to ask. If not please let me know.
Will this give you the correct list of ids?
SELECT id
FROM purchase_order
ORDER BY`po_date` DESC
LIMIT 0, 20
If so, then start with that before launching into the JOIN. You can also (I think) get rid of the GROUP BY that is causing an "explode-implode" of rows.
SELECT ...
FROM ( SELECT id ... (as above) ...) AS ids
JOIN purchase_order po ON po.id = ids.id
JOIN ... (the other tables)
GROUP BY ... -- (this may be problematic, especially with the LIMIT)
ORDER BY po.po_date DESC -- yes, this needs repeating
-- no LIMIT
Something like this
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM (SELECT id, po_date, po_number, customer_id, status
FROM purchase_order
ORDER BY `po_date` DESC
LIMIT 0, 5) as purchase_order
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items
ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY purchase_order.id DESC
LIMIT 0, 5
You need to be sure that purchase_order.po_date and all id column are indexed. You can check it with below query.
SHOW INDEX FROM yourtable;
Since you mentioned that data is getting bigger. I would suggest doing sharding and then you can parallelize multiple queries. Please refer to the following article
Parallel Query for MySQL with Shard-Query
First, I cleaned up readability a bit. You don't need tick marks around every table.column reference. Also, for short-hand, using aliases works well. Ex: "po" instead of "purchase_order", "poi" instead of "purchase_order_items". The only time I would use tick marks is around reserved words that might cause a problem.
Second, you don't have any aggregations (sum, min, max, count, avg, etc.) in your query so you should be able to strip the GROUP BY clause.
As for indexes, I would have to assume you have an index on your reference tables on their respective "id" key columns.
For your Purchase Order table, I would have an index on that based on the "po_date" in the first index field position in case you already had an index using it. Since your Order by is on that, let the engine jump directly to those dated records first and you have your descending order resolved.
SELECT
po.id,
po.po_date,
po.po_number,
po.customer_id,
c.`name` AS customer_name,
po.`status` AS po_status,
poi.product_id,
poi.po_item_name,
p.weight as product_weight,
p.pending as product_pending,
p.company_owner,
poi.uom,
poi.po_item_type,
poi.order_sequence,
poi.pending_balance,
poi.quantity,
poi.notes,
poi.`status` AS po_item_status,
poi.id AS po_item_id
FROM
purchase_order po
INNER JOIN customer c
ON po.customer_id = c.id
INNER JOIN purchase_order_items poi
ON po.id = poi.po_id
INNER JOIN product p
ON poi.product_id = p.id
ORDER BY
po.po_date DESC
LIMIT
0, 20
I have a performance issue with the query below on MYSQL. The below query has 5 tables involved. When I apply the order by and limit, the results are retrieved in 0.3 secs. But without the order by and limit, I was able to get the results in 0.01 secs. I am tired changing the query but that did not work. Could someone please help me with this query so I can get the results in desired time (<0.3 secs).
Below are the details.
m_todos = 286579 (records)
m_pat = 214858 (records)
users = 119 (records)
m_programs = 26 (records)
role = 4 (records)
SELECT *
FROM (
SELECT t.*,
mp.name as A_name,
u.first_name, u.last_name,
p.first, p.last, p.zone, p.language,p.handling,
r.name,
u2.first_name AS created_first_name,
u2.last_name AS created_last_name
FROM m_todos t
INNER JOIN role r ON t.role_id=r.id
INNER JOIN m_pat p ON t.patient_id = p.id
LEFT JOIN users u2 ON t.created_id=u2.id
LEFT JOIN m_programs mp ON t.prog_id=mp.id
LEFT JOIN users u ON t.user_id=u.id
WHERE t.role_id !='9'
AND t.completed = '0000-00-00 00:00:00'
) C
ORDER BY priority DESC, due ASC
LIMIT 0,10
Get rid of the outer SELECT; move the ORDER BY and LIMIT in.
Indexes:
t: (completed)
t: (priority, due)
I assume priority and due are in t?? Please be explicit in the query. It could make a huge difference.
If the following works, it should speed things up a lot: Start by finding the t.id without all the JOINs:
SELECT id
FROM m_todos
WHERE role_id !='9'
AND completed = '0000-00-00 00:00:00'
ORDER BY priority DESC, due DESC
LIMIT 10
That will benefit from this covering composite index:
INDEX(completed, role_id, priority, due, id)
Debug that. Then use it in the rest:
SELECT t.*, the-other-stuff
FROM ( that-query ) AS t1
JOIN m_todos AS t USING(id)
then-the-rest-of-the-JOINs
ORDER BY priority DESC, due ASC -- yes, again
If you don't need all of t.*, it may be beneficial to spell out the actual columns needed.
The reason for this to run much faster is that the 10 rows are found efficiently by looking only at the one table. The original code was shoveling around a lot more rows than 10 and they included all the columns of t, plus columns from the other tables.
My version does only 10 lookups for all the extra stuff.
I have the following MySQL. I want to pull the data in random order.
Could anyonet teach me how to do it please.
$Q = $this->db->query('SELECT P.*, C.Name AS CatName
FROM products AS P
LEFT JOIN categories C
ON C.id = P.category_id
WHERE C.Name = "Front bottom"
AND p.status = "active"
');
$Q = $this->db->query('SELECT P.*, C.Name AS CatName
FROM products AS P
LEFT JOIN categories C
ON C.id = P.category_id
WHERE C.Name = "Front bottom"
AND p.status = "active"
ORDER BY RAND()
');
you can use the RAND function of MySQL to do that, to be noted that it would perform really slowly on huge dataset (~ about 10k). MySQL would pickup a random number for each row of the table which could lead to problem if the table is huge.
A safer method would be to do a SELECT count(*) as n FROM table and to pickup a random number and do a query with LIMIT 1,n to pickup the nth row. That would work if you need only 1, or you don't care having the result in same order.
After if you really need a complete random set better to do it on server side in my opinion.
You can try
ORDER BY RAND()
The easiest way is using ORDER BY RAND(), but it's performance is miserable, especially for larger datasets (requires a random number for all matching rows).
Another way is randomly creating ids(Either in your code or using RAND() again: WHERE id in (RAND(), RAND(), RAND(), RAND()) should work, but no guarantee). This gets problematic as soon as some IDs don't exist.
Here is an interesting article on the topic.