I have a bulk query with subquery. My query works fine when I run it on development server, but when I've try it pn the live server, the query takes too much time to produce an output. I think it's because of a big data on the live server. Can anyone help me on how to index query on MySQL so that it will lessen the time execution.
Here is my query:
SELECT prd.fldemployeeno AS Empno,
(SELECT fldemployeename FROM tblprofile prf WHERE prf.fldemployeeno = prd.fldemployeeno LIMIT 0,1) AS Empname,
'01' AS `Week`,
COUNT(DISTINCT isAud.fldid) AuditedFiles,
COUNT(qua.seqid) ErrorCount,
COUNT(DISTINCT qua.fldid) OrdersWithError
FROM tbldownloadITL dwn
INNER JOIN tblproductionITL prd
ON dwn.fldid = prd.fldglobalid
INNER JOIN (SELECT p.fldemployeeno,fldglobalid,p.fldstarttime,COALESCE(q.fldstarttime,p.fldstarttime) `AuditDate`
FROM tblproductionitl p
LEFT JOIN tblqualityaudit q
ON p.fldemployeeno=q.fldemployeeno
AND p.fldstarttime=q.fldprodstarttime
AND p.fldglobalid=q.fldid
WHERE p.fldprojectgroup='PROJGROUP') temp
ON prd.fldglobalid=temp.fldglobalid
AND prd.fldemployeeno=temp.fldemployeeno
AND prd.fldstarttime=temp.fldstarttime
INNER JOIN tblisauditedITL isAud
USING (fldid)
LEFT JOIN tblqualityaudit qua
ON qua.fldid = dwn.fldid
AND qua.fldbusunit = dwn.fldbusunit
AND qua.fldprojectGroup = dwn.fldprojectGroup
AND qua.fldemployeeno = prd.fldemployeeno
AND qua.fldprodstarttime = prd.fldstarttime
AND qua.flderrorstatus != 'NOT ERROR'
LEFT JOIN tblerrorcategory
USING (flderrorcategoryid)
LEFT JOIN tblerrortypes
USING (flderrortypeid)
WHERE dwn.fldbusunit = 'BUSUNIT'
AND dwn.fldprojectGroup = 'PROJGROUP'
AND temp.AuditDate BETWEEN '2011-07-29 00:00:00' AND '2011-07-29 23:59:59'
GROUP BY prd.fldemployeeno
ORDER BY Empname
Here is also the description of the query:
I would suggest installing Sphinx on the your server if you have the access. That way you can have an indexed resource at your finger tips for extremely fast searching, on top of that you can add the execution of what is called a 'delta' index to allow for real time updating of your mysql database. It is highly customizable. Hopefully this will help you out.
http://sphinxsearch.com/
Related
I have the below query, which I appreciate probably isn't well written, but on my local PC with Xampp and MariaDB it executes in 0.1719 seconds, which is about the speed I would hope for.
However, on my development server with Plesk and MariaDB the same query with the same data takes over 12 seconds. Obviously would be no use.
Probably the query could be modified to make it better, but can somebody explain why the performance difference? The server is a VPS, it has no shortage of resources - it isn't live so usage is almost none at all, yet still 12+ seconds for this query.
The query:
SELECT m.id AS match_id, e.event AS event1
FROM matches m
JOIN competitions co ON co.id = m.competition
JOIN clubs h ON h.id = m.hometeam
JOIN clubs a ON a.id = m.awayteam
LEFT JOIN match_events e ON e.match = m.id
AND e.player = '7138'
WHERE (m.hometeam = '1'
OR m.awayteam = '1'
)
AND m.season = '121'
Are you sure you need AND e.player = '7138' in the ON clause of a LEFT JOIN and not in the WHERE clause?
Better indexing
Recommend these composite, covering, indexes:
m: (season, awayteam, hometeam, competition, id)
e: (player, match, event)
Avoiding OR
OR optimizes poorly. A common trick is to turn it into UNION. Such may work for your query:
SELECT ...
FROM matches JOIN ...
WHERE m.season = 121
AND m.hometeam = 1
UNION ALL
SELECT ...
FROM matches JOIN ...
WHERE m.season = 121
AND m.awayteam = 1
And have these two indexes:
INDEX(season, hometeam) -- will be used by one part of the UNION
INDEX(season, awayteam) -- will be used by the other
I chose UNION ALL because it is faster than UNION DISTINCT. But if you get unwanted dups, change it.
I tried running a query with an inner join in Sequel Pro to get the most recent records/invoices using this:
SELECT tt.Hotel_Property, tt.Preferred_Hotel_Status
FROM hotel_detail tt
INNER JOIN
(SELECT Hotel_Property, MAX(STR_TO_DATE (`Invoice_Date`, '%m/%d/%Y')) AS MaxDateTime
FROM hotel_detail
GROUP BY Hotel_Property) groupedtt
ON tt.Hotel_Property = groupedtt.Hotel_Property
AND tt.Invoice_Date = groupedtt.MaxDateTime
But it's running the query for a long time and I'm not sure if it'll actually execute (cancelled it after waiting 14 mins). I know it's a lot of data to work through but wondered if anyone had suggestions to make it run faster?
*Ideally I want one record for each hotel property giving the most recent invoice date and the status associated with that max invoice
Thanks!
From my knowledge sequel pro uses MySQL so you might no be able to use analytical functions but try the following:
SELECT
Hotel_Property
, Preferred_Hotel_Status
, MAX(STR_TO_DATE (`Invoice_Date`, '%m/%d/%Y')) OVER(PARTITION BY hotel_property) AS MaxDateTime
FROM hotel_detail
If this doesn't work then I'd suggest running the query in 'chunks' based on the date. So maybe run a day at a time by employing a WHERE clause. I.E.:
SELECT tt.Hotel_Property, tt.Preferred_Hotel_Status
FROM hotel_detail tt
INNER JOIN
(SELECT Hotel_Property, MAX(STR_TO_DATE (`Invoice_Date`, '%m/%d/%Y')) AS MaxDateTime
FROM hotel_detail
WHERE DATE = "the_date_you_want_to_run"
GROUP BY Hotel_Property) groupedtt
ON tt.Hotel_Property = groupedtt.Hotel_Property
AND tt.Invoice_Date = groupedtt.MaxDateTime
WHERE DATE = "the_date_you_want_to_run"
Then you can either look at the results for different days separately, or simply INSERT them into a new table where you can perform more analysis.
Try using a correlated subquery:
select hd.*
from hotel_detail hd
where str_to_date(hd.invoice_date, '%m/%d/%Y') =
(select max(str_to_date(hd2.invoice_date, '%m/%d/%Y'))
from hotel_detail hd2
where hd2.hotel_property = hd.hotel_property
);
This can take advantage of an index on hotel_detail(hotel_property, invoice_date). The index would be more effective if you stored the date properly using the native SQL format of date or datetime.
The below query is very slow (takes around 1 second), but is only searching approx 2500 records (+ inner joined tables).
if i remove the ORDER BY, the query runs in much less time (0.05 or less)
OR if i remove the part nested select below "# used to select where no ProfilePhoto specified" it also runs fast, but i need both of these included.
I have indexes (or primary key) on :tPhoto_PhotoID, PhotoID, p.Enabled, CustomerID, tCustomer_CustomerID, ProfilePhoto (bool), u.UserName, e.PrivateEmail, m.tUser_UserID, Enabled, Active, m.tMemberStatuses_MemberStatusID, e.tCustomerMembership_MembershipID, e.DateCreated
(do i have too many indexes? my understanding is add them anywhere i use WHERE or ON)
The Query :
SELECT e.CustomerID,
e.CustomerName,
e.Location,
SUBSTRING_INDEX(e.CustomerProfile,' ', 25) AS Description,
IFNULL(p.PhotoURL, PhotoTable.PhotoURL) AS PhotoURL
FROM tCustomer e
LEFT JOIN (tCustomerPhoto ep INNER JOIN tPhoto p ON (ep.tPhoto_PhotoID = p.PhotoID AND p.Enabled=1))
ON e.CustomerID = ep.tCustomer_CustomerID AND ep.ProfilePhoto = 1
# used to select where no ProfilePhoto specified
LEFT JOIN ((SELECT pp.PhotoURL, epp.tCustomer_CustomerID
FROM tPhoto pp
LEFT JOIN tCustomerPhoto epp ON epp.tPhoto_PhotoID = pp.PhotoID
GROUP BY epp.tCustomer_CustomerID) AS PhotoTable) ON e.CustomerID = PhotoTable.tCustomer_CustomerID
INNER JOIN tUser u ON u.UserName = e.PrivateEmail
INNER JOIN tmembers m ON m.tUser_UserID = u.UserID
WHERE e.Enabled=1
AND e.Active=1
AND m.tMemberStatuses_MemberStatusID = 2
AND e.tCustomerMembership_MembershipID != 6
ORDER BY e.DateCreated DESC
LIMIT 12
i have similar queries that but they run much faster.
any opinions would be grateful:
Until we get more clarity on your question between working in other query etc..Try EXPLAIN {YourSelectQuery} in MySQL client and see the suggestions to improve the performance.
I am trying to optimize mysql to decrease my server load.. And here i have a complex query which will be used about 1k times/minute on a quad core server with 8gb ram and my server is going down.
I have tried many ways to rewrite this query :
SELECT *
FROM (
SELECT a.id,
a.url
FROM surf a
LEFT JOIN users b
ON b.id = a.user
LEFT JOIN surfed c
ON c.user = 'asdf' AND c.site = a.id
WHERE a.active = '0'
AND (b.coins >= a.cpc AND a.cpc >= '2')
AND (c.site IS NULL AND a.user !='asdf')
ORDER BY a.cpc DESC, b.premium DESC
LIMIT 100) AS records
ORDER BY RAND()
LIMIT 1
But it didn't work. So can you guys help me to rewrite the above query so that it would not waste any resources ?
Also this query doesn't have any indexes :( . It would be very helpful to guide me creating indexes for this.
The problem is most likely the inner sort.
You should have indexes on surfed(site, user, cpc), surf(active, user, site), and user(id, coins).
You can also perhaps make minor improvements by switching the join to inner joins from outer joins. The where clause is undoing the left outer join anyway, so this won't affect the results.
But I don't think these changes will really help. The problem is the sort of the result set in the inner query. The outer sort by rand() is a minor issue, because you are only doing that on 100 rows.
If you are running this 1,000/minute, you will need to rethink your data structures and develop an approach that has the data you need more readily available.
im trying to generate a report using CodeIgniter and Datatables.net .
Now i'm trying to the amount of closed jobs (its a human resources system). I used to query all jobs and in PHP do a foreach and then doing the calcs.
Because im want to use all the features of Datatables (sorting specifically) im trying to do all the calcs in mySQL.
The problem is: the second subquery is very very very slow.
SELECT
jobs.jobs_id, clients.nome_fantasia, concat_ws(' ', user_profiles.first_name, user_profiles.last_name) as fullname,
jobs.titulo_vaga, jobs.qtd_vagas, company.name as nome_company, jobs_status.name as status_name, DATEDIFF(NOW(), jobs.data_abertura) as date_idade,
(select count(job_cv.jobs_id) from job_cv where job_cv.jobs_id = jobs.jobs_id) as qtd_int,
(select count(distinct job_cv.user_id) from job_cv_history join job_cv on job_cv.job_cv_id = job_cv_history.job_cv_id where job_cv_history.status = '11' and job_cv.jobs_id = jobs.jobs_id ) as fechadas
FROM (jobs)
JOIN clients ON lients.clients_id=jobs.clients_idJOIN user_profiles ON jobs.consultor_id=user_profiles.user_id
JOIN jobs_status ON jobs.status=jobs_status.jobs_status_id
JOIN company ON jobs.company_id=company.company_id
LIMIT 50
Some one can help me? I can provide more information if its needed.
UPDATE
The idea to use JOIN instead SELECT work with the first subquery but with the second one not, there a way to pass a 'variable' to use inside the subquery? Like the current jobs_id?
UPDATE AGAIN
This line works fine by itself. But inside the subquery take about a minute with worng values
SELECT job_cv.jobs_id,count(distinct job_cv.user_id) AS fechadas
FROM job_cv_history
JOIN job_cv
ON job_cv.job_cv_id = job_cv_history.job_cv_id
WHERE job_cv_history.status = '11'
GROUP BY job_cv.jobs_id
It is not subquery that is slow. It's the fact, that you're executing these subqueries for each row returned from outer query. Move these to joins instead, and you should observe increase in performance.
SELECT
jobs.jobs_id, clients.nome_fantasia, concat_ws(' ', user_profiles.first_name, user_profiles.last_name) as fullname,
jobs.titulo_vaga, jobs.qtd_vagas, company.name as nome_company, jobs_status.name as status_name, DATEDIFF(NOW(), jobs.data_abertura) as date_idade,
qtd.qtd_int,
fechadas.fechadas
FROM (jobs)
JOIN clients ON lients.clients_id=jobs.clients_idJOIN user_profiles ON jobs.consultor_id=user_profiles.user_id
JOIN jobs_status ON jobs.status=jobs_status.jobs_status_id
JOIN company ON jobs.company_id=company.company_id
JOIN (
SELECT jobs_id, count(jobs_id) AS qtd_int FROM job_cv GROUP BY jobs_id
) AS qtd ON qtd.jobs_id = jobs.jobs_id
JOIN (
SELECT job_cv.user_id, count(distinct job_cv.user_id) AS fechadas
FROM job_cv_history
JOIN job_cv
ON job_cv.job_cv_id = job_cv_history.job_cv_id
WHERE job_cv_history.status = '11'
GROUP BY job_cv.user_id
) AS fechadas ON job_cv.jobs_id = jobs.jobs_id
LIMIT 50
You may try to create these indexes:
ALTER TABLE `job_cv` ADD INDEX `job_cv_cindex` (`job_cv_id` ASC, `jobs_id` ASC, `user_id` ASC);
ALTER TABLE `job_cv_history` ADD INDEX `job_cv_history_cindex` (`job_cv_id` ASC, `status` ASC);
use Joins instead of sub queries. It significantly improves the performance in MySql.
try to use Left join on your case and see if performance improves or not