I am fetching data from MySQL views table and Main table. I have created Indexes and Primary keys in Main table but I cannot create Indexes and primary keys on views table.
When I execute the below query it is taking around 10 seconds. I want to optimize the below query to less time.
SELECT DISTINCT
`Emp_No`, `Name`
FROM
`ResLookup`
WHERE
`IsActive` = 1
AND `Department` IN ('SDG' , 'HDD', 'ENG', 'PDN')
AND (`Emp_No` IN (SELECT DISTINCT
ProjList.PM_No
FROM
ProjList
WHERE
ProjList.PM_No != 1749 UNION SELECT DISTINCT
ProjList.PL_No
FROM
ProjList
WHERE
ProjList.PL_No != 1749)
OR Emp_No IN (SELECT
MEMBER_ID
FROM
s_group_details
WHERE
GROUP_ID = 'GRP109'
AND MEMBERSHIP_LEVEL = 30));
Only s_group_details table have Indexes and primary key. Remaining all tables are fetching from views table.
Using Explain Query I have the below output
I don't know your query requirements but still check below query helpful or not
SELECT DISTINCT
`Emp_No`, `Name`
FROM
`ResLookup` inner join (SELECT DISTINCT
ProjList.PM_No ,ProjList.PL_No
FROM
ProjList
WHERE
ProjList.PM_No != 1749
or
ProjList.PL_No != 1749) a
on ResLookup.Emp_No = a.PM_No
and ResLookup.Emp_No = a.PL_No
OR Emp_No IN (SELECT
MEMBER_ID
FROM
s_group_details
WHERE
GROUP_ID = 'GRP109'
AND MEMBERSHIP_LEVEL = 30)
WHERE
`IsActive` = 1
AND `Department` IN ('SDG' , 'HDD', 'ENG', 'PDN');
It may be better to turn things somewhat inside-out:
SELECT `Emp_No`,
( SELECT Name
FROM ResLookup
WHERE Emp_No = u.PM_No
) AS Name
FROM
( SELECT PM_No FROM ProjList WHERE PM_No != 1749 )
UNION DISTINCT
( SELECT PL_No FROM ProjList WHERE PL_No != 1749 )
UNION DISTINCT
( SELECT MEMBER_ID
FROM s_group_details AS d
WHERE d.GROUP_ID = 'GRP109'
AND d.MEMBERSHIP_LEVEL = 30
) AS u
JOIN `ResLookup` AS r ON u.PM_No = r.Emp_No
WHERE r.`IsActive` = 1
AND r.`Department` IN ('SDG' , 'HDD', 'ENG', 'PDN');
Indexes needed:
ResLookup: (Emp_No, IsActive, Department)
s_group_details: (GROUP_ID, MEMBERSHIP_LEVEL, MEMBER_ID)
Related
I have two tables contacts and calllist. contacts has multiple columns containing phone numbers. calllist has only one column from_number containing phone numbers. I'm trying to get all phone numbers from the column from_number which do not match the phone numbers in the table calllist.
Here is my working but probably very inefficient and slow SQL query:
SELECT from_number AS phone_number, COUNT(from_number) AS number_of_calls
FROM calllist
WHERE from_number NOT IN (
SELECT businessPhone1
FROM contacts
WHERE businessPhone1 IS NOT NULL
)
AND from_number NOT IN (
SELECT businessPhone2
FROM contacts
WHERE businessPhone2 IS NOT NULL
)
AND from_number NOT IN (
SELECT homePhone1
FROM contacts
WHERE homePhone1 IS NOT NULL
)
AND from_number NOT IN (
SELECT homePhone2
FROM contacts
WHERE homePhone2 IS NOT NULL
)
AND from_number NOT IN (
SELECT mobilePhone
FROM contacts
WHERE mobilePhone IS NOT NULL
)
AND (received_at BETWEEN '$startDate' AND DATE_ADD('$endDate', INTERVAL 1 DAY))
GROUP BY phone_number
ORDER BY number_of_calls DESC
LIMIT 10
How do i rewrite this SQL query to be faster? Any help would be much appreciated.
try this
SELECT from_number AS phone_number, COUNT(from_number) AS number_of_calls
FROM calllist
WHERE from_number NOT IN (
SELECT businessPhone1
FROM contacts
WHERE businessPhone1 IS NOT NULL
UNION
SELECT businessPhone2
FROM contacts
WHERE businessPhone2 IS NOT NULL
UNION
SELECT homePhone1
FROM contacts
WHERE homePhone1 IS NOT NULL
UNION
SELECT homePhone2
FROM contacts
WHERE homePhone2 IS NOT NULL
UNION
SELECT mobilePhone
FROM contacts
WHERE mobilePhone IS NOT NULL
)
AND (received_at BETWEEN '$startDate' AND DATE_ADD('$endDate', INTERVAL 1 DAY))
GROUP BY phone_number
ORDER BY number_of_calls DESC
LIMIT 10
I don't like the schema design. You have multiple columns holding 'identical' data -- namely phone numbers. What if technology advances and you need a 6th phone number??
Instead, have a separate table of phone numbers, with linkage (id) to JOIN back to calllist. That gets rid of all the slow NOT IN ( SELECT... ), avoids a messy UNION ALL, etc.
If you desire, the new table could have a 3rd column that says which type of phone it is.
ENUM('unknown', 'company', 'home', 'mobile')
The simplified query goes something like
SELECT cl.from_number AS phone_number,
COUNT(*) AS number_of_calls
FROM calllist AS cl
LEFT JOIN phonenums AS pn ON cl.id = pn.user_id
WHERE cl.received_at >= '$startDate' AND
AND cl.received_at < '$endDate' + INTERVAL 1 DAY
AND pn.number IS NULL -- not found in phonenums
GROUP BY phone_number
ORDER BY number_of_calls DESC
LIMIT 10
I have written a query. It works better. But currently, all tables have 100K rows, and one of my queries returns too slow. Can you please suggest to me how I can optimize the query?
select *
from tbl_xray_information X
WHERE locationCode = (SELECT t.id
from tbl_location t
where CODE = '202')
AND ( communicate_with_pt is NULL || communicate_with_pt='')
AND x.patientID NOT IN (SELECT patientID
FROM tbl_gxp_information
WHERE center_id = '202')
order by insertedON desc LIMIT 2000
Please note here 'patientID' is varchar.
This may run faster:
select *
from tbl_xray_information AS X
WHERE locationCode =
( SELECT t.id
from tbl_location t
where CODE = '202'
)
AND ( x.communicate_with_pt is NULL
OR x.communicate_with_pt = '' )
AND NOT EXISTS ( SELECT 1 FROM tbl_gxp_information
WHERE x.patientID = patientID
AND center_id = '202' )
order by insertedON desc
LIMIT 2000
These indexes may help:
tbl_location: INDEX(CODE)
tbl_gxp_information: INDEX(center_id, patientID) -- (either order)
Since OR is poorly optimized, it may be better to pick either NULL or empty-string for communicate_with_pt (to avoid testing for both).
I am currently bit stuck at the query which needs a bit of optimization - I am looking for a way how to optimize (if possible) following query (I have no idea what to do here at the moment :/):
SELECT count(distinct(pj.id)) as qty
FROM `project_jobs` `pj`
JOIN `projects` `p` ON pj.project_id = p.id AND p.status NOT IN ("CANCELED","DELETED","ARCHIVED")
WHERE
(
(
(pj.job_type_service_id IN (SELECT id FROM job_type_services WHERE job_type_id IN (4,2,3)))
AND
(pj.new_status_id IN ("wip","completed","delivered"))
)
AND (pj.status<>'DELETED' AND pj.status<>'CANCELED')
)
AND
(pj.due_date >= '2010-04-01 00:00:00' AND pj.due_date <= '2018-05-09 23:59:59')
and exists
(SELECT * FROM project_job_parents pjp
WHERE pjp.project_job_id IN
(SELECT id FROM project_jobs WHERE job_type_id IN (1,24,7,8,32,34,33))
and
pjp.parent_id = pj.id
)
EXPLAIN gives following info:
Is there anything what can be done here to optimize and speed up the query?
EXISTS subqueries usually perform significantly better than IN subqueries, at least in most MySQL versions (transformed the query below).
You didn't provide the tables structure, so it will be hard to tell which indexes exist and which columns they contain. So, I'll just specify which indexes you should have.
Indexes to add:
ALTER TABLE `job_type_services` ADD INDEX `job_type_services_idx_id_id` (`job_type_id`,`id`);
ALTER TABLE `project_job_parents` ADD INDEX `project_job_parents_idx_id` (`parent_id`);
ALTER TABLE `project_job_parents` ADD INDEX `project_job_parents_idx_id` (`project_job_id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id_id_status_id` (`new_status_id`,`project_id`,`status`,`id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id` (`job_type_service_id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id` (`id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id_id` (`job_type_id`,`id`);
ALTER TABLE `projects` ADD INDEX `projects_idx_status_id` (`status`,`id`);
Transformed query:
SELECT
count(DISTINCT (`pj`.id)) AS qty
FROM
`project_jobs` `pj`
JOIN
`projects` `p`
ON `pj`.project_id = `p`.id
AND `p`.status NOT IN (
'CANCELED',
'DELETED',
'ARCHIVED')
WHERE
(
(
(
EXISTS (
SELECT
1
FROM
job_type_services
WHERE
job_type_services.job_type_id IN (
4, 2, 3
)
AND `pj`.job_type_service_id = job_type_services.id
)
)
AND (
`pj`.new_status_id IN (
'wip', 'completed', 'delivered'
)
)
)
AND (
`pj`.status <> 'DELETED'
AND `pj`.status <> 'CANCELED'
)
)
AND (
`pj`.due_date >= '2010-04-01 00:00:00'
AND `pj`.due_date <= '2018-05-09 23:59:59'
)
AND EXISTS (
SELECT
*
FROM
project_job_parents pjp
WHERE
EXISTS (
SELECT
1
FROM
project_jobs
WHERE
project_jobs.job_type_id IN (
1, 24, 7, 8, 32, 34, 33
)
AND pjp.project_job_id = project_jobs.id
)
AND pjp.parent_id = `pj`.id
)
I have a parent-child relation for the following tables:
CREATE TABLE `pages` (
id INT NOT NULL AUTO_INCREMENT,
name VARCHAR(100) NULL,
PRIMARY KEY ( id )
)
CREATE TABLE `pageObjects` (
id INT NOT NULL AUTO_INCREMENT,
object TEXT NULL,
lastChanged TIMESTAMP on update CURRENT_TIMESTAMP NOT NULL,
fkPageId int NOT NULL,
PRIMARY KEY ( id )
)
The pages have a one:many relation with pageObjects.
Whenever the pageObjects records connected to a single page exceed 10, all records that are edited furthest in the past and exceeding the 10 must be deleted.
I wanted to do this in a single query, but I can't seem to figure this out...
This is how far I've gotten:
DELETE
FROM pageObjects
WHERE id NOT IN (
SELECT po.id, po.fkPageId FROM (
SELECT objects.fkPageId FROM (
SELECT COUNT(*) as count, fkPageId
FROM pageObjects
GROUP BY fkPageId
) objects
WHERE count > 10
) AS page
JOIN pageObjects po
ON page.fkPageId = po.fkPageId
AND po.lastChanged < (
SELECT MIN(lastChanged )
FROM pageObjects
WHERE fkPageId = po.fkPageId
GROUP BY fkPageId
ORDER BY lastChanged DESC
LIMIT 10
)
)
Sadly, the LIMIT bit in the bottom sub-query is not working the way I want to, because the MIN() function should be applied AFTER the LIMIT is applied.
So I tried that:
DELETE
FROM pageObjects
WHERE id NOT IN (
SELECT po.id, po.fkPageId FROM (
SELECT objects.fkPageId FROM (
SELECT COUNT(*) as count, fkPageId
FROM pageObjects
GROUP BY fkPageId
) objects
WHERE count > 10
) AS page
JOIN pageObjects po
ON page.fkPageId = po.fkPageId
AND po.lastChanged < (
SELECT MIN(lastChanged)
FROM (
SELECT lastChanged
FROM pageObjects
WHERE fkPageId = po.fkPageId
GROUP BY fkPageId
ORDER BY lastChanged DESC
LIMIT 10
)
)
)
But this is not possible, because the po.fkPageId is not available in the sub-query of the sub-query.
Is there any way to do this like this?
You can do this quite simply by counting the number of later entries for each id:
DELETE FROM pageObjects
WHERE id IN (
SELECT id FROM pageObjects po
WHERE (
SELECT count(id)
FROM pageObjects po2
WHERE po2.fkPageId = po.fkPageId
AND po2.lastChanged > po.lastChanged
) > 10
)
Check out what the select returns here:
http://www.sqlfiddle.com/#!9/f5218f/1/0
I have the following tables in my game's database:
rankedUp (image_id, user_id, created_at)
globalRank (image_id, rank )
matchups (user_id, image_id1, image_id2)
All image_ids in globalRank table are assigned a rank which is a float from 0 to 1
Assuming I have the current logged in user's "user_id" value, I'm looking for a query that will return a pair of image ids (imageid1, imageid2) such that:
imageid1 has lower rank than imageid2 but is also the next highest rank less than imageid2
matchups table doesn't have (userid,imageid1,imageid2) or (userid,imageid2,imageid1)
rankedup table doesn't have (userid,imageid1) or if it does, the createdat column is older than X hours
What I have so far for requirement 1 is this:
SELECT lowerImages.image_id AS lower_image, higherImages.image_id AS higher_image
FROM global_rank AS lowerImages, global_rank AS higherImages
WHERE lowerImages.rank < higherImages.rank
AND lowerImages.image_id = (
SELECT image_id
FROM (
SELECT image_id
FROM global_rank
WHERE rank < higherImages.rank
ORDER BY rank DESC
LIMIT 1 , 1
) AS tmp
)
but it doesnt work because I can't reference higherImages.rank in the subquery.
Does anyone know how I could satisfy all of those requirements in one query?
Thanks for your help
EDIT:
I now have this query but I don't know about the efficiency and I need to test it for correctness:
SELECT lowerImages.image_id AS lower_image,
max(higherImages.image_id) AS higher_image
FROM global_rank AS lowerImages, global_rank AS higherImages
WHERE lowerImages.rank < higherImages.rank
AND 1 NOT IN (select 1 from ranked_up where
lowerImages.image_id = ranked_up.image_id
AND ranked_up.user_id = $user_id
AND ranked_up.created_at > DATE_SUB(NOW(), INTERVAL 1 DAY))
AND 1 NOT IN (
SELECT 1 from matchups where user_id = $userId
AND lower_image_id = lowerImages.image_id
AND higher_image_id = higherImages.image_id
UNION
SELECT 1 from matchups where user_id = $user_id
AND lower_image_id = higherImages.image_id
AND higher_image_id = lowerImages.image_id
)
GROUP BY 1
the "not in" statements I'm using are all indexed so they should run fast. The efficiency problem I have is the group by and selection of the global_rank tables
This question is a revision of Pretty Complex SQL Query, which should no longer be answered.
select
(
select image_id, rank from
rankedup inner join globalRank
on rankedup.image_id = globalRank .image_id
where user_id = XXX
limit 1, 1
) as highest,
(
select image_id, rank from
rankedup inner join globalRank
on rankedup.image_id = globalRank .image_id
where user_id = XXX
limit 2, 1
) as secondhighest
I normally use SQL Server, but this i think is the translation for mysql :)
This should do the trick:
SELECT lowerImages.*, higherImages.*
FROM globalrank AS lowerImages, globalrank AS higherImages
WHERE lowerImages.rank < higherImages.rank
AND lowerImages.image_id = (
SELECT image_id
FROM (
SELECT image_id
FROM globalrank
WHERE rank < higherImages.rank
ORDER BY rank DESC
LIMIT 1,1
) AS tmp
)
AND NOT EXISTS (
SELECT * FROM matchups
WHERE user_id = $user_id
AND ((image_id1 = lowerImages.image_id AND image_id2 = higherImages.image_id)
OR (image_id2 = lowerImages.image_id AND image_id1 = higherImages.image_id))
)
AND higherImages.image_id NOT IN (
SELECT image_id FROM rankedup
WHERE created_at < DATE_ADD(NOW(), INTERVAL 1 DAY)
AND USER_ID <> $user_id
)
ORDER BY higherImages.rank
I'm assuming the PKs of matchups and rankedup include all columns in those tables. This would allow the second 2 sub-queries to utilize the PK indexes. You would probably want an ordered index on globalrank.rank to speed up the first sub-query.