This query I have takes a whooping 45 seconds to execute. I have indexes on all fields that are being search.
SELECT SQL_CALC_FOUND_ROWS g.app_group_id, g.id as g_id, p.`id` as form_id,
a.`user` as activity_user,a.`activity` as app_act ,a.*
FROM grouped g
INNER JOIN
(SELECT max(id) as id, app_group_id FROM grouped GROUP BY app_group_id) g1
ON g1.app_group_id = g.app_group_id AND g.id = g1.id
INNER JOIN form p
on p.id = g.id
INNER JOIN
(SELECT a.id, a.date_time, a.user, a.activity FROM log a) a
ON g.id = a.id
WHERE p.agname like '%blahblah%' and p.`save4later` != 'y'
and a.activity = 'APP Submitted' or a.activity = 'InstaQUOTE'
ORDER BY app_group_id DESC limit 0, 100
In my explain it shows im using Using temporary; Using filesort
Indexes are:
activity table: PRIMARY activity_id INDEX date_time INDEX id INDEX activity INDEX user
form table: PRIMARY id INDEX id_md5 INDEX dateadd INDEX dateu INDEX agent_or_underwriter INDEX
grouped table: UNIQUE id INDEX app_group_id INDEX agent_or_underwriter save4later
Any advice is much appreciated
Thank you very much
To start with, try this one:
SELECT
g.app_group_id,
g.id AS g_id,
p.id AS form_id,
a.user AS activity_user,
a.activity AS app_act,
a.id,
a.date_time
FROM grouped g
INNER JOIN
(SELECT MAX(id) AS id, app_group_id FROM grouped GROUP BY app_group_id) g1
ON g1.app_group_id = g.app_group_id AND g.id = g1.id
INNER JOIN form p
ON p.id = g.id
INNER JOIN log a
ON g.id = a.id
WHERE p.agname LIKE '%blahblah%'
AND p.save4later != 'y'
AND a.activity IN('APP Submitted', 'InstaQUOTE')
LIMIT 0, 100
I removed an unnecessary subquery. Also removed ORDER BY. I guess you could do without sorting, and that must speed the query up a lot.
I also removed SQL_CALC_FOUND_ROWS, because, as I mentioned earlier, it should be faster to issue a separate COUNT(*) query.
Your where clause has an "Or" on the "a.Activity". Without ( ) around both activity, it is going through EVERYTHING, all P, G, G1 aliases. I am guessing that might be your bigger issue.
Additionally, I would ensure you have an index on your "Form" table with the ( Save4Later ) column indexed
I would update the query like this:
SELECT STRAIGHT_JOIN SQL_CALC_FOUND_ROWS
g.app_group_id,
g.id as g_id,
QualifyPages.Form_id,
a.`user` as activity_user,
a.`activity` as app_act,
a.*
FROM
( select p.ID as Form_ID
from FORM p
WHERE p.`save4later` != 'y'
AND p.agname like '%blahblah%' ) QualifyPages
JOIN Grouped g
on QualifyPages.Form_ID = g.ID
INNER JOIN
( SELECT app_group_id,
max(id) as MaxIDPerGroup
FROM
grouped
GROUP BY
app_group_id ) g1
ON g.app_group_id = g1.app_group_id
AND g.id = g1.MaxIDPerGroup
INNER JOIN
( SELECT a.id, a.date_time, a.user, a.activity
FROM log a
WHERE a.activity = 'APP Submitted'
or a.activity = 'InstaQUOTE' ) a
ON g.id = a.id
ORDER BY
g.app_group_id DESC
limit
0, 100
Related
I have three tables, users(which includes user details), territory_categories(which has different territory names and their ids), user_territory(a junction table between the two)
I want to filter users based on selected territory from the list.
query for that is --
SELECT u.id,account_id_fk,first_name,last_name,email_address,password,mobile_number,gender,user_accuracy,check_in_radius,report_to,role,allow_timeout,
active,last_logged_on,last_known_location_time,last_known_location,u.created_on,u.created_by,
(SELECT COUNT(DISTINCT u2.id) FROM fieldsense.users u2
INNER JOIN user_territory t1 ON u2.id=t1.user_id_fk
INNER JOIN territory_categories c on c.id=t1.teritory_id
**WHERE c.category_name LIKE** "Gujarat" AND(u2.account_id_fk=1 AND u2.role!=0) ) AS usersCount,
IFNULL(a.id,0) attendanceId,IFNULL(a.punch_date,'1111-11-11') punchInDate,IFNULL(a.punch_in,0) punchIntime,IFNULL(a.punch_out_date,'1111-11-11') punchOutDate, IFNULL(a.punch_out,0) punchOutTime
FROM fieldsense.users as u
LEFT OUTER JOIN attendances as a ON u.id=a.user_id_fk AND a.id=(select max(id) from attendances att where att.user_id_fk=u.id)
INNER JOIN user_territory t1 ON u.id=t1.user_id_fk
INNER JOIN territory_categories c on c.id=t1.teritory_id
**WHERE c.category_name LIKE "Gujarat"**
GROUP BY u.id **limit 10**
OUTPUT :
usersCount: 136
attendanceId: 0
punchInDate: 1111-11-11
punchIntime: 0
punchOutDate: 1111-11-11
punchOutTime: 0
10 rows in set (2.06 sec)
And without where clause when it loads it takes almost 1 minute to display the data
OUTPUT without where clause :
usersCount: 144
attendanceId: 0
punchInDate: 1111-11-11
punchIntime: 0
punchOutDate: 1111-11-11
punchOutTime: 0
10 rows in set (54.45 sec)
I am getting the desired output, but query is taking almost 1 minute to load, which i want to optimize. how can i do this ?
Try this out. I just narrowed down your inner join on table user_territory, so it will not join whole user_territory table but will only join distinct user_ids in it. Hope it reduces the performance time,
SELECT u.id,
account_id_fk,
first_name,
last_name,
email_address,
PASSWORD,
mobile_number,
gender,
user_accuracy,
check_in_radius,
report_to,
role,
allow_timeout,
active,
last_logged_on,
last_known_location_time,
last_known_location,
u.created_on,
u.created_by,
(
SELECT
COUNT(u2.id)
FROM
fieldsense.users u2
INNER JOIN (SELECT DISTINCT user_id_fk as id, territory_id FROM user_territory ) t1 ON u2.id = t1.id
INNER JOIN territory_categories c ON c.id = t1.teritory_id
WHERE
c.category_name LIKE "Gujarat"
AND (
u2.account_id_fk = 1
AND u2.role != 0
)
) AS usersCount,
IFNULL(a.id, 0) attendanceId,
IFNULL(a.punch_date, '1111-11-11') punchInDate,
IFNULL(a.punch_in, 0) punchIntime,
IFNULL(
a.punch_out_date,
'1111-11-11'
) punchOutDate,
IFNULL(a.punch_out, 0) punchOutTime
FROM
fieldsense.users AS u
LEFT OUTER JOIN attendances AS a ON u.id = a.user_id_fk
AND a.id = (
SELECT
max(id)
FROM
attendances att
WHERE
att.user_id_fk = u.id
)
INNER JOIN (SELECT DISTINCT user_id_fk as id, territory_id FROM user_territory ) t1 ON u2.id = t1.id
INNER JOIN territory_categories c ON c.id = t1.teritory_id
WHERE
c.category_name LIKE "Gujarat"
GROUP BY
u.id
LIMIT 10
I have this query
SELECT
s.account_number,
a.id AS 'ASPIRION ID',
a.patient_first_name,
a.patient_last_name,
s.admission_date,
s.total_charge,
astat.name AS 'STATUS',
astat.definition,
latest_note.content AS 'LAST NOTE',
a.insurance_company
FROM
accounts a
INNER JOIN
services s ON a.id = s.account_id
INNER JOIN
facilities f ON f.id = a.facility_id
INNER JOIN
account_statuses astat ON astat.id = a.account_status_id
INNER JOIN
(SELECT
account_id, MAX(content) content, MAX(created)
FROM
notes
GROUP BY account_id) latest_note ON latest_note.account_id = a.id
WHERE
a.facility_id = 56
My problem comes from
(SELECT
account_id, MAX(content) content, MAX(created)
FROM
notes
GROUP BY account_id)
Content is a varchar field and I am needed to get the most recent record. I now understand that MAX will not work on a varchar field the way that I want it. I am not sure how to be able to get the corresponding content with the MAX id and group that by account id on in this join.
What would be the best way to do this?
My notes table looks like this...
id account_id content created
1 1 This is a test 2011-03-16 02:06:40
2 1 More test 2012-03-16 02:06:40
Here are two choices. If your content is not very long and don't have funky characters, you can use the substring_index()/group_concat() trick:
(SELECT account_id,
SUBSTRING_INDEX(GROUP_CONCAT(content ORDER BY created desc SEPARATOR '|'
), 1, '|') as content
FROM notes
GROUP BY account_id
) latest_note
ON latest_note.account_id = a.id
Given the names of the columns and tables, that is likely not to work. Then you need an additional join or a correlated subquery in the from clause. I think that might be easiest in this case:
select . . .,
(select n.content
from notes n
where n.account_id = a.id
order by created desc
limit 1
) as latest_note
from . . .
The advantage to this method is that it only gets the notes for the rows you need. And, you don't need a left join to keep all the rows. For performance, you want an index on notes(account_id, created).
SELECT
s.account_number,
a.id AS 'ASPIRION ID',
a.patient_first_name,
a.patient_last_name,
s.admission_date,
s.total_charge,
astat.name AS 'STATUS',
astat.definition,
latest_note.content AS 'LAST NOTE',
a.insurance_company
FROM
accounts a
INNER JOIN services s ON a.id = s.account_id
INNER JOIN facilities f ON f.id = a.facility_id
INNER JOIN account_statuses astat ON astat.id = a.account_status_id
INNER JOIN
(SELECT account_id, MAX(created) mxcreated
FROM notes GROUP BY account_id) latest_note ON latest_note.account_id = a.id and
latest_note.mxcreated = --datetime column from any of the other tables being used
WHERE a.facility_id = 56
You have to join on the max(created) which would give the latest content.
Or you can change the query to
SELECT account_id, content, MAX(created) mxcreated
FROM notes GROUP BY account_id
as mysql allows you even if you don't include all non-aggregated columns in group by clause. However, unless you join on the max date you wouldn't get the correct results.
The last created record is the one for which does not exist a newer one. Hence:
SELECT
s.account_number,
a.id AS "ASPIRION ID",
a.patient_first_name,
a.patient_last_name,
s.admission_date,
s.total_charge,
astat.name AS "STATUS",
astat.definition,
latest_note.content AS "LAST NOTE",
a.insurance_company
FROM accounts a
INNER JOIN services s ON a.id = s.account_id
INNER JOIN facilities f ON f.id = a.facility_id
INNER JOIN account_statuses astat ON astat.id = a.account_status_id
INNER JOIN
(
SELECT account_id, content
FROM notes
WHERE NOT EXISTS
(
SELECT *
FROM notes newer
WHERE newer.account_id = notes.account_id
AND newer.created > notes.created
)
) latest_note ON latest_note.account_id = a.id
WHERE a.facility_id = 56;
I have the following:
SELECT DISTINCT s.username, COUNT( v.id ) AS cnt
FROM `instagram_item_viewer` v
INNER JOIN `instagram_shop_picture` p ON v.item_id = p.id
INNER JOIN `instagram_shop` s ON p.shop_id = s.id
AND s.expirydate IS NULL
AND s.isLocked =0
AND v.created >= '2014-08-01'
GROUP BY (
s.id
)
ORDER BY cnt DESC
Basically I have an instagram_item_viewer with the following structure:
id viewer_id item_id created
It tracks when a user has viewed an item and what time. So basically I wanted to find shops that has the most items viewed. I tried the query above and it executed fine, however it doesn't seem to give the appropriate data, it should have more count than what it is. What am I doing wrong?
First, with a group by statement, you don't need the DISTINCT clause. The grouping takes care of making your records distinct.
You may want to reconsider the order of your tables. Since you are interested in the shops, start there.
Select s.username, count(v.id)
From instagram_shop s
INNER JOIN instagram_shop_picture p ON p.shop_id = s.shop_id
INNER JOIN instagram_item_viewer v ON v.item_id = p.id
AND v.created >= '2014-08-01'
WHERE s.expirydate IS NULL
AND s.isLocked = 0
GROUP BY s.username
Give thata shot.
As mentioned by #Lennart, if you have a sample data it would be helpful. Because otherwise there will be assumptions.
Try run this to debug (this is not the answer yet)
SELECT s.username, p.id, COUNT( v.id ) AS cnt
FROM `instagram_item_viewer` v
INNER JOIN `instagram_shop_picture` p ON v.item_id = p.id
INNER JOIN `instagram_shop` s ON p.shop_id = s.id
AND s.expirydate IS NULL
AND s.isLocked =0
AND v.created >= '2014-08-01'
GROUP BY (
s.username, p.id
)
ORDER BY cnt DESC
The problem here is the store and item viewer is too far apart (i.e. bridged via shop_picture). Thus shop_picture needs to be in the SELECT statement.
Your original query only gets the first shop_picture count for that store that is why it is less than expected
Ultimately if you still want to achieve your goal, you can expand my SQL above to
SELECT x.username, SUM(x.cnt) -- or COUNT(x.cnt) depending on what you want
FROM
(
SELECT s.username, p.id, COUNT( v.id ) AS cnt
FROM `instagram_item_viewer` v
INNER JOIN `instagram_shop_picture` p ON v.item_id = p.id
INNER JOIN `instagram_shop` s ON p.shop_id = s.id
AND s.expirydate IS NULL
AND s.isLocked =0
AND v.created >= '2014-08-01'
GROUP BY (
s.username, p.id
)
ORDER BY cnt DESC
) x
GROUP BY x.username
I don't know much about query optimization but I know the order in which queries get executed
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
This the query I had written
SELECT
`main_table`.forum_id,
my_topics.topic_id,
(
SELECT MAX(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id
) AS `maxpostid`,
(
SELECT my_posts.admin_user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `admin_user_id`,
(
SELECT my_posts.user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `user_id`,
(
SELECT COUNT(my_topics.topic_id) FROM my_topics WHERE my_topics.forum_id = main_table.forum_id ORDER BY my_topics.forum_id DESC LIMIT 1
) AS `topicscount`,
(
SELECT COUNT(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_topics.topic_id DESC LIMIT 1
) AS `postcount`,
(
SELECT CONCAT(admin_user.firstname,' ',admin_user.lastname) FROM admin_user INNER JOIN my_posts ON my_posts.admin_user_id = admin_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `adminname`,
(
SELECT forum_user.nick_name FROM forum_user INNER JOIN my_posts ON my_posts.user_id = forum_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `nickname`,
(
SELECT CONCAT(ce1.value,' ',ce2.value) AS fullname FROM my_posts INNER JOIN customer_entity_varchar AS ce1 ON ce1.entity_id = my_posts.user_id INNER JOIN customer_entity_varchar AS ce2 ON ce2.entity_id=my_posts.user_id WHERE (ce1.attribute_id = 1) AND (ce2.attribute_id = 2) AND my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `fullname`
FROM `my_forums` AS `main_table`
LEFT JOIN `my_topics` ON main_table.forum_id = my_topics.forum_id
WHERE (forum_status = '1')
And now I want to know if there is any way to optimize it ? Because all the logic is written in Select section not From, but I don't know how to write the same logic in From section of the query ?
Does it make any difference or both are same ?
Thanks
Correlated subqueries should really be a last resort, they often end up being executed RBAR, and given that a number of your subqueries are very similar, trying to get the same result using joins is going to result in a lot less table scans.
The first thing I note is that all of your subqueries include the table my_posts, and most contain ORDER BY my_posts.post_id DESC LIMIT 1, those that don't have a count with no group by so the order and limit are redundant anyway, so my first step would be to join to my_posts:
SELECT *
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
WHERE forum_status = '1';
Here the subquery just ensures you get the latest post per topic_id. I have shortened your table aliases here for my convenience, I am not sure why you would use a table alias that is longer than the actual table name?
Now you have the bulk of your query you can start adding in your columns, in order to get the post count, I have added a count to the subquery Maxp, I have also had to add a few more joins to get some of the detail out, such as names:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
t2.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id,
MAX(post_id) AS post_id,
COUNT(*) AS postcount
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
LEFT JOIN
( SELECT forum_id, COUNT(*) AS topicscount
FROM my_topics
GROUP BY forum_id
) AS t2
ON t2.forum_id = f.forum_id
WHERE forum_status = '1';
I am not familiar with your schema so the above may need some tweaking, but the principal remains - use JOINs over sub-selects.
The next stage of optimisation I would do is to get rid of your customer_entity_varchar table, or at least stop using it to store things as basic as first name and last name. The Entity-Attribute-Value model is an SQL antipattern, if you added two columns, FirstName and LastName to your forum_user table you would immediately lose two joins from your query. I won't get too involved in the EAV vs Relational debate as this has been extensively discussed a number of times, and I have nothing more to add.
The final stage would be to add appropriate indexes, you are in the best decision to decide what is appropriate, I'd suggest you probably want indexes on at least the foreign keys in each table, possibly more.
EDIT
To get one row per forum_id you would need to use the following:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
MaxT.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN
( SELECT t.forum_id,
COUNT(DISTINCT t.topic_id) AS topicscount,
COUNT(*) AS postCount,
MAX(t.topic_ID) AS topic_id
FROM my_topics AS t
INNER JOIN my_posts AS p
ON p.topic_id = p.topic_id
GROUP BY t.forum_id
) AS MaxT
ON MaxT.forum_id = f.forum_id
LEFT JOIN my_topics AS t
ON t.topic_ID = Maxt.topic_ID
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
WHERE forum_status = '1';
Here is a sample SQL dump: https://gist.github.com/JREAM/99287d033320b2978728
I have a SELECT that grabs a bundle of users.
I then do a foreach loop to attach all the associated tree_processes to that user.
So I end up doing X Queries: users * tree.
Wouldn't it be much more efficient to fetch the two together?
I've thought about doing a LEFT JOIN Subselect, but I'm having a hard time getting it correct.
Below I've done a query to select the correct data in the SELECT, however I would have to do this for all 15 rows and it seems like a TERRIBLE waste of memory.
This is my dirty Ateempt:
-
SELECT
s.id,
s.firstname,
s.lastname,
s.email,
(
SELECT tp.id FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
WHERE subscribers_id = s.id
ORDER BY tp.id DESC
LIMIT 1
) AS newest_tree_id,
#
# Don't want to have to do this below for every row
(
SELECT t.type FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
WHERE subscribers_id = s.id
ORDER BY tp.id DESC
LIMIT 1
) AS tree_type
FROM subscribers AS s
INNER JOIN scenario_subscriptions AS ss ON (
ss.subscribers_id = s.id
)
WHERE ss.scenarios_id = 1
AND ss.completed != 1
AND ss.purchased_exit != 1
AND deleted != 1
GROUP BY s.id
LIMIT 0, 100
This is my LEFT JOIN attempt, but I am having trouble getting the SELECT values
SELECT
s.id,
s.firstname,
s.lastname,
s.email,
freshness.id,
# freshness.subscribers_id < -- Cant get multiples out of the LEFT join
FROM subscribers AS s
INNER JOIN scenario_subscriptions AS ss ON (
ss.subscribers_id = s.id
)
LEFT JOIN ( SELECT tp.id, tp.subscribers_id AS tp FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
ORDER BY tp.id DESC
LIMIT 1 ) AS freshness
ON (
s.id = subscribers_id
)
WHERE ss.scenarios_id = 1
AND ss.completed != 1
AND ss.purchased_exit != 1
AND deleted != 1
GROUP BY s.id
LIMIT 0, 100
In the LEFT JOIN you are using 'freshness' as the table alias. This in you select you need to additionally state what column(s) you want from it. Since there is only one column (id) you need to add:
freshness.id
to the select clause.
Your ON clause of the left join looks pretty dodgy too. Maybe freshness.id = ss.subscribers_id?
Cheers -