Counting rows from a big mysql query (zend) - mysql

I a developing in zend and have a rather large mysql query. The query works fine and i get the list I expect. I am doing this using Select->Where.... below is the query.
SELECT DISTINCT `d`.* FROM `deliverable` AS `d` INNER JOIN `groups` AS `g1` ON d.id = g1.deliverable_id INNER JOIN `groupmembers` AS `gm1` ON g1.id = gm1.group_id LEFT JOIN `connection` AS `c` ON d.id = c.downstreamnode_id LEFT JOIN `deliverable` AS `d1` ON c.upstreamnode_id = d1.id INNER JOIN `deliverable` AS `d2` ON CASE WHEN d1.id IS NULL THEN d.id ELSE d1.id END = d2.id INNER JOIN `groups` AS `g` ON d2.id = g.deliverable_id INNER JOIN `groupmembers` AS `gm` ON g.id = gm.group_id WHERE (g1.group_type = 100) AND (gm1.member_id = 1) AND (c.downstreamnode_id IS NULL OR d.restrict_access = 1) AND (g.group_type = 100 OR g.group_type = 110) AND (gm.member_id = 1) AND (d.deliverable_type = 110 OR d.deliverable_type = 100) GROUP BY CASE WHEN c.downstreamnode_id IS NULL THEN d.id ELSE c.downstreamnode_id END
Only problem is when I try to count the rows in a mysql query I only get 1 returned. below is the query
SELECT DISTINCT count(*) AS `rowCount` FROM `deliverable` AS `d` INNER JOIN `groups` AS `g1` ON d.id = g1.deliverable_id INNER JOIN `groupmembers` AS `gm1` ON g1.id = gm1.group_id LEFT JOIN `connection` AS `c` ON d.id = c.downstreamnode_id LEFT JOIN `deliverable` AS `d1` ON c.upstreamnode_id = d1.id INNER JOIN `deliverable` AS `d2` ON CASE WHEN d1.id IS NULL THEN d.id ELSE d1.id END = d2.id INNER JOIN `groups` AS `g` ON d2.id = g.deliverable_id INNER JOIN `groupmembers` AS `gm` ON g.id = gm.group_id WHERE (g1.group_type = 100) AND (gm1.member_id = 1) AND (c.downstreamnode_id IS NULL OR d.restrict_access = 1) AND (g.group_type = 100 OR g.group_type = 110) AND (gm.member_id = 1) AND (d.deliverable_type = 110 OR d.deliverable_type = 100) GROUP BY CASE WHEN c.downstreamnode_id IS NULL THEN d.id ELSE c.downstreamnode_id END
i generate this from by using the same 'select' that generated the first query but I reset the columns and add count in.
$this->getAdapter()->setFetchMode(Zend_Db::FETCH_ASSOC);
$select
->reset( Zend_Db_Select::COLUMNS)
->columns(array('count('.$column.') as rowCount'));
$rowCount = $this->getAdapter()->fetchOne($select);
This method works fine for all my other queries only this one i am having trouble with. I suspect it has something to do the 'CASE' I have in there but it is strange because I am getting the correct rows the the first query. Any ideas. Thanks.
FYI below are two queries that I have working successfully.
SELECT DISTINCT `po`.* FROM `post` AS `po` INNER JOIN `postinfo` AS `p` ON po.postinfo_id = p.id WHERE (p.creator_id = 1) ORDER BY `p`.`date_created` DESC
SELECT DISTINCT count(*) AS `rowCount` FROM `post` AS `po` INNER JOIN `postinfo` AS `p` ON po.postinfo_id = p.id WHERE (p.creator_id = 1) ORDER BY `p`.`date_created` DESC
In this one I have 4 rows returned in the first query and 'int 4' returned for the second one. Does anyone know why it doesnt work for the big query?

Move your DISTINCT.
SELECT COUNT(DISTINCT `po`.*) AS `rowCount` ...

Ok figured it out It was the GROUP BY that was causing only 1 result to be returned. Thanks Interrobang for you help I am sure that using DISTINCT incorrectly will have caused me a headache in the future.

Try using SQL_CALC_FOUND_ROWS in your query?
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_found-rows
Using SQL_CALC_FOUND_ROWS is mysql-specific, but it's pretty nice for getting a full record count even when your initial query contains a limit. Once you get the count, don't include SQL_CALC_FOUND_ROWS in subsequent queries for extra records since that will cause extra load on your query.
Your initial query would be:
SELECT SQL_CALC_FOUND_ROWS DISTINCT `d`.* FROM `deliverable` AS `d` INNER JOIN `groups` ...
You'll have to do a subsequent call after your initial query executes to get the count by doing a SELECT FOUND_ROWS().
If you do a little searching, you'll find someone who extended Zend_Db_Select to include this ability.

Related

use case statement in where clause conditional

Using the same query I am trying to list out notices which are not sent. I am closer to query but stuck in how to execute few conditions in where clause based upon certain condition.
I have tried the following query.
SELECT
vtn.*,
vn.v_notice_datetime
FROM
v_templates vt
JOIN v_template_notices vtn ON (vtn.v_template_id = vt.id)
JOIN violations v ON( v.v_template_id = vt.id )
LEFT JOIN vnotices vn ON(vn.vtemplate_notice_id = vtn.id)
WHERE
v.id = 1
AND vn.v_notice_datetime IS NULL
AND vtn.id > (
SELECT max(vn.vtemplate_notice_id)
FROM vnotices vn
WHERE vn.vnotice_datetime IS NOT NULL )
I want to cocatenate following sql code when vn.id IS NOT NULL
*AND vtn.id > ( SELECT max(vn.v_template_notice_id)
FROM v_notices vn WHERE vn.v_notice_datetime IS NOT NULL)*
Is CASE statement good option or any alternative? In research found that the CASE statement degrades the performance but I m not sure how to execute conditional statements in PostgreSQL / MySql?
Just a suggestion
You have left join table column involved in where this work as an inner join ..
you should move the condition for left joined table from where to ON condition
SELECT
vtn.*,
vn.v_notice_datetime
FROM v_templates vt
JOIN v_template_notices vtn ON (vtn.v_template_id = vt.id)
JOIN violations v ON( v.v_template_id = vt.id )
LEFT JOIN vnotices vn ON(vn.vtemplate_notice_id = vtn.id) AND vn.v_notice_datetime IS NULL
WHERE v.id = 1
AND vtn.id > (
SELECT max(vn.vtemplate_notice_id)
FROM vnotices vn
WHERE vn.vnotice_datetime IS NOT NULL )
;

How to Make This SQL Query More Efficient?

I'm not sure how to make the following SQL query more efficient. Right now, the query is taking 8 - 12 seconds on a pretty fast server, but that's not close to fast enough for a Website when users are trying to load a page with this code on it. It's looking through tables with many rows, for instance the "Post" table has 717,873 rows. Basically, the query lists all Posts related to what the user is following (newest to oldest).
Is there a way to make it faster by only getting the last 20 results total based on PostTimeOrder?
Any help would be much appreciated or insight on anything that can be done to improve this situation. Thank you.
Here's the full SQL query (lots of nesting):
SELECT DISTINCT p.Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(p.PostCreationTime) AS PostTimeOrder
FROM Post p
WHERE (p.Id IN (SELECT pc.PostId
FROM PostCreator pc
WHERE (pc.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pc.UserId = '100')
))
OR (p.Id IN (SELECT pum.PostId
FROM PostUserMentions pum
WHERE (pum.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pum.UserId = '100')
))
OR (p.Id IN (SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100'))
))
OR (p.Id IN (SELECT psm.PostId
FROM PostSMentions psm
WHERE (psm.StockId IN (SELECT sf.StockId
FROM StockFollowing sf
WHERE sf.UserId = '100' ))
))
UNION ALL
SELECT DISTINCT p.Id AS Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(upe.PostEchoTime) AS PostTimeOrder
FROM Post p
INNER JOIN UserPostE upe
on p.Id = upe.PostId
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND (uf.FollowingId = '100' OR upe.UserId = '100'))
ORDER BY PostTimeOrder DESC;
Changing your p.ID in (...) predicates to existence predicates with correlated subqueries may help. Also since both halves of your union all query are pulling from the Post table and possibly returning nearly identical records you might be able to combine the two into one query by left outer joining to UserPostE and adding upe.PostID is not null as an OR condition in the WHERE clause. UserFollowing will still inner join to UPE. If you want the same Post record twice once with upe.PostEchoTime and once with p.PostCreationTime as the PostTimeOrder you'll need keep the UNION ALL
SELECT
DISTINCT -- <<=- May not be needed
p.Id
, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime
, p.Content AS Content
, p.Bu AS Bu
, p.Se AS Se
, UNIX_TIMESTAMP(coalesce( upe.PostEchoTime
, p.PostCreationTime)) AS PostTimeOrder
FROM Post p
LEFT JOIN UserPostE upe
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND
(uf.FollowingId = '100' OR
upe.UserId = '100'))
on p.Id = upe.PostId
WHERE upe.PostID is not null
or exists (SELECT 1
FROM PostCreator pc
WHERE pc.PostId = p.ID
and pc.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pc.UserID
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM PostUserMentions pum
WHERE pum.PostId = p.ID
and pum.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pum.UserId
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM SStreamPost ssp
WHERE ssp.PostId = p.ID
and exists (SELECT 1
FROM SStreamFollowing ssf
WHERE ssf.SStreamId = ssp.SStreamId
and ssf.UserId = '100')
)
OR exists (SELECT 1
FROM PostSMentions psm
WHERE psm.PostId = p.ID
and exists (SELECT
FROM StockFollowing sf
WHERE sf.StockId = psm.StockId
and sf.UserId = '100' )
)
ORDER BY PostTimeOrder DESC
The from section could alternatively be rewritten to also use an existence clause with a correlated sub query:
FROM Post p
LEFT JOIN UserPostE upe
on p.Id = upe.PostId
and ( upe.UserId = '100'
or exists (select 1
from UserFollowing uf
where uf.FollwedID = upe.UserID
and uf.FollowingId = '100'))
Turn IN ( SELECT ... ) into a JOIN .. ON ... (see below)
Turn OR into UNION (see below)
Some the tables are many:many mappings? Such as SStreamFollowing? Follow the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Example of IN:
SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (
SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100' ))
-->
SELECT ssp.PostId
FROM SStreamPost ssp
JOIN SStreamFollowing ssf ON ssp.SStreamId = ssf.SStreamId
WHERE ssf.UserId = '100'
The big WHERE with all the INs becomes something like
JOIN ( ( SELECT pc.PostId AS id ... )
UNION ( SELECT pum.PostId ... )
UNION ( SELECT ssp.PostId ... )
UNION ( SELECT psm.PostId ... ) )
Get what you can done of that those suggestions, then come back for more advice if you still need it. And bring SHOW CREATE TABLE with you.

MySQL Query, (access mainquery from subquery)

I have a problem that I need a WHERE clause in a subquery that depends on the results of the main Query, otherwise my results would be wrong and the query takes too long / is not executeable.
The circumstances that I need this query to create a view which I need for a search server support the problem that I cannot split this into two queries, nor process it with a script dynamically.
The problem occurs with the following query:
SELECT `s`.`id` AS `seminar_id`, (SUM( `sub`.`seminar_rate` ) / COUNT( `sub`.`seminar_id` )) AS `total_rate`
FROM
(
SELECT (SUM( value ) / COUNT( * )) AS `seminar_rate` , `r`.`seminar_id`
FROM `rating` r
INNER JOIN `rating_item` ri ON `r`.`id` = `ri`.`rating_id`
WHERE `r`.`seminar_id` = `s`.`id`/* <- Here is my problem, this is inacessible */
GROUP BY `r`.`seminar_id`
) AS sub,
`seminar` s
INNER JOIN `date` d
ON `s`.`id` = `d`.`seminar_id`
INNER JOIN `date_unit` du
ON `d`.`id` = `du`.`date_id`
LEFT JOIN `seminar_subject` su
ON `s`.`id` = `su`.`seminar_id`
LEFT JOIN `subject` suj
ON `su`.`subject_id` = `suj`.`id`
INNER JOIN `user` u
ON `s`.`user_id` = `u`.`id`
INNER JOIN `company` c
ON `u`.`company_id` = `c`.`id`
GROUP BY `du`.`date_id`, `sub`.`seminar_id`
This query should calculate a total rate out of ratings for each Seminar.
However my ratings are stored in my "rating" table and should be processed live.
(Sidenote: If you wonder about all the joins: This query has alooot more SELECT'ed fields, I just removed them because they are not nesessary to solve the problem and to make the query look less complicated [I know it still is >.>]...)
The reason is that I want this results to be sortable by my search engine later depending
on the users sort parameters, thatswhy I need it inside this query.
The problem itself is pretty obvious:
ERROR 1054 (42S22): Unknown column 's.id' in 'where clause'
The subselect doesnt know about the results of the main query, is there a solution to bypass this?
Could someone give me a hint to get this working?
Thanks in advance.
Using your subquery in the JOIN you can eliminate the WHERE clause and achieve nearly the same result. Here is your modified query. Hope this solves your problem.
SELECT `s`.`id` AS `seminar_id`, (SUM( `sub`.`seminar_rate` ) / COUNT( `sub`.`seminar_id` )) AS `total_rate`
FROM `seminar` s
INNER JOIN
(
SELECT (SUM( value ) / COUNT( * )) AS `seminar_rate` , `r`.`seminar_id`
FROM `rating` r
INNER JOIN `rating_item` ri ON `r`.`id` = `ri`.`rating_id`
/*WHERE `r`.`seminar_id` = `s`.`id` <- Here is my problem, this is inacessible */
GROUP BY `r`.`seminar_id`
) AS sub ON s.id = sub.`seminar_id`
INNER JOIN `date` d
ON `s`.`id` = `d`.`seminar_id`
INNER JOIN `date_unit` du
ON `d`.`id` = `du`.`date_id`
LEFT JOIN `seminar_subject` su
ON `s`.`id` = `su`.`seminar_id`
LEFT JOIN `subject` suj
ON `su`.`subject_id` = `suj`.`id`
INNER JOIN `user` u
ON `s`.`user_id` = `u`.`id`
INNER JOIN `company` c
ON `u`.`company_id` = `c`.`id`
GROUP BY `du`.`date_id`, `sub`.`seminar_id`

Combine conditions with AND in Mysql ON clause

I have the following query, that I use to filter rows based on software_id and level.
I've put the conditions in the ON-Clause since I still want rows returned, where there are no corresponding rows in the JobadvertsSoftware Table.
SELECT `Jobadvert`.`id` FROM `jobadverts` AS `Jobadvert`
LEFT JOIN `users` AS `User` ON (`Jobadvert`.`user_id` = `User`.`id`)
LEFT JOIN `jobadverts_softwares` AS `JobadvertsSoftware_0` ON
(`Jobadvert`.`id` = 'JobadvertsSoftware_0.jobadvert_id' AND
(`JobadvertsSoftware_0`.`software_id` = '32' AND
`JobadvertsSoftware_0`.`level` IN ('1', 4)))
WHERE `Jobadvert`.`active` = 1 AND `User`.`premium` = '1' AND
Jobadvert`.`department_id` = (5)
GROUP BY `Jobadvert`.`id`
The problem is that it also returns JobadvertsSoftware-rows where level is e.g. 2
Again, if I put that in the WHERE clause it will filter out the rows where there are not JobadvertsSoftware which it shouldn't do.
How can I tell MySQL to return all rows of Jobadvert, where the given software_id AND the level matches or are NULL?
Try this:
SELECT `Jobadvert`.`id`, `JobadvertsSoftware_0`.`level`
FROM `jobadverts` AS `Jobadvert`
LEFT JOIN `users` AS `User` ON (`Jobadvert`.`user_id` = `User`.`id`)
INNER JOIN `jobadverts_softwares` AS `JobadvertsSoftware_0` ON
(`Jobadvert`.`id` = 'JobadvertsSoftware_0.jobadvert_id' AND
(`JobadvertsSoftware_0`.`software_id` = '32' AND
`JobadvertsSoftware_0`.`level` IN ('1', 4)))
WHERE `Jobadvert`.`active` = 1 AND `User`.`premium` = '1' AND
Jobadvert`.`department_id` = (5)
GROUP BY `Jobadvert`.`id`
Saludos!
Try this( it's a bit unclear if some fields are numeric on string, it might be corrected):
SELECT distinct(`Jobadvert`.`id`) FROM `jobadverts` AS `Jobadvert`
LEFT JOIN `users` AS `User` ON (`Jobadvert`.`user_id` = `User`.`id`)
LEFT JOIN `jobadverts_softwares` AS `JobadvertsSoftware_0`
ON `Jobadvert`.`id` = `JobadvertsSoftware_0.jobadvert_id`
WHERE
`Jobadvert`.`active` = 1
AND `User`.`premium` = '1'
AND Jobadvert`.`department_id` = (5)
AND JobadvertsSoftware_0`.`software_id` = '32'
AND (`JobadvertsSoftware_0`.`level` IN (1, 4) OR `JobadvertsSoftware_0`.`level` is NULL)
Assuming the level parameters in your ON clause is not needed for the join you can do a nested SELECT on your Software table to clear out the data you do not need first:
SELECT * FROM jobadverts_softwares
WHERE
(`software_id` = 32 OR `software_id` IS NULL) --Select all software_id that are 32 or null
AND
`level` IN (1, 4)
Then you can incorporate this as a nested statement in your main SQL query so you only join on the data which is filtered in your LEFT JOIN but keep any null values that you needed:
SELECT `Jobadvert`.`id`
FROM `jobadverts` AS `Jobadvert`
LEFT JOIN `users` AS `User`
ON `Jobadvert`.`user_id` = `User`.`id`
LEFT JOIN
( --Software Subquery
SELECT `jobadvert_id`, `level` FROM jobadverts_softwares
WHERE
(`software_id` = 32 OR `software_id` IS NULL) --Select all software_id that are 32 or null
AND
`level` IN (1, 4)
) AS `software_subquery`
ON `Jobadvert`.`id` = `software_subquery`.`jobadvert_id`
WHERE
`Jobadvert`.`active` = 1
AND
`User`.`premium` = '1'
AND
`Jobadvert`.`department_id` = 5
ORDER BY `Jobadvert`.`id` --Changed GROUP BY to ORDER BY as not needed
This is untested but try it out and see if this will help.
Try this:
SELECT j.id
FROM jobadverts j
LEFT JOIN User u ON (j.user_id = u.id)
LEFT JOIN jobadverts_softwares AS js ON
(j.id = js.jobadvert_id)
WHERE j.active = 1
AND u.premium = '1'
AND j.department_id = (5)
AND js.software_id` = '32'
AND js.level IN ('1', 4)))
You won't need a GROUP BY unless summing data in some way.

MySQL - LEFT JOIN and NOT IN

I try to avoid subqueries due to the fact they usually have much lower performance than a proper join.
This is my current NOT working query:
SELECT
a.`email_list_id`, a.`category_id`, a.`name`
FROM
`email_lists`AS a
LEFT JOIN `email_autoresponders` AS b
ON ( a.`website_id` = b.`website_id` )
WHERE
a.`website_id` = [...]
AND a.`category_id` <> 0
AND a.`email_list_id` <> b.`email_list_id`
GROUP BY
a.`email_list_id`
ORDER BY a.`name`
This query works:
SELECT
`email_list_id`, `category_id`, `name`
FROM
`email_lists`
WHERE
`website_id` = [...]
AND `category_id` <> 0
AND `email_list_id` NOT IN (
SELECT
`email_list_id`
FROM
`email_autoresponders`
WHERE `website_id` = [...]
)
GROUP BY
`email_list_id`
ORDER BY
`name`
Is there any way to do this with a left join? I've tried a number of different options.
After rethinking it a bit, this might work i believe:
SELECT
a.`email_list_id`, a.`category_id`, a.`name`
FROM
`email_lists`AS a
LEFT JOIN `email_autoresponders` AS b
ON ( a.`website_id` = b.`website_id` and a.`email_list_id` = b.`email_list_id` )
WHERE
a.`website_id` = [...]
AND a.`category_id` <> 0
AND b.`email_list_id` is NULL
GROUP BY
a.`email_list_id`
ORDER BY a.`name`
For starters, add the inequality check on email_list_id to the join criteria, instead of having it in your WHERE clause:
LEFT JOIN `email_autoresponders` AS b
ON ( a.`website_id` = b.`website_id`
AND a.`email_list_id` <> b.`email_list_id` )
Though I'm not sure the scenario you mention calls for vast optimizations, this is a way to use a join rather than a subquery...