I have a funny MySQL query that needs to pull a subquery from another table, I'm wondering if this is even possible to get mysql to evaluate the subquery.
example:
(I had to replace some brackets with 'gte' & 'lte' cause they were screwing up the post format)
select a.id,a.alloyname,a.label,a.symbol, g.grade,
if(a.id = 1,(
(((select avg(cost/2204.6) as averageCost from nas_cost where cost != '0' and `date` lte '2011-03-01' and `date` gte '2011-03-31') - t.value) * (astm.astm/100) * 1.2)
),(a.formulae)) as thisValue
from nas_alloys a
left join nas_triggers t on t.alloyid = a.id
left join nas_astm astm on astm.alloyid = a.id
left join nas_estimatedprice ep on ep.alloyid = a.id
left join nas_grades g on g.id = astm.gradeid
where a.id = '1' or a.id = '2'
order by g.grade;
So when the IF statement is not = '1' then the (a.formulae) is the value in the nas_alloys table which is:
((ep.estPrice - t.value) * (astm.astm/100) * 0.012)
Basically I want this query to run as:
select a.id,a.alloyname,a.label,a.symbol, g.grade,
if(a.id = 1,(
(((select avg(cost/2204.6) as averageCost from nas_cost where cost != '0' and `date` gte '2011-03-01' and `date` lte '2011-03-31') - t.value) * (astm.astm/100) * 1.2)
),((ep.estPrice - t.value) * (astm.astm/100) * 0.012)) as thisValue
from nas_alloys a
left join nas_triggers t on t.alloyid = a.id
left join nas_astm astm on astm.alloyid = a.id
left join nas_estimatedprice ep on ep.alloyid = a.id
left join nas_grades g on g.id = astm.gradeid
where a.id = '1' or a.id = '2'
order by g.grade;
When a.id != '1', btw, there are about 30 different possibilities for a.formulae, and they change frequently, so hard banging in multiple if statements is not really an option. [redesigning the business logic is more likely than that!]
Anyway, any thoughts? Will this even work?
-thanks
-sean
Create a Stored Function to compute that value for you, and pass the params you will decide later on. When your business logic changes, you just have to update the Stored Function.
Related
I have written a query but it's taking a lot of time. I want to know if there exists any solution to optimize it without making a temp table in MYSQL. Is there a way to optimize the subquery part since AccessLog2019 is huge so it's taking forever)
Here is my query
SELECT distinct l.ListingID,l.City,l.ListingStatus,l.Price,l.Bedrooms,l.FullBathrooms, gc.Latitude,gc.Longitude , count(distinct s.AccessLogID) AS access_count, s.LBID , lb.CurrentListingID
from lockbox.Listings l
JOIN lockbox.GeoCoordinates gc ON l.ListingID = gc.ID
LEFT JOIN lockbox.LockBox lb ON l.ListingID = lb.CurrentListingID
LEFT JOIN
(SELECT * FROM lockbox.AccessLog2019 ac where ac.AccessType not in('1DayCodeGen','BluCodeGen','SmartMACGen') AND DATEDIFF(NOW(), ac.UTCAccessedDT ) < 1 ) s
ON lb.LBID = s.LBID
WHERE l.AssocID = 'AS00000000CC' AND (gc.Confidence <> '5 - Unmatchable' OR gc.Confidence IS NULL OR gc.Confidence = ' ')
group BY l.ListingID
Thanks
If you can avoid the outer group by, that is a big win. I am thinking:
SELECT l.ListingID, l.City, l.ListingStatus, l.Price, l.Bedrooms, l.FullBathrooms,
gc.Latitude, gc.Longitude,
(select count(*)
from lockbox.LockBox lb join
lockbox.AccessLog2019 ac
on lb.LBID = ac.LBID
where l.ListingID = lb.CurrentListingID and
ac.AccessType not in ('1DayCodeGen', 'BluCodeGen', 'SmartMACGen') and
DATEDIFF(NOW(), ac.UTCAccessedDT) < 1
) as cnt
from lockbox.Listings l JOIN
lockbox.GeoCoordinates gc
ON l.ListingID = gc.ID
WHERE l.AssocID = 'AS00000000CC' AND
(gc.Confidence <> '5 - Unmatchable' OR
gc.Confidence IS NULL OR
gc.Confidence = ' '
)
Note: This does not select s.LBID or lb.CurrentListingID because these don't make sense in your query. If I understand correctly, these could have different values on different rows.
You could try breaking out the subquery to the JOIN clause.
It might give a hint to the optimizer that it can use the LBID field first, and then test the AccessType later (in case the optimizer doesn't figure that out when you have the sub-select).
SELECT distinct l.ListingID,l.City,l.ListingStatus,l.Price,l.Bedrooms,l.FullBathrooms, gc.Latitude,gc.Longitude , count(distinct s.AccessLogID) AS access_count, s.LBID , lb.CurrentListingID
from lockbox.Listings l
JOIN lockbox.GeoCoordinates gc ON l.ListingID = gc.ID
LEFT JOIN lockbox.LockBox lb ON l.ListingID = lb.CurrentListingID
LEFT JOIN AccessLog2019 s
ON lb.LBID = s.LBID
AND s.AccessType not in('1DayCodeGen','BluCodeGen','SmartMACGen')
AND DATEDIFF(NOW(), s.UTCAccessedDT ) < 1
WHERE l.AssocID = 'AS00000000CC' AND (gc.Confidence <> '5 - Unmatchable' OR gc.Confidence IS NULL OR gc.Confidence = ' ')
group BY l.ListingID
Note that this is one of those cases where conditions in the JOIN clause gives different behavior than using a WHERE clause. If you just had lb.LBID = s.LBID and then had the conditions I wrote in the WHERE of the outer query the results would be different. They would exclude the records matching lb.LBID = s.LBID. But in the JOIN clause, it is part of the conditions of the outer join.
SELECT * --> Select only the columns needed.
SELECT DISTINCT ... GROUP BY -- Do one or the other, not both.
Need composite INDEX(AssocID, ListingID) (in that order)
DATEDIFF(NOW(), ac.UTCAccessedDT ) < 1 --> ac.UTCAccessedDT > NOW() - INTERVAL 1 DAY (or whatever your intent was. Then add INDEX(UTCAccessedDT)
OR is hard to optimize; consider cleansing the data so that Confidence does not have 3 values that mean the same thing.
I'm not sure how to make the following SQL query more efficient. Right now, the query is taking 8 - 12 seconds on a pretty fast server, but that's not close to fast enough for a Website when users are trying to load a page with this code on it. It's looking through tables with many rows, for instance the "Post" table has 717,873 rows. Basically, the query lists all Posts related to what the user is following (newest to oldest).
Is there a way to make it faster by only getting the last 20 results total based on PostTimeOrder?
Any help would be much appreciated or insight on anything that can be done to improve this situation. Thank you.
Here's the full SQL query (lots of nesting):
SELECT DISTINCT p.Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(p.PostCreationTime) AS PostTimeOrder
FROM Post p
WHERE (p.Id IN (SELECT pc.PostId
FROM PostCreator pc
WHERE (pc.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pc.UserId = '100')
))
OR (p.Id IN (SELECT pum.PostId
FROM PostUserMentions pum
WHERE (pum.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pum.UserId = '100')
))
OR (p.Id IN (SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100'))
))
OR (p.Id IN (SELECT psm.PostId
FROM PostSMentions psm
WHERE (psm.StockId IN (SELECT sf.StockId
FROM StockFollowing sf
WHERE sf.UserId = '100' ))
))
UNION ALL
SELECT DISTINCT p.Id AS Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(upe.PostEchoTime) AS PostTimeOrder
FROM Post p
INNER JOIN UserPostE upe
on p.Id = upe.PostId
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND (uf.FollowingId = '100' OR upe.UserId = '100'))
ORDER BY PostTimeOrder DESC;
Changing your p.ID in (...) predicates to existence predicates with correlated subqueries may help. Also since both halves of your union all query are pulling from the Post table and possibly returning nearly identical records you might be able to combine the two into one query by left outer joining to UserPostE and adding upe.PostID is not null as an OR condition in the WHERE clause. UserFollowing will still inner join to UPE. If you want the same Post record twice once with upe.PostEchoTime and once with p.PostCreationTime as the PostTimeOrder you'll need keep the UNION ALL
SELECT
DISTINCT -- <<=- May not be needed
p.Id
, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime
, p.Content AS Content
, p.Bu AS Bu
, p.Se AS Se
, UNIX_TIMESTAMP(coalesce( upe.PostEchoTime
, p.PostCreationTime)) AS PostTimeOrder
FROM Post p
LEFT JOIN UserPostE upe
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND
(uf.FollowingId = '100' OR
upe.UserId = '100'))
on p.Id = upe.PostId
WHERE upe.PostID is not null
or exists (SELECT 1
FROM PostCreator pc
WHERE pc.PostId = p.ID
and pc.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pc.UserID
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM PostUserMentions pum
WHERE pum.PostId = p.ID
and pum.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pum.UserId
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM SStreamPost ssp
WHERE ssp.PostId = p.ID
and exists (SELECT 1
FROM SStreamFollowing ssf
WHERE ssf.SStreamId = ssp.SStreamId
and ssf.UserId = '100')
)
OR exists (SELECT 1
FROM PostSMentions psm
WHERE psm.PostId = p.ID
and exists (SELECT
FROM StockFollowing sf
WHERE sf.StockId = psm.StockId
and sf.UserId = '100' )
)
ORDER BY PostTimeOrder DESC
The from section could alternatively be rewritten to also use an existence clause with a correlated sub query:
FROM Post p
LEFT JOIN UserPostE upe
on p.Id = upe.PostId
and ( upe.UserId = '100'
or exists (select 1
from UserFollowing uf
where uf.FollwedID = upe.UserID
and uf.FollowingId = '100'))
Turn IN ( SELECT ... ) into a JOIN .. ON ... (see below)
Turn OR into UNION (see below)
Some the tables are many:many mappings? Such as SStreamFollowing? Follow the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Example of IN:
SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (
SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100' ))
-->
SELECT ssp.PostId
FROM SStreamPost ssp
JOIN SStreamFollowing ssf ON ssp.SStreamId = ssf.SStreamId
WHERE ssf.UserId = '100'
The big WHERE with all the INs becomes something like
JOIN ( ( SELECT pc.PostId AS id ... )
UNION ( SELECT pum.PostId ... )
UNION ( SELECT ssp.PostId ... )
UNION ( SELECT psm.PostId ... ) )
Get what you can done of that those suggestions, then come back for more advice if you still need it. And bring SHOW CREATE TABLE with you.
In relation to the answer I accepted for this post, SQL Group By and Limit issue, I need to figure out how to create that query using SQLAlchemy. For reference, the query I need to run is:
SELECT t.id, t.creation_time, c.id, c.creation_time
FROM (SELECT id, creation_time
FROM thread
ORDER BY creation_time DESC
LIMIT 5
) t
LEFT OUTER JOIN comment c ON c.thread_id = t.id
WHERE 3 >= (SELECT COUNT(1)
FROM comment c2
WHERE c.thread_id = c2.thread_id
AND c.creation_time <= c2.creation_time
)
I have the first half of the query, but I am struggling with the syntax for the WHERE clause and how to combine it with the JOIN. Any one have any suggestions?
Thanks!
EDIT: First attempt seems to mess up around the .filter() call:
c = aliased(Comment)
c2 = aliased(Comment)
subq = db.session.query(Thread.id).filter_by(topic_id=122098).order_by(Thread.creation_time.desc()).limit(2).offset(2).subquery('t')
subq2 = db.session.query(func.count(1).label("count")).filter(c.id==c2.id).subquery('z')
q = db.session.query(subq.c.id, c.id).outerjoin(c, c.thread_id==subq.c.id).filter(3 >= subq2.c.count)
this generates the following SQL:
SELECT t.id AS t_id, comment_1.id AS comment_1_id
FROM (SELECT count(1) AS count
FROM comment AS comment_1, comment AS comment_2
WHERE comment_1.id = comment_2.id) AS z, (SELECT thread.id AS id
FROM thread
WHERE thread.topic_id = :topic_id ORDER BY thread.creation_time DESC
LIMIT 2 OFFSET 2) AS t LEFT OUTER JOIN comment AS comment_1 ON comment_1.thread_id = t.id
WHERE z.count <= 3
Notice the sub-query ordering is incorrect, and subq2 somehow is selecting from comment twice. Manually fixing that gives the right results, I am just unsure of how to get SQLAlchemy to get it right.
Try this:
c = db.aliased(Comment, name='c')
c2 = db.aliased(Comment, name='c2')
sq = (db.session
.query(Thread.id, Thread.creation_time)
.order_by(Thread.creation_time.desc())
.limit(5)
).subquery(name='t')
sq2 = (
db.session.query(db.func.count(1))
.select_from(c2)
.filter(c.thread_id == c2.thread_id)
.filter(c.creation_time <= c2.creation_time)
.correlate(c)
.as_scalar()
)
q = (db.session
.query(
sq.c.id, sq.c.creation_time,
c.id, c.creation_time,
)
.outerjoin(c, c.thread_id == sq.c.id)
.filter(3 >= sq2)
)
I have this QUERY:
select
a.*
from
mt_proyecto a,
mt_mockup b,
mt_diseno c,
mt_modulo d
where
a.estado = 'A' and
(
(b.encargado = '1' and b.idproyecto = a.idmtproyecto) or
(c.encargado = '1' and c.idproyecto = a.idmtproyecto) or
(d.encargado = '1' and d.idproyecto = a.idmtproyecto)
)
group by
a.idmtproyecto
order by a.finalizado asc, a.feccrea desc
Result:
Then, I run the same code on server with the same database:
Is there any problem with the query?
It seems that the query is running correctly on your server, and returning no rows. Please make sure you have the same table contents on your local machine and your server.
Other things:
Pro tip: never use SELECT * or SELECT table.* in software. Always enumerate the columns you want in your result set.
Unless you use GROUP BY with aggregate functions like SUM() or `COUNT(), and naming the correct columns from the result set, it returns unpredictable results. Read this. http://dev.mysql.com/doc/refman/5.1/en/group-by-extensions.html
I solved with this QUERY:
select
a.*
from
(
mt_proyecto a
left join mt_mockup b on
b.idproyecto = a.idmtproyecto
left join mt_diseno c on
c.idproyecto = a.idmtproyecto
left join mt_modulo d on
d.idproyecto = a.idmtproyecto
left join mt_integracion e on
e.idproyecto = a.idmtproyecto
left join mt_pruebas_internas f on
f.idproyecto = a.idmtproyecto
)
where
a.estado = 'A' and
(
(a.idmtproyecto = b.idproyecto and
b.encargado = '1' ) or
(a.idmtproyecto = c.idproyecto and
c.encargado = '1' ) or
(a.idmtproyecto = d.idproyecto and
d.encargado = '1' ) or
(a.idmtproyecto = e.idproyecto and
e.encargado = '1' ) or
(a.idmtproyecto = f.idproyecto =
f.encargado = '1' )
)
group by a.idmtproyecto
order by
a.finalizado asc, a.feccrea desc
Thanks everyone for answer me. I'll do your suggestions .
I a developing in zend and have a rather large mysql query. The query works fine and i get the list I expect. I am doing this using Select->Where.... below is the query.
SELECT DISTINCT `d`.* FROM `deliverable` AS `d` INNER JOIN `groups` AS `g1` ON d.id = g1.deliverable_id INNER JOIN `groupmembers` AS `gm1` ON g1.id = gm1.group_id LEFT JOIN `connection` AS `c` ON d.id = c.downstreamnode_id LEFT JOIN `deliverable` AS `d1` ON c.upstreamnode_id = d1.id INNER JOIN `deliverable` AS `d2` ON CASE WHEN d1.id IS NULL THEN d.id ELSE d1.id END = d2.id INNER JOIN `groups` AS `g` ON d2.id = g.deliverable_id INNER JOIN `groupmembers` AS `gm` ON g.id = gm.group_id WHERE (g1.group_type = 100) AND (gm1.member_id = 1) AND (c.downstreamnode_id IS NULL OR d.restrict_access = 1) AND (g.group_type = 100 OR g.group_type = 110) AND (gm.member_id = 1) AND (d.deliverable_type = 110 OR d.deliverable_type = 100) GROUP BY CASE WHEN c.downstreamnode_id IS NULL THEN d.id ELSE c.downstreamnode_id END
Only problem is when I try to count the rows in a mysql query I only get 1 returned. below is the query
SELECT DISTINCT count(*) AS `rowCount` FROM `deliverable` AS `d` INNER JOIN `groups` AS `g1` ON d.id = g1.deliverable_id INNER JOIN `groupmembers` AS `gm1` ON g1.id = gm1.group_id LEFT JOIN `connection` AS `c` ON d.id = c.downstreamnode_id LEFT JOIN `deliverable` AS `d1` ON c.upstreamnode_id = d1.id INNER JOIN `deliverable` AS `d2` ON CASE WHEN d1.id IS NULL THEN d.id ELSE d1.id END = d2.id INNER JOIN `groups` AS `g` ON d2.id = g.deliverable_id INNER JOIN `groupmembers` AS `gm` ON g.id = gm.group_id WHERE (g1.group_type = 100) AND (gm1.member_id = 1) AND (c.downstreamnode_id IS NULL OR d.restrict_access = 1) AND (g.group_type = 100 OR g.group_type = 110) AND (gm.member_id = 1) AND (d.deliverable_type = 110 OR d.deliverable_type = 100) GROUP BY CASE WHEN c.downstreamnode_id IS NULL THEN d.id ELSE c.downstreamnode_id END
i generate this from by using the same 'select' that generated the first query but I reset the columns and add count in.
$this->getAdapter()->setFetchMode(Zend_Db::FETCH_ASSOC);
$select
->reset( Zend_Db_Select::COLUMNS)
->columns(array('count('.$column.') as rowCount'));
$rowCount = $this->getAdapter()->fetchOne($select);
This method works fine for all my other queries only this one i am having trouble with. I suspect it has something to do the 'CASE' I have in there but it is strange because I am getting the correct rows the the first query. Any ideas. Thanks.
FYI below are two queries that I have working successfully.
SELECT DISTINCT `po`.* FROM `post` AS `po` INNER JOIN `postinfo` AS `p` ON po.postinfo_id = p.id WHERE (p.creator_id = 1) ORDER BY `p`.`date_created` DESC
SELECT DISTINCT count(*) AS `rowCount` FROM `post` AS `po` INNER JOIN `postinfo` AS `p` ON po.postinfo_id = p.id WHERE (p.creator_id = 1) ORDER BY `p`.`date_created` DESC
In this one I have 4 rows returned in the first query and 'int 4' returned for the second one. Does anyone know why it doesnt work for the big query?
Move your DISTINCT.
SELECT COUNT(DISTINCT `po`.*) AS `rowCount` ...
Ok figured it out It was the GROUP BY that was causing only 1 result to be returned. Thanks Interrobang for you help I am sure that using DISTINCT incorrectly will have caused me a headache in the future.
Try using SQL_CALC_FOUND_ROWS in your query?
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_found-rows
Using SQL_CALC_FOUND_ROWS is mysql-specific, but it's pretty nice for getting a full record count even when your initial query contains a limit. Once you get the count, don't include SQL_CALC_FOUND_ROWS in subsequent queries for extra records since that will cause extra load on your query.
Your initial query would be:
SELECT SQL_CALC_FOUND_ROWS DISTINCT `d`.* FROM `deliverable` AS `d` INNER JOIN `groups` ...
You'll have to do a subsequent call after your initial query executes to get the count by doing a SELECT FOUND_ROWS().
If you do a little searching, you'll find someone who extended Zend_Db_Select to include this ability.