I'm working through some old code (not mine) and I need to optimise the following query because it is taking a long time to complete. My guess is the subquery is causing it
UPDATE topic a, cycle c
SET a.cycleId = c.id
WHERE a.id = 1
AND ((c.year * 100) + c.sequence) = (
SELECT MIN((`year` * 100) + sequence)
FROM cycle c2
WHERE c2.groupId = a.groupId)
I was thinking of selecting the cycleId (c.id) in a separate query before the update statement but I am having problems separating it. So far I have the following but I haven't accounted for the (c.year * 100) + c.sequence) and have to be honest I'm not sure what that is doing!
SELECT c.id
FROM cycle c
LEFT JOIN topic a ON c.groupId = a.groupId
WHERE a.id = 1;
This is my workaround for time being. Get the result from:
SELECT MIN((`year` * 100) + sequence)
FROM cycle c
INNER JOIN topic a ON c.groupId = a.groupId
WHERE a.id = 1;
and use in the main query:
UPDATE topic a, cycle c
SET a.cycleId = c.id
WHERE a.id = 1
AND ((c.year * 100) + c.sequence) = [result]
Related
the first picture is the table . second picture is the expected output.
conditions are 1. refids should be same. 2. for all the same ref ids (a.start,a.end &b.start,b.end) in the current and previous row. 3. should calculate the time difference which is greater than or equal to one day.
You want pairs of rows that match certain condition. You can perform a join to identify the pairs.
You don't say which version of MySQL you are using but in MySQL 8.x you can do:
with
x as (
select a.id
from my_table a
join my_table b on b.id = a.id + 1
and b.refid = a.refid
and (a.detail = 'a.end' and b.detail = 'a.start'
or a.detail = 'b.end' and b.detail = 'b.start')
)
select t.*
from my_table t
join x on t.id = x.id or t.id = x.id + 1
For MySQL 5.x you can do:
select t.*
from my_table t
join (
select a.id
from my_table a
join my_table b on b.id = a.id + 1
and b.refid = a.refid
and (a.detail = 'a.end' and b.detail = 'a.start'
or a.detail = 'b.end' and b.detail = 'b.start')
) x on t.id = x.id or t.id = x.id + 1
I have ran into a problem when I tried to check if there was a difference in the results between different tests. The inner select statement returns around 5000 rows but the join doesn't finish in one minute. I expect the output to be around 10 rows. Any reason that the join is so slow?
select * from(
select *
from R inner join C
on R.i = C.j
where C.j in (2343,3423,4222,1124,2344)
) AS A,(
select *
from R inner join C
on R.i = C.j
where C.j in (2343,3423,4222,1124,2344)
) AS B
where A.x = B.x and
A.y = B.y and
A.result <> B.result
I think you can do what you want with aggregation:
select x, y, group_concat(distinct result) as results
from R inner join
C
on R.i = C.j
where C.j in (2343, 3423, 4222, 1124, 2344)
group by x, y
having count(distinct result) > 1;
For this query, an index on C(j) and R(i) would be very helpful. I would add x and y to the appropriate index as well, but I don't know which table they are combing from.
In relation to the answer I accepted for this post, SQL Group By and Limit issue, I need to figure out how to create that query using SQLAlchemy. For reference, the query I need to run is:
SELECT t.id, t.creation_time, c.id, c.creation_time
FROM (SELECT id, creation_time
FROM thread
ORDER BY creation_time DESC
LIMIT 5
) t
LEFT OUTER JOIN comment c ON c.thread_id = t.id
WHERE 3 >= (SELECT COUNT(1)
FROM comment c2
WHERE c.thread_id = c2.thread_id
AND c.creation_time <= c2.creation_time
)
I have the first half of the query, but I am struggling with the syntax for the WHERE clause and how to combine it with the JOIN. Any one have any suggestions?
Thanks!
EDIT: First attempt seems to mess up around the .filter() call:
c = aliased(Comment)
c2 = aliased(Comment)
subq = db.session.query(Thread.id).filter_by(topic_id=122098).order_by(Thread.creation_time.desc()).limit(2).offset(2).subquery('t')
subq2 = db.session.query(func.count(1).label("count")).filter(c.id==c2.id).subquery('z')
q = db.session.query(subq.c.id, c.id).outerjoin(c, c.thread_id==subq.c.id).filter(3 >= subq2.c.count)
this generates the following SQL:
SELECT t.id AS t_id, comment_1.id AS comment_1_id
FROM (SELECT count(1) AS count
FROM comment AS comment_1, comment AS comment_2
WHERE comment_1.id = comment_2.id) AS z, (SELECT thread.id AS id
FROM thread
WHERE thread.topic_id = :topic_id ORDER BY thread.creation_time DESC
LIMIT 2 OFFSET 2) AS t LEFT OUTER JOIN comment AS comment_1 ON comment_1.thread_id = t.id
WHERE z.count <= 3
Notice the sub-query ordering is incorrect, and subq2 somehow is selecting from comment twice. Manually fixing that gives the right results, I am just unsure of how to get SQLAlchemy to get it right.
Try this:
c = db.aliased(Comment, name='c')
c2 = db.aliased(Comment, name='c2')
sq = (db.session
.query(Thread.id, Thread.creation_time)
.order_by(Thread.creation_time.desc())
.limit(5)
).subquery(name='t')
sq2 = (
db.session.query(db.func.count(1))
.select_from(c2)
.filter(c.thread_id == c2.thread_id)
.filter(c.creation_time <= c2.creation_time)
.correlate(c)
.as_scalar()
)
q = (db.session
.query(
sq.c.id, sq.c.creation_time,
c.id, c.creation_time,
)
.outerjoin(c, c.thread_id == sq.c.id)
.filter(3 >= sq2)
)
I have the below query to find the row prior to MAX row. i feel like i am missing something, can somebody please help with it. I ammlooking forward to get the b.usercode_1 as row prior to a.usercode_1 not the min or any other random row but the ROW prior to the MAX.
Please suggest.
Select distinct
c.ssn
, c.controlled_group_Status CG_status
, c.last_name || ' , '|| c.first_name FULL_NAME
, a.usercode_1 Current_REG
, a.eff_date effective_since1
, b.usercode_1 PRIOR_REG
, b.eff_date effective_since2
, d.term_eff_date
from employee_eff_date c
, emp_cg_data a
, emp_cg_data b
, emp_ben_elects d
where c.control_id = 'XYZ'
and c.controlled_group_Status <> 'D'
and c.eff_date = (select max( c1.eff_date)
from emp_cg_data c1
where c.control_id = c1.control_id
and c.ssn = c1.ssn)
and a.control_id = c.control_id
and a.ssn = c.ssn
and a.eff_date = (select max(a1.eff_date )
from emp_cg_data a1
where a.control_id = a1.control_id
and a.ssn = a1.ssn)
and a.usercode_1 = 'REG26'
and b.control_id = c.control_id
and b.ssn = c.ssn
and b.eff_date = (select max( b1.eff_date)
from emp_cg_data b1
where b.control_id = b1.control_id
and b.ssn = b1.ssn
and b1.eff_date < a.eff_date)
and b.usercode_1 like 'REG%'
and d.control_id = c.control_id
and d.ssn = c.ssn
and d.life_event_date = (select max( d1.life_event_date)
from emp_ben_elects d1
where d.control_id = d1.control_id
and d.ssn = d1.ssn)
and d.le_seq_no= (select max( d1.le_seq_no)
from emp_ben_elects d1
where d.control_id = d1.control_id
and d.ssn = d1.ssn
and d.life_event_date = d1.life_event_date)
and d.term_eff_date is null
;
NOTE: this is not a complete answer... its a helpful suggestion of what you should start with.
you are doing a Cartesian Product of the four tables, filtered by a WHERE... so something like this
Implicit Join -- generally not a good practice as it can be very difficult to keep the where filters apart from the join conditions.
SELECT *
FROM tableA a, TableB b
WHERE b.id = a.id
another way to write a JOIN (the more generally accepted way)
SELECT *
FROM tableA a
JOIN tableB b ON b.id = a.id
Use the ON clause to join two tables together.
You should change your joins to this format so that others can read your query and understand it better.
suggestion to solve your problem
a fairly simple way to get the second to last row is to use a row counter.
so something like
SELECT *, #row_count := #row_count + 1
FROM tableA a
JOIN tableB b on b.id = a.id AND -- any other conditions for the join.
CROSS JOIN (SELECT #row_count := 0) t
then from here you can get the MAX row, whether thats the ID or something else. and then get the #row_num -1. aka the previous row.
I have a funny MySQL query that needs to pull a subquery from another table, I'm wondering if this is even possible to get mysql to evaluate the subquery.
example:
(I had to replace some brackets with 'gte' & 'lte' cause they were screwing up the post format)
select a.id,a.alloyname,a.label,a.symbol, g.grade,
if(a.id = 1,(
(((select avg(cost/2204.6) as averageCost from nas_cost where cost != '0' and `date` lte '2011-03-01' and `date` gte '2011-03-31') - t.value) * (astm.astm/100) * 1.2)
),(a.formulae)) as thisValue
from nas_alloys a
left join nas_triggers t on t.alloyid = a.id
left join nas_astm astm on astm.alloyid = a.id
left join nas_estimatedprice ep on ep.alloyid = a.id
left join nas_grades g on g.id = astm.gradeid
where a.id = '1' or a.id = '2'
order by g.grade;
So when the IF statement is not = '1' then the (a.formulae) is the value in the nas_alloys table which is:
((ep.estPrice - t.value) * (astm.astm/100) * 0.012)
Basically I want this query to run as:
select a.id,a.alloyname,a.label,a.symbol, g.grade,
if(a.id = 1,(
(((select avg(cost/2204.6) as averageCost from nas_cost where cost != '0' and `date` gte '2011-03-01' and `date` lte '2011-03-31') - t.value) * (astm.astm/100) * 1.2)
),((ep.estPrice - t.value) * (astm.astm/100) * 0.012)) as thisValue
from nas_alloys a
left join nas_triggers t on t.alloyid = a.id
left join nas_astm astm on astm.alloyid = a.id
left join nas_estimatedprice ep on ep.alloyid = a.id
left join nas_grades g on g.id = astm.gradeid
where a.id = '1' or a.id = '2'
order by g.grade;
When a.id != '1', btw, there are about 30 different possibilities for a.formulae, and they change frequently, so hard banging in multiple if statements is not really an option. [redesigning the business logic is more likely than that!]
Anyway, any thoughts? Will this even work?
-thanks
-sean
Create a Stored Function to compute that value for you, and pass the params you will decide later on. When your business logic changes, you just have to update the Stored Function.