Need help with mySQL query - mysql

I have the following query (and it works fine):
SELECT cd.id AS card_id,
ct.id AS category_id,
COUNT(cc.user_id) AS cnt
FROM uiCards AS cd
JOIN uiCardCategories AS ct USING (project_id)
LEFT JOIN uiCategories2Cards AS cc ON (cc.card_id = cd.id AND cc.stack_id = ct.id)
WHERE cd.project_id = $projID
GROUP BY cd.id, ct.id
ORDER BY cd.id, ct.id
I also have a sting of numbers:
$exclude = '100,122,345';
I need to modify the string too exclude results found in the string. So I added:
AND cc.user_id NOT IN ($exclude)
below WHERE
WHERE cd.project_id = $projID
AND cc.user_id NOT IN ($exclude)
It did not seem to work, so I tried to modify more, and the whole query collapsed on me.
UPDATE:
I got it! I added quotes:
AND (FIND_IN_SET(cc.user_id, '$exclude') = 0 OR FIND_IN_SET(cc.user_id, '$exclude') IS NULL)

The SQL IN clause doesn't allow a single variable to represent a list of values. The query, as-is, can only be run as dynamic SQL -- on any database. Even run dynamically, SQL will only interpret this example as a single string.
Secondly, because of using an OUTER JOIN (LEFT in this example), placement of criteria can drastically affect the results returned. Specifying criteria in the JOIN's ON clause will apply the criteria before the JOIN is made; using the WHERE clause means the criteria is applied after the JOIN, which could mean additional records you did not want included.
You could use the FIND_IN_SET function instead:
WHERE cd.project_id = $projID
AND (FIND_IN_SET(cc.user_id, $excluded) = 0 OR
FIND_IN_SET(cc.user_id, $excluded) IS NULL)
..vs in the LEFT JOIN criteria:
LEFT JOIN uiCategories2Cards AS cc ON cc.card_id = cd.id
AND cc.stack_id = ct.id
AND FIND_IN_SET(cc.user_id, $excluded) = 0

IN() requires a row set, but you are providing a string, so this won't work.
Use the function FIND_IN_SET() instead.

Related

SQL IF ELSE WITH MULTIPLE SELECT STATEMENT

I want to optimize these SQL queries using if-else but how I should use it? .
if this query result contain 'ALL'
SELECT
bdsubcategory.subcategoryID as ID,
bdsubcategory.subcategoryName as Name
FROM
phonebook.newsms_subscription
INNER JOIN bdsubcategory ON bdsubcategory.subcategoryID = newsms_subscription.subcategoryID
INNER JOIN newsms_client ON newsms_subscription.clientID =newsms_client.clientID
INNER JOIN newsms_person ON newsms_subscription.personID = newsms_person.personID
WHERE
newsms_subscription.isActive = 1 AND
newsms_person.personID = '856'
Then i want to query this
SELECT
bdsubcategory.subcategoryID as ID,
bdsubcategory.subcategoryName as Name
FROM
phonebook.newsms_subscription
INNER JOIN bdsubcategory ON bdsubcategory.subcategoryID = newsms_subscription.subcategoryID
INNER JOIN newsms_person ON newsms_subscription.personID = newsms_person.personID
WHERE
newsms_subscription.isActive = 1
GROUP BY subcategoryName
ORDER BY subcategoryName
otherwise take query1 result .
The problem is that if we do not refactor your project, then you always have to evaluate query1 and see whether it contains All or not. If it does not contain All, then you need to evaluate query2 as well. This can hardly be optimized, let's see a few approaches:
Quickening query1
Since All might be not be the very last evaluated element, adding it to the filter and limiting it is a good idea to quicken query1:
SELECT
COUNT(*)
FROM
phonebook.newsms_subscription
INNER JOIN bdsubcategory ON bdsubcategory.subcategoryID = newsms_subscription.subcategoryID
INNER JOIN newsms_client ON newsms_subscription.clientID =newsms_client.clientID
INNER JOIN newsms_person ON newsms_subscription.personID = newsms_person.personID
WHERE
newsms_subscription.isActive = 1 AND
newsms_person.personID = '856' AND
bdsubcategory.subcategoryName = 'ALL'
LIMIT 0, 1
So, you could create a stored procedure which evaluates query1' (query1' is the quickened version of query1, as seen above) and if there is a result, then we need to execute query1. Otherwise we need to execute query2. This way you still execute two queries, but the first query is optimized.
Refactoring
Note that the second query does not change. You could create a table where you could cache its results, using a periodic job. Then, you could skip the second table to
SELECT ID, Name
FROM MyNewTable;
without the many joins. You would also cache the results of the first query into a table where the items having ALL would be stored and query that table.
One option would be to use a CASE.
Change this:
newsms_person.personID = '856'
To this:
'Y' = CASE WHEN UPPER('856') = 'ALL' THEN 'Y'
WHEN newsms_person.personID = '856' THEN 'Y'
ELSE 'N' END
Alternatively, a stored procedure could be used to first validate whether the personID seems valid, then returns the appropriate data.

How to optimize a query with inner join

My mysql query is too slow and i don't know how to optimize it. My webapp cant load this query because take too much time to run and the webserver have a limit time to get the result.
SELECT rc.trial_id,
rc.created,
rc.date_registration,
rc.agemin_value,
rc.agemin_unit,
rc.agemax_value,
rc.agemax_unit,
rc.exclusion_criteria,
rc.study_design,
rc.expanded_access_program,
rc.number_of_arms,
rc.enrollment_start_actual,
rc.target_sample_size,
(select name from repository_institution where id = rc.primary_sponsor_id) as
primary_sponsor,
(select label from vocabulary_studytype where id = rc.study_type_id) as study_type,
(select label from vocabulary_interventionassigment where id =
rc.intervention_assignment_id) as intervention_assignment,
(select label from vocabulary_studypurpose where id = rc.purpose_id) as study_purpose,
(select label from vocabulary_studymasking where id = rc.masking_id) as study_mask,
(select label from vocabulary_studyallocation where id = rc.allocation_id) as
study_allocation,
(select label from vocabulary_studyphase where id = rc.phase_id) as phase,
(select label from vocabulary_recruitmentstatus where id = rc.recruitment_status_id) as
recruitment_status,
GROUP_CONCAT(vi.label)
FROM
repository_clinicaltrial rc
inner JOIN repository_clinicaltrial_i_code rcic ON rcic.clinicaltrial_id = rc.id JOIN
vocabulary_interventioncode vi ON vi.id = rcic.interventioncode_id
GROUP BY rc.id;
Using inner join instead join could be a solution?
Changing to JOINs vs continuous selects per every row will definitely improve. Also, since you are using MySQL, using the keyword "STRAIGHT_JOIN" tells MySQL to do the query in the order I provided. Since your "rc" table is the primary and all the others are lookups, this will make MySQL use it in that context rather than hoping some other lookup table be the basis of the rest of the joins.
SELECT STRAIGHT_JOIN
rc.trial_id,
rc.created,
rc.date_registration,
rc.agemin_value,
rc.agemin_unit,
rc.agemax_value,
rc.agemax_unit,
rc.exclusion_criteria,
rc.study_design,
rc.expanded_access_program,
rc.number_of_arms,
rc.enrollment_start_actual,
rc.target_sample_size,
ri.name primary_sponsor,
st.label study_type,
via.label intervention_assignment,
vsp.label study_purpose,
vsm.label study_mask,
vsa.label study_allocation,
vsph.label phase,
vrs.label recruitment_status,
GROUP_CONCAT(vi.label)
FROM
repository_clinicaltrial rc
JOIN repository_clinicaltrial_i_code rcic
ON rc.id = rcic.clinicaltrial_id
JOIN vocabulary_interventioncode vi
ON rcic.interventioncode_id = vi.id
JOIN repository_institution ri
on rc.primary_sponsor_id = ri.id
JOIN vocabulary_studytype st
on rc.study_type_id = st.id
JOIN vocabulary_interventionassigment via
on rc.intervention_assignment_id = via.id
JOIN vocabulary_studypurpose vsp
ON rc.purpose_id = vsp.id
JOIN vocabulary_studymasking vsm
ON rc.masking_id = vsm.id
JOIN vocabulary_studyallocation vsa
ON rc.allocation_id = vsa.id
JOIN vocabulary_studyphase vsph
ON rc.phase_id = vsph.id
JOIN vocabulary_recruitmentstatus vrs
ON rc.recruitment_status_id = vrs.id
GROUP BY
rc.id;
One final note. You are using a GROUP BY and applying to the GROUP_CONCAT() which is ok. However, proper group by says you need to group by all non-aggregate columns, which in this case is every other column in the list. You may know this, and the fact the lookups will be the same based on the "rc" associated columns, but its not good practice to do so.
Your joins and subqueries are probably not the problem. Assuming you have correct indexes on the tables, then these are fast. "Correct indexes" means that the id column is the primary key -- a very reasonable assumption.
My guess is that the GROUP BY is the performance issue. So, I would suggest structuring the query with no `GROUP BY:
select . . .
(select group_concat(vi.label)
from repository_clinicaltrial_i_code rcic
vocabulary_interventioncode vi
on vi.id = rcic.interventioncode_id
where rcic.clinicaltrial_id = rc.id
)
from repository_clinicaltrial rc ;
For this, you want indexes on:
repository_clinicaltrial_i_code(clinicaltrial_id, interventioncode_id)
vocabulary_interventioncode(id, label)

MAX(Date) is giving empty result

I have a table with exchange rate like below
And I am using the maxofdate to pick all these values based on currency code. But the query is giving blank.
Select USDAMOUNT * dbo.EXCHANGERATEAMT
from dbo.Amount_monthly
Left Join dbo.EXCHANGERATE on dbo.Amount_monthly.Currencycode=dbo.EXCHANGERATE.fromcurrencycode
WHERE ValidToDateTime = (Select MAX(ValidToDateTime) from dbo.EXCHANGERATE)
AND dbo.EXCHANGERATE.EXCHANGERATETYPECODE = 'DAY'
Using this statement
CONVERT(DATE,ValidToDateTime) = CONVERT(DATE,GETDATE()-1)
instead of subquery is giving me expected result.
Can someone correct this.
thanks in advance.
If I understand correctly, you need two things. First, the condition for the max() needs to match the condition in the outer query. Second, if you really want a left join, then conditions on the second table need to go in the on clause.
The resulting query looks like:
Select . . .
from dbo.Amount_monthly am Left Join
dbo.EXCHANGERATE er
on am.Currencycode = er.fromcurrencycode and
er.ValidToDateTime = (Select max(er2.ValidToDateTime)
from dbo.EXCHANGERATE er2
where er2.EXCHANGERATETYPECODE = 'DAY'
) and
er.EXCHANGERATETYPECODE = 'DAY';
I would write this using window functions, but that is a separate issue.
Try removing WHERE clause for ValidToDateTime and include it in the JOIN as AND condition
SELECT USDAMOUNT * dbo.EXCHANGERATEAMT
FROM dbo.Amount_monthly
LEFT JOIN dbo.EXCHANGERATE
ON dbo.Amount_monthly.Currencycode = dbo.EXCHANGERATE.fromcurrencycode
AND ValidToDateTime = (SELECT MAX(ValidToDateTime) --remove WHERE clause
FROM dbo.EXCHANGERATE)
AND dbo.EXCHANGERATE.EXCHANGERATETYPECODE = 'DAY';
I cleaned up your query a bit: as the other folks mentioned you needed to close the parentheses around the MAX(Date) sub-query, and if you reference a LEFT JOINed table in the WHERE clause, it behaves like an INNER JOIN, so I changed to in INNER. You also had "dbo" sprinkled in as a field prefix, but that (the namespace) only prefixes a database, not a field. I added the IS NOT NULL check just to avoid SQL giving the "null values were eliminated" SQL warning. I used the aliases "am" for the first table and "er" for the 2nd, which makes it more readable:
SELECT am.USDAMOUNT * er.EXCHANGERATEAMT
FROM dbo.Amount_monthly am
JOIN dbo.EXCHANGERATE er
ON am.Currencycode = er.fromcurrencycode
WHERE er.ValidToDateTime = (SELECT MAX(ValidToDateTime) FROM dbo.EXCHANGERATE WHERE ValidToDateTime IS NOT NULL)
AND er.EXCHANGERATETYPECODE = 'DAY'
If you're paranoid like I am, you might also want to make sure the exchange rate is not zero to avoid a divide-by-zero error.

MySQL Replacing IN and EXISTS with joins in sub sub queries

So, this query is currently used in a webshop to retrieve technical data about articles.
It has served its purpose fine except the amount of products shown have increased lately resulting in unacceptable long loading times for some categories.
For one of the worst pages this (and some other queries) get requested about 80 times.
I only recently learned that MySQL does not optimize sub-queries that don't have a depending parameter to only run once.
So if someone could help me with one of the queries and explain how you can replace the in's and exists's to joins, i will probably be able to change the other ones myself.
select distinct criteria.cri_id, des_texts.tex_text, article_criteria.acr_value, article_criteria.acr_kv_des_id
from article_criteria, designations, des_texts, criteria, articles
where article_criteria.acr_cri_id = criteria.cri_id
and article_criteria.acr_art_id = articles.art_id
and articles.art_deliverystatus = 1
and criteria.cri_des_id = designations.des_id
and designations.des_lng_id = 9
and designations.des_tex_id = des_texts.tex_id
and criteria.cri_id = 328
and article_criteria.acr_art_id IN (Select distinct link_art.la_art_id
from link_art, link_la_typ
where link_art.la_id = link_la_typ.lat_la_id
and link_la_typ.lat_typ_id = 17484
and link_art.la_ga_id IN (Select distinct link_ga_str.lgs_ga_id
from link_ga_str, search_tree
where link_ga_str.lgs_str_id = search_tree.str_id
and search_tree.str_type = 1
and search_tree.str_id = 10132
and EXISTS (Select *
from link_la_typ
where link_la_typ.lat_typ_id = 17484
and link_ga_str.lgs_ga_id = link_la_typ.lat_ga_id)))
order by article_criteria.acr_value
I think this one is the main badguy with sub-sub-sub-queries
I just noticed i can remove the last exist and still get the same results but with no increase in speed, not part of the question though ;) i'll figure out myself whether i still need that part.
Any help or pointers are appreciated, if i left out some useful information tell me as well.
I think this is equivalent:
SELECT DISTINCT c.cri_id, dt.tex_text, ac.acr_value, ac.acr_kv_des_id
FROM article_criteria AS ac
JOIN criteria AS c ON ac.acr_cri_id = c.cri_id
JOIN articles AS a ON ac.acr_art_id = a.art_id
JOIN designations AS d ON c.cri_des_id = d.des_id
JOIN des_texts AS dt ON dt.tex_id = d.des_tex_id
JOIN (SELECT distinct la.la_art_id
FROM link_art AS la
JOIN link_la_typ AS llt ON la.la_id = llt.lat_la_id
JOIN (SELECT DISTINCT lgs.lgs_ga_id
FROM link_ga_str AS lgs
JOIN search_tree AS st ON lgs.lgs_str_id = st.str_id
JOIN link_la_typ AS llt ON lgs.lgs_ga_id = llt.lat_ga_id
WHERE st.str_type = 1
AND st.str_id = 10132
AND llt.lat_typ_id = 17484) AS lgs
ON la.la_ga_id = lgs.lgs_ga_id
WHERE llt.lat_typ_id = 17484) AS la
ON ac.acr_art_id = la.la_art_id
WHERE a.art_deliverystatus = 1
AND d.des_lng_id = 9
AND c.cri_id = 328
ORDER BY ac.acr_value
All the IN <subquery> clauses can be replaced with JOIN <subquery>, where you then JOIN on the column being tested equaling the column returned by the subquery. And the EXISTS test is converted to a join with the table, moving the comparison in the subquery's WHERE clause into the ON clause of the JOIN.
It's probably possible to flatten the whole thing, instead of joining with subqueries. But I suspect performance will be poor, because this won't reduce the temporary tables using DISTINCT. So you'll get combinatorial explosion in the resulting cross product, which will then have to be reduced at the end with the DISTINCT at the top.
I've converted all the implicit joins to ANSI JOIN clauses, to make the structure clearer, and added table aliases to make things more readable.
In general, you can convert a FROM tab1 WHERE ... val IN (SELECT blah) to a join like this.
FROM tab1
JOIN (
SELECT tab1_id
FROM tab2
JOIN tab3 ON whatever = whatever
WHERE whatever
) AS sub1 ON tab1.id = sub1.tab1_id
The JOIN (an inner join) will drop the rows that don't match the ON condition from your query.
If your tab1_id values can come up duplicate from your inner query, use SELECT DISTINCT. But don't use SELECT DISTINCT unless you need to; it is costly to evaluate.

MYSQL get other table data in a join

I am currently running this SQL
SELECT jm_recipe.name, jm_recipe.slug
FROM jm_recipe
LEFT JOIN jm_category_recipe ON jm_category_recipe.recipe_id = jm_recipe.id
WHERE jm_category_recipe.category_id = $cat"
This returns the desired results except that I also need to return the name of the category that the recipe I am looking for is in, to do this I tried to add the field in to my SELECT statement and also add the table into the FROM clause,
SELECT jm_recipe.name, jm_recipe.slug, jm_category_name
FROM jm_recipe, jm_category
LEFT JOIN jm_category_recipe ON jm_category_recipe.recipe_id = jm_recipe.id
WHERE jm_category_recipe.category_id = $cat"
However this just returns no results, what am i doing wrong?
You need to join both tables:
SELECT jm_recipe.name, jm_recipe.slug, jm.category_name
FROM jm_recipe
INNER JOIN jm_category_recipe ON jm_category_recipe.recipe_id = jm_recipe.id
INNER JOIN jm_category ON jm_recipe.recipe_id = jm_category.recipe_id
WHERE jm_category_recipe.category_id = $cat
I've changed the joins to inner joins as well. You might want to make them both LEFT joins if you have NULLs and want them in the result.
Also, you're vulnerable to SQL Injection by simply copying over $cat.
Here's some PHP specific info for you (I'm assuming you're using PHP.)