Mysql where in not working as expected - mysql

I have the following query:
SELECT
`tests`.`id`,
`tests`.`created_at`,
`tests`.`updated_at`,
`tests`.`created_by`,
`tests`.`date_of_test`,
`tests`.`location`,
`tests`.`information`,
`tests`.`title`,
`tests`.`goals`,
`tests`.`deleted_at`,
`tests`.`status`,
`tests`.`tester`,
`tests`.`test_approach`
FROM `tests`
WHERE
`tests`.`id` IN (
SELECT `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
GROUP BY `wobbles`.`id`
)
GROUP BY `tests`.`id`
ORDER BY tests.date_of_test DESC
If I run the IN query on its own, it returns
1 result
with the value 13.
the column is called "test_id"
When i run the whole above query, I get
2 results from the test table back...
with different ids... 13 and 14.
If I replace the IN query with the number 13... The SQL returns 1 result (The correct one).
What am i doing wrong here?

This query:
SELECT `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
GROUP BY `wobbles`.`id`
groups by wobbles.id but returns test_wobble.test_id which is not a part of GROUP BY.
On each iteration, MySQL pushes the IN field into this query:
SELECT `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
-- This is implicitly added by MySQL when optimizing
AND `test_wobble`.`test_id` = `tests`.`id`
GROUP BY `wobbles`.`id`
and then just checks if some value exists.
If you remove the GROUP BY from your IN query, you'll see that it contains both 13 and 14, but only one of those is returned when you run the query with GROUP BY.
You can also try running the second query, substituting 13 and 14 instead of tests.id and make sure the query returns something in both cases.
This might actually be considered a bug in MySQL. However, since the documentation does not specify which ungrouped and unaggregated expression will be returned from a grouped query, it's better to specify it explicitly, of side effects from the optimizer will kick in like the do in this case.
Could you please provide some sample of your data and outline what are you going to achieve with the query?

It is a little bit hard to tell without knowing what the data is. But, you do have an issue in the subquery. This is your subquery:
SELECT `test_wobble`.`test_id`
FROM `project_wobble` INNER JOIN
`wobbles`
ON `project_wobble`.`wobble_id` = `wobbles`.`id` INNER JOIN
`wobble_profiles`
ON `wobble_profiles`.`wobble_id` = `wobbles`.`id` INNER JOIN
`wobble_profile_user`
ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id` INNER JOIN
`test_wobble`
ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
GROUP BY `wobbles`.`id`
Note the select and group by. These have different variables:
`test_wobble`.`test_id`
`wobbles`.`id`
I'm not sure which one you really want. But MySQL returns an indeterminate value when you run the query -- and a value that can change from one run to the next. You should fix the select and group by so they match.

The inner query exposes undefined behaviour. It is explained in the documentation on the page MySQL Handling of GROUP BY.
According to the SQL standard, the inner query is invalid. To be valid, all the columns that appear in the SELECT, HAVING and ORDER BY clauses must satisfy one of the following:
they appear in the GROUP BY clause;
are used (in SELECT, HAVING or ORDER BY) only as parameters of aggregate functions;
are functionally dependent on the GROUP BY columns.
For example, using your tables, you can put in the SELECT clause:
wobbles.id - because it appears in the GROUP BY clause;
COUNT(DISTINCT project_wobble.project_id) - even if project_wobble.project_id does not appear in GROUP BY, it can be used as a parameter of the aggregate function COUNT(DISTINCT);
any column of table wobbles, given that the column id is its PK - all the columns of table wobbles are functionally dependent on wobbles.id (their values are uniquely determined by the value of wobbles.id).
Before version 5.7.5, MySQL accepts queries that do not follow the above requirements but, as the documentation states:
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want.
Starting with version 5.7.5, MySQL implements detection of functional dependence as an configurable feature (which is turned ON by default).
On 5.7.5 your inner query will trigger an error and that's all; your query is invalid, so it doesn't run at all.
On previous versions (also on 5.7.5 if the ONLY_FULL_GROUP_BY SQL mode is disabled), the query runs but its results are unpredictable. They can change from one execution to the next if, for example, a row is deleted then re-inserted.
Because the MySQL query optimizer re-organizes your whole query for better execution plan, when it is embedded in the larger query its execution is not the same as when it is ran standalone. This is another way you can observe its undefined behaviour.
How to fix your query
Extract the inner query, remove the GROUP BY clause, add more columns to the SELECT clause and look at what it produces:
SELECT DISTINCT `test_wobble`.`wobble_id`, `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
If I'm not wrong, it will produce two rows having the same wobble_id and values 13 and 14 for column test_id.
If this result set is correct then you can remove test_wobble.wobble_id from SELECT, keep DISTINCT and put the query into the larger one.
There is no need for GROUP BY (because of the DISTINCT) and it should work faster without it.

Related

MySQL Sum even if records doesnt exist [duplicate]

I need to retrieve all default settings from the settings table but also grab the character setting if exists for x character.
But this query is only retrieving those settings where character is = 1, not the default settings if the user havent setted anyone.
SELECT `settings`.*, `character_settings`.`value`
FROM (`settings`)
LEFT JOIN `character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
WHERE `character_settings`.`character_id` = '1'
So i should need something like this:
array(
'0' => array('somekey' => 'keyname', 'value' => 'thevalue'),
'1' => array('somekey2' => 'keyname2'),
'2' => array('somekey3' => 'keyname3')
)
Where key 1 and 2 are the default values when key 0 contains the default value with the character value.
The where clause is filtering away rows where the left join doesn't succeed. Move it to the join:
SELECT `settings`.*, `character_settings`.`value`
FROM `settings`
LEFT JOIN
`character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1'
When making OUTER JOINs (ANSI-89 or ANSI-92), filtration location matters because criteria specified in the ON clause is applied before the JOIN is made. Criteria against an OUTER JOINed table provided in the WHERE clause is applied after the JOIN is made. This can produce very different result sets. In comparison, it doesn't matter for INNER JOINs if the criteria is provided in the ON or WHERE clauses -- the result will be the same.
SELECT s.*,
cs.`value`
FROM SETTINGS s
LEFT JOIN CHARACTER_SETTINGS cs ON cs.setting_id = s.id
AND cs.character_id = 1
If I understand your question correctly you want records from the settings database if they don't have a join accross to the character_settings table or if that joined record has character_id = 1.
You should therefore do
SELECT `settings`.*, `character_settings`.`value`
FROM (`settings`)
LEFT OUTER JOIN `character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
WHERE `character_settings`.`character_id` = '1' OR
`character_settings`.character_id is NULL
You might find it easier to understand by using a simple subquery
SELECT `settings`.*, (
SELECT `value` FROM `character_settings`
WHERE `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1') AS cv_value
FROM `settings`
The subquery is allowed to return null, so you don't have to worry about JOIN/WHERE in the main query.
Sometimes, this works faster in MySQL, but compare it against the LEFT JOIN form to see what works best for you.
SELECT s.*, c.value
FROM settings s
LEFT JOIN character_settings c ON c.setting_id = s.id AND c.character_id = '1'
For this problem, as for many others involving non-trivial left joins such as left-joining on inner-joined tables, I find it convenient and somewhat more readable to split the query with a with clause. In your example,
with settings_for_char as (
select setting_id, value from character_settings where character_id = 1
)
select
settings.*,
settings_for_char.value
from
settings
left join settings_for_char on settings_for_char.setting_id = settings.id;
The way I finally understand the top answer is realising (following the Order Of Execution of the SQL query ) that the WHERE clause is applied to the joined table thereby filtering out rows that do not satisfy the WHERE condition from the joined (or output) table. However, moving the WHERE condition to the ON clause applies it to the individual tables prior to joining. This enables the left join to retain rows from the left table even though some column entries of those rows (entries from the right tables) do not satisfy the WHERE condition.
The result is correct based on the SQL statement. Left join returns all values from the right table, and only matching values from the left table.
ID and NAME columns are from the right side table, so are returned.
Score is from the left table, and 30 is returned, as this value relates to Name "Flow". The other Names are NULL as they do not relate to Name "Flow".
The below would return the result you were expecting:
SELECT a.*, b.Score
FROM #Table1 a
LEFT JOIN #Table2 b
ON a.ID = b.T1_ID
WHERE 1=1
AND a.Name = 'Flow'
The SQL applies a filter on the right hand table.

MySQL subquery in select with link to outer field

I convert an old software (that use MS-ACCESS MDB) to mySQL.
I have a query that takes long time to run (actualy I break running after 5 minutes of waiting)
How can I write it?
SELECT pa_ID, pa_PRODUCT_ID, pr_ID,pr_NAME,Sum(pa_KILOS) as IN_KILOS,
(select sum(pl_KILOS) from POLHSH where POLHSH.pl_PRODUCT_ID = pa_PRODUCT_ID and POLHSH.pl_PARALABH_ID = pa_ID) as OUT_KILOS From PARALABH, PRODUCTS WHERE pa_company_id=1 GROUP BY pa_ID, pa_PRODUCT_ID,pr_ID, pr_NAME HAVING pa_ID=241 and pr_id=pa_PRODUCT_ID
Thanks in advance
Consider avoiding the correlated subquery which runs a SUM separately for each row and use a join of two aggregate queries each of which runs SUM once by grouping fields. Additionally, use explicit joins, the current SQL standard in joining tables/views.
Please adjust column aliases and names to actuals as assumptions were made below.
SELECT t1.*, t2.OUT_KILOS
FROM
(SELECT pa.pa_ID,
pa.pa_PRODUCT_ID,
pr.pr_ID,
pr.pr_NAME,
SUM(pa.pa_KILOS) AS IN_KILOS
FROM PARALABH pa
INNER JOIN PRODUCTS pr
ON pr.pr_id = pa.pa_PRODUCT_ID
WHERE pa.pa_company_id = 1
GROUP BY pa.pa_ID,
pa.pa_PRODUCT_ID,
pr.pr_ID,
pr.pr_NAME
HAVING pa.pa_ID = 241
) AS t1
INNER JOIN
(SELECT POLHSH.pl_PRODUCT_ID,
POLHSH.pl_PARALABH_ID
SUM(pl_KILOS) As OUT_KILOS
FROM POLHSH
GROUP BY POLHSH.pl_PRODUCT_ID,
POLHSH.pl_PARALABH_ID
) AS t2
ON t2.pl_PRODUCT_ID = t1.pa_PRODUCT_ID
AND t2.pl_PARALABH_ID = t1.pa_ID

MYSQL Statement Issues - INNER JOIN, LEFT JOIN WITH GROUP BY and MAX

I can't for the life of me get this statement to work.
SELECT max(pm.timestamp), pm.id, pm.p_media_user_id, pm.p_media_type,
pm.p_media_file, pm.wall_post, pm.p_media_location,pm.p_media_location_name,
pm.p_media_category, pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm
INNER JOIN p_users as pu ON pm.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts as pa ON pm.id = pa.post_id AND pa.p_source_alert_id ='3849084'
group by pm.p_media_user_id;
The only thing that I am having issues with is the max(pm.timestamp), after the grouping I would expect it to show the NEWEST rows in the p_media table, but to the contrary it's doing the exact opposite and showing the oldest rows. So, I need the newest rows from the p_media table grouped by the user id which Join the p_users table.
Thanks in advance, if anyone helps.
As others have already pointed out, you are aggregating by the p_media_user_id column but then selecting other non aggregate columns. This either won't run at all, or it will run but give non determistic results. However, it looks like you just want the most recent record from the p_media table, for each p_media_user_id.
If so, then this would seem to be the query you intended to run:
SELECT
pm1.timestamp, pm1.id, pm1.p_media_user_id, pm1.p_media_type, pm1.p_media_file,
pm1.wall_post, pm1.p_media_location, pm1.p_media_location_name,
pm1.p_media_category, pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm1
INNER JOIN
(
SELECT p_media_user_id, MAX(timestamp) AS max_timestamp
FROM p_media
GROUP BY p_media_user_id
) pm2
ON pm1.p_media_user_id = pm2.p_media_user_id AND
pm1.timestamp = pm2.max_timestamp
INNER JOIN p_users AS pu
ON pm1.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts AS pa
ON pm1.id = pa.post_id AND
pa.p_source_alert_id = '3849084';
Your query is not doing what you think it is doing. When you use GROUP BY, only the columns that appear in the GROUP BY clause can be used in the SELECT without an aggregate function. All columns that are not in the GROUP BY clause MUST be using in an aggregate function when adding them to the SELECT.
This is the standard, and for all databases that follow the standards, you will get an error from your query. For some reason, MySQL decided not to follow the standards on this and no error is returned. This is really bad, because your query will run, but the results cannot be predicted. So you will think that the query is fine and will wonder why you get the wrong results, while in fact your query is invalid.
MySQL has finally addressed the problem and starting with MySQL 5.7.5, the ONLY_FULL_GROUP_BY SQL mode is enabled by default. The reason they gave is rather silly: because GROUP BY processing has become more sophisticated to include detection of functional dependencies., but at least they've changed the default and starting with MySQL 5.7.5, it will behave like most other databases. For earlier versions, if you have access to change the settings, I recommend enabling ONLY_FULL_GROUP_BY so you get a clear error for such invalid queries.
In some cases, you really don't care about the value returned for the non-aggregate columns, if all the values are exactly the same. To let the query pass while ONLY_FULL_GROUP_BY is enabled, use the ANY_VALUE() function on those columns. The is a better approach as it clearly indicate your intention.
To learn how you can fix your query, you can read How do we select non-aggregate columns in a query with a GROUP BY clause. You need to self-join the p_media table with only the p_media_user_id and MAX(timestamp) selected on the grouping:
SELECT pm.timestamp, pm.id, pm.p_media_user_id, pm.p_media_type, pm.p_media_file,
pm.wall_post, pm.p_media_location, pm.p_media_location_name, pm.p_media_category,
pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm
INNER JOIN (SELECT p_media_user_id, MAX(timestamp) AS max_time
FROM p_media
GROUP BY p_media_user_id
) pmm ON pm.p_media_user_id = pmm.p_media_user_id
AND pm.timestamp = pmm.max_time
INNER JOIN p_users AS pu ON pm.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts AS pa ON pm.id = pa.post_id
AND pa.p_source_alert_id = '3849084';
You should be able to add an ORDER BY after the grouping and tell SQL what column you want to sort by [ASC or DESC].
SELECT max(pm.timestamp), pm.id, pm.p_media_user_id, pm.p_media_type,
pm.p_media_file, pm.wall_post, pm.p_media_location,pm.p_media_location_name,
pm.p_media_category, pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm
INNER JOIN p_users as pu ON pm.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts as pa ON pm.id = pa.post_id AND pa.p_source_alert_id ='3849084'
group by pm.p_media_user_id
ORDER BY pm.p_media_user_id DESC;

Can I use COUNT(*) in multiple tables?

I have a select statement that uses inner joins on multiple tables, and I want to get COUNT() from one particular table, however my current statement is throwing an error:
Syntax error: unexpected 'COUNT' (count)
Helpful. I know. Gotta love MySQL's detailed and in-depth error messages.
Here is my select statement:
SELECT SE.SEId, SE.ParentME, SE.ParentSE, SE.Name, SE.Status, SE.Description,
UDC.UDCId, UDC.Code, UDC.Description,
TRM.COUNT(*)
FROM SubEquipment SE
INNER JOIN UserDefinedCode UDC ON UDC.ETId = SE.EquipmentType
INNER JOIN Terminal TRM ON TRM.SEId = SE.SEId
GROUP BY TRM.SEId
WHERE ParentME = #MEId;
What am I doing wrong? Is this possible?
You want to do the following:
SELECT SE.SEId, SE.ParentME, SE.ParentSE, SE.Name, SE.Status, SE.Description,
UDC.UDCId, UDC.Code, UDC.Description,
COUNT(DISTINCT TRM.SEID)
FROM SubEquipment SE
INNER JOIN UserDefinedCode UDC ON UDC.ETId = SE.EquipmentType
INNER JOIN Terminal TRM ON TRM.SEId = SE.SEId
WHERE ParentME = #MEId
GROUP BY 1,2,3,4,5,6,7,8,9
Because Count is an aggregate your single measures must be grouped. Plus the error you're seeing is because COUNT isn't a column in TRM. That's what it thinks you're asking for.
Try COUNT(DISTINCT [the TRM primary key field(s)]); it should count the distinct terminal "id" values, so even if the intermediate JOIN multiples the rows, you'll still get the number of terminals.
In addition to FirebladeDan's answer, (as he suggested) a subquery also cleaned this issue up:
SELECT DISTINCT SE.SEId, SE.ParentME, SE.ParentSE, SE.Name, SE.Status, SE.Description,
UDC.UDCId, UDC.Code, UDC.Description,
--Subquery to get the count
(SELECT COUNT(*) FROM Terminal WHERE TRM.SEId = SE.SEId) AS TerminalCount
FROM SubEquipment SE
INNER JOIN UserDefinedCode UDC ON UDC.ETId = SE.EquipmentType
LEFT JOIN Terminal TRM ON TRM.SEId = SE.SEId
WHERE ParentME = #MEId;
This got rid of the need for grouping the columns.
Subnote: I changed the INNER JOIN on the Terminal table to a LEFT JOIN, because if a SEId did not have any associated terminals, it would not return any information, which also called for a DISTINCT query.

How to fix a count() in a query with a "group by" clause?

I have a function that gets a SQL code and inserts a count field in it and executes the query to return the number of rows in it. The objective is to have a dynamic SQL code and be able to get its record count no matter what code it has, because I use it in a registry filter window and I never know what code may be generated, because the user can add as many filters as he/she wants.
But as I use the group by clause, the result is wrong because it is counting the number of times a main registry appears because of the use on many join connections.
The result of that code above should only one row with a columns with 10 as result, but I get a new table with the first columns with a 2 in the first row and a 1 on the other rows.
If I take off the group by clause I will receive a 11 as a count result, but the first row will be counted twice.
What should I do to get a single row and the correct number?
SELECT
COUNT(*) QUERYRECORDCOUNT, // this line appears only in the Count() function
ARTISTA.*,
CATEGORIA.NOME AS CATEGORIA,
ATIVIDADE.NOME AS ATIVIDADE,
LOCALIDADE.NOME AS CIDADE,
MATRICULA.NUMERO AS MAP
FROM
ARTISTA
LEFT JOIN PERFIL ON PERFIL.REGISTRO = ARTISTA.ARTISTA_ID
LEFT JOIN CATEGORIA ON CATEGORIA.CATEGORIA_ID = PERFIL.CATEGORIA
LEFT JOIN ATIVIDADE ON ATIVIDADE.ATIVIDADE_ID = PERFIL.ATIVIDADE
LEFT JOIN LOCALIDADE ON LOCALIDADE.LOCALIDADE_ID = ARTISTA.LOCAL_ATIV_CIDADE
LEFT JOIN MATRICULA ON MATRICULA.REGISTRO = ARTISTA.ARTISTA_ID
WHERE
((ARTISTA.SIT_PERFIL <> 'NORMAL') AND (ARTISTA.SIT_PERFIL <> 'PRIVADO'))
GROUP BY
ARTISTA.ARTISTA_ID
ORDER BY
ARTISTA.ARTISTA_ID;
This always gives you the number of rows for any query you have:
Select count(*) as rowcount from
(
Paste your query here
) as countquery
Since your are GROUPING BY ARTISTA.ARTISTA_ID, COUNT(*) QUERYRECORDCOUNT will return records count for each ARTISTA.ARTISTA_ID value.
If you want GLOBAL count, then you need to use a nested query:
SELECT COUNT(*) AS QUERYRECORDCOUNT
FROM (SELECT
ARTISTA.*,
CATEGORIA.NOME AS CATEGORIA,
ATIVIDADE.NOME AS ATIVIDADE,
LOCALIDADE.NOME AS CIDADE,
MATRICULA.NUMERO AS MAP
FROM
ARTISTA
LEFT JOIN PERFIL ON PERFIL.REGISTRO = ARTISTA.ARTISTA_ID
LEFT JOIN CATEGORIA ON CATEGORIA.CATEGORIA_ID = PERFIL.CATEGORIA
LEFT JOIN ATIVIDADE ON ATIVIDADE.ATIVIDADE_ID = PERFIL.ATIVIDADE
LEFT JOIN LOCALIDADE ON LOCALIDADE.LOCALIDADE_ID = ARTISTA.LOCAL_ATIV_CIDADE
LEFT JOIN MATRICULA ON MATRICULA.REGISTRO = ARTISTA.ARTISTA_ID
WHERE
((ARTISTA.SIT_PERFIL <> 'NORMAL') AND (ARTISTA.SIT_PERFIL <> 'PRIVADO'))
GROUP BY
ARTISTA.ARTISTA_ID
ORDER BY
ARTISTA.ARTISTA_ID);
In this case, you may not need to select those many columns.
If you need to retrieve the all records count with details, then better to use two separate queries.