WHERE clause, on joined table, with multiple rows - mysql

I have an incidents table which has a 1 to many relationship with a few tables - mainly, for the context of this question, people.
Basically, one incident may have many people (involved).
At the moment, I'm retrieving the incident details - plus a concatenated comma-delimited string of people's IDs using this query:
SELECT
i.`ID` AS `id`,
i.`Author_ID` AS `author_id`,
i.`Description` AS `description`,
i.`Date` AS `date`,
i.`Datetime_Created` AS `created`,
p.`Title` AS `period`,
GROUP_CONCAT(DISTINCT ip.`Person_ID` ORDER BY FIELD(ip.`Involvement`, 'V', 'P', 'W') ASC SEPARATOR ',') AS `people_ids`,
( SELECT COUNT(`ID`) FROM `reports` r WHERE r.Incident_ID = i.ID ) AS `reports`,
i.`Status` AS `status`
FROM `incidents` i
LEFT JOIN `reports` ir ON ir.Incident_ID = i.ID
LEFT JOIN `people` ip ON ip.Incident_ID = i.ID
LEFT JOIN `periods` p ON i.Period_ID = p.ID
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID )
GROUP BY i.ID
ORDER BY i.`Date` DESC, p.`ID` DESC
This works fine, and produces data like:
What I'm trying to do now is filter these reports so that only incidents where one of the people involved is a student from a certain year group.
This information can be found by joining their IDs to the students table. The students table contains their ID and a Year_Group field.
One of the complexities is that some of the IDs from the people_involved table may not relate just to students - they could be staff, parents or other members of our community.
I don't want to exclude reports which have other people involved, as long as there is a student from a specific year group involved too.
I've written a query which seems to partially work:
SELECT
i.`ID` AS `id`,
i.`Author_ID` AS `author_id`,
i.`Description` AS `description`,
i.`Date` AS `date`,
i.`Datetime_Created` AS `created`,
p.`Title` AS `period`,
GROUP_CONCAT(DISTINCT ip.`Person_ID` ORDER BY FIELD(ip.`Involvement`, 'V', 'P', 'W') ASC SEPARATOR ',') AS `people_ids`,
( SELECT COUNT(`ID`) FROM `reports` r WHERE r.Incident_ID = i.ID ) AS `reports`,
i.`Status` AS `status`
FROM `incidents` i
LEFT JOIN `reports` ir ON ir.Incident_ID = i.ID
LEFT JOIN `people` ip ON ip.Incident_ID = i.ID
<< LEFT JOIN `student` stu ON ip.Person_ID = stu.db_id >>
LEFT JOIN `periods` p ON i.Period_ID = p.ID
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID )
<< AND `stu`.`Year_Group` = 11 >>
GROUP BY i.ID
ORDER BY i.`Date` DESC, p.`ID` DESC
But I just can't imagine that a single simple JOIN would be sufficient for the task I'm trying to achieve.
I think a subquery might do it, but I don't know where to begin with that.
The code I would use to access this information (for year 7 students) without all of the necessary incidents data would be (I think):
SELECT DISTINCT( p.`Incident_ID` )
FROM `people` p
LEFT JOIN `student` stu ON p.Person_ID = stu.db_id
WHERE stu.Year_Group = 7
How do I bundle that into this code?

To get incidents where students of only specific age group is included,use the following query.
SELECT p.Incident_ID
FROM people p
JOIN student stu ON p.Person_ID = stu.db_id
WHERE stu.Year_Group = 11
group by p.Incident_ID
Your original query returns the incidents and the group of people involved ,So in your original query filter incidents by comparing with the above query written by me.This way you will get all incidents where students from a specific year group involved plus other people also involved(if any).I think this will solve your problem.
SELECT
i.`ID` AS `id`,
i.`Author_ID` AS `author_id`,
i.`Description` AS `description`,
i.`Date` AS `date`,
i.`Datetime_Created` AS `created`,
p.`Title` AS `period`,
GROUP_CONCAT(DISTINCT ip.`Person_ID` ORDER BY FIELD(ip.`Involvement`, 'V', 'P', 'W') ASC SEPARATOR ',') AS `people_ids`,
( SELECT COUNT(`ID`) FROM `reports` r WHERE r.Incident_ID = i.ID ) AS `reports`,
i.`Status` AS `status`
FROM `incidents` i
LEFT JOIN `reports` ir ON ir.Incident_ID = i.ID
LEFT JOIN `people` ip ON ip.Incident_ID = i.ID
LEFT JOIN `periods` p ON i.Period_ID = p.ID
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID )
and i.ID in //Here you will put the above query
(
SELECT p.Incident_ID
FROM people p
JOIN student stu ON p.Person_ID = stu.db_id
WHERE stu.Year_Group = 11
group by p.Incident_ID
)
GROUP BY i.ID
ORDER BY i.`Date` DESC, p.`ID` DESC

It looks to me like you want an OUTER JOIN on students.
LEFT OUTER JOIN 'student' stu on ip.Person_ID = stu.db_id
That will include all the incidents. Then, in the WHERE clause, add the filter
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID ) AND `stu`.`Year_Group` = 7

Related

Check if ID exist in another table if yes return TRUE

I have a grocery Database, Grocery Items Belongs to Category or To a Sub Category and can be part of Favourite List
When I am reading the list of Items from the Category, I would like to include a boolean value to see if the item is part of Favourite list.
Favourite is stored as a combination of Product_ID and USER_ID with a Primary key index id
I have been failed to make the left join for favourite list.
Would really appreciate your support.
Database with data
https://wetransfer.com/downloads/853bd65b90f3a36b4d9264c018bbda9720190409083930/7d620e
Select a.*, ifnull(Deriv1.Count , 0) as Count, ifnull(Total1.PCount, 0) as PCount FROM `categories` a
LEFT OUTER JOIN (SELECT `parent`, COUNT(*) AS Count FROM `categories` GROUP BY `parent`) Deriv1 ON a.`id` = Deriv1.`parent`
LEFT OUTER JOIN (SELECT `category_id`,COUNT(*) AS PCount,
JOIN (SELECT id From Favorite Where userID='1' ) FROM `products` GROUP BY `category_id`) Total1 ON a.`id` = Total1.`category_id`
WHERE a.`parent`=" . $parent
I didn't get the meaning of retrieving the two 'count columns'. But, still if you just need favourite of user 1 or not, you can use FavouriteOrNot column from this query:
SELECT p.product_id, a.id, f.userID, IFNULL(Deriv1.Count , 0) as Count, IFNULL(Total1.PCount, 0) as PCount, IFNULL(f.id, 0) as FavouriteOrNot
FROM
`products` p
INNER JOIN
`categories` a
ON
p.`category_id` = a.`id`
LEFT OUTER JOIN
(SELECT `parent`, COUNT(*) AS Count FROM `categories` GROUP BY `parent`) Deriv1
ON
a.`id` = Deriv1.`parent`
LEFT OUTER JOIN
(SELECT `category_id`,COUNT(*) AS PCount FROM `products` GROUP BY `category_id`)Total1
ON
a.`id` = Total1.`category_id`
LEFT OUTER JOIN
`Favorite` f
ON
f.`ProductID` = p.`product_id`
AND
userID = 1
WHERE a.`parent`= 1

MySQL UNION query return duplicate values even using GROUP BY

I have the following query:
SELECT *
FROM (
SELECT
m.id AS id,
reference_id,
title,
created_by,
publish_up,
state
FROM z_news_master m
LEFT JOIN z_news_english c ON m.id = c.reference_id
WHERE c.created_by = 17152
ORDER by c.id DESC
) AS A
UNION
SELECT * FROM (
SELECT
m.id AS id,
reference_id,
title,
created_by,
publish_up,
state
FROM z_news_master m
LEFT JOIN z_news_spanish c ON m.id = c.reference_id
WHERE c.created_by = 17152
ORDER by c.id DESC
) AS B
GROUP BY id
Basically, I have 3 tables (z_news_master, z_news_english, z_news_spanish), to store News in Spanish or English languages. The z_news_master table contains the generic news information, the z_news_english and z_news_spanish contain the news in its respective language.
I need to get a list of the news, if the news is in both language tables it should return only one (not duplicated), the code above does the work, but if there is a new in English and Spanish, the record gets duplicated.
I'd also like to know why the GROUP BY id and the GROUP BY reference_id don't work?
Use a NOT EXISTS subquery to remove a fallback language row if a corresponding row for the prefered language exists. Assuming the prefered language is "english", the query would be:
SELECT
m.id AS id,
reference_id,
title,
created_by,
publish_up,
state
FROM z_news_master m
JOIN z_news_english c ON m.id = c.reference_id
WHERE c.created_by = 17152
UNION ALL
SELECT
m.id AS id,
reference_id,
title,
created_by,
publish_up,
state
FROM z_news_master m
JOIN z_news_spanish c ON m.id = c.reference_id
WHERE c.created_by = 17152
AND NOT EXISTS (
SELECT *
FROM z_news_english e
WHERE e.reference_id = m.id
AND e.created_by = c.created_by
)
ORDER by id DESC
Note that there is no need for GROUP BY. And a LEFT JOIN doesn't make sense because you have a WHERE condition on a column from the right table (which would convert the LEFT JOIN to INNER JOIN).
You can do this without using a union. The statement below gets all rows from the master table, and join any existing rows from both the spanish and english tables.
If a row exists in the spanish table, it uses the values from that table. If a row exists in the english table, and not the spanish table, it uses the values from that table.
If no matching row exists in either the english or spanish table, it returns columns from the master table.
You can alter the priorities by changing the order of the WHEN's.
SELECT
CASE
WHEN NOT s.id IS NULL THEN s.id
WHEN NOT e.id IS NULL THEN e.id
ELSE m.id AS `id`,
CASE
WHEN NOT s.reference_id IS NULL THEN s.reference_id
WHEN NOT e.reference_id IS NULL THEN e.reference_id
ELSE m.reference_id AS `reference_id`,
CASE
WHEN NOT s.title IS NULL THEN s.title
WHEN NOT e.title IS NULL THEN e.title
ELSE m.title AS `title`,
CASE
WHEN NOT s.created_by IS NULL THEN s.created_by
WHEN NOT e.created_by IS NULL THEN e.created_by
ELSE m.created_by AS `created_by`,
CASE
WHEN NOT s.publish_up IS NULL THEN s.publish_up
WHEN NOT e.publish_up IS NULL THEN e.publish_up
ELSE m.publish_up AS `publish_up`,
CASE
WHEN NOT s.state IS NULL THEN s.state
WHEN NOT e.state IS NULL THEN e.state
ELSE m.state AS `state`
FROM z_news_master m
LEFT JOIN z_news_spanish s ON m.id = s.reference_id
LEFT JOIN z_news_english e ON m.id = e.reference_id
WHERE m.created_by = 17152
ORDER by m.id DESC
GROUP BY m.id
EDIT
Per Paul Spiegel's comment here's an even shorter version:
SELECT
COALESCE(s.id, e.id, m.id) AS `id`,
COALESCE(s.reference_id, e.reference_id, m.reference_id) AS `reference_id`,
COALESCE(s.title, e.title, m.title) AS `title`,
COALESCE(s.created_by, e.created_by, m.created_by) AS `created_by`,
COALESCE(s.publish_up, e.publish_up, m.publish_up) AS `publish_up`,
COALESCE(s.state, e.state, m.state) AS `state`
FROM z_news_master m
LEFT JOIN z_news_spanish s ON m.id = s.reference_id
LEFT JOIN z_news_english e ON m.id = e.reference_id
WHERE m.created_by = 17152
ORDER by m.id DESC
GROUP BY m.id

this type of clause was previously parsed

I have the following query working fine.
SELECT d.customer_id,
d.fname,
d.lname,
m.lastDate,
(SELECT COUNT(order_id)
FROM `orders`
WHERE `customer_id`=d.customer_id
) AS 'total_orders',
d.isActive
FROM customers d
JOIN `orders` m ON m.order_id=
(SELECT order_id
FROM `orders`
WHERE customer_id=d.customer_id
ORDER BY order_id DESC LIMIT 1
)
WHERE d.user_id=382
AND d.customer_id NOT IN
(SELECT `customer_id`
FROM `orders`
WHERE `balance`>0
AND `isActive`=1
)
The above query works fine but when add and union query to also includes customer that have not placed any orders it does work.
SELECT d.customer_id,
d.fname,
d.lname,
m.lastDate,
(SELECT COUNT(order_id)
FROM `orders`
WHERE `customer_id`=d.customer_id
) AS 'total_orders',
d.isActive
FROM customers d
JOIN `orders` m ON m.order_id=
(SELECT order_id
FROM `orders`
WHERE customer_id=d.customer_id
ORDER BY order_id DESC LIMIT 1
)
WHERE d.user_id=382
AND d.customer_id NOT IN
(SELECT `customer_id`
FROM `orders`
WHERE `balance`>0
AND `isActive`=1
)
UNION
#customer WITH NO ORDERS
SELECT `customer_id`,`fname`,`lname`,`state`,`city`,`isActive`
FROM `customers`
WHERE `user_id`=382
AND `isActive` >-1
AND `customer_id` NOT IN
(SELECT `customer_id`
FROM `orders`
)
It display this error in my phpmyadmin
This type of clause was previously parsed (near select)
Based on your comment, I don't believe you want to use union -- you want to use an outer join instead. Here's a slightly simplified version of your query utilizing joins instead of all those correlated subqueries.
SELECT d.customer_id, d.fname, d.lname, d.isactive,
o.lastdate,
Count(o2.order_id) AS 'total_orders'
FROM customers d
LEFT JOIN (SELECT MAX(order_id) order_id, customer_id
FROM orders
GROUP BY customer_id) m on d.customer_id = m.customer_id
LEFT JOIN orders o on m.order_id = o.order_id
LEFT JOIN orders o2 on d.customer_id = o2.customer_id
AND o2.balance > 0 AND o2.isactive = 1
WHERE d.user_id = 382
AND o2.customer_id IS NULL
GROUP BY d.customer_id
BTW -- Reviewing your edits, when using union statements, you must have the same number of columns with the same types in each select list. The state and city fields in the second query probably don't have the same data type as the lastDate and count fields from the first.

mysql query optimization steps or how to optimze query

I don't know much about query optimization but I know the order in which queries get executed
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
This the query I had written
SELECT
`main_table`.forum_id,
my_topics.topic_id,
(
SELECT MAX(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id
) AS `maxpostid`,
(
SELECT my_posts.admin_user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `admin_user_id`,
(
SELECT my_posts.user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `user_id`,
(
SELECT COUNT(my_topics.topic_id) FROM my_topics WHERE my_topics.forum_id = main_table.forum_id ORDER BY my_topics.forum_id DESC LIMIT 1
) AS `topicscount`,
(
SELECT COUNT(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_topics.topic_id DESC LIMIT 1
) AS `postcount`,
(
SELECT CONCAT(admin_user.firstname,' ',admin_user.lastname) FROM admin_user INNER JOIN my_posts ON my_posts.admin_user_id = admin_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `adminname`,
(
SELECT forum_user.nick_name FROM forum_user INNER JOIN my_posts ON my_posts.user_id = forum_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `nickname`,
(
SELECT CONCAT(ce1.value,' ',ce2.value) AS fullname FROM my_posts INNER JOIN customer_entity_varchar AS ce1 ON ce1.entity_id = my_posts.user_id INNER JOIN customer_entity_varchar AS ce2 ON ce2.entity_id=my_posts.user_id WHERE (ce1.attribute_id = 1) AND (ce2.attribute_id = 2) AND my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `fullname`
FROM `my_forums` AS `main_table`
LEFT JOIN `my_topics` ON main_table.forum_id = my_topics.forum_id
WHERE (forum_status = '1')
And now I want to know if there is any way to optimize it ? Because all the logic is written in Select section not From, but I don't know how to write the same logic in From section of the query ?
Does it make any difference or both are same ?
Thanks
Correlated subqueries should really be a last resort, they often end up being executed RBAR, and given that a number of your subqueries are very similar, trying to get the same result using joins is going to result in a lot less table scans.
The first thing I note is that all of your subqueries include the table my_posts, and most contain ORDER BY my_posts.post_id DESC LIMIT 1, those that don't have a count with no group by so the order and limit are redundant anyway, so my first step would be to join to my_posts:
SELECT *
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
WHERE forum_status = '1';
Here the subquery just ensures you get the latest post per topic_id. I have shortened your table aliases here for my convenience, I am not sure why you would use a table alias that is longer than the actual table name?
Now you have the bulk of your query you can start adding in your columns, in order to get the post count, I have added a count to the subquery Maxp, I have also had to add a few more joins to get some of the detail out, such as names:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
t2.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id,
MAX(post_id) AS post_id,
COUNT(*) AS postcount
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
LEFT JOIN
( SELECT forum_id, COUNT(*) AS topicscount
FROM my_topics
GROUP BY forum_id
) AS t2
ON t2.forum_id = f.forum_id
WHERE forum_status = '1';
I am not familiar with your schema so the above may need some tweaking, but the principal remains - use JOINs over sub-selects.
The next stage of optimisation I would do is to get rid of your customer_entity_varchar table, or at least stop using it to store things as basic as first name and last name. The Entity-Attribute-Value model is an SQL antipattern, if you added two columns, FirstName and LastName to your forum_user table you would immediately lose two joins from your query. I won't get too involved in the EAV vs Relational debate as this has been extensively discussed a number of times, and I have nothing more to add.
The final stage would be to add appropriate indexes, you are in the best decision to decide what is appropriate, I'd suggest you probably want indexes on at least the foreign keys in each table, possibly more.
EDIT
To get one row per forum_id you would need to use the following:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
MaxT.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN
( SELECT t.forum_id,
COUNT(DISTINCT t.topic_id) AS topicscount,
COUNT(*) AS postCount,
MAX(t.topic_ID) AS topic_id
FROM my_topics AS t
INNER JOIN my_posts AS p
ON p.topic_id = p.topic_id
GROUP BY t.forum_id
) AS MaxT
ON MaxT.forum_id = f.forum_id
LEFT JOIN my_topics AS t
ON t.topic_ID = Maxt.topic_ID
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
WHERE forum_status = '1';

getting data from multiple tables in mysql

My goal is to get from the following tables - the user's unique group names and ids, the latest comments for the user's groups, the latest "done" article for the user's groups, the SUM of done articles, and total articles. Basically what is presented in the bottom sheet.
So far I've managed to get the data from the groups table and from the articles, but I can't get the latest comment.
Here is my query
SELECT `groups`.`name` , `groups`.`id` , (
SELECT MAX( `articles`.`written` )
FROM `articles`
WHERE `group` = `groups`.`id`
AND `articles`.`done` = '1'
) AS latestArt, (
SELECT MAX( `comments`.`date_added` )
FROM `comments`
WHERE `comments`.`article_id` = `a`.`id`
AND `comments`.`active` = '1'
) AS latestComm, SUM( `a`.`done` = '1' ) articlesAchieved, COUNT( `a`.`id` ) AS totalArticles
FROM `groups`
LEFT JOIN `articles` AS `a` ON `a`.`group` = `groups`.`id`
LEFT JOIN `comments` AS `c` ON `c`.`note_id` = `a`.`id`
WHERE `groups`.`user_id` = '6'
AND `n`.`active` = '1'
GROUP BY `groups`.`id`
I've also tried to get the data by joining everything to the article table but I wasn't successful with that either :(
UPDATED Your query might look like this
SELECT g.id group_id, g.name group_name,
a.last_written, a.total_articles, a.total_done,
c.last_comment
FROM groups g LEFT JOIN
(
SELECT `group`,
MAX(CASE WHEN done = 1 THEN written END) last_written,
COUNT(*) total_articles,
SUM(done) total_done
FROM articles
WHERE active = 1
AND user_id = 1
GROUP BY `group`
) a
ON g.id = a.`group` LEFT JOIN
(
SELECT a.`group`,
MAX(date_added) last_comment
FROM commants c JOIN articles a
ON c.article_id = a.id
WHERE a.active = 1
AND a.user_id = 1
GROUP BY a.`group`
) c
ON g.id = c.`group`
WHERE user_id = 1