Can't concat values with GROUP_CONCAT on joins - mysql

Could anybody help me?
With this query I am getting the ids, but it is not putting the separators when subscriber_data.fieldid is null. For example instead of coming 2,,12 it comes 2,12 when the value for 4 is null...
I think the problem is on the Join with subquery, but i couldn't make it with two left joins in the main query also...
This is the query im using:
SELECT
list_subscribers.emailaddress,
(SELECT
GROUP_CONCAT(IFNULL(customfields.fieldid,'') SEPARATOR '","')
FROM customfields
LEFT JOIN subscribers_data
ON subscribers_data.fieldid = customfields.fieldid
WHERE
customfields.fieldid IN (2,4,12,13,14,15,17,19,20,21,22,23,16,26,27)
AND
list_subscribers.subscriberid = subscribers_data.subscriberid
) AS data FROM list_subscribers
Thanks everyone.

The left join is useless. Because you have a condition on subscriber_data in the WHERE clause, that subquery will not return those rows for which there is no matching subscriber_data, so it effectively works as if you used INNER JOIN. You should add that condition to the left join condition, but it is impossible in this query layout. Values from the outer query are not allowed in join conditions in the inner query.
You could change it, but apparently you need to join three tables, where the middle table, subscriber_data, that links them all together, is optional. That doesn't really make sense.
Or maybe customfields is the table that is optional, but in that case, you should have reversed the table or used a RIGHT JOIN.
In conclusion, I think you meant to write this:
select
s.emailaddress,
GROUPCONCAT(IFNULL(f.fieldid, '') SEPARATOR '","')
from
list_subscribers s
inner join subscribers_data d on d.subscriberid = s.subscriberid
left join customfields f on f.fieldid = d.fieldid
where
d.fieldid in (2,4,12,13,14,15,17,19,20,21,22,23,16,26,27)
group by
s.emailaddress
Or do you want to list the id's of the fields that are filled for the subscriber(s)? In that case, it would be:
select
s.emailaddress,
GROUPCONCAT(IFNULL(d.fieldid, '') SEPARATOR '","')
from
list_subscribers s
cross join customfields f
left join subscribers_data d on
d.subscriberid = s.subscriberid and
d.fieldid = f.fieldid
where
f.fieldid in (2,4,12,13,14,15,17,19,20,21,22,23,16,26,27)
group by
s.emailaddress

Related

FULL table scan in LEFT JOIN using OR

How can this query be optimized to avoid the full table scan described below?
I've got a slow query that's taking approximately 15 seconds to return.
Let's get this part out of the way - I've confirmed all indexes are there.
When I run EXPLAIN, it shows that there is a FULL TABLE scan ran on the crosswalk table (the index for fromQuestionCategoryJoinID is not used, even if I attempt to force) - if I remove either of the fields and the OR, the index is used and query completes in milliseconds.
SELECT c.id, c.name, GROUP_CONCAT(DISTINCT tags.externalDisplayID SEPARATOR ', ') AS tags
FROM checklist c
LEFT JOIN questionchecklistjoin qcheckj on qcheckj.checklistID = c.id
LEFT JOIN questioncategoryjoin qcatj ON qcatj.questionID = qcheckj.questionID
LEFT JOIN questioncategoryjoin qcatjsub on qcatjsub.parentQuestionID = qcatj.questionID
LEFT JOIN crosswalk cw on (cw.fromQuestionCategoryJoinID = qcatj.id OR cw.fromQuestionCategoryJoinID = qcatjsub.id)
-- index used if I remove OR, eg.: LEFT JOIN crosswalk cw on (cw.fromQuestionCategoryJoinID = qcatj.id)
LEFT JOIN questioncategoryjoin qcj1 on qcj1.id = cw.toQuestionCategoryJoinID
LEFT JOIN question tags on tags.id = qcj1.questionID
GROUP BY c.id
ORDER BY c.name, tags.externalDisplayID;
Split the query into two queries for each part of the OR. Then combine them with UNION.
SELECT id, name, GROUP_CONCAT(DISTINCT externalDisplayID SEPARATOR ', ') AS tags
FROM (
SELECT c.id, c.name, tags.externalDisplayID
FROM checklist c
LEFT JOIN questionchecklistjoin qcheckj on qcheckj.checklistID = c.id
LEFT JOIN questioncategoryjoin qcatj ON qcatj.questionID = qcheckj.questionID
LEFT JOIN crosswalk cw on cw.fromQuestionCategoryJoinID = qcatj.id
LEFT JOIN questioncategoryjoin qcj1 on qcj1.id = cw.toQuestionCategoryJoinID
LEFT JOIN question tags on tags.id = qcj1.questionID
UNION ALL
SELECT c.id, c.name, tags.externalDisplayID
FROM checklist c
LEFT JOIN questionchecklistjoin qcheckj on qcheckj.checklistID = c.id
LEFT JOIN questioncategoryjoin qcatj ON qcatj.questionID = qcheckj.questionID
LEFT JOIN questioncategoryjoin qcatjsub on qcatjsub.parentQuestionID = qcatj.questionID
LEFT JOIN crosswalk cw on cw.fromQuestionCategoryJoinID = qcatjsub.id
LEFT JOIN questioncategoryjoin qcj1 on qcj1.id = cw.toQuestionCategoryJoinID
LEFT JOIN question tags on tags.id = qcj1.questionID
) AS x
GROUP BY x.id
ORDER BY x.name
Also, it doesn't make sense to include externalDisplayID in ORDER BY, because that will order by its value from a random row in the group. You could put ORDER BY externalDisplayID in the GROUP_CONCAT() arguments if that's what you want.
There is a second inefficiency going on here. I call it "explode-implode". First a bunch of JOINs (potentially) expand the number of rows in an intermediate table, then GROUP BY c.id collapses the number of rows back to what you started with (one row of output per row of checkpoint).
Before trying to help with that, please answer:
Is LEFT really needed?
How many rows in each table? (Especially in cw)
Can you get rid of DISTINCT?
Barmar's answer can possibly be improved upon by delaying the JOINs to qcj1andtagsuntil after theUNION`:
SELECT ...
FROM ( SELECT ...
FROM first few tables
UNION ALL
SELECT ...
FROM first few tables
) AS u
[LEFT] JOIN qcj1
[LEFT] JOIN tags
GROUP BY ...
ORDER BY ...
Another optimization (again building on Barmar's)
GROUP BY x.id
ORDER BY x.name
-->
GROUP BY x.name, x.id
ORDER BY x.name, x.id
When the items in GROUP BY and ORDER BY are the "same", they can be done in a single action, thereby saving (at least) a sort.
x.name, x.id is deterministic, where as x.name might put two rows with the same name in a different order, depending (perhaps) on the phase of the moon.
These indexes may help:
qcheckj: INDEX(checklistID, questionID)
qcatj: INDEX(questionID, id)
qcatjsub: INDEX(parentQuestionID, id)
cw: INDEX(fromQuestionCategoryJoinID, toQuestionCategoryJoinID)

Not getting the right results using GROUP_CONCAT in query

I have 7 tables to work with inside a query:
tb_post, tb_spots, users, td_sports, tb_spot_types, tb_users_sports, tb_post_media
This is the query I am using:
SELECT po.id_post AS id_post,
po.description_post as description_post,
sp.id_spot as id_spot,
po.date_post as date_post,
u.id AS userid,
u.user_type As tipousuario,
u.username AS username,
spo.id_sport AS sportid,
spo.sport_icon as sporticon,
st.logo_spot_type as spottypelogo,
sp.city_spot AS city_spot,
sp.country_spot AS country_spot,
sp.latitud_spot as latitudspot,
sp.longitud_spot as longitudspot,
sp.short_name AS spotshortname,
sp.verified_spot AS spotverificado,
u.profile_image AS profile_image,
sp.verified_spot_by as spotverificadopor,
uv.id AS spotverificador,
uv.user_type AS spotverificadornivel,
pm.media_type AS mediatype,
pm.media_file AS mediafile,
GROUP_CONCAT(tus.user_sport_sport) sportsdelusuario,
GROUP_CONCAT(logosp.sport_icon) sportsdelusuariologos,
GROUP_CONCAT(pm.media_file) mediapost,
GROUP_CONCAT(pm.media_type) mediaposttype
FROM tb_posts po
LEFT JOIN tb_spots sp ON po.spot_post = sp.id_spot
LEFT JOIN users u ON po.uploaded_by_post = u.id
LEFT JOIN tb_sports spo ON sp.sport_spot = spo.id_sport
LEFT JOIN tb_spot_types st ON sp.type_spot = st.id_spot_type
LEFT JOIN users uv ON sp.verified_spot_by = uv.id
LEFT JOIN tb_users_sports tus ON tus.user_sport_user = u.id
LEFT JOIN tb_sports logosp ON logosp.id_sport = tus.user_sport_sport
LEFT JOIN tb_post_media pm ON pm.media_post = po.id_post
WHERE po.status = 1
GROUP BY po.id_post,uv.id
I am having problems with some of the GROUP_CONCAT groups:
GROUP_CONCAT(tus.user_sport_sport) sportsdelusuario is giving me the right items but repeated, all items twice
GROUP_CONCAT(logosp.sport_icon) sportsdelusuariologos is giving me the right items but repeated, all items twice
GROUP_CONCAT(pm.media_file) mediapost is giving me the right items but repeated four times
GROUP_CONCAT(pm.media_type) mediaposttype s giving me the right items but repeated four times
I can put here all tables structures if you need them.
Multiple one-to-many relations JOINed in a query have a multiplicative affect on aggregation results; the standard solution is subqueries:
You can change
GROUP_CONCAT(pm.media_type) mediaposttype
...
LEFT JOIN tb_post_media pm ON pm.media_post = po.id_post
to
pm.mediaposttype
...
LEFT JOIN (
SELECT media_post, GROUP_CONCAT(media_type) AS mediaposttype
FROM tb_post_media
GROUP BY media_post
) AS pm ON pm.media_post = po.id_post
If tb_post_media is very big, and the po.status = 1 condition in the outer query would significantly reduce the results of the subquery, it can be worth replicating the original join within the subquery to filter down it's results.
Similarly, the correlated version I mentioned in the comments can also be more performant if the outer query has relatively few results. (Calculating the GROUP_CONCAT() for each individually can cost less than calculating it for all once if you would only actually using very few of the results of the latter).
or just add DISTINCT to all the group_concat, e.g., GROUP_CONCAT(DISTINCT pm.media_type)

how to join SQL with 3 tables

I'm trying to study SQL.
I have a problem with JOIN
I want to display ref_id, pro_name, class_name but I couldn't.
I find EFFICIENT solution.
MY QUERY (DOESN'T WORK)
SELECT
ref_id, pro_name, class_name
FROM
RC, RP, PP, LP
WHERE
RC.ref_id = RP.ref_id
Avoid using commas be CROSS JOIN
You could use JOIN to instead of commas
like this.
SELECT
RP.ref_id, PP.pro_name, LP.class_name
FROM
RP
LEFT JOIN RC ON RC.ref_id = RP.ref_id
LEFT JOIN PP ON PP.pro_id = RP.pro_id
LEFT JOIN LP ON LP.lec_id = RP.lec_id
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
You would seem to want:
select rp.pro_id, pp.pro_name, lp.class_name
from rp left join
pp
on rp.pro_id = pp.pro_id left join
lp
on rp.lec_id = lp.lec_id;
Note the use of left join. This ensure that all rows are in the result set, even when one or the other joins doesn't find a matching record.
From what I can see, the table rc is not needed to answer this specific question.

Query with joins

I am running a query:
select course.course,iars.id,
students.rollno,
students.name as name,
teachers.name as tname,
students.studentid,
attndata.studentid ,sum(attndata.obt) as obt
sum(attndata.benefits) as ben , (sum(attndata.max)) as abc
from groups, students
left join iars
on iars.id
left join str
on str.studentid=students.studentid
left join course
on course.c_id=students.course
left join teachers
on teachers.id=iars.teacherid
join sgm
on sgm.studentid=students.studentid
left join attndata
on attndata.studentid=students.studentid and iars.id=attndata.iarsid
left join sps
on sps.studentid=students.studentid and iars.paperid=sps.paperid
left join semdef
on semdef.semesterid=str.semesterid
where students.course='1'
and students.status='regular'
and sps.paperid='5'
and iars.courseid=students.course
and iars.semester=str.semesterid
and semdef.month=9
and iars.paperid='5'
and str.semesterid='1'
and str.sessionid='12'
and groups.id=sgm.groupid
group by sps.studentid,
teachers.id,
semdef.month
order by
students.name
In this query whenever I am having left join on semdef.id=attndata.mon, I am getting zero result when the value of semdef.id=null but I want all the results, irrespective of semdef, but I want to use it. As in it should fetch result, if the values are null. Can you please help it out.
It's probably because your where clause is saying
and semdef.month=9
and you probably want
and (semdef.month=9 OR semdef.id IS NULL)
or something similar.
It's because your where clause has statements relating to the semdef table. Add these to the join clause as putting these in the where is implying an inner join.
Eg:
Left join semdef on xxx and semdef.id = attndata.min

MySQL - LEFT JOIN failing when WHERE clause added

I have 4 tables as follows; SCHEDULES, SCHEDULE_OVERRIDE, SCHEDULE_LOCATION_OVERRIDES and LOCATION
I need to return ALL rows from all tables so running this query works fine, adding NULL values for any values that are not present:
SELECT.....
FROM (schedule s LEFT JOIN schedule_override so ON so.schedule_id = s.id)
LEFT JOIN schedule_location_override slo ON slo.schedule_override_id = so.id
LEFT JOIN location l ON slo.location_id = l.id
ORDER BY s.id, so.id, slo.id, l.id
I then need to restict results on the schedule_override end_date field. My problem is, as soon as I do this, no results for the SCHEDULE table are returned at all. I need all schedules to be returned, even if the overrides end_date criteria is not met.
Heres what I am using:
SELECT.....
FROM (schedule s LEFT JOIN schedule_override so ON so.schedule_id = s.id)
LEFT JOIN schedule_location_override slo ON slo.schedule_override_id = so.id
LEFT JOIN location l ON slo.location_id = l.id
WHERE so.end_date > '2011-01-30' OR so.end_date IS NULL
ORDER BY s.id, so.id, slo.id, l.id
Appreciate any thoughts/comments.
Best regards, Ben.
Have you tried putting it in the ON clause?
SELECT.....
FROM (schedule s LEFT JOIN schedule_override so ON so.schedule_id = s.id AND (so.end_date > '2011-01-30' OR so.end_date IS NULL))
LEFT JOIN schedule_location_override slo ON slo.schedule_override_id = so.id
LEFT JOIN location l ON slo.location_id = l.id
ORDER BY s.id, so.id, slo.id, l.id
That's a quite common mistake with outer Joins.
You need to put everything that limits the Join into the "ON" part for that table, otherwise you are effectively transforming the join to an inner one.
So move the WHERE clause in this case into the ON-part of the schedule_override and you should be fine.
Yes, when you left join, it could be that a row is not found, and the field is NULL in the result. When you add a condition in the WHERE clause, the value must match that condition, which it won't if it's NULL.
That shouldn't be a problem, because you explicitly check for NULL, so I don't really know why this condition fails, unless it does return a date, but that date is befor 2011-01-30.
Anyway, you could try to move the condition to the join. It will eliminate the need to check for NULL, although it shouldn't make a difference really.
SELECT.....
FROM
schedule s
LEFT JOIN schedule_override so
ON so.schedule_id = s.id
AND so.end_date > '2011-01-30'
...