MySQL: count matching rows in second table

MySQL: count matching rows in second table - mysql

I want to list all teams, then count how many times each team appears in my second table. Some users are not in the second table, so the count would be zero. The problem is when I use the count function it only lists users that are in the second table. How do I count, and list 0 if they dont appear in second table?
$query = "SELECT t.id as id, t.t_name as name, t.t_city as city, (count(pd.rs)) as pd FROM #__bl_regions as r, #__bl_teams as t, #__bl_paid as pd WHERE t.id != 0 AND t.id != 1 AND (t.id IN($teams)) AND r.id = ".$t_id." AND pd.rs = 1 AND pd.t_id = ".$t_id." ORDER BY t.t_name";
$db->setQuery($query);
$players = $db->loadObjectList();
Tried Left Join
Ok, so because I am including 3 tables I believe I have to use 2 queries. Same thing is still happening, only listing schools with count. #__bl_paid is the table I want to count, #__bl_teams is the table I want to list all.
$query = "SELECT t.id as id FROM #__bl_regions as r, #__bl_teams as t WHERE t.id != 0 AND t.id != 1 AND (t.id IN($teams)) AND r.id = ".$t_id." ORDER BY t.t_name";
$db->setQuery($query);
$players1 = $db->loadResultArray();
if ($players1){
$players2 = implode(",",$players1);
}else{
$players2 = 0;
}
$query = "SELECT t.id as id, t.t_name as name, t.t_city as city, coalesce((count(pd.rs)),0) as pdc FROM #__bl_paid as pd LEFT JOIN #__bl_teams as t ON pd.t_id = t.id WHERE (t.id IN($players2)) ORDER BY t.t_name";
$db->setQuery($query);
$players = $db->loadObjectList();

You need two pieces to get what you want:
an outer join -- left join is the typical MySQL version used
a way to detect if a column is null, and if so, supply a different value. I often use coalesce
An inner join drops rows that don't have matches in the other table; a left join is similar to an inner join, but preserves all the rows in the left table, and supplies columns with null if there's no matching row in the right table.
Here's an example:
select column1, coalesce(column2, 0) as `newcolumn2`
from lefttable
left join righttable
on lefttable.something = righttable.something
What this will do: whenever column2 is null, it will be replaced with 0.

You should use LEFT JOIN statement instead of INNER JOIN.

Related

mysql query takes 3 hours to run and process

I have a query that is ran on a cron job late at night. This query is then processed through a generator as it has to populate another database and I make some additional processes and checks before it is sent to the other DB.
I am wondering is there anyway for me to speed up this query and hopefully keep it as a single query. Or will I be forced to create other queries and join the data within PHP? This queries the main mautic database.
SELECT c.id as "campaign_id",
c.created_by_user,
c.name,
c.date_added,
c.date_modified,
(SELECT DISTINCT COUNT(cl.lead_id)) as number_of_leads,
GROUP_CONCAT(lt.tag) as tags,
cat.title as category_name,
GROUP_CONCAT(ll.name) as segment_name,
GROUP_CONCAT(emails.name) as email_name,
CASE WHEN c.is_published = 1 THEN "Yes" ELSE "No" END AS "published",
CASE WHEN c.publish_down > now() THEN "Yes"
WHEN c.publish_down > now() AND c.is_published = 0 THEN "Yes"
ELSE "No" END AS "expired"
FROM campaigns c
LEFT JOIN campaign_leads cl ON cl.campaign_id = c.id
LEFT JOIN lead_tags_xref ltx on cl.lead_id = ltx.lead_id
LEFT JOIN lead_tags lt on ltx.tag_id = lt.id
LEFT JOIN categories cat on c.category_id = cat.id
LEFT JOIN lead_lists_leads llist on cl.lead_id = llist.lead_id
LEFT JOIN lead_lists ll on llist.leadlist_id = ll.id
LEFT JOIN email_list_xref el on ll.id = el.leadlist_id
LEFT JOIN emails on el.email_id = emails.id
GROUP BY c.id;
Here is a image of the explain
https://prnt.sc/qQtUaLK3FIpQ
Definitions
Campaign Table:
https://prnt.sc/6JXRGyMsWpcd
Campaign_leads table
https://prnt.sc/pOq0_SxW2spe
lead_tags_xref table
https://prnt.sc/oKYn92O82gHL
lead_tags table
https://prnt.sc/ImH81ECF6Ly1
categories table
https://prnt.sc/azQj_Xwq3dw9
lead_lists_lead table
https://prnt.sc/x5C5fiBFP2N7
lead_lists table
https://prnt.sc/bltkM0f3XeaH
email_list_xref table
https://prnt.sc/kXABVJSYWEUI
emails table
https://prnt.sc/7fZcBir1a6QT
I am only expected 871 rows to be completed, I have identified that the joins can be very large, in the tens of thousands.

Seems you have an useless select DISTINCT .. could you are looking for a count(distinct .. )
In this way you can avoid nested select for each rows in main select ..
SELECT c.id as "campaign_id",
c.created_by_user,
c.name,
c.date_added,
c.date_modified,
COUNT(DISTINCT cl.lead_id) as number_of_leads,
GROUP_CONCAT(lt.tag) as tags,
cat.title as category_name,
GROUP_CONCAT(ll.name) as segment_name,
GROUP_CONCAT(emails.name) as email_name,
CASE WHEN c.is_published = 1 THEN "Yes" ELSE "No" END AS "published",
CASE WHEN c.publish_down > now() THEN "Yes"
WHEN c.publish_down > now() AND c.is_published = 0 THEN "Yes"
ELSE "No" END AS "expired"
FROM campaigns c
LEFT JOIN campaign_leads cl ON cl.campaign_id = c.id
LEFT JOIN lead_tags_xref ltx on cl.lead_id = ltx.lead_id
LEFT JOIN lead_tags lt on ltx.tag_id = lt.id
LEFT JOIN categories cat on c.category_id = cat.id
LEFT JOIN lead_lists_leads llist on cl.lead_id = llist.lead_id
LEFT JOIN lead_lists ll on llist.leadlist_id = ll.id
LEFT JOIN email_list_xref el on ll.id = el.leadlist_id
LEFT JOIN emails on el.email_id = emails.id
GROUP BY c.id;
anyway be sure you have a proper composite index on
table campaign_leads columns campaign_id, lead_id
table lead_tags_xref columns lead_id, tag_id
table lead_lists_leads columns lead_id, leadlist_id
table email_list_xref columns leadlist_id, email_id

Query where column value equals count MySQL

I have 3 tables, "negocio" , "paquete" , "posts".
"negocio" has one "paquete", and "negocio" may have one or more "posts".
I want to bring all negocios that have the same amount of "posts" (that have as value on it's column "posts.tipo_post" the value "Post") records as the value on it's respective "paquete.no_posts"
I was doing something like this, but it returns me an empty set.
SELECT DISTINCT negocio.id, negocio.nombre FROM negocio
INNER JOIN posts ON negocio.id = posts.id_negocio
INNER JOIN paquete ON paquete.id_negocio = negocio.id
WHERE paquete.no_posts = (SELECT COUNT(*) FROM negocio INNER JOIN posts
ON posts.id_negocio = negocio.id WHERE posts.tipo_post = 'Post'
AND posts.estado_post = 'Disenador')

Try giving your count(*) table an alias
SELECT COUNT(*) FROM negocio N INNER JOIN posts P
ON P.id_negocio = N.id WHERE P.tipo_post = 'Post'
AND P.estado_post = 'Disenador'

How to remove duplicates from a SQL query based on a single column

There's a query I need to modify. What the query currently does is return search results (ads) based on Ad Title and Ad Description . If any of the search words are either found in ad title or ad description, it returns those results
I want to modify the query so that each ad appears in search results only once for a given ad title... So if there were 5 ads found with the same ad title for the given words in the search , it should return only 1 ad for that ad title...
$sql = "SELECT a.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct.cityname,
COUNT(*) AS piccount, p.picfile,
scat.subcatname, cat.catid, cat.catname $xfieldsql
FROM t_ads a
INNER JOIN t_cities ct ON a.cityid = ct.cityid
INNER JOIN t_subcats scat ON a.subcatid = scat.subcatid
INNER JOIN t_cats cat ON scat.catid = cat.catid
LEFT OUTER JOIN t_adxfields axf ON a.adid = axf.adid
LEFT OUTER JOIN t_adpics p ON a.adid = p.adid AND p.isevent = '0'
LEFT OUTER JOIN t_featured feat ON a.adid = feat.adid AND feat.adtype = 'A'
WHERE $where
AND $visibility_condn
AND (feat.adid IS NULL OR feat.featuredtill < NOW())
$loc_condn
GROUP BY a.adid
ORDER BY a.createdon DESC
LIMIT $offset, $ads_per_page";
Edit: $where contains the search expression... if regular expression search is turned on it uses regex otherwise not... $sqlsearch contains the search words that were input by the user...
if ($regex_search) {
$where = "(a.adtitle RLIKE '[[:<:]]{$searchsql}[[:>:]]' OR a.addesc RLIKE '[[:<:]]{$searchsql}[[:>:]]')";
} else {
$where = "(a.adtitle LIKE '$searchsql' OR a.addesc LIKE '$searchsql')";

The "proper" way to do this would be tackle the route cause by working out why the duplicates are appearing in the first place. It will be something to do with the JOINs but without looking at the data I'm unable to answer that. If, however you'd like a quick(ish) and dirty way to remove duplicates, could try something like below.
Disclaimer: This is completely untested so there's more likely to be a mistake or two in here - but hopefully no dealbreaker.
SELECT a2.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct2.cityname,
COUNT(*) AS piccount, p2.picfile,
scat2.subcatname, cat2.catid, cat2.catname $xfieldsql
FROM
(SELECT subq1.title, MIN(subq1.adid) AS adid
FROM
(SELECT a.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct.cityname,
COUNT(*) AS piccount, p.picfile,
scat.subcatname, cat.catid, cat.catname
FROM t_ads a
INNER JOIN t_cities ct ON a.cityid = ct.cityid
INNER JOIN t_subcats scat ON a.subcatid = scat.subcatid
INNER JOIN t_cats cat ON scat.catid = cat.catid
LEFT OUTER JOIN t_adxfields axf ON a.adid = axf.adid
LEFT OUTER JOIN t_adpics p ON a.adid = p.adid AND p.isevent = '0'
LEFT OUTER JOIN t_featured feat ON a.adid = feat.adid AND feat.adtype = 'A'
WHERE $where
AND $visibility_condn
AND (feat.adid IS NULL OR feat.featuredtill < NOW())
$loc_condn
GROUP BY a.adid) subq1
GROUP BY subq.title) subq2
INNER JOIN t_ads a2 ON a2.adid = subq2.adid
INNER JOIN t_cities ct2 ON a2.cityid = ct2.cityid
INNER JOIN t_subcats scat2 ON a2.subcatid = scat2.subcatid
INNER JOIN t_cats cat2 ON scat2.catid = cat2.catid
LEFT OUTER JOIN t_adxfields axf2 ON a2.adid = axf2.adid
LEFT OUTER JOIN t_adpics p2 ON a2.adid = p2.adid AND p2.isevent = '0'
LEFT OUTER JOIN t_featured feat2 ON a2.adid = feat2.adid AND feat2.adtype = 'A'
ORDER BY a2.createdon DESC
LIMIT $offset, $ads_per_page
This could be massively simplified and tidied up e.g. by removing some of the stuff from the subquery but am just giving the general idea to (hopefully) get you up and running...
Explanation
subq2 simply groups by title and picks out an adid from each group (chose to use MIN here but could have used MAX instead).
subq1 is the original query but with ordering and limits removed since these are applied by the outer query.
The outer query joins back on the de-duped IDs and joins back to the ads and other tables (giving them different aliases) in order to select the fields from your original query.

SQL query wrong result

i have this query:
SELECT `completed`.`ID` AS `ID`,`completed`.`level` AS `level`,`completed`.`completed_in` AS `completed_in`, COUNT(1) AS `right_answers_num`
FROM `completed`
INNER JOIN `history` ON `history`.`ID` = `completed`.`ID`
INNER JOIN `questions` ON `questions`.`ID` = `history`.`question`
WHERE `completed`.`student_id` = '1' AND `questions`.`answer` = `history`.`answer`
GROUP BY `completed`.`ID`
ORDER BY `completed`.`completed_in` DESC
what i need is to get info of each test in completed table (id,level,completed_in,right_answer_num)
the problem with that query is that if there is no one right answer(history.answer = questions.answer) then it doesn't return the row, while it should return the row(id,level,completed_in) and the right_answer_num(counter) should be zero..
please help me,, thanks ahead.

SELECT
completed.ID AS ID,
completed.level AS level,
completed.completed_in AS completed_in,
COUNT(questions.answer) AS right_answers_num
FROM completed
INNER JOIN history ON history.ID = completed.ID
LEFT JOIN questions ON questions.ID = history.question AND questions.answer = history.answer
WHERE
completed.student_id = '1'
GROUP BY
completed.ID
ORDER BY completed.completed_in DESC

use a LEFT OUTER JOIN intead of an INNER JOIN.

The second inner join is what's causing rows with no record in the questions table to be omitted. An inner join will only return rows that have data in all corresponding tables. Change the second inner join to a left join like so:
SELECT
completed.ID AS ID,
completed.level AS level,
completed.completed_in AS completed_in,
COUNT(questions.answer) AS right_answers_num
FROM completed
INNER JOIN history ON history.ID = completed.ID
LEFT JOIN questions ON questions.ID = history.question
WHERE completed.student_id = 1
GROUP BY completed.ID
ORDER BY completed.completed_in DESC

SQL - Multiple many-to-many relations filtering SELECT

These are my tables:
Cadastros (id, nome)
Convenios (id, nome)
Especialidades (id, nome)
Facilidades (id, nome)
And the join tables:
cadastros_convenios
cadastros_especialidades
cadastros_facilidades
The table I'm querying for: Cadastros
I'm using MySQL.
The system will allow the user to select multiple "Convenios", "Especialidades" and "Facilidades". Think of each of these tables as a different type of "tag". The user will be able to select multiple "tags" of each type.
What I want is to select only the results in Cadastros table that are related with ALL the "tags" from the 3 different tables provided. Please note it's not an "OR" relation. It should only return the row from Cadastros if it has a matching link table row for EVERY "tag" provided.
Here is what I have so far:
SELECT Cadastro.*, Convenio.* FROM Cadastros AS Cadastro
INNER JOIN cadastros_convenios AS CadastrosConvenio ON(Cadastro.id = CadastrosConvenio.cadastro_id)
INNER JOIN Convenios AS Convenio ON (CadastrosConvenio.convenio_id = Convenio.id AND Convenio.id IN(2,3))
INNER JOIN cadastros_especialidades AS CadastrosEspecialidade ON (Cadastro.id = CadastrosEspecialidade.cadastro_id)
INNER JOIN Especialidades AS Especialidade ON(CadastrosEspecialidade.especialidade_id = Especialidade.id AND Especialidade.id IN(1))
INNER JOIN cadastros_facilidades AS CadastrosFacilidade ON (Cadastro.id = CadastrosFacilidade.cadastro_id)
INNER JOIN Facilidades AS Facilidade ON(CadastrosFacilidade.facilidade_id = Facilidade.id AND Facilidade.id IN(1,2))
GROUP BY Cadastro.id
HAVING COUNT(*) = 5;
I'm using the HAVING clause to try to filter the results based on the number of times it shows (meaning the number of times it has been successfully "INNER JOINED"). So in every case, the count should be equal to the number of different filters I added. So if I add 3 different "tags", the count should be 3. If I add 5 different tags, the count should be 5 and so on. It works fine for a single relation (a single pair of inner joins). When I add the other 2 relations it starts to lose control.
EDIT
Here is something that I believe is working (thanks #Tomalak for pointing out the solution with sub-queries):
SELECT Cadastro.*, Convenio.*, Especialidade.*, Facilidade.* FROM Cadastros AS Cadastro
INNER JOIN cadastros_convenios AS CadastrosConvenio ON(Cadastro.id = CadastrosConvenio.cadastro_id)
INNER JOIN Convenios AS Convenio ON (CadastrosConvenio.convenio_id = Convenio.id)
INNER JOIN cadastros_especialidades AS CadastrosEspecialidade ON (Cadastro.id = CadastrosEspecialidade.cadastro_id)
INNER JOIN Especialidades AS Especialidade ON(CadastrosEspecialidade.especialidade_id = Especialidade.id)
INNER JOIN cadastros_facilidades AS CadastrosFacilidade ON (Cadastro.id = CadastrosFacilidade.cadastro_id)
INNER JOIN Facilidades AS Facilidade ON(CadastrosFacilidade.facilidade_id = Facilidade.id)
WHERE
(SELECT COUNT(*) FROM cadastros_convenios WHERE cadastro_id = Cadastro.id AND convenio_id IN(1, 2, 3)) = 3
AND
(SELECT COUNT(*) FROM cadastros_especialidades WHERE cadastro_id = Cadastro.id AND especialidade_id IN(3)) = 1
AND
(SELECT COUNT(*) FROM cadastros_facilidades WHERE cadastro_id = Cadastro.id AND facilidade_id IN(2, 3)) = 2
GROUP BY Cadastro.id
But I'm concerned about performance. It looks like these 3 sub-queries in the WHERE clause are gonna be over-executed...
Another solution
It joins subsequent tables only if the previous joins were a success (if no rows match one of the joins, the next joins are gonna be joining an empty result-set) (thanks #DRapp for this one)
SELECT STRAIGHT_JOIN
Cadastro.*
FROM
( SELECT Qualify1.cadastro_id
from
( SELECT cc1.cadastro_id
FROM cadastros_convenios cc1
WHERE cc1.convenio_id IN (1, 2, 3)
GROUP by cc1.cadastro_id
having COUNT(*) = 3 ) Qualify1
JOIN
( SELECT ce1.cadastro_id
FROM cadastros_especialidades ce1
WHERE ce1.especialidade_id IN( 3 )
GROUP by ce1.cadastro_id
having COUNT(*) = 1 ) Qualify2
ON (Qualify1.cadastro_id = Qualify2.cadastro_id)
JOIN
( SELECT cf1.cadastro_id
FROM cadastros_facilidades cf1
WHERE cf1.facilidade_id IN (2, 3)
GROUP BY cf1.cadastro_id
having COUNT(*) = 2 ) Qualify3
ON (Qualify2.cadastro_id = Qualify3.cadastro_id) ) FullSet
JOIN Cadastros AS Cadastro
ON FullSet.cadastro_id = Cadastro.id
INNER JOIN cadastros_convenios AS CC
ON (Cadastro.id = CC.cadastro_id)
INNER JOIN Convenios AS Convenio
ON (CC.convenio_id = Convenio.id)
INNER JOIN cadastros_especialidades AS CE
ON (Cadastro.id = CE.cadastro_id)
INNER JOIN Especialidades AS Especialidade
ON (CE.especialidade_id = Especialidade.id)
INNER JOIN cadastros_facilidades AS CF
ON (Cadastro.id = CF.cadastro_id)
INNER JOIN Facilidades AS Facilidade
ON (CF.facilidade_id = Facilidade.id)
GROUP BY Cadastro.id

Emphasis mine
"It should only return the row from Cadastros if it has a matching row for EVERY "tag" provided."
"where there is a matching row"-problems are easily solved with EXISTS.
EDIT After some clarification, I see that using EXISTS is not enough. Comparing the actual row counts is necessary:
SELECT
*
FROM
Cadastros c
WHERE
(SELECT COUNT(*) FROM cadastros_facilidades WHERE cadastro_id = c.id AND id IN (2,3)) = 2
AND
(SELECT COUNT(*) FROM cadastros_especialidades WHERE cadastro_id = c.id AND id IN (1)) = 1
AND
(SELECT COUNT(*) FROM cadastros_facilidades WHERE cadastro_id = c.id AND id IN (1,2)) = 2
The indexes on the link tables should be (cadastro_id, id) for this query.

Depending on the size of the tables (records), WHERE-based subqueries, running a test on every row CAN SIGNIFICANTLY hit performance. I have restructured it which MIGHT better help, but only you would be able to confirm. The premise here is to have the first table based on getting distinct IDs that meet the criteria, join THAT set to the next qualifier criteria... joined to the FINAL set. Once that has been determined, use THAT to join to your main table and its subsequent links to get the details you are expecting. You also had an overall group by by the ID which will eliminate all other nested entries as found in the support details table.
All that said, lets take a look at this scenario. Start with the table that would be EXPECTED TO HAVE THE LOWEST RESULT SET to join to the next and next. if cadastros_convenios has IDs that match all the criteria include IDs 1-100, great, we know at MOST, we'll have 100 ids.
Now, these 100 entries are immediately JOINED to the 2nd qualifying criteria... of which, say it only matches ever other... for simplicity, we are now matched on 50 of the 100.
Finally, JOIN to the 3rd qualifier based on the 50 that qualified and you get 30 entries. So, within these 3 queries you are now filtered down to 30 entries with all the qualifying criteria handled up front. NOW, join to the Cadastros and then subsequent tables for the details based ONLY on the 30 that qualified.
Since your original query would eventually TRY EVERY "ID" for the criteria, why not pre-qualify it up front with ONE query and get just those that hit, then move on.
SELECT STRAIGHT_JOIN
Cadastro.*,
Convenio.*,
Especialidade.*,
Facilidade.*
FROM
( SELECT Qualify1.cadastro_id
from
( SELECT cc1.cadastro_id
FROM cadastros_convenios cc1
WHERE cc1.convenio_id IN (1, 2, 3)
GROUP by cc1.cadastro_id
having COUNT(*) = 3 ) Qualify1
JOIN
( SELECT ce1.cadastro_id
FROM cadastros_especialidades ce1
WHERE ce1.especialidade_id IN( 3 )
GROUP by ce1.cadastro_id
having COUNT(*) = 1 ) Qualify2
ON Qualify1.cadastro_id = Qualify2.cadastro_id
JOIN
( SELECT cf1.cadastro_id
FROM cadastros_facilidades cf1
WHERE cf1.facilidade_id IN (2, 3)
GROUP BY cf1.cadastro_id
having COUNT(*) = 2 ) Qualify3
ON Qualify2.cadastro_id = Qualify3.cadastro_id ) FullSet
JOIN Cadastros AS Cadastro
ON FullSet.Cadastro_id = Cadastro.Cadastro_id
INNER JOIN cadastros_convenios AS CC
ON Cadastro.id = CC.cadastro_id
INNER JOIN Convenios AS C
ON CC.convenio_id = C.id
INNER JOIN cadastros_especialidades AS CE
ON Cadastro.id = CE.cadastro_id
INNER JOIN Especialidades AS E
ON CE.especialidade_id = E.id
INNER JOIN cadastros_facilidades AS CF
ON Cadastro.id = CF.cadastro_id
INNER JOIN Facilidades AS F
ON CF.facilidade_id = F.id

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL: count matching rows in second table - mysql

You should use LEFT JOIN statement instead of INNER JOIN.

Related

mysql query takes 3 hours to run and process

Query where column value equals count MySQL

How to remove duplicates from a SQL query based on a single column

SQL query wrong result

SQL - Multiple many-to-many relations filtering SELECT

Categories

Resources