very complex SELECT - mysql

I want to do a complex SELECT between more tables (4+) that will order and count items.
So far this is what my line is :
SELECT
myl_u.id,
myl_u.label_real_address,
myl_u.ext,
COUNT(myc_c.contact_id),
COUNT(myl_r_c.release_id)
FROM
myl_users myl_u
LEFT JOIN myc_contacts myc_c ON myc_c.contact_type='l' AND myc_c.contact_id=myl_u.id
LEFT JOIN myl_releases myl_r ON myl_r.id=myl_u.id
LEFT JOIN myl_r_comments myl_r_c ON myl_r.release_id=myl_r_c.release_id
GROUP BY myl_u.label_real_address
ORDER BY COUNT(myc_c.contact_id) DESC
It's half working, but when I add the latter part of the SQL, it shows unexpected values and it doubles them too somehow.
Basically I have myl_users (a collection of record labels)
myc_contacts (how many favourites does a user have, contact_type='l' means it's about myl_users and not other table)
myl_releases contains music releases (like EP, album, with unique id's
and myl_r_comments contains comments regular users do to these releases.
I managed to ORDER by how many favourites a record label has (15, 14, 10, 8..) - the COUNT(myc_c.contact_id) clause
but when I add the next clause and make the query bigger to order by the total comments the releases from labels have, unexpected appears.
Can someone pinpoint me to the right way ?
I will comment and adapt / clarify the question by your needs.
thanks,
have a happy new year

The problem is that you are summing along multiple dimensions, so you are getting a cross product. The best way is to summarize along each dimension independently:
SELECT myl_u.id, myl_u.label_real_address, myl_u.ext,
sum(myc_c.cnt),
sum(myl_rc.cnt)
FROM myl_users myl_u LEFT JOIN
(select contact_id, count(*) as cnt
from myc_contacts myc_c
where myc_c.contact_type='l'
group by contact_id
) myc_c
ON myc_c.contact_id=myl_u.id LEFT JOIN
(select myl_r.id, count(*) as cnt
from myl_releases myl_r LEFT JOIN
myl_r_comments myl_r_c
ON myl_r.release_id=myl_r_c.release_id
gropu by myl_r.id
) myl_rc
ON myl_rc.id=myl_u.id
GROUP BY myl_u.id, myl_u.label_real_address, myl_u.ext
ORDER BY 4 DESC
It is not clear from the question whether the final group by is necessary. If there are no duplicates in the myl_u table, then you don't need the outside aggregation at all.

At least one problem that I spot is that you need a WHERE clause if you want to restrict the rows. The JOINs should only include the considtions of the JOINs.
SELECT
myl_u.id,
myl_u.label_real_address,
myl_u.ext,
COUNT(myc_c.contact_id),
COUNT(myl_r_c.release_id)
FROM
myl_users myl_u
LEFT JOIN myc_contacts myc_c ON myc_c.contact_id=myl_u.id
LEFT JOIN myl_releases myl_r ON myl_r.id=myl_u.id
LEFT JOIN myl_r_comments myl_r_c ON myl_r.release_id=myl_r_c.release_id
WHERE
myc_c.contact_types = '1'
GROUP BY myl_u.label_real_address
ORDER BY COUNT(myc_c.contact_id) DESC
Also, are you sure its a left join that you want? That returns all rows from the "left" table even if no matching values on the right. Try changing LEFT to INNER and see if you get what you are expecting.

Related

how to combine two tables and count their values as one?

I want to get the total numbers of committed crimes when combining the two tables.
But I want to count the numbers for each crime being committed and also display the values of those that has not being committed as 0, How can i achieve this using mysql?
my code:
SELECT count(offense_id)
AS totalnumber,(
select offense_description
from offense
where offense.offense_id = case_crime.offense_id
)as crimeName
from case_crime
group by offense_id
To get a list of all your offenses and the count of registered crimes, you need to use the LEFT JOIN between the two tables .... something like this SQL:
SELECT a.offense_id, a.offense_description, count(b.crime_caseid) as total
FROM offense a LEFT JOIN case_crime b
ON a.offense_id = b.offense_id
group by a.offense_id;
Here is a SQLFiddle to play around with :)
You are looking for a LEFT OUTER JOIN
Something like
SELECT offense.offense_id
, offense_description
, count(case_crime.case_id) as Total_Number
from offense
LEFT OUTER JOIN case_crime ON offense.offense_id = case_crime.offense_id
group by offense.offense_id
I'll admit, I'm a TSQL guy, so I would handle the COUNT(*) returning null when there are no cases with the specified offense using: ISNULL(COUNT(case_crime),0)
Other SQL variants might use COALESCE( COUNT(case_crime), 0)

Count, Group By, Subquery, Left Join not working as expected

This is puzzling me and no amount of the Google is helping me, hoping someone can point me in the right direction.
Please note that I have omitted some fields from the tables that don't relate to the question just to simplify things.
contacts
contact_id
name
email
contact_uuids
uuid
contact_id
visitor_activity
uuid
event
contact_communications
comm_id
contact_id
employee_id
Query
SELECT
`c`.*,
(SELECT COUNT(`log_id`) FROM `contact_communications` `cc` WHERE `cc`.`contact_id` = `c`.`contact_id`) as `num_comms`,
(SELECT MAX(`date`) FROM `contact_communications` `cc` WHERE `cc`.`contact_id` = `c`.`contact_id`) as `latest_date`,
(SELECT MIN(`date`) FROM `contact_communications` `cc` WHERE `cc`.`contact_id` = `c`.`contact_id`) as `first_date`,
(SELECT COUNT(`vaid`) FROM `visitor_activity` `va` WHERE `va`.`uuid` = `cu`.`uuid`) as `num_act`
FROM `contacts` `c`
LEFT JOIN `contact_uuids` `cu` ON `c`.`contact_id` = `cu`.`contact_id`
GROUP BY `c`.`contact_id`
ORDER BY `c`.`name` ASC
Some contacts have multiple UUIDs (upwards of 20 or 30).
When I perform the query WITHOUT the GROUP BY statement, I get the results I expect - a row returned for each UUID that exists for that contact, with the correct "num_comms" and "num_act" numbers.
However when I add the GROUP BY statement, the "num_comms" is a lot smaller then expected and the "num_act" returns only the value from the first row without the GROUP BY statement.
I tried doing a "WHERE NOT IN" in the subquery, however that simply crashed the server as it was far too intense.
So - how do I get this to add up all the COUNT values from the LEFT JOIN and not just return the first value?
Also if anyone can help me optimize this that would be great.
Two problems:
GROUP BY c.contact_id does not include all the non-aggregate columns. This is a MySQL extension. What you get is random values for the rows other than contact_id
The JOIN adds confusion. Your only use for visitor_activity is the COUNT(*) one it. But that does not make sense since it is limited to one UUID, whereas the row is limited to one contact_id. Rethink the purpose of that.
Remove this line:
(SELECT COUNT(`vaid`) FROM `visitor_activity` `va` WHERE `va`.`uuid` = `cu`.`uuid`) as `num_act`
and the rest may work ok.
I will continue with the assumption that you want the COUNT of all rows in visitor_activity for all the uuids associated with the one contact_id.
See if this:
( SELECT COUNT(*)
FROM `contacts` c2
JOIN `visitor_activity` USING(uuid)
WHERE c2.contact_id = c.contact_id as `num_act` ) AS num_act
will work for the last subquery. At the same time, remove the JOIN:
LEFT JOIN `contact_uuids` `cu` ON `c`.`contact_id` = `cu`.`contact_id`
Now, back to the other problem (the non-standard usage of GROUP BY). Assuming that contact_id is the PRIMARY KEY, then simply remove the
GROUP BY `c`.`contact_id`

Creating a subquery in Access

I am attempting to create a subquery in Access but I am receiving an error stating that one record can be returned by this subquery. I am wanting to find the top 10 companies that have the most pets then I want to know the name of those pets. I have never created a subquery before so I am not sure where I am going wrong. Here is what I have:
SELECT TOP 10 dbo_tGovenrnmentRegulatoryAgency.GovernmentRegulatoryAgency
(SELECT dbo_tPet.Pet
FROM dbo_tPet)
FROM dbo_tPet INNER JOIN dbo_tGovenrnmentRegulatoryAgency ON
dbo_tPet.GovernmentRegulatoryAgencyID =
dbo_tGovenrnmentRegulatoryAgency.GovernmentRegulatoryAgencyID
GROUP BY dbo_tGovenrnmentRegulatoryAgency.GovernmentRegulatoryAgency
ORDER BY Count(dbo_tPet.PetID) DESC;
Consider this solution, requiring a subquery in the WHERE IN () clause:
SELECT t1.GovernmentRegulatoryAgency, dbo_tPet.Pet,
FROM dbo_tPet
INNER JOIN dbo_tGovenrnmentRegulatoryAgency t1 ON
dbo_tPet.GovernmentRegulatoryAgencyID = t1.GovernmentRegulatoryAgencyID
WHERE t1.GovernmentRegulatoryAgency IN
(SELECT TOP 10 t2.GovernmentRegulatoryAgency
FROM dbo_tPet
INNER JOIN dbo_tGovenrnmentRegulatoryAgency t2 ON
dbo_tPet.GovernmentRegulatoryAgencyID = t2.GovernmentRegulatoryAgencyID
GROUP BY t2.GovernmentRegulatoryAgency
ORDER BY Count(dbo_tPet.Pet) DESC);
Table aliases are not needed but I include them for demonstration.
This should hopefully do it:
SELECT a.GovernmentRegulatoryAgency, t.NumOfPets
FROM dbo_tGovenrnmentRegulatoryAgency a
INNER JOIN (
SELECT TOP 10 p.GovernmentRegulatoryAgencyID, COUNT(p.PetID) AS NumOfPets
FROM dbo_tPet p
GROUP BY p.GovernmentRegulatoryAgencyID
ORDER BY COUNT(p.PetID) DESC
) t
ON a.GovernmentRegulatoryAgencyID = t.GovernmentRegulatoryAgencyID
In a nutshell, first get the nested query sorted, identifying what the relevant agencies are, then inner join back to the agency table to get the detail of the agencies so picked.

MySql query runs very slow(actually never gives output) without where clause

I have a mysql query and it works fine when i use where clause, but when i donot use
where clause it gone and never gives the output and finally timeout.
Actually i have used Explain command to check the performance of the query and in both cases the Explain gives the same number of rows used in joining.
I have attached the image of output got with Explain command.
Below is the query.
I couldn't figure whats the problem here.
Any help is highly appreciated.
Thanks.
SELECT
MCI.CLIENT_ID AS CLIENT_ID, MCI.NAME AS CLIENT_NAME, MCI.PRIMARY_CONTACT AS CLIENT_PRIMARY_CONTACT,
MCI.ADDED_BY AS SP_ID, CONCAT(MUD_SP.FIRST_NAME, ' ', MUD_SP.LAST_NAME) AS SP_NAME,
MCI.FK_PROSPECT_ID AS PROSPECT_ID, MCI.DATE_ADDED AS ADDED_ON,
(SELECT GROUP_CONCAT(LT.TAG_TEXT SEPARATOR ', ')
FROM LK_TAG LT
INNER JOIN M_OBJECT_TAG_MAPPING MOTM
ON LT.PK_ID = MOTM.FK_TAG_ID
WHERE MOTM.FK_OBJECT_ID = MCI.FK_PROSPECT_ID
AND MOTM.OBJECT_TYPE = 1
AND MOTM.IS_ACTIVE = 1
) AS TAGS,
IFNULL(SUM(GET_DIGITS(MMR.RCP_AMOUNT)), 0) AS REVENUE_SO_FAR,
IFNULL(SUM(GET_DIGITS(MMR.RCP_RUPEES)), 0) AS REVENUE_INR,
COUNT(DISTINCT PMI_MONTHLY.PROJECT_ID) AS MONTHLY,
COUNT(DISTINCT PMI_FIXED.PROJECT_ID) AS FIXED,
COUNT(DISTINCT PMI_HOURLY.PROJECT_ID) AS HOURLY,
COUNT(DISTINCT PMI_ANNUAL.PROJECT_ID) AS ANNUAL,
COUNT(DISTINCT PMI_CURRENTLY_RUNNING.PROJECT_ID) AS CURRENTLY_RUNNING_PROJECTS,
COUNT(DISTINCT PMI_YET_TO_START.PROJECT_ID) AS YET_TO_START_PROJECTS,
COUNT(DISTINCT PMI_TECH_SALES_CLOSED.PROJECT_ID) AS TECH_SALES_CLOSED_PROJECTS
FROM
M_CLIENT_INFO MCI
INNER JOIN M_USER_DETAILS MUD_SP
ON MCI.ADDED_BY = MUD_SP.PK_ID
LEFT OUTER JOIN M_MONTH_RECEIPT MMR
ON MMR.CLIENT_ID = MCI.CLIENT_ID
LEFT OUTER JOIN M_PROJECT_INFO PMI_FIXED
ON PMI_FIXED.CLIENT_ID = MCI.CLIENT_ID AND PMI_FIXED.PROJECT_TYPE = 1
LEFT OUTER JOIN M_PROJECT_INFO PMI_MONTHLY
ON PMI_MONTHLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_MONTHLY.PROJECT_TYPE = 2
LEFT OUTER JOIN M_PROJECT_INFO PMI_HOURLY
ON PMI_HOURLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_HOURLY.PROJECT_TYPE = 3
LEFT OUTER JOIN M_PROJECT_INFO PMI_ANNUAL
ON PMI_ANNUAL.CLIENT_ID = MCI.CLIENT_ID AND PMI_ANNUAL.PROJECT_TYPE = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_CURRENTLY_RUNNING
ON PMI_CURRENTLY_RUNNING.CLIENT_ID = MCI.CLIENT_ID AND PMI_CURRENTLY_RUNNING.STATUS = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_YET_TO_START
ON PMI_YET_TO_START.CLIENT_ID = MCI.CLIENT_ID AND PMI_YET_TO_START.STATUS < 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_TECH_SALES_CLOSED
ON PMI_TECH_SALES_CLOSED.CLIENT_ID = MCI.CLIENT_ID AND PMI_TECH_SALES_CLOSED.STATUS > 4
WHERE YEAR(MCI.DATE_ADDED) = '2012'
GROUP BY MCI.CLIENT_ID ORDER BY CLIENT_NAME ASC
Yes, as many people have said, the key is that when you have the where clause, mysql engine filters the table M_CLIENT_INFO --probably drammatically--.
A similar result as removing the where clause is to to add this where clause:
where 1 = 1
You will see that the performance is degraded also because mysql will try to get all the data.
Remove the where clause and all columns from select and add a count to see how many records you get. If it is reasonable, say up to 10k, then do the following,
put back the select columns related to M_CLIENT_INFO
do not include the nested one "TAGS"
remove all your joins
run your query without where clause and gradually include the joins
this way you'll find out when the timeout is caused.
I would try the following. First, MySQL has a keyword "STRAIGHT_JOIN" which tells the optimizer to do the query in the table order you've specified. Since all you left-joins are child-related (like a lookup table), you don't want MySQL to try and interpret one of those as a primary basis of the query.
SELECT STRAIGHT_JOIN ... rest of query.
Next, your M_PROJECT_INFO table, I dont know how many columns of data are out there, but you appear to be concentrating on just a few columns on your DISTINCT aggregates. I would make sure you have a covering index on these elements to help the query via an index on
( Client_ID, Project_Type, Status, Project_ID )
This way the engine can apply the criteria and get the distinct all out of the index instead of having to go back to the raw data pages for the query.
Third, your M_CLIENT_INFO table. Ensure that has an index on both your criteria, group by AND your Order By, and change your order by from the aliased "CLIENT_NAME" to the actual column of the SQL table so it matches the index
( Date_Added, Client_ID, Name )
I have "name" in ticks as it is also a reserved word and helps clarify the column, not the keyword.
Next, the WHERE clause. Whenever you apply a function to an indexed column name, it doesn't work the greatest, especially on date/time fields... You might want to change your where clause to
WHERE MCI.Date_Added between '2012-01-01' and '2012-12-31 23:59:59'
so the BETWEEN range is showing the entire year and the index can better be utilized.
Finally, if the above do not help, I would consider splitting your query some. The GROUP_CONCACT inline select for the TAGS might be a bit of a killer for you. You might want to have all the distinct elements first for the grouping per client, THEN get those details.... Something like
select
PQ.*,
group_concat(...) tags
from
( the entire primary part of the query ) as PQ
Left join yourGroupConcatTableBasis on key columns

MySQL include zero rows when using COUNT with GROUP BY

I am trying to perform a query which groups a set of data by an attribute called type_id.
SELECT
vt.id AS voucher_type,
COALESCE(COUNT(v.id), 0) AS vouchers_remaining
FROM
vouchers v
INNER JOIN voucher_types vt
ON vt.id = v.type_id
WHERE
v.sold = 0
GROUP BY vt.id
What I want in the result is the type_id and the number of unsold products remaining for each type. This is working OK provided that there is at least one left, however if there is a zero count row, it is not returned in the result set.
How can I set up a dummy row for those types which do not have any corresponding rows to count?
Any advice would be greatly appreciated.
Thanks
You'll have to use a LEFT JOIN instead of an INNER JOIN. You start by selecting all voucher_types and then left join to find the count.
SELECT
voucher_types.id AS voucher_type,
IFNULL(vouchers_count.vouchers_remaining, 0) AS vouchers_remaining
FROM
voucher_types
LEFT JOIN
(
SELECT
v.type_id AS voucher_type,
COUNT(v.id) AS vouchers_remaining
FROM
vouchers v
WHERE
v.sold = 0
GROUP BY v.type_id
) AS vouchers_count
ON vouchers_count.voucher_type = voucher_types.id
You want an OUTER JOIN (or LEFT JOIN, same difference) instead of an INNER JOIN. That should already do the trick.
Because you're doing an INNER JOIN you automatically exclude types with no corresponding vouchers. You need a RIGHT OUTER JOIN.
Also, as far as I can remember, COUNT will always give you an integer, so there is no need for the COALESCE.
Good luck,
Alin