MySQL Left Joins - mysql

EDIT: OK, think I need to be clearer - I'd like the result to show all the 'names' that appear in the table acme, against the counts (if any) from the results table. Hope that makes sense?
Having a huge issue and my brain isn't working as it should.
All I want to do is, in a single statement via a join, count the number of rows for a common field.
SELECT name, COUNT(name) as Count FROM acme
SELECT name, COUNT(name) as Total FROM results
I'm sure it should be something like this...
SELECT acme.name, COUNT(acme.name) As Count,
COUNT(results.name) as Total
FROM acme
LEFT JOIN results ON acme.name = results.name
GROUP BY name
ORDERY BY name
But it doesn't bring back the correct counts.
Thoughts, where am I going wrong...this, I know, will be very very obvious.
H.

From your feedback, this will get what you want. You need to FIRST get unique names / counts from the "ACME" file first... THEN join that to the results table for count of records from that, otherwise, you would end up with a Cartesian result of counts. If ACME had Name "X" 5 times and Results had "X" 20 times, your total would be 100. The query below will actually result with a single row showing "X", 5, 20 which is what it appears you are looking for.. (for however many names exist in ACME).
I've changed to a LEFT join in case there are names in the ACME table that DO NOT exist in the RESULTS table, it won't drop them from your final answer
select
JustACME.Name,
JustACME.NameCount,
COALESCE( COUNT( * ), 0 ) as CountFromResultsTable
from
( select a.Name
count(*) as NameCount
from
acme a
group by
a.Name ) JustACME
LEFT JOIN results r
on JustACME.Name = r.Name
group by
JustACME.Name

It looks like it's because of the join, it's screwing with your counts. Try running the join with SELECT * FROM... and look at the resulting table. The problem should be obvious from there. =D

Yes, your join (inner or outer, doesn't matter) is messing with your results.
In fact, it is likely returning the product of rows with the same name, rather than the sum.
What you want to do is sum the rows from the first table, sum the rows from the second table, and join that.
Like this:
Select name, a.count as Count, r.count as Total
From (select name, count(*) from acme group by name) a
Left join (select name, count(*) from results group by name) r using (name)

I do not see why you forbid using two statements this just complicates everything.
The only reason I see for this is to get the two results into one answer.
I do not know if the latter would work but I would try this:
SET #acount = (SELECT count(DISTINCT name) FROM acme);
SELECT count(DISTINCT name) as Total, #acount as Count FROM results
I would post this as one query and (hopefully) get back the correct results. Let me note, that it is not clear from you question if you want to know how often every name doubles or if you want to count unique names.

Related

MySQL Left-Join vs Join with subquery return different results

I wrote 2 queries expecting them to yield same results, yet they turned out to be different.
I would like to ask why they return different results?
I am more confident that the 1st query returns what I want, so how should I amend the 2nd query? Thx!
1st SQL query:
SELECT
Product.*,
Status.*,
Price.*
FROM Product
LEFT JOIN Status
ON Product.MarketplaceId = Status.ListingId
LEFT JOIN Price
ON Product.ProductId = Price.Id
LIMIT 15;
2nd SQL query:
SELECT
Product.*,
Status.*,
Price.*
FROM Product
LEFT JOIN Status
ON Product.MarketplaceId IN
(
SELECT ListingId FROM Status
)
LEFT JOIN Price
ON Product.ProductId IN
(
SELECT Id FROM Price
)
LIMIT 15;
Without seeing the data, I don't understand why a different result. However, if you intend to have only one per product, I would change the second query to use DISTINCT. If the subquery returns multiple rows for whatever the condition, it will return that many rows even if a single product.
Don't use IN ( SELECT ... ) if there is an alternative; it is often slower.
Don't use LEFT JOIN if the matching row in the 'right' table will always be there. It confuses readers.
The reason for different results is the lack of ORDER BY. (As Akina mentioned.) Removing the LIMIT would probably cause the two queries to deliver all the same rows, though probably in a different order.

MySQL COUNT on multiple relations

I am trying to figure out how to return the counts of multiple different items in seperate tables related to the table i am joining too.
Im quite new to joins so im not sure if im using correct join. hopefully you can help me!
the tables would be like this:
staff_type table
id type
1 doctor
2 nurse
3 surgeon
staff table
id type_id name
1 1 bob
2 1 jane
3 2 phil
4 2 esther
5 3 michael jackson
im tring to construct a statement that will return me the COUNT off the various different staff types, as in how many dactors, how many nurses etc. I also want the query to bring the data from the staff_type table.
I haven't much ideas on how to construct this query, but it may look something like this:
SELECT staff_type.*, COUNT(Staff.type_id = staff_type.id)
INNER JOIN staff AS Staff ON (staff_type.id = Staff.type_id)
i know this is nothing like what its supposed to be, bit hopefully some of you can point me in the right direction. Other posts on this topic are hard for me to understand and look like they are trying to do something slightly different.
thanks for any help!
You can use something like this as an example:
SELECT t.id
, t.name
, COUNT(s.id) AS count_staff
FROM staff_type t
LEFT
JOIN staff s
ON s.type_id = t.id
GROUP
BY t.id
, t.name
To understand what that's doing, you can remove the GROUP BY and the aggregate expression (COUNT) function in the SELECT list, and see the rows returned by the JOIN operation.
For example:
SELECT t.id AS `t.id`
, t.name AS `t.name`
, s.id AS `s.id`
, s.name AS `s.name`
, s.type_id AS `s.type_id`
FROM staff_type t
LEFT
JOIN staff s
ON s.type_id = t.id
ORDER
BY t.id
, s.id
Note that the LEFT keyword indicates an "outer join". This is going to return all the rows from the table on the left side, even if there aren't any matching rows on the right side. (This will let us get a "zero" count for a staff_type that doesn't have any related staff.)
When we add the GROUP BY clause, that says to "collapse" all the rows that have the same values for the expressions or columns in the GROUP BY list.
We can use an aggregate function, such as COUNT(), SUM(), MAX(), MIN() to perform an operation on all of the rows that are collapsed into a group.
The COUNT() aggregate starts at zero, and increments by one for every non-NULL value. So, we use an expression, COUNT(s.id) that we are guaranteed will be non-NULL if there is a matching row from staff, and will be NULL if there isn't a matching row.
(I hope this helps clear up some of your confusion.)
The query will be,
select staff_type.*, count(staff.id) as count from staff_type left join staff on (staff.type_id = staff_type.id) group by staff_type.id

MySQL avg() and count() in one statement with group by

today I'm fighting with MySQL: I've got two tables, that contain records like that (actually there are more columns, but I don't think it's relevant):
Table Metering:
id, value
1000, 0.117
1000, 0.689
1001, 0.050
...
Table Res (there is no more than one record per id in this table):
id, number_residents
1001, 2
...
I try to get results in the following format:
number_residents, avg, count(id)
2, 0.1234, 456
3, 0.5678, 567
...
In words: I try to find out the average of the value-fields with the same number_residents. The id-field is the connection between the two tables. The count(id)-column should show how many ids have been found with that number_residents. The query I could come up with was the following:
select number_residents,count(distinct Metering.id),avg(value)
from Metering, Res
where Metering.id = Res.id
group by number_residents;
The results look like what I searched for but when I tried to validate them I became insecure. I tried it without the distinct at first but that leads to too high values in the count-column of the results.
Is my statement right to get what I want? I thought it might have to to something with the order of execution like asked here, but I actually can't find any official documentation on that...
Thanks for helping!
Judging by the table names, Res is the "parent" table and Metering us the "child" table - that is there are 0-n meterings for each residence.
You have use "old school" joins (and I mean old - the join syntax has been around for 25 years now), which are inner joins, meaning residences without meterings won't participate in the results.
Use an outer join:
select
number_residents,
count(distinct r.id) residences_count,
avg(value) average_value
from Res r
left join Metering m on m.id = r.id
group by number_residents
Although meterings.id = res.id, with a left join counting them may produce different results: I've changed the count to count residences, which for a left join means residences that don't have meterings still count.
Now, nulls (which are what you get from a left-joined table that doesn't have a matching row) don't participate in avg() - either for the numerator or denominator, if you want residences without any meterings to count when calcukating the average (as if they have a single zero metering for the purposes of dividing the total value), use this query:
select
number_residents,
count(distinct r.id) residences_count,
sum(value) / count(r.id) average_value
from Res r
left join Metering m on m.id = r.id
group by number_residents
Because res.id is never null, count(r.id) counts the number of meterings plus 1 for every residence without any meterings.

Get top 5 most popular values in mysql with where clause

I'm working on a project and I have a problem. I have a table namedfriendswith three columnid,from_emailandto_email(it's a social networking site and "from_email" is the person that follows the "to_email"). I want a query to return the top 5 friends I follow according to the number of their followers. I know that the query for top 5 is:
SELECT
to_mail,
COUNT(*) AS friendsnumber
FROM
friends
GROUP BY
to_email
ORDER BY
friendsnumber DESC
LIMIT 5
Any ideas?
I would also like to return friends with the same number of followers ordered by their name. Is it possible?
You should use COUNT(from_email) instead of COUNT(*); because you want to calculate the number of followers, which is represented by from_email.
Thus, your select clause would be something like:
SELECT to_email, COUNT(from_email) as magnitude
as for getting the most popular people that you follow, you could use IN clause:
WHERE to_email IN (SELECT to_email FROM friends WHERE from_email='MY_EMAIL');
and about name, you shall join this query with the other table which contains the name value.
Since you've got the essentials now, I hope you can try to compose the full query on your own =)
Join again to the table for the 2nd tier count:
SELECT f1.to_email
FROM friends f1
JOIN friends f2 on f2.to_mail = f1.to_email
WHERE f1.from_email = 'myemail'
GROUP BY 1
ORDER BY count(*) DESC
LIMIT 5
If an index is defined on to_email, this will perform very well.

inner join and sum()

I Have two tables
trips_data which as tripid, userid, species (int),killcount
masterspecies which had species_id and speceies (string)
I am trying to retrieve a list of all species seen on a trip
I am hoping to get
sum(killcount) : tripid :species (string):species (int)
57 300 rabbit 1
2 300 foxes 2
1 300 squirels 8
and so on
i have the below query which returns everything I want except the sum(killcount) is about 8000 when it should be 57.
Any help would be hugely apreciated
SELECT sum(trips_data.killcount),
trips_data.species,trips_data.spceces,
masterspecies.species
from trips_data
join masterspecies
WHERE tripid=$tripid
AND userid=1
AND NOT killcount=0
You need to tell the database how to join; otherwise you're getting every possible combination. It looks like trips_data.species should match master_species.species_id; is that right? You also need to group the results by species.
SELECT sum(trips_data.killcount), trips_data.species, masterspecies.species
from trips_data join masterspecies
WHERE tripid=$tripid AND userid=1 and trips_data.species=masterspecies.species_id
group by trips_data.species, masterspecies.species;
This is a cartesian join:
from trips_data join masterspecies
This will return a record for every combination of records from the two tables. That is usually not the intention. Join conditions look something like this:
from trips_data
join masterspecies
on masterspecies.species_id = trips_data.species_id
This will match the records up and only return matching records, so there is a chance your sum will come out correctly.