MySQL Differences with counts caused by joins - mysql

i have a problem in MySQL where I use the COUNT function for conditions. However, when combining this with joins, although I use grouping, the COUNT values include ALL rows, even the ones filtered out.
I'm providing a minimal working example, which however maybe does not make a practical sense or is designed smartly.
So assume I have 3 tables:
products with fields: productId, name, active (boolean)
teams with fields: teamId, name
rel_production with fields: teamId, productId
So basically I have products and teams with ids and names. Products can be active (lets say that means that they are still in production or so).
And then I have a relation which team is working on which product.
To explain my problem, assume the following minimal amount of data to clarify the problem is contained inside the tables:
products
teams
rel_production
Now the query that I want to do is, in plain english: "I want all teams that are working on exactly 2 products while atleast one product must be active."
The query in general works and is the following in mysql:
SELECT
teams.*,
"r_count:",
r_count.*,
COUNT(r_count.productId),
"r_active:",
r_active.*,
"p_active:",
p_active.*
FROM teams
INNER JOIN rel_production r_active ON r_active.teamId = teams.teamId
INNER JOIN products p_active ON p_active.productId = r_active.productId AND p_active.active
INNER JOIN rel_production r_count ON r_count.teamId = teams.teamId
GROUP BY teams.teamId, r_active.teamId
HAVING COUNT(r_count.productId) = 2 #4 is the problem!!!!!!!!!!!!!
Now them problem is with team 1. Because it is working on 2 active products, COUNT(r_count.productId) will be 4 and not 2. So my query will filter it out.
Here is the screenshot with the result without the HAVING clause:
I see why this happens, because the two inner joins on rel_production will cause 4 rows to be generated. But then they are merged always together to one using the GROUP BY. So what I need is the COUNT after the GROUP and not before.
How can I fix this?

Perform the filtering on teams in a separate subquery, and then join to that:
SELECT
t1.teamId,
t1.name
FROM teams t1
INNER JOIN
(
SELECT t1.teamId
FROM rel_production t1
INNER JOIN products t2
ON t1.productId = t2.productId
GROUP BY t1.teamId
HAVING COUNT(DISTINCT t1.productId) = 2 AND SUM(t2.active) > 0
) t2
ON t1.teamId = t2.teamId;
SQLFiddle

Related

MySQL - Join 2 tables and count number of entries

I'm trying to join 2 tables and count the number of entries for unique variables in one of the columns. In this case I'm trying to join 2 tables - patients and trials (patients has a FK to trials) and count the number of patients that show up in each trial. This is the code i have so far:
SELECT patients.trial_id, trials.title
FROM trials
JOIN(SELECT patients, COUNT(id) AS Num_Enrolled
FROM patients
GROUP BY trials) AS Trial_Name;
The Outcome I'm trying to acheive is:
Trial_Name Num_Patients
Bushtucker 5
Tribulations 7
I'm completely new to sql and have been struggling with the syntax compared to scripting languages.
It's not 100% clear from your question of the names of your columns however you are after a basic aggregation. Adjust the names of the columns if necessary:
select t.title Trial_Name, Count(*) Num_Patients
from Trials t
join Patients p on p.Trial_Id = t.Id
group by t.title;
Based on Stu-'s answer, I want to say that your column naming is wrong.But you can write query based on logic like this.
SELECT trial.title AS Trial_Name, COUNT(p.id) AS Num_Patients
FROM trial
INNER JOIN patients AS p
ON trial.patient_fk_id = p.id
GROUP BY trial.title,p.id;

MySQL inner join query select from same table multiple times

I believe I have formed this question title correctly because I wasn't sure how to form it. As an example, I have summarized my query below.
I have an order table which saves order details like customer id, address and product ids and quantity ordered for each order in a row. So multiple inventory/product ids are saved in a single row.
so my query looks like: this is a summarized query for an easier explanation I have omitted various other fields.
SELECT customer.name,customer.address,tbl_order.order_date,tbl_order.product1_id,tbl_order.product2_id,inventory.product1_name,inventory.product2_name
FROM tbl_order
INNER JOIN customer ON tbl_order.customer_id = customer.id
INNER JOIN inventory on tbl_order.product1_id = inventory.id
INNER JOIN inventory on tbl_order.product2_id = inventory.id
where YEAR(tbl_order.order_date)='$year'
So my question is how to get the inventory details from the inventory table based on each product id from tbl_order. I am running a while loop to show all data for a year
while($row=mysqli_fetch_assoc($sql1))
I can divide this query into 2 and run the inventory query individually but then how to combine the while loop, as sometimes there could also be empty query when some products are not in order table (depending on order to order, not all products are ordered) so this doesn't work
while($row=mysqli_fetch_assoc($sql1)) and ($row1=mysqli_fetch_assoc($inv1)) and ($row2=mysqli_fetch_assoc($inv2))
and so one for 10 products
First, of all you have bad DB design and I kindly advice to normalize your DB.
Second, if you can not re-design the DB you can use multiple joins with aliases like:
SELECT
customer.name, customer.address, tbl_order.order_date,
tbl_order.product1_id, inv1.product1_name,
tbl_order.product2_id, inv2.product2_name
FROM tbl_order
INNER JOIN customer ON tbl_order.customer_id = customer.id
INNER JOIN inventory AS inv1 ON tbl_order.product1_id = inv1.id
INNER JOIN inventory AS inv2 ON tbl_order.product2_id = inv2.id
WHERE YEAR(tbl_order.order_date)='$year'

MySQL View in place of subquery does not return the same result

The query below is grabbing some information about a category of toys and showing the most recent sale price for three levels of condition (e.g., Brand New, Used, Refurbished). The price for each sale is almost always different. One other thing - the sales table row id's are not necessarily in chronological order, e.g., a toy with a sale id of 5 could have happened later than a toy with a sale id of 10).
This query works but is not performant. It runs in a manageable amount of time, usually about 1s. However, I need to add yet another left join to include some more data, which causes the query time to balloon up to about 9s, no bueno.
Here is the working but nonperformant query:
SELECT b.brand_name, t.toy_id, t.toy_name, t.toy_number, tt.toy_type_name, cp.catalog_product_id, s.date_sold, s.condition_id, s.sold_price FROM brands AS b
LEFT JOIN toys AS t ON t.brand_id = b.brand_id
JOIN toy_types AS tt ON t.toy_type_id = tt.toy_type_id
LEFT JOIN catalog_products AS cp ON cp.toy_id = t.toy_id
LEFT JOIN toy_category AS tc ON tc.toy_category_id = t.toy_category_id
LEFT JOIN (
SELECT date_sold, sold_price, catalog_product_id, condition_id
FROM sales
WHERE invalid = 0 AND condition_id <= 3
ORDER BY date_sold DESC
) AS s ON s.catalog_product_id = cp.catalog_product_id
WHERE tc.toy_category_id = 1
GROUP BY t.toy_id, s.condition_id
ORDER BY t.toy_id ASC, s.condition_id ASC
But like I said it's slow. The sales table has about 200k rows.
What I tried to do was create the subquery as a view, e.g.,
CREATE VIEW sales_view AS
SELECT date_sold, sold_price, catalog_product_id, condition_id
FROM sales
WHERE invalid = 0 AND condition_id <= 3
ORDER BY date_sold DESC
Then replace the subquery with the view, like
SELECT b.brand_name, t.toy_id, t.toy_name, t.toy_number, tt.toy_type_name, cp.catalog_product_id, s.date_sold, s.condition_id, s.sold_price FROM brands AS b
LEFT JOIN toys AS t ON t.brand_id = b.brand_id
JOIN toy_types AS tt ON t.toy_type_id = tt.toy_type_id
LEFT JOIN catalog_products AS cp ON cp.toy_id = t.toy_id
LEFT JOIN toy_category AS tc ON tc.toy_category_id = t.toy_category_id
LEFT JOIN sales_view AS s ON s.catalog_product_id = cp.catalog_product_id
WHERE tc.toy_category_id = 1
GROUP BY t.toy_id, s.condition_id
ORDER BY t.toy_id ASC, s.condition_id ASC
Unfortunately, this change causes the query to no longer grab the most recent sale, and the sales price it returns is no longer the most recent.
Why is it that the table view doesn't return the same result as the same select as a subquery?
After reading just about every top-n-per-group stackoverflow question and blog article I could find, getting a query that actually worked was fantastic. But now that I need to extend the query one more step I'm running into performance issues. If anybody wants to sidestep the above question and offer some ways to optimize the original query, I'm all ears!
Thanks for any and all help.
The solution to the subquery performance issue was to use the answer provided here: Groupwise maximum
I thought that this approach could only be used when querying a single table, but indeed it works even when you've joined many other tables. You just have to left join the same table twice using the s.date_sold < s2.date_sold join condition and make sure the where clause looks for the null value in the second table's id column.

using JOIN and subquery in mysql

I posted a question about 2 weeks ago about 'one to many' relation between SQL tables. Now I have a bit of a different scenario. Basically, there are two tables - coffee_users and coffee_product_registrations. The latter is connected to coffee_users table with 'uid' column. So basically coffee_users.uid = coffee_product_registrations.uid
A single user can have multiple products registered.
What I want to do is to display some product information (from coffee_product_registrations) along with some user information (from coffee_users), BUT retrieve only those rows that have more than 1 product registrations.
So to simplify, here are the steps I need to take:
Join two tables
Select users that have multiple products registered
Display all their products along with their names and stuff
My current SQL query looks like this:
SELECT c.uid, c.name, cpr.model
FROM coffee_users c
JOIN coffee_product_registrations cpr on c.uid = cpr.uid
GROUP BY c.uid
HAVING COUNT(cpr.uid) > 1
This joins the two tables on 'uid' column but displays only 1 row for each user. It selects just users that have multiple products registered.
Now I need to take these IDs and select ALL the products from coffee_product_registrations based on them.
I cannot figure out how to put this in one query.
Replace cpr.*, c.* with columns which you want to extract feom the query
Try this:
SELECT cpr.*, c.*
FROM coffee_product_registrations cpr
INNER JOIN coffee_users c ON c.uid = cpr.uid
INNER JOIN (SELECT cpr.uid
FROM coffee_product_registrations cpr
GROUP BY cpr.uid
HAVING COUNT(DISTINCT cpr.productId) > 1
) AS A ON c.uid = A.uid;

Checking whole array over a multiple JOIN

To filter a table output of selected entries from a single table i would need something like a multiple JOIN request through several tables.
I want to filter a table of people by a special column in the table. Lets say this column is "tasks." Now tasks is also another table with the column "people" and the values between those two tables are connected with an existant "join" table in the database, which is matching several IDs of one table to each ID of the other table.
Now if this would be simple as that i could just filter with an INNER JOIN and a special condition. The problem is, that the entries of the table "tasks" are connected to another table over a "join" table in the database. To simplify things lets say it is "settings". So each "task" consists of several "settings" which are connected via a join table in their IDs.
So what is the input?
I got an array of IDs, which are representing the settings-ids i do not want to be shown.
What should be the output?
As already said i want a filtered output of "people" while the filter is "settings."
I want the sql request to return each entry of the table "people" with only joined tasks that are not joining any of the "setting-ids" from the array.
I hope you can help me with that.
Thanks in advance!
Example
Settings-Table:
1. Is in progress
2. Is important
3. Has unsolved issues
Tasks-Table: (settings.tasks is the join table between many tasks to many settings)
1. Task from 01.01.2012 - JOINS Settings in 1 and 3 (In progress + unsolved issues)
2. Task from 02.01.2012 - JOINS Settings in 2 (Is important)
3. Task from 03.01.2012 - JOINS Settings in 1 and 2 (...)
People-Table: (people.tasks is the join table between many people to many tasks)
1. Guy - JOINS Tasks in 1, 2, 3 (Has been assigned to all 3 tasks)
2. Dude - JOINS Tasks in 1 (Has been assigned to the Task from 01.01.2012)
3. Girl - JOINS Tasks in 2, 3 (...)
Now there is an array passed to a sql query
[2,3] should return noone because every person is assigned in a task that was either important or had unsolved issues!
[3] would return me only the person "Girl" because it is the only one that is assigned to tasks (2 and 3) that had no unsolved issues.
I hope it is clear now. :)
SELECT DISTINCT PEOPLE.*
FROM PEOPLE INNER JOIN PEOPLE_TASKS ON PEOPLE.PERSON_ID = PEOPLE_TASKS.PERSON_ID
WHERE TASK_ID NOT IN (SELECT DISTINCT TASK_ID
FROM TASK_SETTINGS
WHERE SETTING_ID = <Id you don't want>)
EDIT (for supplying multiple setting ids you don't want)
SELECT DISTINCT PEOPLE.*
FROM PEOPLE INNER JOIN PEOPLE_TASKS ON PEOPLE.PERSON_ID = PEOPLE_TASKS.PERSON_ID
WHERE TASK_ID NOT IN (SELECT DISTINCT TASK_ID
FROM TASK_SETTINGS
WHERE SETTING_ID IN (<Id you don't want>))
First you have to join table people and table tasks with the join table, let's call it people_tasks.
select distinct p.* from people p
inner join people_tasks pt on p.people_id = pt.people_id
inner join tasks on t.tasks_id = pt.tasks_id
Then you have to join table tasks and table settings with the join table, let's call it tasks_settings. You have to join them in the current select.
select distinct p.* from people p
inner join people_tasks pt on p.people_id = pt.people_id
inner join tasks on t.tasks_id = pt.tasks_id
inner join tasks_settings ts on t.tasks_id = ts.tasks_id
inner join settings s on s.settings_id = ts.settings_id
and now you have all people connected with its tasks and its settings. Finally you need the restriction. With the people with the settings selected, you choose the others like this:
select distinct p.people_id from people p
inner join people_tasks pt on p.people_id = pt.people_id
where p.people_id not in (
select distinct p2.people_id from people p2
inner join people_tasks pt2 on p2.people_id = pt2.people_id
inner join tasks t2 on t2.tasks_id = pt2.tasks_id
inner join tasks_settings ts2 on t2.tasks_id = ts2.tasks_id
inner join settings s2 on s2.settings_id = ts2.settings_id
where s2.settings_id in (list of ids)
)