MySQL query return unexpected values - mysql

Need to generate courses list and count
all
unanswered
answered but unchecked
Questions.
My database structure is looking like that
https://docs.google.com/open?id=0B9ExyO6ktYcOenZ1WlBwdlY2R3c
Explanation for some of tables:
answer_chk_results - checked answers table. So if some answer doesn't exist on this table, it means it's unchecked
lesson_questions - lesson <-> question associations (by id) table
courses-lessons - courses <-> lessons associations (by id) table
Executing
SELECT
c.ID,
c. NAME,
COUNT(lq.id) AS Questions,
COUNT(
CASE
WHEN a.id IS NULL THEN
lq.id
END
) AS UnAnswered,
COUNT(
CASE
WHEN cr.id IS NULL THEN
lq.id
END
) AS UnChecked
FROM
courses c
LEFT JOIN `courses-lessons` cl ON cl.cid = c.id
LEFT JOIN `lesson_questions` lq ON lq.lid = cl.lid
LEFT JOIN answers a ON a.qid = lq.qid
LEFT JOIN answer_chk_results cr ON cr.aid = a.id
GROUP BY
c.ID
Tested it first on SQL fiddle with sample data. (Real data is huge, so I can't place it on sqlfiddle) It returned some values. Thought works well. But while I test it with real data, see that returns wrong values. Forex, when I manually count, result for all questions count must be 25, but it returns 27. Maybe I'm doing something wrong.
Note MySQL server running on my local machine, so I can give you teamviewer id and password if you want to connect to my desktop remotely and test query with real data.

I suspect the problem is that different joins are resulting in a multiplication of rows. The best way to fix this is by using subqueries along each dimension. The following is a more practical way. Replace the COUNTs in the select with COUNT DISTINCT:
SELECT c.ID, c. NAME,
COUNT(distinct lq.id) AS Questions,
COUNT(distinct CASE WHEN a.id IS NULL THEN lq.id END) AS UnAnswered,
COUNT(distinct CASE WHEN cr.id IS NULL THEN lq.id END) AS UnChecked
Compared to COUNT, COUNT DISTINCT is a resource hog (it has to remove duplicates). However, it will probably work fine for your purposes.

Use this query
SELECT
c.ID,
c.NAME,
COUNT(lq.id) AS Questions,
COUNT(IFNULL(a.id),lq.id)AS UnAnswered,
COUNT(IFNULL(cr.id),lq.id)AS UnChecked,
FROM courses c
LEFT JOIN `courses-lessons` cl ON cl.cid = c.id
LEFT JOIN `lesson_questions` AS lq ON lq.lid = cl.lid
LEFT JOIN answers a ON a.qid = lq.qid
LEFT JOIN answer_chk_results cr ON cr.aid = a.id
GROUP BY c.ID

Related

How is joining with a subquery different from joining without a subquery? Looking for difference between two similar queries

I want to see which user created floor equipment for which customer -- both of these queries do what I want. The second query, however, results with 700 more rows than the first. Could you please explain the difference?
I ran another query that found the difference between the two sets -- sure enough, this query yielded 700 rows. Therefore, the data output is the same, but somehow the second query catches more results. I tried looking at the additional 700 rows, but they all seemed normal and similar to the other results. I can't find the difference by looking at the code, which is what I'm hoping someone can help me with
First query
SELECT customer.name, user.name, floor_equipment.id
FROM customer, user, floor_equipment, floor, building, site
WHERE (floor_equipment.floorID = floor.ID AND floor.buildingID = building.id AND
building.siteID = site.id AND floor_equipment.created_by = user.id)
Second Query
SELECT newTable.custName, newTable.userName, newTable.equipID
FROM (SELECT customer.name as "custName", user.name as "userName",
floor_equipment.id as "equipID", floor_equipment.created_by as "creatorID"
FROM customer, floor_equipment, floor, building, site
WHERE (floor_equipment.floorID = floor.ID AND floor.buildingID = building.id AND
building.siteID = site.id AND site.customerID = customer.ID)) as newTable, user
WHERE user.id = newTable.creatorID
I would expect both of these queries to have the same result, however the second query yields 700 more rows than the first. Aside from the extra rows, both queries result in the same data. The 700 additional rows seem to be normal and similar to the other rows.
NOTE: There is a seemingly pointless subquery in the second query. The purpose of this was for optimization. I am running these queries within Domo, a business intelligence webapp. I wrote the subquery in hopes that it would run faster. Because of the way Domo works, the former took 2 hours whereas the latter took 45 seconds.
Ignoring (or perhaps rectifying) the syntax errors, your first query can be written as follows:
SELECT c.name
, u.name
, fe.id
FROM customer c
CROSS
JOIN user u
JOIN floor_equipment fe
ON fe.created_by = u.id
JOIN floor f
ON f.ID = fe.floorID
JOIN building b
ON b.id = f.buildingID
JOIN site s
ON s.id = b.siteID
Likewise, written a little more coherently, your second query is as follows:
SELECT x.custName
, x.userName
, x.equipID
FROM
( SELECT c.name custName
, u.name userName
, fe.id equipID
, fe.created_by creatorID
FROM customer c
JOIN site s
ON s.customerID = c.ID
JOIN building b
ON b.siteID = s.id
JOIN floor f
ON f.buildingID = b.id
JOIN floor_equipment fe
ON fe.floorID = f.ID
) x
JOIN user u
ON u.id = x.creatorID
Again, we can omit the subquery and write it thus...
SELECT c.name custName
, u.name userName
, fe.id equipID
, fe.created_by creatorID
FROM customer c
JOIN site s
ON s.customerID = c.ID
JOIN building b
ON b.siteID = s.id
JOIN floor f
ON f.buildingID = b.id
JOIN floor_equipment fe
ON fe.floorID = f.ID
JOIN user u
ON u.id = fe.created_by
...so we can see that the first query had a cartesian product (CROSS JOIN), whereas the second query does not.
Your code is a Cartesian product between the tables:
customer, user, floor_equipment, floor, building, site
and your where condition is not for a join but just for a tuple of Boolean value
floor_equipment.floorID = floor.ID,
floor.buildingID = building.id,
building.siteID = site.id,
floor_equipment.created_by = user.id
( boolean, boolean, boolean, boolean)
each boolean is the result for the corresponding match eg:
floor_equipment.floorID = floor.ID
so practically return all the rows because have not matching counterpart.
In the second, your first Cartesian product is expanded by the join between the first result and the matching rows for user.id and newTable.creatorID. Looking to your code, it could be that you need an explicit join syntax and proper on condition.

Alternative way to improve my mysql query?

Is there a best way to write this query? It is working just fine on workbench but when I run it on JS, it's not returning the right value.
What I want to do is I want to show users the list of all the items based on their filtered settings (basing on selected category's material and design).
Query:
SELECT COUNT(A.id)
FROM tbl_product A
JOIN tbl_product_details B ON A.id = B.prod_id
JOIN tbl_category C ON A.id = C.prod_id
JOIN tbl_material D ON A.id = D.prod_id
JOIN tbl_design E ON A.id = E.prod_id
WHERE C.category_id IN (6) AND (D.material_id IN (15) OR E.design_id IN (39));
I expect the output to be (workbench result):
COUNT(A.id): 42
instead, it's giving me:
COUNT(A.id): 1582
I am guessing that you want:
SELECT COUNT(DISTINCT A.id)
There are probably other ways to phrase the query (notably, using EXISTS), but this is the simplest modification.

Running a check within a SQL query (Maybe a subquery?)

I have a simple laptop testing booking system with 6 laptops, named Laptop01 to 06 that each have three allocated time slots.
A user is is able to select these time slots if the slot is not booked or if the booking has been cancelled/declined.
While I have working code, I've realised a fatal error that causes a cancelled/declined slot to duplicate.
Let me explain...
event_information - Holds the booking event information (only ID is
needed for this example)
event_machine_time - This hold all the
laptops, with three rows per laptop with the unique timings available
to choose from
event_booking - This holds the actual booking, which
then links to another candidate database, not included here
I then run a simple query that joins everything together and (I thought) identifies the booked events:
SELECT machine_laptop, machine_name, B.id AS m_id, C.id AS c_id, C.confirmed AS c_confirmed, C.live AS c_live,
(C.id IS NOT NULL AND C.confirmed !=2 AND C.live !=0) AS booked
FROM event_information A
INNER JOIN event_machine_time B ON ( 1 =1 )
LEFT JOIN event_booking C on (B.id = C.machine_time_id and A.id = C.information_id )
WHERE A.id = :id
ORDER BY `B`.`id` DESC
booked is checking if confirmed isn't 2 - which means the booking has been cancelled/declined (0 - not confirmed, 1 - confirmed) and live is checking for deletion (0 - deleted, 1 - not deleted).
However if a person either gets deleted (live - 0) or cancels/declines (confirmed - 2) then in my front end slot selector dropdown it will add an extra slot as the booked column is still 0, as shown below:
This allows the user to then choose from two slots at the same time, meaning double bookings occur.
I now know that using a Join is the wrong thing to do, and I'm presuming that I need to run a subquery, but I'm not an SQL expert and I would love some help to find examples of similar 'second queries' that I can learn from.
Also apologies if my terminology is wrong.
EDIT:
As requested I've included the output:
Second edit and conclusion:
In the end I managed to craft a solution together using a sub query to remove the cancelled/declined bookings before the output, then use a Group By to only display one of each timing. This most likely isn't the best way, but it worked for me.
SELECT machine_laptop, machine_name, B.id AS m_id, C.id AS c_id, C.confirmed AS c_confirmed, C.live AS c_live, B.start_time AS b_start_time, (
C.id IS NOT NULL
AND C.confirmed !=2
AND C.live !=0
) AS booked
FROM event_information A
INNER JOIN event_machine_time B ON (1=1)
LEFT JOIN (SELECT * FROM event_booking WHERE confirmed <> '2' AND live <> '0') AS C ON ( B.id = C.machine_time_id AND A.id = C.information_id )
WHERE A.id = :id
GROUP BY m_id
ORDER BY machine_name ASC, b_start_time ASC
Thank you for all your input.
Try below :
SELECT machine_laptop, machine_name, B.id AS m_id, C.id AS c_id, C.confirmed
AS c_confirmed, C.live AS c_live,
(C.id IS NOT NULL AND C.confirmed !=2 AND C.live !=0) AS booked
FROM event_information A
LEFT JOIN event_booking C ON A.id = C.information_id
RIGHT JOIN event_machine_time B ON B.id = C.machine_time_id
WHERE A.id = :id
ORDER BY `B`.`id` DESC
If you make the event_booking (B) as starting point for your query, you can see that there's no need to use pull all rows and columns from A and C. Intead you can join on matching rows directly. But as I can't even properly grasp what your query is trying to achieve, I have couple of questions first:
While this may work it's actually something that's not under your control nor defined by you. Some more strict mode would politely tell you to specify which aliased table you're referring to in your SELECT, as this
SELECT machine_laptop, machine_name -- combined with
FROM event_information A
actually doesn't make sense and the only reason why it's working is that you're leveraging on MySQL's optimisations. In addition to that you're trying to do table joins in a mixed mode (meaning that you use both JOIN and WHERE tA.colX=tB.colY methods. This makes it really difficult to follow.
INNER JOIN event_machine_time B ON ( 1 =1 )
Um? What exactly is the e purpose of this? As far as I can tell this will only cause it to JOIN both full tables, only to later filter the result using WHERE.
Furthermore, are you even using primary keys? Your condition includes C.id IS NOT NULL while primary keys can't even contain NULLs (as NULL is third boolean state in SQL land. There is True, False, and Null (meaning Undefined, which obviously couldn't be used in primary key, as primary key must be unique and Undefined value can be anything or nothing - ergo it's violating the uniqueness requirement). So I'm assuming you're actually using this NULL check because the temp table during JOIN seems to contain them?
EDIT:
Try to split this into two parts, where you first join 2 tables, and then join third table with the result.
I suggest you go briefly over What is the difference between "INNER JOIN" and "OUTER JOIN"? - as this is pretty great post and clarifies many aspects.
For startest I'd go with something like:
SELECT
<i.cols>,
<b.cols>,
<mt.cols>,
IF(b.confirmed !=2 AND b.live !=0, True, False) sa booked
FROM
event_booking b
LEFT JOIN
event_information i ON b.information_id = i.id
LEFT JOIN
event_machine_time mt ON b.machine_time_id = mt.id
WHERE <conditions>
Later I'd change LEFT JOIN into something more appropriate. However bear in mind that INNER JOIN is only useful if you're 100% sure that there rows returned from joined table columns are unique.
Can there even be 1:n, n:1 relationship between i and b tables? I'd assume there couldn't be multiple bookings to same event info (n:1), nor there'd be so that event information is the same for multiple events ? (1:n)

My SQL query is returning results but they are repeated ~50 times. I don't understand why

The query I'm using calls on a few tables in the database and works fine. However, when I add line 10 to the mix it returns 50 or more repeated results. I'm still somewhat new to SQL and Sequel Pro so I'm sure the solution isn't too complicated but I am truly stumped right now.
Here is the code:
SELECT c.first_name, c.last_name, ca.company, ca.city, ca.state, ct.certificate_number, ct.certificate_date
FROM customer c, customer_type ctype, cust_address ca, certification ct, cust_prof_cert cp
WHERE ca.id_customer = c.id_customer LIKE cp.prof_cert_id_prof_cert
AND c.customer_type_id_customer_type = ctype.id_customer_type
AND ct.customer_id_customer = c.id_customer
AND ca.id_customer = c.id_customer
AND ctype.customer_type IN('CIRA','CIRA, CDBV')
AND ct.course_type_id_course_type = 1
AND ct.certificate_number IS NOT NULL
AND cp.prof_cert_id_prof_cert = "1"
ORDER BY ct.certificate_number ASC, c.last_name ASC;
Thank you for your time.
By Doing your SQL like that you are not relating the data, just selecting it. I would recommend changing your SQL to use JOINS.
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID=Customers.CustomerID;
Here is an article that might be able to help you a bit: w3schools, Joins
Here's your query using the SQL92 syntax for joins. You should use this syntax instead of the SQL89 "comma-style" joins.
SELECT c.first_name, c.last_name, ca.company, ca.city, ca.state,
ct.certificate_number, ct.certificate_date
FROM customer AS c
INNER JOIN customer_type AS ctype ON c.customer_type_id_customer_type = ctype.id_customer_type
INNER JOIN cust_address AS ca ON ca.id_customer = c.id_customer
INNER JOIN certification AS ct ON ct.customer_id_customer = c.id_customer
INNER JOIN cust_prof_cert AS cp -- what's this join condition?
WHERE ca.id_customer = c.id_customer LIKE cp.prof_cert_id_prof_cert
AND ctype.customer_type IN('CIRA','CIRA, CDBV')
AND ct.course_type_id_course_type = 1
AND ct.certificate_number IS NOT NULL
AND cp.prof_cert_id_prof_cert = '1'
ORDER BY ct.certificate_number ASC, c.last_name ASC;
A few weird things I notice in this query:
The first term in the WHERE clause is strange. You should know that LIKE has higher precedence than = so this might not be doing what you think it's doing. It's as if you wrote
WHERE ca.id_customer = (c.id_customer LIKE cp.prof_cert_id_prof_cert)
Which means evaluate the LIKE and produce a 0 or a 1 to represent the boolean condition. Then look for a ca.id_customer matching that 0 or 1.
Given that strange term, I can find no other join condition for the cp table. The default join if you give no restriction for it is that every row matches every row in the joined tables. So if you have 50 rows where cp.prof_cert_id_prof_cert = 1, then it will effectively multiply the results from the rest of the joined tables by 50.
This is called a Cartesian product, or in MySQL parlance it's counted in SHOW STATUS as a Full join.
ctype.customer_type IN('CIRA','CIRA, CDBV') You have quoted the second and third strings together. Basically, this means you are trying to match the column against two strings, one of which happens to contain a comma.
You probably meant to write ctype.customer_type IN('CIRA','CIRA','CDBV') so the column may match any of these three values.
I would suggest not querying multiple tables in your FROM clause, I believe this is the cause of your duplicate rows. If you separate out the tables into separate inner or left joins, (whichever you need) you should be able to match which ever keys in each table manually, instead of having SQL attempt to automatically do this.

Problem using MySQL Join

i have a MySQL SELECT query which fetches data from 6 tables using Mysql JOIN. here is the MySQL query i am using.
SELECT
u.id,u.password,
u.registerDate,
u.lastVisitDate,
u.lastVisitIp,
u.activationString,
u.active,
u.block,
u.gender,
u.contact_id,
c.name,
c.email,
c.pPhone,
c.sPhone,
c.area_id,
a.name as areaName,
a.city_id,
ct.name as cityName,
ct.state_id,
s.name as stateName,
s.country_id,
cn.name as countryName
FROM users u
LEFT JOIN contacts c ON (u.contact_id = c.id)
LEFT JOIN areas a ON (c.area_id = a.id)
LEFT JOIN cities ct ON (a.city_id = ct.id)
LEFT JOIN states s ON (ct.state_id = s.id)
LEFT JOIN countries cn ON (s.country_id = c.id)
although query works perfectly fine it sometimes returns duplicate results if it finds any duplicate values when using LEFT JOIN. for example in contacts table there exist two rows with area id '2' which results in returning another duplicated row. how do i make a query to select only the required result without any duplicate row. is there any different type of MySQL Join i should be using?
thank you
UPDATE :
here is the contacts table, the column area_id may have several duplicate values.
ANSWER :
there was an error in my condition in last LEFT JOIN where i have used (s.country_id = c.id) instead it should be (s.country_id = cn.id) after splitting the query and testing individually i got to track the error. thank you for your response. it works perfectly fine now.
Duplicating the rows like you mentioned seems to indicate a data problem.
If users is your most granular table this shouldn't happen.
I'd guess, then, that it's possible for a single user to have multiple entries in contacts
You could use DISTINCT as mentioned by #dxprog but I think that GROUP BY is more appropriate here. GROUP BY whichever datapoint could potentially be duplicated....
After all, if a user has corresponding contact records, which one are you intending to JOIN to?
You must specify this if you want to remove "duplicates" because, as far as the RDBMS is concerned, the two rows matching
LEFT JOIN contacts c ON (u.contact_id = c.id)
Are, in fact, distinct already
I think a DISTINCT may be what you're looking for:
SELECT DISTINCT
u.id,u.password,
u.registerDate,
u.lastVisitDate,
u.lastVisitIp,
u.activationString,
u.active,
u.block,
u.gender,
u.contact_id,
c.name,
c.email,
c.pPhone,
c.sPhone,
c.area_id,
a.name as areaName,
a.city_id,
ct.name as cityName,
ct.state_id,
s.name as stateName,
s.country_id,
cn.name as countryName
FROM users u
LEFT JOIN contacts c ON (u.contact_id = c.id)
LEFT JOIN areas a ON (c.area_id = a.id)
LEFT JOIN cities ct ON (a.city_id = ct.id)
LEFT JOIN states s ON (ct.state_id = s.id)
LEFT JOIN countries cn ON (s.country_id = c.id)
This should only return rows where the user ID is distinct, though you may not get all the joined data you'd hoped for.