Retrieving uniq records with union in mysql - mysql

I have two tables usersin and usersout(I can not change schema, a lot of system changes must be done in php otherwise). I should get all user records in a query but I should mark them if they are in or out also a user may have an in record and out record I shouldn't show in record if has an out record.
I have created tables with sample data in SQL Fiddle: http://www.sqlfiddle.com/#!9/ac99a/1/0
Can u help me how can I remove duplicates of user records in this union query?

If you want to have all entries with an entry in either the in or out table, but not in both of them, then a full outer join would be your friend.
Since MySQL does not know that kind of join, you can emulate it with a left outer join and a right outer join combined like so:
SELECT
ui.id, ui.user, 'i'
FROM
usersIN ui
LEFT OUTER JOIN
usersOUT uo ON ui.user = uo.user
WHERE uo.id IS NULL
UNION
SELECT
uo.id, uo.user, 'o'
FROM
usersIN ui
RIGHT OUTER JOIN
usersOUT uo ON ui.user = uo.user
WHERE ui.id IS NULL;
This should give you the right output.
A good visual explanation of joins can be found here

Related

MySQL - Why does WHERE ignore RIGHT JOIN?

I have the following MySQL query:
SELECT inv.inventory_id, inv.item_id, item.description, inv.quantity, item.class_id, class.description AS class,
class.is_spool, inv.location_id, location.description AS location, location.division_id, division.name AS division,
inv.service_date, inv.reel_number, inv.original_length, inv.current_length, inv.outside_sequential,
inv.inside_sequential, inv.next_sequential, inv.notes, inv.last_modified, inv.modified_by
FROM reel_inventory AS inv
INNER JOIN reel_items AS item ON inv.item_id = item.item_id
INNER JOIN reel_locations AS location ON inv.location_id = location.location_id
INNER JOIN locations AS division ON location.division_id = division.location_id
RIGHT JOIN reel_classes AS class on item.class_id = class.class_id;
The query works exactly as expected as is. What I was trying to do was add a WHERE clause to this query with one qualifier. For example:
RIGHT JOIN reel_classes AS class ON item.class_id = class.class_id
WHERE inv.current_length > 0;
When I do this, all of the results from the RIGHT JOIN are not included in the result. I've not had a ton of experience with advanced queries, but could someone explain why the RIGHT JOIN is excluded from the result set when a WHERE is used, and how to property write the query to include the RIGHT JOIN information?
Thanks in advance.
What you want is:
RIGHT JOIN reel_classes AS class
ON item.class_id = class.class_id AND
inv.current_length > 0;
Your question is why the RIGHT JOIN turns into an INNER JOIN with the WHERE clause.
The reason is simple. For the non-matching rows, inv.current_length is NULL and this fails the comparison.
I would also suggest that you use LEFT JOIN, starting with the table where you want to keep all the rows. Most people find it much easier to understand logic that is "keep all rows in the first table" rather than "keep all rows in some table whose name will come up".

How to use left outer join in mysql to get my desirable results?

I have made an sql fiddle. Users with id 100 and 118 should have 0 records for assigned_scopes, assigned_qa, failed_qa and assigned_canvass. But it does not display that data; it just shows the record for the user with id 210. What I really need is to display all the users, with 0 in each column if they have nothing. How can I do this?
What I have tried since is in this fiddle. It works according to my requirements but there is problem with query optimization; its execution time is not what I want. It took more than 33 seconds to load the page (which is not good at all) when I used the second fiddle query because of the huge data. The query in the first fiddle executed in 2 seconds even with huge data. How can correct the first query to give the results of the second query while (hopefully) staying fast?
The "ten second explanation" for why most queries with LEFT JOIN do not display rows that you expect is that you've put one of the left joined tables into the where clause, demanding a value. This automatically converts any left join into an inner join behaviour
You either have to say WHERE leftjoinedtable.column =value OR leftjoinedtable.column is null or preferentially put the statement in the ON condition (doesn't need a qualifying 'or xxx is null which makes your query simpler and more readable
Move all the statements in your where clause out, and into the respective ON conditions so your query no longer has a where clause section at all. Throw your latter attempt away; it has a Cartesian join and sub selects in the main selection list; in a production system it could be generating millions more rows than needed, selecting extra data based on them, and then discarding them. A very difficult query style to optimise and usually unnecessary
Edit.. Had another look at your fiddle, noticed you also GROUP BY a column that could be null.. When it is, those rows disappear from the output
In terms of how you should write queries, consider first which tables will always have all the values. In your case these are the user and teams tables. These should be inner joined together first, then other tables which may not always have a matching row should be added LEFT join. Right join is seldom needed
Alter the order of your tables so it's teams, inner join users, then left join the two tables you're counting stars on. When you group by, group on the users Id from the users table, not the stats table
I did this with your fiddles but couldn't find a way to save them as a new fiddle on the iPad, couldn't copy the text, but here's a screenshot of a query that does what you require, I'll leave the typing as an exercise for the reader :) , don't forget to adjust the group by
Note, now the users table is inner joined, the =15 filter could go in the where clause... I'll leave the general advice though, to encourage and remind you that a) it works just as well in the ON clause and b) to prefer putting these things in the ON because it means left joins work as you expect
Read a definition of left join on. It returns rows that inner join on does plus unmatched rows in the left table extended by nulls.
Re your 1st link query: If you want all the records from user then it must be the leftmost table in your left joins. You must not remove rows that might have null values by tests requiring non-null values in where--but, why would you?--ie, why do you? The following returns rows only for users with role 15, per the 2nd link query; you don't explain why 15 tested for in the 1st.
select u.user_id,
ut.name as team_name,
/* ... */
count(case when a.status = 2 AND a.qc_id = 0 and o.class_id= 3 then 1 else null end)
AS assigned_canvass
from am_user u
left join user_team ut
on u.user_team_id = ut.user_team_id
left join am_ts_assignment a
on u.user_id = a.tech_id
left join am_ts_order o
on o.assignment_id = a.assignment_id
where u.user_role_id = 15
group by u.user_id
order by u.user_id asc
Compose your queries incrementally and test as you add joins & columns.
(This is the same query I gave on your last post re essentially your 1st link query except it returns a row for every am_user instead of every am_ts_assignment, per each post. The where might have been an and in the first left join with am_user; either will do here.)

Difference between FROM and JOIN tables

I'm working through the JOIN tutorial on SQL zoo.
Let's say I'm about to execute the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM game a
JOIN goal g
ON g.matchid = a.id
GROUP BY a.stadium
As it happens, it produces the same output as the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM goal g
JOIN game a
ON g.matchid = a.id
GROUP BY a.stadium
So then, when does it matter which table you assign at FROM and which one you assign at JOIN?
When you are using an INNER JOIN like you are here, the order doesn't matter. That is because you are connecting two tables on a common index, so the order in which you use them is up to you. You should pick an order that is most logical to you, and easiest to read. A habit of mine is to put the table I'm selecting from first. In your case, you're selecting information about a stadium, which comes from the game table, so my preference would be to put that first.
In other joins, however, such as LEFT OUTER JOIN and RIGHT OUTER JOIN the order will matter. That is because these joins will select all rows from one table. Consider for example I have a table for Students and a table for Projects. They can exist independently, some students may have an associated project, but not all will.
If I want to get all students and project information while still seeing students without projects, I need a LEFT JOIN:
SELECT s.name, p.project
FROM student s
LEFT JOIN project p ON p.student_id = s.id;
Note here, that the LEFT JOIN refers to the table in the FROM clause, so that means ALL of students were being selected. This also means that p.project will be null for some rows. Order matters here.
If I took the same concept with a RIGHT JOIN, it will select all rows from the table in the join clause. So if I changed the query to this:
SELECT s.name, p.project
FROM student s
RIGHT JOIN project p ON p.student_id = s.id;
This will return all rows from the project table, regardless of whether or not it has a match for students. This means that in some rows, s.name will be null. Similar to the first example, because I've made project the outer joined table, p.project will never be null (assuming it isn't in the original table). In the first example, s.name should never be null.
In the case of outer joins, order will matter. Thankfully, you can think intuitively with LEFT and RIGHT joins. A left join will return all rows in the table to the left of that statement, while a right join returns all rows from the right of that statement. Take this as a rule of thumb, but be careful. You might want to develop a pattern to be consistent with yourself, as I mentioned earlier, so these queries are easier for you to understand later on.
When you only JOIN 2 tables, usually the order does not matter: MySQL scans the tables in the optimal order.
When you scan more than 2 tables, the order could matter:
SELECT ...
FROM a
JOIN b ON ...
JOIN c ON ...
Also, MySQL tries to scan the tables in the fastest way (large tables first). But if a join is slow, it is possible that MySQL is scanning them in a non-optimal order. You can verify this with EXPLAIN. In this case, you can force the join order by adding the STRAIGHT_JOIN keyword.
The order doesn't always matter, I usually just order it in a way that makes sense to someone reading your query.
Sometime order does matter. Try it with LEFT JOIN and RIGHT JOIN.
In this instance you are using an INNER JOIN, if you're expecting a match on a common ID or foreign key, it probably doesn't matter too much.
You would however need to specify the tables the correct way round if you were performing an OUTER JOIN, as not all records in this type of join are guaranteed to match via the same field.
yes, it will matter when you will user another join LEFT JOIN, RIGHT JOIN
currently You are using NATURAL JOIN that is return all tables related data, if JOIN table row not match then it will exclude row from result
If you use LEFT / RIGHT {OUTER} join then result will be different, follow this link for more detail

MYSQL SELECT: check if rowdata exists

In my SQL query i'm checking on different parameters. Nothing strange happens when there is data in each of the tables for the inserted tripcode. But when one table has no data in it I don't get any data at all. Even if the other tables have data. So I need to be able to check if the table has data in it and if it has, I need to select.
SELECT roadtrip_tblgeneral.*,
GROUP_CONCAT(distinct roadtrip_tblhotels.hotel) as hotels,
GROUP_CONCAT(distinct roadtrip_tbllocations.location) as locations,
GROUP_CONCAT(distinct roadtrip_tbltransports.transport) as transports
FROM roadtrip_tblgeneral
INNER JOIN roadtrip_tblhotels
ON roadtrip_tblgeneral.id = roadtrip_tblhotels.tripid
INNER JOIN roadtrip_tbllocations
ON roadtrip_tblgeneral.id = roadtrip_tbllocations.tripid
INNER JOIN roadtrip_tbltransports
ON roadtrip_tblgeneral.id = roadtrip_tbltransports.tripid
WHERE roadtrip_tblgeneral.tripcode = :tripcode
GROUP BY roadtrip_tblgeneral.id
Only the tables with the GROUP_CONCAT in front need the check. I already tried with the keyword EXISTS in front of it.
Thanks in advance.
The INNER JOIN keyword returns rows when there is at least one match in both tables. You can't have a match if there is no data, perhaps you want to use a LEFT JOIN or a FULL JOIN.
Left join will be use as it returns all the data from the table at left, even if there is no matching rows in right table

How do I decide when to use right joins/left joins or inner joins Or how to determine which table is on which side?

I know the usage of joins, but sometimes I come across such a situation when I am not able to decide which join will be suitable, a left or right.
Here is the query where I am stuck.
SELECT count(ImageId) as [IndividualRemaining],
userMaster.empName AS ID#,
CONVERT(DATETIME, folderDetails.folderName, 101) AS FolderDate,
batchDetails.batchName AS Batch#,
Client=#ClientName,
TotalInloaded = IsNull(#TotalInloaded,0),
PendingUnassigned = #PendingUnassigned,
InloadedAssigned = IsNull(#TotalAssigned,0),
TotalProcessed = #TotalProcessed,
Remaining = #Remaining
FROM
batchDetails
Left JOIN folderDetails ON batchDetails.folderId = folderDetails.folderId
Left JOIN imageDetails ON batchDetails.batchId = imageDetails.batchId
Left JOIN userMaster ON imageDetails.assignedToUser = userMaster.userId
WHERE folderDetails.ClientId =#ClientID and verifyflag='n'
and folderDetails.FolderName IN (SELECT convert(VARCHAR,Value) FROM dbo.Split(#Output,','))
and userMaster.empName <> 'unused'
GROUP BY userMaster.empName, folderDetails.folderName, batchDetails.batchName
Order BY folderDetails.Foldername asc
Yes, it depends on the situation you are in.
Why use SQL JOIN?
Answer: Use the SQL JOIN whenever multiple tables must be accessed through an SQL SELECT statement and no results should be returned if there is not a match between the JOINed tables.
Reading this original article on The Code Project will help you a lot: Visual Representation of SQL Joins.
Also check this post: SQL SERVER – Better Performance – LEFT JOIN or NOT IN?.
Find original one at: Difference between JOIN and OUTER JOIN in MySQL.
In two sets:
Use a full outer join when you want all the results from both sets.
Use an inner join when you want only the results that appear in both
sets.
Use a left outer join when you want all the results from set a, but
if set b has data relevant to some of set a's records, then you also
want to use that data in the same query too.
Please refer to the following image:
I think what you're looking for is to do a LEFT JOIN starting from the main-table to return all records from the main table regardless if they have valid data in the joined ones (as indicated by the top left 2 circles in the graphic)
JOIN's happen in succession, so if you have 4 tables to join, and you always want all the records from your main table, you need to continue LEFT JOIN throughout, for example:
SELECT * FROM main_table
LEFT JOIN sub_table ON main_table.ID = sub_table.main_table_ID
LEFT JOIN sub_sub_table on main_table.ID = sub_sub_table.main_table_ID
If you INNER JOIN the sub_sub_table, it will immediately shrink your result set down even if you did a LEFT JOIN on the sub_table.
Remember, when doing LEFT JOIN, you need to account for NULL values being returned. Because if no record can be joined with the main_table, a LEFT JOIN forces that field to appear regardless and will contain a NULL. INNER JOIN will obviously just "throw away" the row instead because there's no valid link between the two (no corresponding record based on the ID's you've joined)
However, you mention you have a where statement that filters out the rows you're looking for, so your question on the JOIN's are null & void because that is not your real problem. (This is if I understand your comments correctly)