Show multiple unmatched records in SQL - mysql

I have two tables. These two tables may have ID's that do not match. However, also they may have names or addresses that do not match as well. I need to be able to filter out not only ID's but first_name, last_name and street_1 from my list. I can do a JOIN on match ID's but sometimes they match but the other columns may have records that do not match which I would need to show.
Find ID's that do not match. If they do match see if any of the other fields do not match.
Here are my expect results:
id first_name_2 last_name_2 street_1 street_2
3 Teresa White 834 Green Ridge Hill 43 Arapahoe Park
6 Rebecca George 39157 Nelson Hill 7467 Acker Center
7 Ann Hawkins 341 Tennessee Street 8 Bunting Street
8 Joyce Moreno 0277 Bunker Hill Drive 6 Nancy Center
9 Kimberly Alvarez 57332 Di Loreto Lane 0437 Waubesa Avenue
ID 3 & 6 is in the list because the Last Name does not match. ID 7 is last name and street_1. ID 8 & 9 ID's do not match.
Here is my sample data for reference: http://sqlfiddle.com#!9/928568/2

I would do the following: Left joining and treating nulls as blank strings. If you have a legitimate empty string, street_2 for example, it may return false positives:
SELECT *
FROM information I1
LEFT JOIN information_2 I2 ON I1.id = I2.id
WHERE ( I1.first_name_2 <> ifnull(I2.first_name_2, '')
OR I1.last_name_2 <> ifnull(I2.last_name_2, '')
OR I1.street_1 <> ifnull(I2.street_1, '')
OR I1.street_2 <> ifnull(I2.street_2, '')
);

Hi I went through the sample data reference and i feel your requirement is To find all the tuples whose exact copy is not there in there in the second table
You can use the following SQL code I tested this on your feedle and it is giving the expected result
SELECT
i.id, i.first_name_2, i.last_name_2, i.street_1, i.street_2
FROM
information i
LEFT JOIN
information_2 i2
ON
i.id=i2.id AND i.first_name_2=i2.first_name_2 AND i.last_name_2=i2.last_name_2
AND i.street_1=i2.street_1 AND i.street_2 = i2.street_2
where
i2.id is null
There is also a simple way to do this if your database supports MINUS set operator just write
SELECT * FROM information
MINUS
SELECT * FROM information_2
and you will get the same answer

Related

Manipulate rows using multiple clauses in mysql

I have a table that goes something like this:
samples
sample_id | field | value | list_id
1 country US 10
2 state tx 10
3 country US 11
4 state tx 11
5 emp_size 100 11
I have a query that retrieves list_ids 10 and 11 using the ff code;
select * from samples where (field='country' and value='US') OR (field='state' and value='tx')
However I realized later on that this is not the setup that I want. Let say I include in my clause (field='emp_size' and value='100') because I want to get list_id 11 only, it still includes list_id 10 because I use OR in my query. And right now I'm not sure if there's a workaround for this using plain mysql only or if I should just manipulate the data using php.
Edit
For clarification, I want to get the list_ids based on the given parameters, say, I want US and TX, it should return list_ids 10 and 11. But if I add another parameter, say, emp_size, it should only return list_id 11.
You've got an EAV style data structure, so the best solution here is to self-join the table for each parameter/value combination that you are searching on.
SELECT s1.list_id
FROM samples s1
INNER JOIN samples s2
ON s1.list_id = s2.list_id
AND s2.field = 'state'
AND s2.value = 'tx'
INNER JOIN samples s3
ON s1.list_id = s3.list_id
AND s3.field = 'emp_size'
AND s3.value = '100'
WHERE s1.field = 'country'
AND s1.value = 'US';

MySQL Query - Find_in_set on comma separated columns

I have an issue with a Query I'm conducting to do a search on a Database of events.
The purpose is about sports and the structure is:
id_event event_sport event_city
1 10 153
2 12 270
3 09 135
The table sports is like:
sport_id sport_name
1 Basketball
and the table cities is:
city_id city_name
1 NYC
So things get complicated, because my events table is like:
id_event event_sport event_city
1 10,12 153,270
2 7,14 135,271
3 8,12 143,80
and I have a multi-input search form, so that people can search for events in their city for multiple sports or for multiple cities. I'm using Chosen
The search resultant from Chosen is, for example:
City = 153,270 (if user selected more than one city)
Sport = 12 (if user only selected one sport, can be "9,15")
So what I need is to search for multiple values on cities and sports in the same column, separated by commas, knowing that sometimes we can be searching only for one value, if user didn't input more than one.
My current query is:
SELECT * FROM events e
LEFT JOIN cities c ON e.event_city=c.city_id
LEFT JOIN sports s ON e.event_sport=s.sport_id
WHERE FIND_IN_SET('1CITY', e.event_city) AND FIND_IN_SET('1SPORT', e.event_sport)
;
Which is good to search for one city, but if the user searches for two or more, I don't have way to show it.
Can you please help me?
Thanks in advance.
When the user inputs multiple cities and/or sports, split it on commas, and then the query should look like:
SELECT * FROM events e
LEFT JOIN cities c on e.event_city = c.city_id
LEFT JOIN sports s ON e.event_sport = s.sport_id
WHERE (FIND_IN_SET('$city[0]', e.event_city) OR FIND_IN_SET('$city[1]', e.event_city) OR ...)
AND (FIND_IN_SET('$sport[0]', e.event_sport) OR FIND_IN_SET('$sport[1]', e.event_sport) OR ...)
Using PHP you can build up those OR expressions with:
$city_list = implode(' OR ', array_map(function($x) { return "FIND_IN_SET('$x', e.event_city)"; }, explode(',', $_POST['cities'])));
Do the same to make $sport_list, and then your SQL string would contain:
WHERE ($city_list) AND ($sport_list)
As you can see, this is really convoluted and inefficient, I recommend you normalize your schema as suggested in the comments.

Joining values of columns?

I have a query that fetches the list of user IDs and their corresponding user names on a board but from another table also gets a column that has a value (a name) on the row corresponding to the user ID if said user has changed their name. Using an outer join I got the three nicely displayed as in the following example of a few of the results:
member_id name dname_current
1 Blablabla1 blablabla2
2 Bla4444
3 RevZ
5 Herpaderp42
6 Lalalala
7 Kaboom
14 testtesttest21 Formula21
15 Alex Ethan
16 Bob Radio3
The SQL query to get the three columns is as follows:
SELECT
data_members.member_id,
data_members.name,
data_dnames_change.dname_current
FROM data_members LEFT OUTER JOIN data_dnames_change
ON data_members.member_id = data_dnames_change.dname_member_id
GROUP BY data_members.member_id
Is there a way to display this so that it merges the values which exist in the 'dname_current' column of that other table into the 'name' column, replacing any value that's already in the corresponding row of that column?
COALESCE() returns the first non-null value, so you can do the following to prefer dbname_current over data_members.name unless it is NULL:
SELECT
data_members.member_id,
COALESCE(data_dnames_change.dname_current, data_members.name) AS name
FROM data_members LEFT OUTER JOIN data_dnames_change
ON data_members.member_id = data_dnames_change.dname_member_id
GROUP BY data_members.member_id
Should return:
member_id name
1 blablabla2
2 Bla4444
3 RevZ
5 Herpaderp42
6 Lalalala
7 Kaboom
14 Formula21
15 Ethan
16 Radio3

GROUP BY does not remove duplicates

I have a watchlist system that I've coded, in the overview of the users' watchlist, they would see a list of records, however the list shows duplicates when in the database it only shows the exact, correct number.
I've tried GROUP BY watch.watch_id, GROUP BY rec.record_id, none of any types of group I've tried seems to remove duplicates. I'm not sure what I'm doing wrong.
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN members usr ON rec.user_id = usr.user_id
)
WHERE watch.user_id = 1
GROUP BY watch.watch_id
LIMIT 0, 25
The watchlist table looks like this:
+----------+---------+-----------+------------+
| watch_id | user_id | record_id | watch_date |
+----------+---------+-----------+------------+
| 13 | 1 | 22 | 1314038274 |
| 14 | 1 | 25 | 1314038995 |
+----------+---------+-----------+------------+
GROUP BY does not "remove duplicates". GROUP BY allows for aggregation. If all you want is to combine duplicated rows, use SELECT DISTINCT.
If you need to combine rows that are duplicate in some columns, use GROUP BY but you need to to specify what to do with the other columns. You can either omit them (by not listing them in the SELECT clause) or aggregate them (using functions like SUM, MIN, and AVG). For example:
SELECT watch.watch_id, COUNT(rec.street_number), MAX(watch.watch_date)
... GROUP by watch.watch_id
EDIT
The OP asked for some clarification.
Consider the "view" -- all the data put together by the FROMs and JOINs and the WHEREs -- call that V. There are two things you might want to do.
First, you might have completely duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 3
3 4 5
Then simply use DISTINCT
SELECT DISTINCT * FROM V;
a b c
- - -
1 2 3
3 4 5
Or, you might have partially duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 6
3 4 5
Those first two rows are "the same" in some sense, but clearly different in another sense (in particular, they would not be combined by SELECT DISTINCT). You have to decide how to combine them. You could discard column c as unimportant:
SELECT DISTINCT a,b FROM V;
a b
- -
1 2
3 4
Or you could perform some kind of aggregation on them. You could add them up:
SELECT a,b, SUM(c) "tot" FROM V GROUP BY a,b;
a b tot
- - ---
1 2 9
3 4 5
You could add pick the smallest value:
SELECT a,b, MIN(c) "first" FROM V GROUP BY a,b;
a b first
- - -----
1 2 3
3 4 5
Or you could take the mean (AVG), the standard deviation (STD), and any of a bunch of other functions that take a bunch of values for c and combine them into one.
What isn't really an option is just doing nothing. If you just list the ungrouped columns, the DBMS will either throw an error (Oracle does that -- the right choice, imo) or pick one value more or less at random (MySQL). But as Dr. Peart said, "When you choose not to decide, you still have made a choice."
While SELECT DISTINCT may indeed work in your case, it's important to note why what you have is not working.
You're selecting fields that are outside of the GROUP BY. Although MySQL allows this, the exact rows it returns for the non-GROUP BY fields is undefined.
If you wanted to do this with a GROUP BY try something more like the following:
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN est8_records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN est8_members usr ON rec.user_id = usr.user_id
)
WHERE watch.watch_id IN (
SELECT watch_id FROM watch WHERE user_id = 1
GROUP BY watch.watch_id)
LIMIT 0, 25
I Would never recommend using SELECT DISTINCT, it's really slow on big datasets.
Try using things like EXISTS.
You are grouping by watch.watch_id and you have two results, which have different watch IDs, so naturally they would not be grouped.
Also, from the results displayed they have different records. That looks like a perfectly valid expected results. If you are trying to only select distinct values, then you don't want ot GROUP, but you want to select by distinct values.
SELECT DISTINCT()...
If you say your watchlist table is unique, then one (or both) of the other tables either (a) has duplicates, or (b) is not unique by the key you are using.
To suppress duplicates in your results, either use DISTINCT as #Laykes says, or try
GROUP BY watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
It sort of sounds like you expect all 3 tables to be unique by their keys, though. If that is the case, you are simply masking some other problem with your SQL by trying to retrieve distinct values.

SQL GROUP BY - Multiple results in one column?

I am trying to perform a SELECT query using a GROUP BY clause, however I also need to access data from multiple rows and somehow concatenate it into a single column.
Here's what I have so far:
SELECT
COUNT(v.id) AS quantity,
vt.name AS name,
vt.cost AS cost,
vt.postage_cost AS postage_cost
FROM vouchers v
INNER JOIN voucher_types vt
ON v.type_id = vt.id
WHERE
v.order_id = 1 AND
v.sold = 1
GROUP BY vt.id
Which gives me the first four columns I need in the following format.
quantity | name | cost | postage_cost
2 X 5 1
2 Y 6 1
However, I would also like a fifth column to be displayed, showing all of the codes associated with each line of the order like this:
code
ABCD, EFGH
IJKL, MNOP
Where the comma separated values are pulled from the voucher table.
Is this possible?
Any advice would be appreciated.
Thanks
This is what GROUP_CONCAT does.
Assuming the column is called code you would just add ,GROUP_CONCAT(v.code) As Codes to your select list.