SQL deduping help? - mysql

I'm sure there are a ton of ways to do this, but right now I'm struggling to find the way that will work properly given the data.
I basically have a table containing duplicates which have additional fields tied to them and source details that take priority over others. So basically I added a "priority" field to my table which I then updated based on source priority. I now need to select the distinct records to populate my "unique" records table (which I'll then apply unique key constraint to prevent this from happening again on the field required!)....
So I have basically, something like this:
Select phone, carrier, src, priority
from dbo.mytable
So basically I need to pull distinct on phone in order of priority (1,2,3,4, etc), and basically pull the rest of the other data along with it and still keep UNIQUE on phone.
I've tried a few things using sub-select from the same table with min(priority) value, but outcome still doesn't seem to make sense. Any help would be greatly appreciated. Thanks!
EDIT I need to dedupe from the same table, but I can populate a new table with the uniques if needed based on my select statement to pull the uniques. This is in MSSQL, but figured anyone with SQL knowledge could answer.
For example, let's say I have the following rows:
5556667777, ATT, source1, 1
5556667777, ATT, source2, 2
5556667777, ATT, source3, 3
I need to pull uniques based on priority 1 first..... the problem is, I need to remove any all other dupes from the table based on the priority order without ending up with the same phone number twice again. Make sense?

So you're saying the combination (phone, priority) is unique in the existing table, and you want to select the rows for which the priority is smallest?
SELECT mytable.phone, mytable.carrier, mytable.src
FROM mytable
INNER JOIN (
SELECT phone, MIN(priority) AS minpriority
FROM mytable
GROUP BY phone
) AS minphone
ON mytable.phone = minphone.phone
AND mytable.priority = minphone.minpriority

Related

How can I combine these two tables so that I can sort with information on each table, but not get duplicate answers?

I have two tables. The first is named master_list. It has these fields: master_id, item_id, name, img, item_code, and length. My second table is named types_join. It has these fields: master_id and type_id. (There is a third table, but it is not being used in the queries. It is more for reference.) I need to be able to combine these two tables so that I can sift the results to only show certain ones but part of the information to sift is on one table and the other part is on the other one. I don't want duplicate answers.
For example say I only want items that have a type_id of 3 and a length of 18.
When I use
SELECT * FROM master_list LEFT JOIN types_join ON master_list.master_id=types_join.master_id WHERE types_join.type_id = 3 AND master_list.length = 18"
it finds the same thing twice.
How can I query this so I won't get duplicate answers?
Here are the samples from my tables and the result I am getting.
This is what I get with an INNER JOIN:
BTW, master_id and name both only have unique information on the master_list table. However, the types_join table does use the master_id multiple times later on, but not for Lye. That is why I know it is duplicating information.
If you want unique rows from master_list, use exists:
SELECT ml.*
FROM master_list ml
WHERE ml.length = 18 AND
EXISTS (SELECT 1
FROM types_join tj
WHERE ml.master_id = tj.master_id AND tj.type_id = 3
);
Any duplicates you get will be duplicates in master_list. If you want to remove them, you need to provide more information -- I would recommend a new question.
Thank you for the data. But as you can see enter link description here, there is nothing wrong with your query.
Have you tried create an unique index over master_id, just to make sure that you do not have duplicated rows?
CREATE UNIQUE INDEX MyMasterUnique
ON master_list(master_id);

Is there way to add multiple values to 1 ID in access

I have a table that has Act ID, and another table that has Act ID, percentage complete. This can have multiple entries for different days. I need the sum of the percentage added for the Act ID on the first tableZA.108381.080
First table
Act ID Percent Date
ZA.108381.110 Total from 2 table
ZA.108381.120
ZA.108476.020
ZA.108381.110 25% 5/25/19
ZA.108381.110 75 6/1/19
ZA.108381.120
ZA.108476.020
This would be generally considering not good practice. Your primary key should be uniquely identifiable for that specific table, and any other data related to that key should be stored in separate columns.
However since an answer is not a place for a lecture, if you want to store multiple values in you Act ID column, I would suggest changing your primary key to something more generic "RowID". Then using vba to insert multiple values into this field.
However changing the primary key late in a databases life may cause alot of issues or be difficult. So good luck
Storing calculated values in a table has the disadvantage that these entries may become outdated as the other table is updated. It is preferable to query the tables on the fly to always get uptodate results
SELECT A.ActID, SUM(B.Percentage) AS SumPercent
FROM
table1 A
LEFT JOIN table2 B
ON A.ActID = B.ActID
GROUP BY A.ActID
ORDER BY A.ActID
This query allows you to add additional columns from the first table. If you only need the ActID from table 1, then you can simplify the query, and instead take it from table 2:
SELECT ActID, SUM(Percentage) AS SumPercent
FROM table2
GROUP BY ActID
ORDER BY ActID
If you have spaces other other special characters in a column or table name, you must escape it with []. E.g. [Act ID].
Do not change the IDs in the table. If you want to have the result displayed as the ID merged with the sum, change the query to
SELECT A.ActID & "." & Format(SUM(B.Percentage), "0.000") AS Result
FROM ...
See also: SQL GROUP BY Statement (w3schools)

MYSQL checking if a record exists with the specified child-records

Okay, let's say I have a table called rooms:
It only has one column: ID
I also have another table called items_in_rooms with columns:
roomId, itemName, itemColor
Whenever a room-record is inserted a bunch of records is also inserted into items_in_rooms linked to the row-record, specifying what items are in that room.
The problem is that when a room-record along with its items, I need to first verify if a room with those exact items don't already exist.
How can this be done?
One way of course would be to first fetch all room-records along with all their items then look through them until it has been verified there isn't already an exact copy in the database and then do insertion if it's unique.
But this sounds a bit ineffective to me, especially as the tables grows very large so I was hoping there's a way to have MYSQL do the checking.
One way I came up with was to do something like this:
SELECT roomId FROM(
SELECT rooms.id roomId, GROUP_CONCAT(
CONCAT_WS(',',itemName,itemColor) ORDER BY itemName,itemColor SEPARATOR '/'
) roomContents
FROM items_in_rooms
JOIN rooms ON roomId=rooms.id
WHERE snapshotDate='$dateString'
GROUP BY roomId
) concatenatedRoomContents
WHERE roomContents='bed,white/carpet,red/chair,brown'
Essentially this will make MYSQL concatenate each room into a string, then compare them to the "input-string" in the WHERE-clause. Obviously the input-string would have to be ordered the same way as how MYSQL orders the rows before concatenating (itemName,itemColor).
While this worked for be it felt very dirty. Also, it initially caused some problems when I had added a decimal-field as MYSQL always includes every decimal-digit when stringifying so 1 for instance could be "1.000"
while PHP which I'm using by default stringifies it to "1". I solved this using number_format() making it include the right amount of decimal-digits.
Now I've noticed I've got some duplicates in the table again so there's some other gotcha I need to find, but I was just wondering if there's maybe a more clever way?
This is how it can be done. The following query returns the id of the room if such a room exists(it has exactly those items, no more, no less).
SELECT roomId FROM (
SELECT roomId,count(*) numMatchedItems
FROM items_in_rooms WHERE (itemName,itemColor)
IN (('bed','white'),('carpet','red'),('chair','brown'))
GROUP BY roomId
) matches
WHERE numMatchedItems=3
Thanks, CBroe.

How to query 3 mysql tables and return matching results (with one to many relationships)?

I am trying to query a database to return some matching records and can't work out how to do it in the most efficient way. I have a TUsers table, a TJobsOffered table and a TJobsRequested table. The UserID is the primary key for the TUsers table and is used within the Job tables in a one to many relationship.
Ultimately I want to run a query that returns a list of all matching users based on a particular UserID (eg a matching user is one that has at least one matching record in both tables, eg if UserA has jobid 999 listed in TJobsOffered and UserB has jobid 999 listed in TJobsRequested then this is a match).
In order to try and get my head around it i've simplified it down a lot and am trying to match the records based on the jobids for the user in question, eg:
SELECT DISTINCT TJobsOffered.FUserID FROM TJobsOffered, TJobsRequested
WHERE TJobsOffered.FUserID=TJobsRequested.FUserID AND
(TJobsRequested.FJobID='12' OR TJobsRequested.FJobID='30') AND
(TJobsOffered.FJobID='86' OR TJobsOffered.FJobID='5')
This seems to work fine and returns the correct results however when I introduce the TUsers table (so I can access user information) it starts returning incorrect results. I can't work out why the following query doesn't return the same results as the one listed above as surely it's still matching the same information just with a different connector (or is the one above effectively many to many and the one below 2 sets of one to many comparisons)?
SELECT DISTINCT TUsers.Fid, TUsers.FName FROM TUsers, TJobsOffered, TJobsRequested
WHERE TUsers.Fid=TJobsRequested.FUserID AND TUsers.Fid=TJobsOffered.FUserID AND
(TJobsRequested.FJobID='12' OR TJobsRequested.FJobID='30') AND
(TJobsOffered.FJobID='86' OR TJobsOffered.FJobID='5')
If anyone could explain where i'm going wrong with the second query and how you should incorporate TUsers then that would be greatly appreciated as I can't get my head around the join. If you are able to give me any pointers as to how I can do this all in one query by just passing the user id in then that would be massively appreciated as well! :)
Thanks so much,
Dave
Try this
SELECT DISTINCT TJobsOffered.FUserID , TUsers.FName
FROM TJobsOffered
INNER JOIN TJobsRequested ON TJobsOffered.FUserID=TJobsRequested.FUserID
LEFT JOIN TUsers ON TUsers.Fid=TJobsOffered.FUserID
WHERE
(TJobsRequested.FJobID (12,30) AND
(TJobsOffered.FJobID IN (86 ,5)
You need to add "AND TJobsOffered.FUserID=TJobsRequested.FUserID" to your where clause.

Query Construction Combining UNION and LEAST

I have an email subscription table and a user table. I need to combine the two to get all the emails, since it's possible to create an account without subscribing and vice versa. Easy enough so far:
SELECT email FROM emailcapture
UNION
SELECT email FROM cpnc_User
Now, this gets me the complete list of all emails. For each email on this combined list, I need to add an extra piece of information: the created date. Both emailcapture and cpnc_User tables have a "created" field. The created date should be the earlier of the two dates, if both dates exist, or, if only one exists and the other is NULL, it should just be the one that exists.
How can I change this query so that it returns this extra piece of information, the created date? Keep in mind that the new query I seek should return exactly the same number of rows as the query above.
Thanks,
Jonah
SELECT i.email, MIN(i.date_creation) FROM
(SELECT email, date_creation FROM emailcapture
UNION ALL
SELECT email, date_creation FROM cpnc_User) as InnerTable i
GROUP BY i.email