Removing duplicates from a different table - Criteria based - ms-access

Morning
I need to remove duplicates based on 3 fields matching. I have put together a line of code based on one criteria but need to add more.
My code:
DoCmd.RunSQL ("DELETE tbl_Added.*, tbl_Added.[NUM_CUST]FROM tbl_Added WHERE (((tbl_Added.[NUM_CUST]) In (Select tbl_Removed.[NUM_CUST] from tbl_Removed)));")
Extra field names I am looking to add are:
- NeedType
- CrackID
Any help is appreciated
UPDATE:
I am also trying the below VBA to no avail
DoCmd.RunSQL ("DELETE tbl_Added.*, tbl_Added.[NUM_CUST],tbl_Added.[ID_CRAC] FROM tbl_Added WHERE (((tbl_Added.[NUM_CUST],tbl_Added.[ID_CRAC]) In (Select tbl_Removed.[NUM_CUST], tbl_Removed.[ID_CRAC] from tbl_Removed)));")

The following SQL will return the NUM_CUST field for records which are duplicated based on equality of the three fields you mention:
SELECT NUM_CUST
FROM tbl_added
GROUP BY NUM_CUST, NEEDTYPE, CRACKID
HAVING Count(NUM_CUST) > 1
From here you will need to decide whether you would like to delete all such records from your table, or delete only the duplicate records, retaining a single instance of each.

Related

Joining Two Unrelated Tables to Find Record(s)

I am working on creating a query that looks to see renewed records from a previous record that still currently has an outstanding balance.
Now I have manage to make a query that has the child record look back at the parent record. Basically what this first query will do is go through and look at the records, that have been renewed, and then compare it to the previous parent records expiration date.
Select BL1.RECORDID as RECIID, BL2.EXPIRATIONDATE as ParentRecordExpirationDate,
(DATEDIFF(DD,GETDATE(),BL2.ExpirationDate)*-1)as DAYSOVER
from BLLICENSE BL1 
JOIN RECORD BL2 on BL1.RECORDPARENTID = BL2.RECORDID
Where CONVERT(date,BL2.EXPIRATIONDATE) < GETDATE()
AND BL1.RECORDSTATUSID IN('Renewed',
'Issued', 'In Review',
'On Hold', 'Submitted',
'Fees Due')
Now that I have that first checkbox taken care of I am trying to figure out how I can get the query to incorporate the below query so that I can do the renewal check and also check to see if their is an open invoice/balance due.
Select DISTINCT(BLL.RECORDID)
from CAINVOICE CAI 
JOIN CAINVOICEFEE CAIF on CAIF.CAINVOICEID = CAI.CAINVOICEID
JOIN CACOMPUTEDFEE CACF on CACF.CACOMPUTEDFEEID = CAIF.CACOMPUTEDFEEID
JOIN RECORD BLLF on BLLF.CACOMPUTEDFEEID = CACF.CACOMPUTEDFEEID
JOIN RECORD BLL on BLL.RECORDID = BLLF.RECORDID
AND CAI.CASTATUSID in (1,2,3,6,7,8))
I tried to use a Union of the two but sadly that did not work as "All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists."
I was going to try and join the tables but using that Distinct option on query #2 I feel like causes that idea to fail.

Access SQL Append/Update Distinct

I'm trying to update a query I have, Access SQL (for now). It is currently a make table, but I want to change it into an update query, but only if the information isn't already in the end table.
Here is the current "Make Table" Query:
SELECT DISTINCT dbo_Us_postal_codes.[City]
,dbo_Us_postal_codes.[State] INTO dbo_ActiveZipCodes
FROM dbo_General_Client_List INNER JOIN dbo_Us_postal_codes ON
dbo_General_Client_List.ZipCode = dbo_Us_postal_codes.[Zip Code]
WHERE (((dbo_General_Client_List.ActiveCompany)=Yes));
US Postal Codes is a list of all US zipcodes, and their corresponding City, State.
What I want to do, is as General_Client_List is updated, run this query and add to ActiveZipCodes if the city/state combo isn't already there.
Add dbo_ActiveZipCodes to your query window. Join the City and state fields from the source table with an Left join to the matching fields in the dbo_ActiveZipCodes. Make the criteria for both Is Null.
Obviously test it as as select query first.

Find matches from 2 tables, change other field?

I have a database with two separate tables. One table (T1) has 400+ values in its only column, while the other (T2) has 14,000+ rows and multiple columns.
What I need to do is to compare the column in T1 to one column in T2. For every matching value, I need to update a different value in the same row in T2.
I know this is pretty easy and straight-forward, but I'm new to MySQL and trying to get this down before I go back to other things. Thanks a ton in advance!
EDIT: Here's what I've been trying to no avail..
UPDATE `apollo`.`Source`, `apollo`.`Bottom`
SET `Source`.`CaptureInterval` = '12'
WHERE `Bottom`.`URL` LIKE `Source`.`SourceID`
EDIT 2:
A little clarification:
apollo.Bottom and apollo.Source are the two tables.
apollo.Bottom is the table with one column and 400 records in that column.
I want to compare Bottom.URL to Source.SourceID. If they match, I want to update Source.CaptureInterval to 12.
You can use the following query to update. But the performance will be much better if you index URL and SourceID columns in both tables as they are being used in the WHERE clause.
UPDATE `apollo`.`Source`, `apollo`.`Bottom`
SET `Source`.`CaptureInterval` = '12'
WHERE `Bottom`.`URL` = `Source`.`SourceID`
You can join the two tables together and do a multiple table update.
Start with something like this:
UPDATE `apollo`.`Source`
INNER JOIN `apollo`.`Bottom` ON `apollo`.`Bottom`.`URL` = `apollo`.`Source`.`SourceID`
SET `apollo`.`Source`.`CaptureInterval` = '12';

Copy data from table to another table MySQL

I'm having a problem updating newly added records that don't have a timestamp to another identical table in the same database. Here is my query
INSERT INTO mlscopy
SELECT * FROM mls_cvrmls AS parent
LEFT JOIN mlscopy AS child
ON child.listing_listnum != parent.listing_listnum
The parent table is updated by a separate company every morning, and unfortunately there are no timestamps(datetime) to relate the newly added records.
My child table(the copy) is needed for google geocoding since their morning udpates drop and create the parent table each morning.
I made a structure and data copy of the parent table, then deleted the last ten records to test my query. But I keep receiving the error Column count doesn't match value count at row 1.
Can't think of what I'm doing wrong here.
Here are the column table names
listing_listing
listing_listnum
listing_propertytype
listing_status
listing_listingpublicid
listing_agentname
listing_agentlist
listing_listingbrokercode
listing_officelist
listing_lo
listing_lo00
listing_lo01
listing_lo02
listing_lo03
listing_lo04
listing_lo05
listing_agentcolist
listing_agentcolist00
listing_officecolist
listing_area
listing_listdate
listing_listprice
listing_streetnumdisplay
listing_streetdirectional
listing_streetname
listing_streettype
listing_countyid
listing_zipcode
listing_zipplus4
listing_postoffice
listing_subdivision
listing_neighborhood
listing_schoolelem
listing_schooljunior
listing_schoolhigh
listing_pud
listing_lotdim
listing_acres
listing_zoning
listing_sqfttotal
listing_sqftunfinished
listing_rooms
listing_bedrooms
listing_stories
listing_basement
listing_garage
listing_garagecap
listing_fireplaces
listing_pool
listing_bathsfull
listing_bathshalf
listing_bathstotal
listing_bathsfullbsmt
listing_bathsfulllevel1
listing_bathsfulllevel2
listing_bathsfulllevel3
listing_bathshalfbsmt
listing_bathshalflevel1
listing_bathshalflevel2
listing_bathshalflevel3
listing_roombed2desc
listing_roombed2length
listing_roombed2level
listing_roombed2width
listing_roombed3desc
listing_roombed3length
listing_roombed3level
listing_roombed3width
listing_roombed4desc
listing_roombed4length
listing_roombed4level
listing_roombed4width
listing_roombed5desc
listing_roombed5length
listing_roombed5level
listing_roombed5width
listing_roomdiningdesc
listing_roomdininglength
listing_roomdininglevel
listing_roomdiningwidth
listing_roomfamilydesc
listing_roomfamilylength
listing_roomfamilylevel
listing_roomfamilywidth
listing_roomfloridadesc
listing_roomfloridalength
listing_roomfloridalevel
listing_roomfloridawidth
listing_roomfoyerdesc
listing_roomfoyerlength
listing_roomfoyerlevel
listing_roomfoyerwidth
listing_roomgreatdesc
listing_roomgreatlength
listing_roomgreatlevel
listing_roomgreatwidth
listing_roomkitchendesc
listing_roomkitchenlength
listing_roomkitchenlevel
listing_roomkitchenwidth
listing_roomlaundrydesc
listing_roomlaundrylength
listing_roomlaundrylevel
listing_roomlaundrywidth
listing_roomlivingdesc
listing_roomlivinglength
listing_roomlivinglevel
listing_roomlivingwidth
listing_roommasterbrdesc
listing_roommasterbrlength
listing_roommasterbrlevel
listing_roommasterbrwidth
listing_roomofficedesc
listing_roomofficelength
listing_roomofficelevel
listing_roomofficewidth
listing_roomother1desc
listing_roomother1length
listing_roomother1level
listing_roomother1width
listing_roomother1
listing_roomother2desc
listing_roomother2length
listing_roomother2level
listing_roomother2width
listing_roomother2
listing_roomrecdesc
listing_roomreclength
listing_roomreclevel
listing_roomrecwidth
listing_handicap
listing_yearbuilt
listing_lotdesc
listing_construction
listing_watertype
listing_roof
listing_attic
listing_style
listing_floors
listing_fireplacedesc
listing_structure
listing_walltype
listing_basedesc
listing_appliances
listing_interior
listing_exterior
listing_amenities
listing_pooldesc
listing_fence
listing_porch
listing_heatsrc
listing_heatsystem
listing_coolsystem
listing_waterheater
listing_watersewer
listing_parking
listing_garagedesc
listing_handicapdesc
listing_feedesc
listing_restrictions
listing_terms
listing_assocfeeincludes
listing_building
listing_possession
listing_farmtype
listing_ownerdesc
listing_irrigationsrc
listing_taxyear
listing_taxamount
listing_directions
listing_remarks
listing_virtualtourlink
listing_vowavmyn
listing_vowcommyn
listing_addressdisplayyn
listing_f174
listing_proptype
listing_lat
listing_lon
listing_photo1
listing_listofficename
listing_vtoururl
listing_multiphotoflag
id <- primary key
If you only run the SELECT statement from your INSERT you will see that your select returns all the columns of both mls_cvrmls AND mlscopy.
You probably need:
INSERT INTO mlscopy
SELECT parent.* FROM mls_cvrmls AS parent
LEFT JOIN mlscopy AS child
ON child.listing_listnum != parent.listing_listnum
EDIT
I am not sure your JOIN condition is correct. This kind of condition will probably return many records you did not wish for. Each record in mls_cvrmls has many (many!) records in mlscopy which satisfy the condition.
As an example, let's assume the 2 tables have 3 columns, and you want to add all records from parent to child, as long as they don't exists there anymore.
INSERT INTO mlscopy (listing_listing, listing_listnum, listing_propertytype)
SELECT parent.listing_listing,
parent.listing_listnum,
parent.listing_propertytype // (more columns...)
FROM mls_cvrmls AS parent
LEFT JOIN mlscopy AS child
ON child.listing_listnum = parent.listing_listnum
WHERE child.listing_listnum IS NULL
Couple of things here.
The error message is because "select *" gives you all columns from all tables in the query. That is, each row has all the columns from mls_cvrmls PLUS all the columns from mlscopy. This is not going to be suitable for inserting into mlscopy because it's going to have many extra columns. If the two tables have all the same columns, then they will all be doubled.
Your WHERE clause is unlikely to be correct. This is saying that for every row in parent, you want all the rows in child that don't match. Think this through. Suppose parent has listing_listnum values of 1, 2, and 4, and child has value 1, 4, and 5. So the pairs 1/1 and 4/4 will be excluded. But you'll get the pairs 1/4, 1/5, 2/4, 2/5, 4/1, and 4/5. I think what you really want here is to just get the records from parent that aren't found on child at all, like in this example, just 2. So what you probably really want is a "not exists" query.
I'm not entirely clear from your description, but you say you want to "update newly added records", but then you do an INSERT. Do you want to update existing records, or do you want to insert new records?
So assuming that what you want to do is find records that are in mls_cvrmls but not in mlscopy and insert these records, I think the correct query would be more like -- and your field list is long so I'll just pick a few sample fields to make the point:
insert into mlscopy (listing_listing, listing_listnum, listing_propertytype, listing_status
listing_listingpublicid, listing_agentname)
select listing_listing, listing_listnum, listing_propertytype, listing_status
listing_listingpublicid, listing_agentname
from mls_cvrmls
where not exists (select 1 from mlscopy where mlscopy.listing_listnum=mls_cvrmls.listing_listnum)
As Icarus says, you should list all columns explicitly. Among the many reasons for this, even if the two tables have all the same fields, if they do not occur in the same order, "insert into mlscopy select *" will not work, because a SQL engine does not match names, it just takes the fields in each table in the order they occur. This may seem like a pain if the list is long, but trust me, after you've been burned a few times by mysterious problems, you'll want to list the fields explicitly.
And just a side note: Why do you prefix all the column names with "listing_" ? This just makes more to type every time you use the table. If you have another table that has names that would otherwise be the same and you need to distinguish, you can always prefix with the table name, like "mls_cvrmls.propertytype".
Get used to list-all-your-columns and you will save yourself some headaches like this and in the future your code won't break if they add more columns.
Change your sql statement to something like this
INSERT INTO mlscopy (col1,col2,col3...coln)
SELECT col1,col2,col3....coln FROM mls_cvrmls AS parent
LEFT JOIN mlscopy AS child
ON child.listing_listnum != parent.listing_listnum
The two tables have different structures, and you're not specifying WHICH fields would be copied across. If you must have different structures, you'll have to explicitly state WHICH fields should be copied. MySQL isn't smart enough to figure that sort of mismatch out on its own, so it complains and aborts.

Record-doubling problem on a simple left join

I'm running this query:
CREATE TABLE
SELECT people.*, Sheet1.department
FROM people LEFT JOIN Sheet1 ON people.depno = Sheet1.depno
On a set of tables detailing employee records.
The goal is to create a new table that has all the "people" data, plus a human-readable department name. Simple, right?
The problem is that each record in the resulting table appears to be duplicated exactly (with literally every field being the same), turning a roughly 23,000-record table into a roughly 46,000-record table. I say "roughly" because it's not an exact doubling -- there's a difference of about a hundred records.
Some details: The "people" table contains 15 fields, including the "depno" field, which is an integer indicating department.
The "Sheet1" table is, as one would guess, a table generated from an imported xls file containing two fields: the shared "depno" and a new "department" (the latter being a verbose department name corresponding to the depno in question). There are 44 records in the "Sheet1" table.
Thanks in advance for any pointers on this. Let me know what other information you can use from me.
Update: Here's the code I ended up using, from my response to Johan (thanks again to everyone who worked on this):
CREATE TABLE morebetter
SELECT people.*, Sheet1.department FROM people
LEFT JOIN Sheet1 ON people.depno = Sheet1.depno
GROUP BY id
Sounds like the Sheet1.depno field isn't unique?
The people.depno is not unique, that's why you're getting the doubling.
Change the SELECT part to
SELECT DISTINCT people.*, Sheet1.department
FROM people LEFT JOIN Sheet1 ON people.depno = Sheet1.depno
This will eliminate duplicate rows.
In MySQL you can also write
SELECT people.*, Sheet1.department
FROM people LEFT JOIN Sheet1 ON people.depno = Sheet1.depno
GROUP BY people.depno
Which works slightly different.
The first query eliminates rows with duplicate output, the second query eliminates records with duplicate people.depno, even if people.depno does not appear in the output.
I like the second form, because it makes explicit which duplicate you're trying to eliminate and you don't need to tweak the output.
It's also slightly faster in executing time.
***Warning***
The group by version will eliminate any double people.depno it finds, but if the other fields in the select are not identical it will just choose one at random!
In other words. If the outcome of the select distinct is different from the group by version that means that MySQL is silently dropping non-duplicate rows.
This may or may not be what you want!
In order to be safe, do a group by on all fields that you care about!
If the group by is on a unique key than it's pointless to include further fields from the same table as that unique key.