I am playing around with ms-access (MS-Office Professional Plus 2013) trying to figure out if I have duplicate rows before I merge one table into another table. I want to collect the rows that are duplicates and give an error with the duplicates before the merge happens. I have two scenarios to cover. The first scenario is duplicates on a single column. The second scenario is duplicates on two columns. Any help on the first scenario would be appreciated.
Scenario 1:
The two tables have the exact same column structure so to keep it simple I will use the following table structure. ( I simply added two tables inside access and run the query to figure out the correct syntax.)
Duplicates based upon one column:
Table1 Table2
ID ID
1 1
2 3
Running the query:
Select ID from Table1
Union ALL
Select ID from Table2
group by ID having count(*) > 1
The result set is always the records from the first select statement. In other words it always returns Id=1 and Id=2. If you change Table1 to Table2 the result set is always from table2. If I change "Union all" to union same results. I tried changing the ID column names as well as change the type to be number instead of auto. Any idea what am I doing wrong?
Scenario 2: I know what the value should be in the second column so it is hard-coded. I added this here to show access appears to work as expected in this scenario but not in scenario 1.
Duplicates based upon two columns:
Table1 Table2
ID Field1 ID Field1
1 abc 1 abc
2 bcd 3 abc
Running the query below works as expected. The row with ID=1 is only returned.
select ID, Field1 from Table1 where Field1 = 'abc'
union all
select ID, Field1 from Table2 where Field1 = 'abc'
group by ID, Field1 having count(*) > 1
The GROUP BY is only being applied to the second table. You need to do the UNION ALL first, and then the GROUP BY and HAVING on a SELECT from the combined results.
Not Access specific, but something like this works:
SELECT id FROM
(
SELECT id FROM a
UNION ALL
SELECT id FROM b
) AS c
GROUP BY id HAVING COUNT(*) > 1
My preferred way to do things like that is to use the build in Query Wizard:
Query Wizard, Find Duplicates Query Wizard
Let Access create the SQL statement for you and then you can modify it and/or move it into code.
Related
I have two tables, with 2 PKs. Table 1 has 478 records. Field 1 is a unique ID for that table only. Table 1 field 2 is a ID (shared with table 2) and 3rd field is a category field. IDs from field 2 can be repeated within a table, but I cannot have ID+category twice.
I have a 2nd table, that contains 757 records. It has a ID column and a category column (such as table1) and I want to know which records from table 1 are included on table 2. By the moment I am just checking which IDs are included in both tables (I want to clean up the database so I can use an AND query to obtain ID + category)
My SQL query does not return the desired result. When I do
SELECT DISTINCT(table1.field1) FROM table1, table2 WHERE table1.ID = table2.ID;
I get all the results that do match, but, when I do the opposite
SELECT table1.field1 FROM table1, table2 WHERE table1.ID != table2.ID;
SQL gives all the rows from table 1, when, the expected outcome would be
total rows from table 1 - IDs that do match with the ones at table 2
I've tried to invert the order in which the query is displayed as:
SELECT table1.field1 FROM table1, table2 WHERE table2.ID != table1.ID;
But then a loop occurs and I get 36000+ results which is, of course, impossible (I imagine that checking a bigger record table against a smaller one makes the small one loop over and over, and seeing that I get the full table all the time, the loop is Xtimes478, hence the 36000+ results).
I have checked this matched/unmatched query using R (just for testing) and I got 170 matches (that I can obtain in SQL) and 308 "not coincident" results (170+308=478, so I imagine it makes sense even if I am using R instead of a proper relational database system)
How can I search for unmatched IDs in a query rather than checking for matched ones and substracting from total? How to get the 308 records that do not match?
If you want values in table 1 that are not in table 2, then use not exists or something similar:
select t1.*
from table1 t1
where not exists (select 1 from table2 where t2.id = t1.id);
I have two tables that almost have identical columns. The first table contains the "current" state of a particular record and the second table contains all the previous stats of that records (it's a history table). The second table has a FK to the first table.
I'd like to query both tables so I get the entire records history, including its current state in one result. I don't think a JOIN is what I'm trying to do as that "joins" multiple tables "horizontally" (one or more columns of one table combined with one or more columns of another table to produce a result that includes columns from both tables). Rather, I'm trying to "join"(???) the tables "vertically" (meaning, no columns are getting added to the result, just that the results from both tables are falling under the same columns in the result set).
Not exactly sure if what I'm expressing make sense -- or if it's possible in MySQL.
To accomplish this, you could use a UNION between two SELECT statements. I would also suggest selecting from a derived table in the following manner so that you can sort by columns in your result set. Suppose we wanted to combine results from the following two queries:
SELECT FieldA, FieldB FROM table1;
SELECT FieldX, FieldY FROM table2;
We could join these with a UNION statement as follows:
SELECT Field1, Field2 FROM (
SELECT FieldA AS `Field1`, FieldB AS `Field2` FROM table1
UNION SELECT FieldX AS `Field1`, FieldY AS `Field2` FROM table2)
AS `derived_table`
ORDER BY Field1 ASC, Field2 DESC
In this example, I have selected from table1 and table2 fields which are similar, but not identically named, sharing the same data type. They are matched up using aliases (e.g., FieldA in table1 and FieldX in table2 both map to Field1 in the result set, etc.).
If each table has the same column names, field aliasing is not required, and the query becomes simpler.
Note: In MySQL it is necessary to name derived tables, even if the name given is not intended to be used.
UNION.
Select colA, colB From TblA
UNION
Select colA, colB From TblB
Your after a left join on the first table. That will make the right side I'd he their a number (exists in both) or null (exists only in the left table )
You want
select lhs.* , rhs.id from lhs left join rhs using(Id)
I have the following query:
SELECT DISTINCT field1, field2 FROM table1 WHERE something = 'x' ORDER BY time_of_insertion DESC LIMIT 2
What I want is to get the last two inserted rows of the table, one with a certain field1 ('1' for example) and another with another field1 ('2' for example). So, it's not really the last two rows, what I want is the last one from one certain field1 and the last one from a different field1. When I tried the query, DISTINCT was not respected. Any ideas on why and on how to solve this?
I think a UNION will do what you want. Select a single row with your exact criteria, and then combine its resultset with that of another SELECT statement that selected the other row you wanted.
It's hard to be concrete and definitive when your example is rather generalized, but something along the lines of this:
SELECT field1, field2 FROM table1 WHERE something = 'x' ORDER BY time_of_insertion DESC LIMIT 1
UNION
SELECT field1, field2 FROM table1 WHERE something = 'y' ORDER BY time_of_insertion DESC LIMIT 1;
Notice it's one statement: only the second SELECT has a semi-colon terminating it.
I have a PHP coma separated string of ids like 1,2,3. I have a MySQL table which has id column
Table task_comments:
id
--
1
2
I want to get all the ids in the list which are not in the table. Here i would like to get the 3 as result.
Currently I am building a query like the following in PHP and it is working.
SELECT id FROM (
SELECT 1 id FROM DUAL
UNION ALL
SELECT 2 id FROM DUAL
UNION ALL
SELECT 3 id FROM DUAL
) a WHERE id NOT IN (SELECT id FROM task_comments);
I don't think this is a good way to do this. I want to know if there is a better method to do this, because if the list is big the union list will grow.
Thanks
PS: I can post the PHP code used to make the query also if needed.
PPS: I would like to know if there is better MySQL Query.
Your string separated values in PHP:
$my_ids = "1,2,3";
SQL query in PHP:
$query = "SELECT id FROM task_comments WHERE id IN ($my_ids)";
This will return the id values from database which is 1 or 2 or 3.
Then you can simply compare it.
What you do is already the way to do it. There is no other way to create sets to reason over than the (pretty ugly) union construct. You can leave of the "from dual"s and replace the union alls with plain unions to make it shorter - although with a very large list union all might be the more performant solution as it does not sort for duplicate deletion.
SELECT id FROM (
SELECT 1 id
UNION
SELECT 2 id
UNION
SELECT 3 id
) a WHERE id NOT IN (SELECT id FROM tasklist);
You might also want to have a look at temporary tables. That way you could create the set you need in a more natural way without hitting the limits of the large SQL involving unions.
CREATE TEMPORARY TABLE temp_table (id int);
INSERT INTO temp_table VALUES((1),(2),(3)); -- or just repeat for as many values as you might have from your app (batch insert?)
SELECT id FROM temp_table
WHERE id NOT IN (SELECT id FROM tasklist);
See more on temporary tables here.
You can do it like that: select your ids:
SELECT id FROM task_comments WHERE id IN (1,2,3)
(here (1,2,3) is built from your array data - for example, via implode() function)
Then, in a cycle, fetch your ids into an array and then use array_diff() to find absent values.
May be you should first save all the distinct id's from the table that are present in your string of id's -
SELECT DISTINCT id FROM task_comments WHERE id IN (1,2,3..)
and then compare the two.
I have two tables with identical structures handling distinct data. I want to merge them, add a text field indicating where the data for that row came from, and order by a common field.
TABLE1
ID|NAME|YEAR
1,'peter',2008
2,'edward',2010
TABLE2
ID|NAME|YEAR
1,'compadre',2009
2,'vika',2011
DRAFT of query ( obviously is erroneous )
select * from TABLE1 JOIN TABLE2 order by YEAR asc
expected result:
1,'peter','iamfromTABLE1',2008
1,'compadre','iamfromTABLE2',2009
2,'edward','iamfromTABLE1',2010
2,'vika','iamfromTABLE2',2011
I know I can do this using PHP/MySQL, but is there not a more elegant way like the "One Simple Query".
Use a Union query and literals:
SELECT ID, Name, 'iamfromTABLE1' as indicator, Year
FROM Table1
UNION
SELECT ID, Name, 'iamfromTABLE2' as indicator, Year
FROM Table2
ORDER BY Year
EDIT: as indicator added on recommendation of iim.hlk