I have two tables that are very similar. For example, let's say that each row has two ID numbers, and a data value. The first ID number may occur once, twice, or not be included, and the second ID number is either 1 or -1. The data value is not important, but for the sake of this example, we'll say it's an integer. For each pair of ID numbers, there can only be one data value, so if I have a data point where the ID's are 10 and 1, there won't be another 10 and 1 row with a different data value. Similarly, in the other table, the data point with ID's 10 and 1 will be the same as in the first table. I want to be able to select the rows that exist in both tables for the sake of changing the data value in all of the rows that are in both. My command for MySQL so far is as follows:
SELECT DISTINCT * FROM schema.table1
WHERE EXISTS (SELECT * from schema.table1
WHERE schema.table1.ID1 = schema.table2.ID1
and schema.table1.ID2 = schema.table2.ID2);
I want to be able to have this code select all the rows in table1 that are also in table2, but allow me to edit table1 values.
I understand that by creating a union of the two tables, I can see the rows that exist in both tables, but would this allow me to make changes to the actual data values if I changed the values in the merged set? For example, if I did:
SELECT DISTINCT * FROM schema.table1 inner join schema.table2
WHERE schema.table1.ID1 = schema.table2.ID1
schema.table1.ID2 = schema.table2.ID2;
If I call UPDATE on the rows that I get from this query, would the actual values in table1/table2 be changed or is this union just created in dynamic memory and I would just be changing values that get deleted when the query is over?
Update as follows:
UPDATE table1 SET data = whateverupdate
WHERE ID1 IN (SELECT ID1 from schema.table1
WHERE schema.table1.ID1 = schema.table2.ID1
and schema.table1.ID2 = schema.table2.ID2);
In your inner select statement, you cannot do a select * you'll have to select a particular column. This should work because your inner select finds the row in question and feeds it to your update statement. That being said, your inner select has to return the right row you need, else, the wrong row will be updated. Hope this helps.
Related
I have two tables, with 2 PKs. Table 1 has 478 records. Field 1 is a unique ID for that table only. Table 1 field 2 is a ID (shared with table 2) and 3rd field is a category field. IDs from field 2 can be repeated within a table, but I cannot have ID+category twice.
I have a 2nd table, that contains 757 records. It has a ID column and a category column (such as table1) and I want to know which records from table 1 are included on table 2. By the moment I am just checking which IDs are included in both tables (I want to clean up the database so I can use an AND query to obtain ID + category)
My SQL query does not return the desired result. When I do
SELECT DISTINCT(table1.field1) FROM table1, table2 WHERE table1.ID = table2.ID;
I get all the results that do match, but, when I do the opposite
SELECT table1.field1 FROM table1, table2 WHERE table1.ID != table2.ID;
SQL gives all the rows from table 1, when, the expected outcome would be
total rows from table 1 - IDs that do match with the ones at table 2
I've tried to invert the order in which the query is displayed as:
SELECT table1.field1 FROM table1, table2 WHERE table2.ID != table1.ID;
But then a loop occurs and I get 36000+ results which is, of course, impossible (I imagine that checking a bigger record table against a smaller one makes the small one loop over and over, and seeing that I get the full table all the time, the loop is Xtimes478, hence the 36000+ results).
I have checked this matched/unmatched query using R (just for testing) and I got 170 matches (that I can obtain in SQL) and 308 "not coincident" results (170+308=478, so I imagine it makes sense even if I am using R instead of a proper relational database system)
How can I search for unmatched IDs in a query rather than checking for matched ones and substracting from total? How to get the 308 records that do not match?
If you want values in table 1 that are not in table 2, then use not exists or something similar:
select t1.*
from table1 t1
where not exists (select 1 from table2 where t2.id = t1.id);
I have two identical tables. I want to compare these two tables and getting the result from them. The condition are:
each record in TABLE1 grouped by TID will be compared to all records in TABLE2 grouped by their each TID.
if each grouped record in TABLE1 are to be discovered in TABLE2 (records in TABLE2 that grouped by each tid, too), as many as N (N is the user input variable), then that record will be inserted into new table.
For example, like the ss below, ITEM C-F-A grouped by TID 2 has 3 occurrences in table2, thus they will be inserted into new table:
I've already tried writing the code for this and it worked (vb.net), but the compiler takes ridiculous time to complete. The main cause is I'm processing a huge database.
The method I've done in program is populate the two table into 2d array. assigning value to array while comparing the two element with if clause.
Below is the 2d array that I've created:
But this method is really expensive, my real database on pic above is 1st 2d array has 2k records and 2nd 2d array has 800 records, and when I try to calculate the estimate time for compiling to completed, it showed a fantastic number, about 16 hours.. gosh!!
So I was wondering, whether this problem can be solved with mysql query,
or other method that is more effective than what I have done?
INSERT INTO tbl3
SELECT tbl1.TID, tbl1.ITEM
FROM tbl1
JOIN tbl2 ON tbl2.TID = tbl1.TID AND tbl2.ITEM = tbl1.ITEM
This will insert a record into tbl3 for each record in tbl1 that has a corresponding record in tbl2 identified by TID and ITEM.
This assumes that TID/ITEM is a unique index in both tbl1 and tbl2.
Ok, here's a wild, untested, guess (WUG).
The approach goes like this:
You need a list of TID's from table1. So you build a distinct list (inner most query).
You use that list in a where clause when selecting from table2, so that you only get rows that have TIDs in table1. You group that query, and use HAVING to then limit the rows to only those with a count > X.
Now you have a list of TIDs that match those in table1 and have more than X entries in table2. You select those rows.
Those are used a the source of an insert statement into table1.
The SQL might looks something like:
insert into table1
values (select * from table2 where tid in
(select tid, count(*) as cnt
from table2
where tid in (select distinct tid from table1)
group by tid
having cnt > 10)));
I doubt the syntax is correct (cant remember the exact syntax for an insert from a select), and make no claim it will work off the bat, but its what my first shot would be if I wanted to do it all in one query.
I have two identical tables. I want to compare these two tables and getting the result from them. The condition are:
each record in TABLE1 grouped by TID will be compared to all records in TABLE2 grouped by their each TID.
if each grouped record in TABLE1 are to be discovered in TABLE2 (records in TABLE2 that grouped by each tid, too), as many as N (N is the user input variable), then that record will be inserted into new table.
For example, like the ss below, ITEM C-F-A grouped by TID 2 has 3 occurrences in table2, thus they will be inserted into new table:
I've already tried writing the code for this and it worked (vb.net), but the compiler takes ridiculous time to complete. The main cause is I'm processing a huge database.
The method I've done in program is populate the two table into 2d array. assigning value to array while comparing the two element with if clause.
Below is the 2d array that I've created:
But this method is really expensive, my real database on pic above is 1st 2d array has 2k records and 2nd 2d array has 800 records, and when I try to calculate the estimate time for compiling to completed, it showed a fantastic number, about 16 hours.. gosh!!
So I was wondering, whether this problem can be solved with mysql query,
or other method that is more effective than what I have done?
INSERT INTO tbl3
SELECT tbl1.TID, tbl1.ITEM
FROM tbl1
JOIN tbl2 ON tbl2.TID = tbl1.TID AND tbl2.ITEM = tbl1.ITEM
This will insert a record into tbl3 for each record in tbl1 that has a corresponding record in tbl2 identified by TID and ITEM.
This assumes that TID/ITEM is a unique index in both tbl1 and tbl2.
Ok, here's a wild, untested, guess (WUG).
The approach goes like this:
You need a list of TID's from table1. So you build a distinct list (inner most query).
You use that list in a where clause when selecting from table2, so that you only get rows that have TIDs in table1. You group that query, and use HAVING to then limit the rows to only those with a count > X.
Now you have a list of TIDs that match those in table1 and have more than X entries in table2. You select those rows.
Those are used a the source of an insert statement into table1.
The SQL might looks something like:
insert into table1
values (select * from table2 where tid in
(select tid, count(*) as cnt
from table2
where tid in (select distinct tid from table1)
group by tid
having cnt > 10)));
I doubt the syntax is correct (cant remember the exact syntax for an insert from a select), and make no claim it will work off the bat, but its what my first shot would be if I wanted to do it all in one query.
I have a requirement to check what table a record belongs in out of 2 tables and set a variable depending on the returned table.
e.g. I have 2 tables (tbl_registered_users, tbl_unregistered_users). If I search for an email address that existed in tbl_registered_users I would like the query to return 'tbl_registered_users' so I can set a variable $whatTable = ... (for example).
I know I could do this with 2 queries or even 1 if I can guarantee the record will exist in at least one table however I would potentially like to use the query on 3/4/5/10 tables and on records that may not exist in any.
Thanks
You can use a UNION for that with a subquery:
SELECT *
FROM (
SELECT 'Registered' WhichTable, Email
FROM tbl_registered_users
UNION
SELECT 'UnRegistered', Email
FROM tbl_unregistered_users
) t
WHERE Email = 'emailaddress'
SQL Fiddle Demo
Using UNION ALL would yield a better performance, but it won't remove duplicates (in case you have duplicated data in either single table).
I have a table that compares the competitiveness of airline routes in United States. So, some of the fields in the table are id, route_id1, route_id2, airline_id1, airline_id2, sources_airport_id, and destination_airport_id.
This table is the result of self joining the routes table which consists of route maps.
But as the result, the table has somewhat duplicate records.
For example,
route 1 is competitive with route2 because they have the same source_airport and destination_airport but different airline_id. But I have two records comparing route1 to route2 and route2 to route1. They are the same comparison, but just ordered differently.
I've tried to fetch the duplicates by self-joining:
SELECT t1.*
FROM routes AS t1, routes AS t2
WHERE t1.route_id1 = t2.route_id2 AND t1.route_id2 = t2.route_id1
But this query just gets the same number of records in the table.
How do I get rid of the "duplicate" data?
Thanks in advance.
The problem is that you have no condition to separate t1 and t2. First you'll get duplicates where t1 and t2 are swapped. Secondly, if any rows have route_id1 = route_id2, you'll get those rows too, in both t1 and t2 of the result set.
The simplest way to get around this would be:
SELECT t1.* FROM routes AS t1, routes AS t2
WHERE t1.route_id1 = t2.route_id2 AND t1.route_id2 = t2.route_id1
AND t2.id > t1.id
The added criterion is that one row must have a larger id than the other. This means that t1, as returned, will always be the row with the lower id. You can of course replace it with a < or swap the parameters to get the row with the upper id.
That will get rid of most of the duplicates. If you have proper duplicates too in the database, those will create some duplicate rows in the result set of the above query. The reason is that a "duplicate" might be detected as being a "duplicate" of two different corresponding rows, which in turn are actual duplicates of each other.
in the select use the actual names of the fields and use the DISTINCT clause instead of using t1.* .
in the list of field make sure you do not include the airline_id as those are different and they would make your records not duplicates.
Have you tried using "SELECT DISTINCT t1.* FROM ..."?