Is there any way to keep always the same value in two fields of different tables?
You could use triggers so that if one of the fields is change, the other is synchronized to match.
It's usually best not to store a value twice. Instead you can store the value in just one of the tables and when you query you can join the two tables together on a foreign key so that you have access to values from both tables at the same time:
SELECT table1.foo, table2.bar
FROM table1
JOIN table2 ON table1.table2_id = table2.id
If you store the value twice it is called denormalization. This can lead to problems if the values ever get out-of-sync for one reason or another. Sometimes it is advantageous to denormalize to improve performance, but a single join is very fast so unless you have measured the performance and found it to be too slow, I'd advise against doing this.
Any reason why you couldn't normalize the design of your database so that you don't have the same data twice and don't have to worry about stuff like this anymore?
In case you can't change the design take a look at triggers
Why would you ever want to do this?
If one attribute of one entity is always the same as some attribute of another related entity, then you have a redundant data model.
Instead of trying to synchronize the attributes, refer to one attribute. Use a join to join the first table to the second, then get the value of the attribute from one table. E.g, if you currently have this:
TableA.foo should always equal TableB.bar
drop column TableA.foo, and do this:
select A.*, B.bar as foo
from TableA A
join TableB B on (B.foreign_key = A.key);
INSERT INTO TABLE A (FieldInA) VALUES ('X')
INSERT INTO TABLE B (FieldInB) VALUES ('X')
Then simply never delete, nor update, these table rows, and voilĂ , you have always the same value in two fields of different tables.
Related
I'm trying to do a lookup of a field form one table, to update values in another table. I know this can be done easily with a query, but is there a way to do it in a table?
Basically all I'm trying to do is an excel VLOOKUP but in Access. Where if I change the lookup value in my destination table, the value returned will be updated.
You need to join the tables in a query and then set the values of a field in one table to the field in the second table based on the join fields (hope that made sense).
So, for example, if you have:
Table1 with KeyField1 and DescriptionField1
Table2 with KeyField2 and DescriptionField2
If you want to update DescriptionField1 with the values in DescriptionField2 where the KeyField values match you use this SQL:
UPDATE Table1 INNER JOIN Table2 ON Table1.KeyField1 = Table2.KeyField2
SET Table1.DescriptionField1 = Table2.DescriptionField2
The other way is to use a look up field - select Lookup Wizard in the Data Type.
If taking this route I'd advise the ten commandments of Access tables :)
Thou shalt never allow thy users to see or edit tables directly, but
only through forms and thou shalt abhor the use of "Lookup Fields"
which art the creation of the Evil One.
http://access.mvps.org/access/tencommandments.htm
I wonder if there is a way to have an SQL table update itself dynamically.
I have table1 and table2 and I need to create a table3 using UNION and WHERE both tables ID column (PK) match but the issue is that I do not want to always create the same table3 instead if I add a record to the tables , let it appear automatically appear in table 3..
Any advise how it is done if possible or where should I look into?
Thanks
Table3 shouldn't be a table, it should be a view.
From the perspective of any given SELECT query and any consuming application looking at the data, a view can be treated like any other table. The fact that it's not a table is entirely transparent in those cases.
What a view does is compile and store a query which examines other tables, and presents the results of that query in a table structure. So any time you select from the view, you're dynamically selecting from the current state of the tables it examines.
Here's my problem...
I need to be able to check which items in a list of about 1,000 items (the needles) are in a fairly large table containing about ~500,000 rows (the haystack).
My question is, what's the best/fastest/most efficient way to do this?
I know that I can create a SQL statement like this:
SELECT id FROM haystack WHERE id IN (ID1, ID2, ID3, ..., IDn)
(assuming ID1, ID2, ID3, ..., IDn are the the needles.)
However, I'm not sure how performant or wise that is if the needles list contains 1,000+ items.
I also know that, if my needles list was in a table of it's own, I could join that table to the haystack table. However, the needles list isn't already in a table.
So - I guess another possible option is to put those 1,000 items into a temporary table and then join that to the haystack table. If that's the best option - then what's the best way to quickly load 1,000 items into a temporary table? (E.g., 1,000 individual INSERT statements? Insert all rows in a single INSERT statment? Is there a limit on how long an INSERT statement can be?)
A third possible option - write the needles list to a text file, then use LOAD DATA INFILE to load that into a (temporary) table, then join the temp table to the haystack table. But, wow... that seems like a lot of overhead.
Is there another, better option?
For what it's worth, the context of this is PHP, and I'm getting the needles list from a JSON web-service response, and using MySQLi for the database interaction.
According to this benchmark, it is faster in your case to use a temporary table and the JOIN method.
I am not sure though that's not a premature optimisation. You should perform your own benchmark and determine if the added complexity deserves the effort. I would recommend going with the simple IN method and only start to optimise when you detect a performance issue.
Just remember that according to the manual:
The number of values in the IN list is only limited by the max_allowed_packet value.
I think your query SELECT id FROM haystack WHERE id IN (ID1, ID2, ID3, ..., IDn) would be fine. I have a very similar use case where I have millions of "needles" and I pass them to the IN clause in blocks of 10,000 via PDO with no issues.
I would add that the column you are checking should be indexed. In my case it is the primary key of the table.
If the needles are going to be used to query the haystack frequently, you absolutely want to create a new table. For this example, I'm going to assume that the needles are int values and will label them as id in the table needle.
First, you need to create the table
CREATE TABLE needle (
id INT(11) PRIMARY KEY
)
Next, you need to insert the values
INSERT INTO needle (id)
VALUES (ID1),
(ID2),
...,
(IDn)
Now, you can query haystack using a join.
SELECT h.id
FROM haystack h
JOIN needle n
ON h.id = n.id
If this is an infrequent query and the number of needles won't grow beyond the 1,000, using the IN clause won't hurt your performance greatly.
1) My question is that, when i have two large table which cannot be alter because of there sizes.
Now i have to join them on a common field and now compare one field from both the table which is having same data but one's data type is int and anothers is varchar.
I know we can done this easily, but when table have millions of record then comparing between two different data type is slow down, how can i make it fast.
2) my similar 2nd question is that when i have to join two tables on some field like id and which is in different data type in both the table. like one is int and another is char.....how can i join this two table because i cannot wait for many days.
(One solution i have tried is to create new table as an abstract(by in file out file) of old . While i have now changed the data type from char to int during create table and then took the in file)
If anybody have any other solution, please share
Make sure the conversion happens on the first table in the join, that way:
the conversion only happens only once per row
indexes can be used to join with the second table
for example:
select *
from table1
join table2 on table2.intcol = cast(table1.varcharcol as signed)
This sample query will use an index on table2.intcol (if one exists) to join the two tables.
Yes cast can be used for changing data types
select *
from table1
join table2 on table2.intcol = cast(col as int)
I got a table with a normal setup of auto inc. ids. Some of the rows have been deleted so the ID list could look something like this:
(1, 2, 3, 5, 8, ...)
Then, from another source (Edit: Another source = NOT in a database) I have this array:
(1, 3, 4, 5, 7, 8)
I'm looking for a query I can use on the database to get the list of ID:s NOT in the table from the array I have. Which would be:
(4, 7)
Does such exist? My solution right now is either creating a temporary table so the command "WHERE table.id IS NULL" works, or probably worse, using the PHP function array_diff to see what's missing after having retrieved all the ids from table.
Since the list of ids are closing in on millions or rows I'm eager to find the best solution.
Thank you!
/Thomas
Edit 2:
My main application is a rather easy table which is populated by a lot of rows. This application is administrated using a browser and I'm using PHP as the intepreter for the code.
Everything in this table is to be exported to another system (which is 3rd party product) and there's yet no way of doing this besides manually using the import function in that program. There's also possible to insert new rows in the other system, although the agreed routing is to never ever do this.
The problem is then that my system cannot be 100 % sure that the user did everything correct from when he/she pressed the "export" key. Or, that no rows has ever been created in the other system.
From the other system I can get a CSV-file out where all the rows that system has. So, by comparing the CSV file and my table I can see if:
* There are any rows missing in the other system that should have been imported
* If someone has created rows in the other system
The problem isn't "solving it". It's making the best solution to is since there are so much data in the rows.
Thanks again!
/Thomas
We can use MYSQL not in option.
SELECT id
FROM table_one
WHERE id NOT IN ( SELECT id FROM table_two )
Edited
If you are getting the source from a csv file then you can simply have to put these values directly like:
I am assuming that the CSV are like 1,2,3,...,n
SELECT id
FROM table_one
WHERE id NOT IN ( 1,2,3,...,n );
EDIT 2
Or If you want to select the other way around then you can use mysqlimport to import data in temporary table in MySQL Database and retrieve the result and delete the table.
Like:
Create table
CREATE TABLE my_temp_table(
ids INT,
);
load .csv file
LOAD DATA LOCAL INFILE 'yourIDs.csv' INTO TABLE my_temp_table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(ids);
Selecting records
SELECT ids FROM my_temp_table
WHERE ids NOT IN ( SELECT id FROM table_one )
dropping table
DROP TABLE IF EXISTS my_temp_table
What about using a left join ; something like this :
select second_table.id
from second_table
left join first_table on first_table.id = second_table.id
where first_table.is is null
You could also go with a sub-query ; depending on the situation, it might, or might not, be faster, though :
select second_table.id
from second_table
where second_table.id not in (
select first_table.id
from first_table
)
Or with a not exists :
select second_table.id
from second_table
where not exists (
select 1
from first_table
where first_table.id = second_table.id
)
The function you are looking for is NOT IN (an alias for <> ALL)
The MYSQL documentation:
http://dev.mysql.com/doc/refman/5.0/en/all-subqueries.html
An Example of its use:
http://www.roseindia.net/sql/mysql-example/not-in.shtml
Enjoy!
The problem is that T1 could have a million rows or ten million rows, and that number could change, so you don't know how many rows your comparison table, T2, the one that has no gaps, should have, for doing a WHERE NOT EXISTS or a LEFT JOIN testing for NULL.
But the question is, why do you care if there are missing values? I submit that, when an application is properly architected, it should not matter if there are gaps in an autoincrementing key sequence. Even an application where gaps do matter, such as a check-register, should not be using an autoincrenting primary key as a synonym for the check number.
Care to elaborate on your application requirement?
OK, I've read your edits/elaboration. Syncrhonizing two databases where the second is not supposed to insert any new rows, but might do so, sounds like a problem waiting to happen.
Neither approach suggested above (WHERE NOT EXISTS or LEFT JOIN) is air-tight and neither is a way to guarantee logical integrity between the two systems. They will not let you know which system created a row in situations where both tables contain a row with the same id. You're focusing on gaps now, but another problem is duplicate ids.
For example, if both tables have a row with id 13887, you cannot assume that database1 created the row. It could have been inserted into database2, and then database1 could insert a new row using that same id. You would have to compare all column values to ascertain that the rows are the same or not.
I'd suggest therefore that you also explore GUID as a replacement for autoincrementing integers. You cannot prevent database2 from inserting rows, but at least with GUIDs you won't run into a problem where the second database has inserted a row and assigned it a primary key value that your first database might also use, resulting in two different rows with the same id. CreationDateTime and LastUpdateDateTime columns would also be useful.
However, a proper solution, if it is available to you, is to maintain just one database and give users remote access to it, for example, via a web interface. That would eliminate the mess and complication of replication/synchronization issues.
If a remote-access web-interface is not feasible, perhaps you could make one of the databases read-only? Or does database2 have to make updates to the rows? Perhaps you could deny insert privilege? What database engine are you using?
I have the same problem: I have a list of values from the user, and I want to find the subset that does not exist in anther table. I did it in oracle by building a pseudo-table in the select statement Here's a way to do it in Oracle. Try it in MySQL without the "from dual":
-- find ids from user (1,2,3) that *don't* exist in my person table
-- build a pseudo table and join it with my person table
select pseudo.id from (
select '1' as id from dual
union select '2' as id from dual
union select '3' as id from dual
) pseudo
left join person
on person.person_id = pseudo.id
where person.person_id is null