I have to import loads of files into a database, the problem is, with time it got more columns.
The files are all insert-lines from SQLite, but i need them in MySQL, SQLIte doesn't provide column-names in their sql files, so the MySQL-script crashes when there are more or less columns as in the insert statement.
Is there a solution for this? Maybe over a join?
The new added columns are in the end, so the first are ALWAYS the same.
Is there any possibility to insert the sql-file in a temporary table, then make a join on an empty table (or 1 ghost record) to get the right amount of columns, and then do a insert on each line from that table to the table i want to have the data in?
Files looks like:
INSERT into theTable Values (1,1,Text,2913, txt,);
And if columns were added the file is like
INSERT into theTable Values (1,1,Text,2913, txt,added-Text);
Related
We are using MySQL Workbench 6.2 to migrate data. Our source table and destination table have different numbers of columns. Say, the source table has 16 and destination table has 18.
When we migrate, Workbench says
Error wrong number of columns in target DB.
Do the source and destination table columns numbers to be same? Or is there some way to tell Workbench default or derived values for the destination table columns?
Maybe you are doing a SELECT INTO query like this in your migrate script/step, that depends on the number of columns in the database rather than the actual column names.
SELECT *
INTO newtable [IN externaldb]
FROM table1;
Or
insert into items_ver
select * from items where item_id=2;
Somewhere along the line, your databases have changed in the number of rows available, so you cant do either of those
You can either specicy the columns like this:
insert into items_ver(column1, column2, column3)
select column1, column2, column3 from items where item_id=2;
Or add the 2 missing columns from the one database into the other, and ensure that they all match up.
In MySQL Workbench migration wizard (or schema transfer wizard) there is no way to do that. All columns in source and target databases must match.
I have a MySQL table with data in it, called current. I import new data into a table called temp. Both these tables have auto_increment ID columns.
The table structure is not known in advance for the data import (there are various file structures that I need to import), event though the structure of current and temp will be the same.
Because of the unknown column configuration of the import files (tables created on the fly for each different file configuration), I cannot select specific columns, hence I would have to select all columns, less the ID column from table temp and import the result into table current.
I need to import into temp first, as the files can be large, and I need to do processing on the data before saving into the database, so I do not want to do any operations on the current table before I have imported the separate file first.
The ID column from the temp table prevents the insert into the current table due to duplicate key.
So I need something like this:
INSERT INTO `current`
(SELECT **ALL COLUMNS EXCEPT ID** FROM `temp`)
Any ideas on how to write the section ALL COLUMNS EXCEPT ID? Is this even possible?
There's no * except foo. You'll have to list all of the columns, except the ones you don't want.
SELECT field1, field2, ..., fieldN ...
You could do it via dynamic scripting, e.g. query information_schema for the field names, build up the field list as a string, prepare that string as query, execute it, etc...
I am trying to insert a large dump of custom CMS's news section into WordPress. Unfortunalety, columns doesn't match. Some of them - yeah, sure - like title, date or content. But WordPress required a lot of columns which this dump doesn't have. Is there a way to either omit this count on insert or filling it with dummy (preferably blank) data? Search and replace (even with regular expressions) won't do here, since it is really huge file and simple 'find' takes a lot of time.
You stated that you were given a "table." If it included a schema, create the table and insert the data. Otherwise, create the table based on the data columns and insert your data. This is considered your staging table. You can now write a SELECT statement to select the data from your staging table that will be inserted into your destination table. You will finally prepend an INSERT statement to insert your selected data. It should look something like this:
INSERT INTO destinationTable (fruits, animals, numbers, plants)
SELECT fruits, animals, numbers, '' FROM stagingTable
If you did not have plants in your staging table, you would simply SELECT '' or SELECT NULL for that column. You can then simply drop your staging table.
Assuming the answer to my clarification is yes, you can insert multiple rows by delimiting them with a column.
INSERT INTO table (col1, col2)
VALUES (row1val1, row1val2),
(row2val1,row2val2)
I got a table with a normal setup of auto inc. ids. Some of the rows have been deleted so the ID list could look something like this:
(1, 2, 3, 5, 8, ...)
Then, from another source (Edit: Another source = NOT in a database) I have this array:
(1, 3, 4, 5, 7, 8)
I'm looking for a query I can use on the database to get the list of ID:s NOT in the table from the array I have. Which would be:
(4, 7)
Does such exist? My solution right now is either creating a temporary table so the command "WHERE table.id IS NULL" works, or probably worse, using the PHP function array_diff to see what's missing after having retrieved all the ids from table.
Since the list of ids are closing in on millions or rows I'm eager to find the best solution.
Thank you!
/Thomas
Edit 2:
My main application is a rather easy table which is populated by a lot of rows. This application is administrated using a browser and I'm using PHP as the intepreter for the code.
Everything in this table is to be exported to another system (which is 3rd party product) and there's yet no way of doing this besides manually using the import function in that program. There's also possible to insert new rows in the other system, although the agreed routing is to never ever do this.
The problem is then that my system cannot be 100 % sure that the user did everything correct from when he/she pressed the "export" key. Or, that no rows has ever been created in the other system.
From the other system I can get a CSV-file out where all the rows that system has. So, by comparing the CSV file and my table I can see if:
* There are any rows missing in the other system that should have been imported
* If someone has created rows in the other system
The problem isn't "solving it". It's making the best solution to is since there are so much data in the rows.
Thanks again!
/Thomas
We can use MYSQL not in option.
SELECT id
FROM table_one
WHERE id NOT IN ( SELECT id FROM table_two )
Edited
If you are getting the source from a csv file then you can simply have to put these values directly like:
I am assuming that the CSV are like 1,2,3,...,n
SELECT id
FROM table_one
WHERE id NOT IN ( 1,2,3,...,n );
EDIT 2
Or If you want to select the other way around then you can use mysqlimport to import data in temporary table in MySQL Database and retrieve the result and delete the table.
Like:
Create table
CREATE TABLE my_temp_table(
ids INT,
);
load .csv file
LOAD DATA LOCAL INFILE 'yourIDs.csv' INTO TABLE my_temp_table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(ids);
Selecting records
SELECT ids FROM my_temp_table
WHERE ids NOT IN ( SELECT id FROM table_one )
dropping table
DROP TABLE IF EXISTS my_temp_table
What about using a left join ; something like this :
select second_table.id
from second_table
left join first_table on first_table.id = second_table.id
where first_table.is is null
You could also go with a sub-query ; depending on the situation, it might, or might not, be faster, though :
select second_table.id
from second_table
where second_table.id not in (
select first_table.id
from first_table
)
Or with a not exists :
select second_table.id
from second_table
where not exists (
select 1
from first_table
where first_table.id = second_table.id
)
The function you are looking for is NOT IN (an alias for <> ALL)
The MYSQL documentation:
http://dev.mysql.com/doc/refman/5.0/en/all-subqueries.html
An Example of its use:
http://www.roseindia.net/sql/mysql-example/not-in.shtml
Enjoy!
The problem is that T1 could have a million rows or ten million rows, and that number could change, so you don't know how many rows your comparison table, T2, the one that has no gaps, should have, for doing a WHERE NOT EXISTS or a LEFT JOIN testing for NULL.
But the question is, why do you care if there are missing values? I submit that, when an application is properly architected, it should not matter if there are gaps in an autoincrementing key sequence. Even an application where gaps do matter, such as a check-register, should not be using an autoincrenting primary key as a synonym for the check number.
Care to elaborate on your application requirement?
OK, I've read your edits/elaboration. Syncrhonizing two databases where the second is not supposed to insert any new rows, but might do so, sounds like a problem waiting to happen.
Neither approach suggested above (WHERE NOT EXISTS or LEFT JOIN) is air-tight and neither is a way to guarantee logical integrity between the two systems. They will not let you know which system created a row in situations where both tables contain a row with the same id. You're focusing on gaps now, but another problem is duplicate ids.
For example, if both tables have a row with id 13887, you cannot assume that database1 created the row. It could have been inserted into database2, and then database1 could insert a new row using that same id. You would have to compare all column values to ascertain that the rows are the same or not.
I'd suggest therefore that you also explore GUID as a replacement for autoincrementing integers. You cannot prevent database2 from inserting rows, but at least with GUIDs you won't run into a problem where the second database has inserted a row and assigned it a primary key value that your first database might also use, resulting in two different rows with the same id. CreationDateTime and LastUpdateDateTime columns would also be useful.
However, a proper solution, if it is available to you, is to maintain just one database and give users remote access to it, for example, via a web interface. That would eliminate the mess and complication of replication/synchronization issues.
If a remote-access web-interface is not feasible, perhaps you could make one of the databases read-only? Or does database2 have to make updates to the rows? Perhaps you could deny insert privilege? What database engine are you using?
I have the same problem: I have a list of values from the user, and I want to find the subset that does not exist in anther table. I did it in oracle by building a pseudo-table in the select statement Here's a way to do it in Oracle. Try it in MySQL without the "from dual":
-- find ids from user (1,2,3) that *don't* exist in my person table
-- build a pseudo table and join it with my person table
select pseudo.id from (
select '1' as id from dual
union select '2' as id from dual
union select '3' as id from dual
) pseudo
left join person
on person.person_id = pseudo.id
where person.person_id is null
I have a table containing about 500 000 rows. Once a day, I will try to synchronize this table with an external API. Most of the times, there are few- or no changes made since last update. My question is basically how should I construct my MySQL query for best performance? I have thought about using insert ignore, but it doesn't feel like the best way to go since only a few rows will be inserted and MySQL must loop through all rows in the table. I have also thought about using LOAD_DATA_INFILE to insert all rows in a temporary table and then select the rows not already in my original table, and then remove the temporary table. Maybe someone else has a better suggestion?
Thank you in advance!
I usually use a temporary table and the LOAD DATA INFILE bulk loader. The bulk loader is much more efficient that trying to insert records using a dynamically created query.
If you index your permanent tables with appropriate unique keys that relate to the keys in the API then you should find the the INSERT and UPDATE statements work pretty fast. An example of the type of INSERT query I use is as follows:
INSERT INTO keywords(api_adgroup_id, api_keyword_id, keyword_text, match_type, status)
SELECT a.api_id, a.keyword_text, a.match_type, a.status
FROM tmp_keywords a LEFT JOIN keywords b ON a.api_adgroup_id = b.api_adgroup_id AND a.api_keyword_id = b.api_keyword_id
WHERE b.api_keyword_id IS NULL
In this example, I perform an OUTER JOIN on the keywords table to check if it already exists. Only new rows in the temporary table where there isn't a match in the main table (the api_keyword_id in the keywords table is NULL) are inserted.
Also note that in this example I need to use both the ad group id AND the keyword id to uniquely identify the keyword because the AdWords API gives the same keyword/match type combination the same id when it exists in more than one ad group.