I want to merge data from CSV into an already existing table that has data:
The table I already have has 4 columns, 2 of which are NULL. My CSV file has 3 columns: one is the Primary Key to be used to match the table and the other 2 have information to fill my already-existing table.
I want to be able to match the CSV's Primary Key to my table's Primary Key, and insert the information from the CSV into the matched row.
My table goes something like:
+--------+----------------+------+
| CardID | Topic | Number | Date |
+--------+----------------+------+
pk00001 | name | NULL | NULL |
+--------+----------------+------+
Whereas the CSV goes like:
"ID","Number","Date"
"pk00001","100001","1999/01/01"
The table doesn't have any constraints: it's all VARCHAR(256)
However, my table has more rows than the CSV, and the operation shouldn't touch those rows.
Is it possible to create such a query in MySQL?
MySQL doesn't have a function to do exactly what you're asking. There may be database clients that can do this for you, but your best bet would be to do it the old fashioned way.
First, load your csv file into a new table in your database. Most database clients will be able to do this for you, but if you want to do it from your shell, you can look at mysqlimport or LOAD DATA INFILE.
Once your data is in your table (you may want to index the key as well, depending on the size), you can do a simple update with a join to your new table.
UPDATE
original_table
SET
CardName = new.CardName,
CardText = new.CardText,
-- ...And the rest of your columns to update
FROM
original_table old
join imported_table new
on (old.CardID=new.CardID)
Then drop your csv table, and you're good to go.
A simple (but far more dangerous and less performant) alternative would be to open your csv file in Excel and use a formula to generate the UPDATE statements for you. For example:
="UPDATE original_table SET CardName='"&B1&"', CardText='"&C1&"' WHERE CardID='"&A1&"'"
Then just populate each row with this formula, and you'll have an update statement for every row you want to modify.
Please be aware that this Excel method is opening you up to SQL injections (both accidental, as if Card Text said "for John O'Brien", and potentially malicious if any of the csv content comes from an external source).
Related
The following is an example to better explain my scenario. My database table has following columns
Column -1: Operating_ID (which is the primary key)
Column -2: Name
Column -3: Phone
Column -4: Address
Column -5: Start Date
Column -6: End Date
The values for the columns 1,2,3,4 come from an extract and this extract is pushed to the database daily using SSIS data flow task
The values for the columns 5 and 6 are user inputted from a web applicxation and saved to the database.
Now in SSIS process instead of throwing violation of primary key error, i need to update Columns 2,3,4 if the primary key i.e column 1 already exists.
First i considered replace but that deletes the user inputted data columns 4,5.
I would like to keep the data in columns 4,5 and update columns 2,3,4 when column 1 already exists.
Do a LOOKUP for Operating_ID. Change that lookup from "FAIL ON NOT FOUND" to "REDIRECT ROWS TO NO MATCH"
If match not found, go to INSERT
If match found, go to UPDATE. You can run OLAP commands to update, but if it is a large set of data, you are better off putting into a table, and doing an UPDATE with a JOIN
This is what I would do. I would put all the data in a staging table. Then I woudl use a data flow to insert the new records and the source of that dataflow would be the staging table with a not exists clause referencing the prod table.
Then I would use an Execute SQL task in the control flow to update the data for existing rows.
Hi everyone so my question is this, So I have a file that reads in roughly 3000 rows of data by the local infile command. After which there is a trigger on the table that's inserted into that copies three columns from from the updated table and two columns from a table that exist in the database already(if this is unclear to what I mean the structures are coming). From there only combinations that have unique glNumbers will be entered into the processed table. This takes over a minute and half normally. I find this pretty long, I was wondering if this is normal for what I'm doing(can't believe that's true) or is there a way to optimize the queries so it goes faster?
Tables that are inserted to are labeled the first three letters of each month. Here is the default structure.
RawData Structure
| idjan | glNumber | journel | invoiceNumber | date | JT | debit | credit | descriptionDetail | totalDebit | totalCredit |
(sorry) for the poor format there isn't a really good way to do this it seems)
After Insert Trigger Query
delete from processedjan;
insert into processedjan(glNumber,debit,credit,bucket1,bucket2)
select a.glNumber, a.totalDebit, a.totalCredit, b.bucket1, b.bucket2
from jan a inner join bucketinformation b on a.glNumber = b.glNumber
group by glNumber;
Processed Datatable Structure
| glNumber | bucket1| bucket2| credit | debit |
Also I guess it helps to know the bucket 1 and bucket 2 come from another table where its matched against the glNumber. That table is roughly 800 rows with three columns for the glNumber and the two buckets.
While postgresql has statement level triggers, mysql only has row level triggers. From the mysql reference:
A trigger is defined to activate when a statement inserts, updates, or
deletes rows in the associated table. These row operations are trigger
events. For example, rows can be inserted by INSERT or LOAD DATA
statements, and an insert trigger activates for each inserted row.
So while you are managing to load 3000 rows in one operation, unfortunately 3000 more queries are executed by the triggers. But the complex nature of your transaction sounds like you might actually be performing 2-3 queries per row. That's the real reason for the slow down.
You can speed things up by disabling the trigger and carrying out a INSERT .. SELECT after the load data in file. You can automate it with a small script.
I'm in the process of migrating a Ruby on Rails application from MySQL to Postgres. Is there a recommended way to keep the deleted data, like all the deleted records (their IDs at least) from MySQL?
In testing a dump-and-restore didn't seem to keep deleted records.
Also, in the event that I manage to keep the records where they are, what'll happen with the blank ones in Postgres? Will they be skipped over or used?
Example
Say I have a user with an ID of 101 and I've deleted users up to 100. I need 101 to stay at 101.
So you don't want to reassign the IDs assigned to records where you generated keys.
That should be the default in any sane migration. When you copy the data rows over - say, exporting from MySQL with SELECT ... INTO OUTFILE and importing into PostgreSQL with COPY tablename FROM 'filename.csv' WITH (FORMAT CSV), the IDs won't change.
All you'll need to do is to set the next ID to be generated in the sequence on the PostgreSQL table afterwards. So, say you have the table:
CREATE TABLE users
(
id serial primary key,
name text not null,
...
);
and you've just copied a user with id = 101 into it.
You'll now just assign a new value to the key generation sequence for the table, e.g.:
SELECT setval('users_id_seq', (SELECT max(id) FROM users)+1);
To learn more about sequences and key generation in PostgreSQL, see SERIAL in the numeric types documentation, the documentation for CREATE SEQUENCE, the docs for setval, etc. The default name for a key generation sequence is tablename_columnname_seq.
My need is very simple. How to add a serial number or id(probably an auto_increment value I guess) for each row being inserted in a MySQL table?
To be more specific, I've a CSV file from which I store the separate field values into the database using LOAD DATA. What all I need is, if there are totally 2000 rows being loaded from the CSV file, each row has to be automatically inserted with a unique serial number like 1,2,3,etc...
Please help me with the exact query that I'll be needing rather than a syntax.
Thanks in anticipation.
Add AUTO_INCREMENT to your id column (create it if you need to). It will then do this automatically for you:
ALTER TABLE tablename MODIFY id INTEGER NOT NULL AUTO_INCREMENT;
I have a table mileage_registrants which keeps users registration data. It has a field department which is NULL for all users now.
The most difficult thing is that I need to do an automatic record match according to user_id and insert their department info from a .csv file into table department. There are one thousand records so it is horrible for me to insert them by hand.
Is there anyway for me to get this done quickly?
The easiesit is to use phpMyAdmin to import the CSV file, that will create a dummy table like TABLE 45 and from there you add some indexes, and you will write an update query that will join to the real table and update the relevant columns.