Pentaho Kettle - How to produce update query based on result set? - mysql

I come up with Insert query generator from pentaho spoon that writes input data to a text file in the form of a set of SQL statements.
I wonder if there is any method that can be used similar to this but generate update query based on input.

Well, if you need to update a table based on some key columns compared to your stream, you may use the Insert/Update step.
The downside is that it won't generate the statements in a file, it will execute the updates or inserts based on that comparison and that's all.
Can you give more details about your scenario? We may work things out together.
Why do you need a file with UPDATE statements?
Can't we connect to the database and run the updates right away?

Sure use the "Dynamic SQL Row" step.

Related

using INFORMATION_SCHEMA.COLUMNS in a TRIGGER to log changes to _all_ columns?

TRIGGERs can be used to log changes to individual DB columns as described at https://stackoverflow.com/a/779250/569976 but that technique requires you have an IF statement for each column. It's not a huge issue if you're just interested in changes to one column BUT if you're interested in changes to all columns it becomes a bit more unweildy.
I can get all the column names of a table, dynamically, by querying the INFORMATION_SCHEMA.COLUMNS table. My question is... can I use that to dynamically reference the column names? Like in the TRIGGER you'd do OLD.columnName <> NEW.columnName but I don't think you can really make a column name dynamic like that.
In PHP you could use variable variables. eg. $obj->$var. But if MySQL has anything remotely similar that'd be news to me.
Any ideas? Or am I just going to go with the old fashioned approach of writing an IF statement for each of the 100s of columns this table has?
The trigger can only reference identifiers directly. You can't use a variable or an expression to name an identifier.
That would require dynamic SQL with PREPARE and EXECUTE so you could have the statement parsed at runtime from a string, but you can't PREPARE a new statement inside a trigger, because the trigger is already executing in the context of the currently executing statement.
The simplest solution is to write a trigger that references each column directly, with as many IF statements as there are columns in the table (I wonder why you have hundreds of columns in your table; that sounds like a different problem of bad design).
The comments above mention a binary log parser. Debezium is an example of an open-source binlog parser.
MySQL also supports an audit plugin architecture, but frankly the existing implementations of audit plugins are pretty clumsy.
https://www.mysql.com/products/enterprise/audit.html
https://mariadb.com/resources/blog/introducing-the-mariadb-audit-plugin/
https://github.com/mcafee/mysql-audit

MySql filtering data from insert query

I'll start by saying I'm new to MySql, at least in the level of my question. :)
I got a data logger with a high data output and I'm interested in saving the data to a database.
I've been wondering if it's possible to filter the INSERT query in the database itself, so it will save only data if certain values appear in the query.
As #Akina mentioned, you can use CHECK CONSTRAINT and INSERT IGNORE. However, It is better not trying to insert any problematic data, since it will slow down insert operation.
you need to filter data before insert operation. You may want to consider writing custom log shipper or if you have option you can use logstash

how create a sql insert query from php select query

My problem:
I am trying to delete some important rows from multiple tables, around 20 tables, I am afraid that deleting the rows might cause some problem(I am not the creator of this website), so before deleting the rows I am selecting the rows and writing it into a file. But I write it as an array.
Is there a way to write it as an sql insert statement, to a file, so that it would be easy for me to update the database if there is some problem.
For me it would be easier to store the information in a way that would allow me to understand the data. Then IF I need it, I could mutate the data into an INSERT statement.
I strongly encourage you as a professional software engineer, to try not to solve a problem that you might encounter, until you DO encounter it.
If you use phpMyAdmin you can run a query that selects those rows, then click the Export link under Query results operations:
In the next page, select Custom - display all possible options and SQL Format:
Then, further down the page, select data under Format specific options:
And then press Go. You will be prompted to Save or Open a file, which will include the appropriate INSERT statements to recreate the data from those rows.

Sybase to MySQL automatic exportation

I have two databases: Sybase and MySQL. I need to export records to MySql when these are inserted in Sybase or export in some scheduled event.
I've tried with output statement but this can not be used in triggers or procedures.
Any suggestion to solve this problem?
(disclaimer, I've done similar things previously, but by no means would I consider the answer below the state of the art - just one possible approach
google around something like 'cross-database replication' or 'cross rdbms replication' to see who's done this before.
).
I would first of all see if you can't score an ETL tool do the job without too much work. There are free open source ones and even things like Microsoft SSIS might work on non-MS databases.
If not, I would split this into different steps.
Find an appropriate Sybase output command that exports a subset of rows from one or more tables. By subset I mean you need to be able to add a WHERE clause, not just do a full table dump.
Use an appropriate MySQL import script/command to load the data gotten out of step #1. You may need to cycle back and forth between the 2 till you have something that works manually.
Write a Sybase trigger to insert lookup keys into a to-export table. You want to store at least the tablename & source Sybase table's keys for each inserted row. Use column names like key1_char, key2_char, not the actual column names, that makes it easier to extend to other source tables as needed. keep trigger processing as light as possible. What about updates btw?
Write a scheduled batch on Sybase side to run step #1 for the rows flagged in #3.
Write a scheduled batch on Mysql to import ,via #2, the results of #4. Or kick it off from #4.
Another approach is to do the #3 flagging bit as needed, but use to drive one scheduled batch that SELECTs data from Sybase and INSERTs it into mysql directly.
You'll have to pick up the data from Sybase's SELECT and bind it manually to the INSERT of mysql. But you probably get finer control over whats going on and you don't have to juggle 2 batches. That's what I think a clever ETL would already be doing on your behalf. Any half clever scripting language like php, python or ruby ought to handle it easily. Especially important if you have things like surrogate/auto-generated keys.
Keep in mind that in both cases you'll have to either delete the to-export rows that you've successfully inserted or flag them as done.

update the destination table

I want to pull data from an source destination. How can I insert rows that are not already in the table and update rows that already exist ?
We could use LOOK UP on target for the existing recrods. On matching Update otherwise insert in the target.
Other approach of using the MERGE statement.
thanks
prav
Use a slowly changing dimension transform see http://msdn.microsoft.com/en-us/library/ms141715.aspx
I would recommend CozyRoc's TableDifference component. I have used the predecessor from SQLBI.EU and it's very good.
I also recommend that instead of using a Command compponent to run individual updates on the stream with updates detected, that you stream the updates to a table and then use a single UPDATE statement in a SQL task to perform the update.
I found this webcast very helpful in learning some different methods of doing "upserts" with SSIS. You can download the samples referenced in the webcast and see working examples of exactly what you need. MSDN Architecture Webcast: Using SQL Server 2005 Integration Services to Populate a Kimball Method Data Warehouse (Level 200)