I'd like to send several queries to the Google Cloud SQL database from Google Apps Script in one row. For example :
insert into table_name (field_name) values ("prout");
select last_insert_id()
but for some reason I can't get it to work. Is the API limited to one query per time ? It is a pain because sending a query takes time. It would be a lot more efficient to be able to send several things at once.
The reason that you may not be able to send multiple queries at the same time is because each statement will have a different return value. For example, an "insert" statement will give you an integer indicating how many rows were affected (hopefully 1!). A "select" statement will return to you a set of objects.
You can do batch commands using addBatch, but they must all be of the same type of query (for example, a lot of "insert"s).
Related
I am developing high load web-application and trying to reduce the quantity of sql queries. Quite often I need to update one row and get the results. I think it would be nice to have possibility to run query and at the same time receive the values of updated fields without making 2 calls of mysql server. For example, I execute the following query:
update table set val=val+1 where id=1;
and function returns:
array("val"=>10)
Sure, I understand, that I can write my own function, which first makes update, than select and returns the result. But the problem is that in such case mysql server will have to seek data, update during first query, than comes the second query, which again requires to seek data and return it. And I am thinking about the way, where mysql seeks data, updates and returns updated data.
I have one sql function, let's say theFunction(item_id). It takes an item id and computes one value as its return. I read one table from the DB and I suppose to compute a new value to append for each row by this function given the item_id particular to taht row. Which desing block would do this form me with the following SQL (if not wrong).
select thsFunction(item_id);
I assume that the block gives me item_id of each row as a variable.
You can use another table input step, and have it accept fields from previous steps and execute for every row (both config options are at the bottom of the step's window).
Beware that this is a rather slow implementation. Each query is executed separately and as such each row requires a round trip to the database.
Alternatively, you can use the Row SQL Script. I believe it allows you to pass all SQL statements in a single trip to the database.
An SQL function is probably much more efficient to run in the database, for all rows at once, in stead of making a separate call into the database from PDI for each row to execute the function. So if performance is at all a relevant concern, I'd suggest a whole different strategy:
Write your rows to a table in the database. End your transformation here.
On the job level, first execute your transformation from above, then execute the function in an "Execute SQL script..." component, giving it an SQL command somewhat like "UPDATE my_temp_table set target_col = theFunction(item_id)".
Continue your job with the remaining steps in a new transformation, starting from that table as input.
This of course presupposes that you don't have too many other threads going on, but if your transofrmation is simple and linear -- or at least if it can be made single-linear at this particular step -- it may be possible to split it up into two parts before and after this SQL call.
I have no idea whether this can be done or not, but basically, I have the following data flow:
Extracts the data from an XML file (works fine)
Simply splits the records based on an enclosed condition (works fine)
Had to add a derived column object due to some character set issues (might be better methods, but it works)
Now "Step 4" is where I'm running into a scenario where I'd only like to insert the values that have a corresponding match in my database, for instance, the XML has about 6000 records, and from those, I have maybe 10 of them that I need to match back against and insert them instead of inserting all 6000 of them and doing the compare after the fact (which I could also do, but was hoping there'd be another method). I was thinking that I might be able to perform a sql insert command within the OLE DB DESTINATION object where the ID value in the file matches, but that's what I'm not 100% clear on or if it's even possible for that matter. Should I simply go the temp table route and scrub the data after the fact, or can I do this directly in the destination piece? Any suggestions would be greatly appreciated.
EDIT
Thanks to the last comment from billinkc, I managed to get bit closer, where I can identify the matches and use that result set, but somehow it seems to be running the data flow twice, which is strange.... I took the lookup object out to see whether it was causing it and somehow it seems to be the case, any reason why it would run this entire flow twice with the addition of the lookup? I should have a total of 8 matches, which I confirmed with the data viewer output, but then it seems to be running it a second time for the same file.
Is there a reason you can't use a Lookup transformation to find existing records. Configure it so that it routes non-match records to the no match output and then only connect the match found connector to the "Navigator Staging Manager Funds"
I believe that answers what you've asked but I wonder if you're expressing the right desire? My assumption is the lookup would go against the existing destination and so the lookup returns the id 10 for a row. All of the out of the box destinations in SSIS only perform inserts, so that row that found a match would now get doubled. As you are looking for existing rows, that usually implies you'd want to perform an update to an existing row. If that's the case, there is a specially designed transformation, the OLE DB Command. It is the component that allows for updates. There is a performance problem with that component, it issues a single update statement per row flowing through it. For 10 rows, I think it'd be fine. Otherwise, the pattern you'd use is to write all the new rows (inserts) into your destination table and then write all of your changed rows (updates) into a second staging-type table. After the data flow is complete, then use an Execute SQL Task to perform a set based update statement.
There are third party options that handle combined upserts. I know Pragmatic Works has an option and there are probably others on the tasks and components site.
So I'm building a C program that connects to a mySQL database. Everything worked perfectly. Then, to save on number of queries, I decided that I would like to execute 10 statements at a time. I set the "CLIENT_MULTI_STATEMENTS" flag in the connection, and separated my statements by semicolons.
When I execute the first batch of 10 statements, it succeeds and mysql_real_query() returns a 0.
When I try the second batch though, it returns a "1" and doesn't work. Nowhere can I find what this "1" error code means, so I was hoping someone may have run into this problem before.
Please note that these are all UPDATE statements, and so I have no need for result sets, it's just several straight-up calls to mysql_real_query().
It's not clear from the documentation whether the errors this function can cause are returned or not, but it should be possible to obtain the actual error using mysql_error().
My guess is that you still have to loop through the result sets whether you're interested in them or not.`
Are these prepared statements? If that's the case then you can't use CLIENT_MULTI_STATEMENTS.
Also, note (from http://dev.mysql.com/doc/refman/5.5/en/c-api-multiple-queries.html) that:
After handling the result from the
first statement, it is necessary to
check whether more results exist and
process them in turn if so. To support
multiple-result processing, the C API
includes the mysql_more_results() and
mysql_next_result() functions. These
functions are used at the end of a
loop that iterates as long as more
results are available. Failure to
process the result this way may result
in a dropped connection to the server.
You have to walk over all the results, regardless of whether or not you care about the values.
So here is my situation: I have a vendor supplied DB we cannot modify and a custom db that imports data from the vendor app and acts on it. Once records are imported form the vendor app, they cannot appear on the list of records to be imported. Also we only want to display the 250 most recent records that have not been imported.
What I originally started with was select the list of ids that have been imported from the custom db, and then query the vendor db, using the list of ids in a .Where(x => !idList.Contains(x.Id)) clause on the remote query.
This worked up until we broke 2100 records imported into the custom db, as 2100 is the limit on the number of parameters that can be passed into SQL. After finding out this was the actual problem and not the 'invalid buffer'/'severe error' ADO.Net reported, my solution was to remove the first 2000 ids in the remote query, and then remove the remaining records in the local query.
Having to pull back a large number of irrelevant records, just to exclude them, so I can get the correct 250 records seems very inelegant. Is there a better way to do this, short of doing a cross db stored procedure?
Thanks in advance.
This might not be the best answer, depending on how many records you're dealing with, but you could force the SQL to execute and just deal with it as in-memory objects. Calling the ToList() method will execute the SQL and convert to an IEnumerable .
What I might suggest is to have started by querying the vendor database first ordering the results by some kind of criteria (perhaps a date field, oldest to most recent).
You could do a Skip().Take() to "skim" the results and then take each bulk set and insert them into the custom db where the ID doesn't already exist. That way you avoid the problem you have now.
If you have db-create access to the SQL Server that the vendor's db is running on (or if your custom db is on the same server), you could create a "has been imported" table in a different database on that same server, and then write a stored proc that does a cross-database join of that table against the vendor db, e.g.:
select top 250 from vendordb.to_be_imported
where not exists
(select 1 from customdb.has_been_imported where idWasImported = idToBeImported)
order by whatever;
You might even be able to do this in Linq 2 SQL -- I've never tried adding objects from different databases into a single DataContext...