SQL Server 2008: insert into table in batches - sql-server-2008

I have a linked server (Sybase) set up in SQL Server from which I need to draw data. The Sybase server sits on the other side of the world, and connectivity is pretty shoddy. I would like to insert data into one of the SQL Server tables in manageable batches (e.g. 1000 records at a time). I.e I want to do;
INSERT IN [SQLServerTable] ([field])
SELECT [field] from [LinkedServer].[DbName].[dbo].[SybaseTable]
but I want to fetch 1000 records at a time and insert them.
Thanks
Karl

I typically use python with the pyodbc module to perform batches like this against a SQL server. Take a look and see if it is an option, if so I can provide you an example.
You will need to modify a lot of this code to fit your particular situation, however you should be able to follow the logic. You can comment out the cnxn.commit() line to rollback the transactions until you get everything working.
import pyodbc
#This is an MS SQL2008 connection string
conn='DRIVER={SQL Server};SERVER=SERVERNAME;DATABASE=DBNAME;UID=USERNAME;PWD=PWD'
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rowCount=cursor.execute('SELECT Count(*) from RemoteTable').fetchone()[0]
cnxn.close()
count=0
lastID=0
while count<rowCount:
#You may want to close the previous connection and start a new one in this loop. Otherwise
#the connection will be open the entire time defeating the purpose of performing the transactions in batches.
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rows=cursor.execute('SELECT TOP 1000 ID, Field1, Field2 FROM INC WHERE ((ID > %s)) ' % (lastID)).fetchall()
for row in rows:
cursor.execute('INSERT INTO LOCALTABLE (FIELD1, FIELD2) VALUES (%s, %s)' % (row.Field1, row.Field2))
cnxn.commit()
cnxn.close()
#The [0] assumes the id is the first field in the select statement.
lastID=rows[len(rows)-1][0]
count+=len(rows)
#Pause after each insert to see if the user wants to continue.
raw_input("%s down, %s to go! Press enter to continue." % (count, rowCount-count))

Related

How to compare the two table row count , if counts matches than ok if not matches this will restart the SSIS package

I have made the ssis package in which i made the data flow for incremental data. Source and destination server ip's are different. Below you can find the flow diagram of my packageControl flow diagram
Data flow diagram
the package is working fine .
In the Execute SQl task :- it controls the log table and start the incremental task
query which i used is :-
insert into audit_log (
Packagename,
process_date,
start_datetime,
end_datetime,
Record_processed,
status
)values('CRM-TO-TRANSORGDB',null,GETDATE(),null,null,null);
select MAX(ID) as ID,MAX(process_date) as proc_date from audit_log where Packagename ='CRM-TO-TRANSORGDB' ;
store the ID and proc_date in the variable.
in the Execute SQl task 1:- it just update the log table.
UPDATE audit_log
SET
process_date=?,
end_datetime = GETDATE(),
status='SUCCESS'
record_processed=?
WHERE (packagename = 'CRM-TO-TRANSORGDB') AND ID=? ;
this is the query we have used to update the log table.
In the Data flow simple fetching the all the records and put in into the destination table.
this all i have done .
But my question are:-
1) How to compare the total no. of row counts from the source table to destination table in ssis package.
2) if its doesn't matches than it will restart my task automatically.
#thomas as per your instruction i have done the following thing:
1) i have made the Execute SQl Task for source and destination .
2) and Add the Execute Package task and added the condition for not matching the count.
and added the expression for check row_count_src!= row_count_dest
and in Source_table_count i have used the below query:
select count(SubOrderID) as row_count_src from fact_suborder_journey
WHERE Suborderdate between '2016-06-01' and GETDATE()-1 ;
in dest_table_count i have used the below query:
select count(SubOrderID) as row_count_dest from fact_suborder_journey
WHERE Suborderdate between '2016-06-01' and GETDATE()-1 ;
i have added the two variable as int64 in ths ssis package. and map in the result set below you can find the pic what i have done.
but After done all this this i am getting this error:
[Execute SQL Task] Error: An error occurred while assigning a value to variable "row_count_src": "The type of the value being assigned to variable "User::row_count_src" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
".
I havent tested this completely but you might be able to do something like this. This creates a loop of your packages and will executes as long as your count variables are different from each other.
What have i done?
First i have a DataFlow Task which moves data from source to
destination.
Then i have an Execute SQL task which basically counts all rows from
TableA and maps it to variable count1 eg. Source table
Then i have an Execute SQL task which basically counts all rows from
TableB and maps it to variable count2 eg. Destination Table
Then i create an Execute Package task where i reference it too it
self. Then i make a precedence constraint with an expression saying
Count1 != count2.
Because if they are different you want to restart the task. If they
are equal the last task Execute Package task will never be executed.
Hope that is something like that?
If I understand your challenge correctly...
In the data flow task, use a RowCount transformation between source
and destination to capture the rows written to the destination. This
will be stored in a variable.
In the control flow, get the max row counts available from the log table and store that a variable.
Create an execute package tasks that executes this same package and put a precedence constraint before if that compares if variable from Step1 <> variable in Step2.

Wordnet MySQL statement doesn't complete

I'm using the Wordnet SQL database from here: http://wnsqlbuilder.sourceforge.net
It's all built fine and users with appropriate privileges have been set.
I'm trying to find synonyms of words and have tried to use the two example statements at the bottom of this page: http://wnsqlbuilder.sourceforge.net/sql-links.html
SELECT synsetid,dest.lemma,SUBSTRING(src.definition FROM 1 FOR 60) FROM wordsXsensesXsynsets AS src INNER JOIN wordsXsensesXsynsets AS dest USING(synsetid) WHERE src.lemma = 'option' AND dest.lemma <> 'option'
SELECT synsetid,lemma,SUBSTRING(definition FROM 1 FOR 60) FROM wordsXsensesXsynsets WHERE synsetid IN ( SELECT synsetid FROM wordsXsensesXsynsets WHERE lemma = 'option') AND lemma <> 'option' ORDER BY synsetid
However, they never complete. At least not in any reasonable amount of time and I have had to cancel all of the queries. All other queries seem to work find and when I break up the second SQL example, I can get the individual parts to work and complete in reasonable times (about 0.40 seconds)
When I try and run the full statement however, the MySQL command line client just hangs.
Is there a problem with this syntax? What is causing it to take so long?
EDIT:
Output of "EXPLAIN SELECT ..."
Output of "EXPLAIN EXTENDED ...; SHOW WARNINGS;"
I did more digging and looking into the various statements used and found the problem was in the IN command.
MySQL repeats the statement for every single row in the database. This is the cause of the hang, as it had to run through hundreds of thousands of records.
My remedy to this was to split the command into two separate database calls first getting the synsets, and then dynamically creating a bound SQL string to look for the words in the synsets.

Generate 10 queries to run in SSIS

I have a driver table, date_driver_table that contains 10 dates. Jan 2014, Feb 2014, ... Nov2014.
I need to run a query
select * from records_Jan2014 where recdate='Jan 2014'
This is query 1 . After this runs and puts the result set in a SQL server table, query 2,
select * from records_Feb2014 where recdate='Feb 2014'
will then run and do the same insert into SQL server table , and then query 3, and so forth until no dates left in driver table.
So in ssis I have an execute SQL task with full result set enabled that puts all the dates from date driver table in a variable called date with type object and then feeds into a for each loop with a variable called single date type string. A dat flow with source and a destination of a SQL server table. The problem is how do I set up a source to do query 1 then put the results in the table then do query 2 etc.
I was thinking maybe creating 10 files with SQL and then using the ole db source with file as the SQL that needs to run but sure there is a way to do this with the for each loop. Can anyone point me how to this ? The question is how to set up the for each loop so it runs query 1, puts into the table then runs query 2 and puts it into the table and so on until all the records are done.
Used sql command expression pointing to a variable on the ADO.net Source . Then the variable was fed from an execute sql task which gave the list to process.

Substitue to Insert query for more than 200,000 records in mysql db table

I have to insert more than 200000 records at one go, into mysql db table, Insert query is resulting in performance issue, what could be the substitute to this.
Below is the code I am using
$xml = simplexml_load_file("247electrical.xml");
foreach($xml->merchant as $merchant){
define('API', 'PS');
require_once('constants.inc.php');
require_once('classes/class.ClientFactory.php');
$oClient = ClientFactory::getClient(API_USERNAME, API_PASSWORD, API_USER_TYPE); $merchattrs=$merchant->attributes();
$aParams100 = array('iMerchantId' => array($merchattrs->id)); $merchantinfo= $oClient->call('getMerchant', $aParams100);
//Get Products
foreach($xml->merchant->prod as $product){
$attrs=$product->attributes();
//Insert Products into DB
mysql_query('INSERT INTO productstemp (merchant_id, merchant_name, aw_product_id, merchant_product_id, product_name, description, category_id, merchant_category, aw_deep_link, aw_image_url, search_price, delivery_cost, merchant_image_url, aw_thumb_url, brand_name, delivery_time, display_price, in_stock, merchant_thumb_url, model_number, pre_order, stock_quantity, store_price, valid_from, valid_to, web_offer, merchantimage, cleancompany) VALUES("'.$merchattrs->id.'","'.$merchattrs->name.'","'.$attrs->id.'"," ","'.$product->text->name.'","'.$product->text->desc.'","'.$product->cat->awCatId.'","'.$product->cat->mCat.'","'.$product->uri->awTrack.'","'.$product->uri->awImage.'","'.$product->price->buynow.'","'.$product->price->delivery.'","'.$product->uri->mImage.'","'.$product->uri->awThumb.'","'.$product->brand->brandName.'","'.$product->delTime.'","'.$product->price->buynow.'","'.$attrs->in_stock.'","'.$product->uri->mThumb.'","'.$product->modelNumber.'","'.$attrs->pre_order.'","'.$attrs->stock_quantity.'","'.$product->price->store.'","'.$product->valFrom.'","'.$product->valTo.'","'.$attrs->web_offer.'","'.$merchantinfo->oMerchant->sLogoUrl.'","247electrical" ) ')
or die(mysql_error());
}
}
Thanks
I dont think that the INSERT queries per se are the problem. 200.000 inserts arent that much for mysql after all.
First I guess reading the file is slow. SimpleXML is convenient but for large files it results in a huge memory overhead. Think about a streaming XML reader like PHP´s XMLReader.
You are sending individual statements to the mysql server which is way slower then sending one huge statement. Also, your single insert statement should be wrapped in a transaction. What happens if you processed 10.000 records and inserted them and then your script dies/mysql server dies etc.? How do you safely start the script again without manual work (clearing table, lookup which were already processed etc.).
Apart from that, one single INSERT statement with many VALUES should be way faster. I would make your PHP script output the query so it looks in the end like this:
INSERT INTO table(field_1, field_2, field 3)
VALUES('foo 1', 'bar 1', 'baz 1'),
VALUES('foo 2', 'bar 2', 'baz 2'),
...
And then import that file via:
$ mysql ... credentials options etc ... < output.sql
If thats still too slow… buying more hardware might help, too.

Is there any way to create multiple insert statements in a ms-access query?

I am using MS Access 2003. I want to run a lot of insert SQL statements in what is called 'Query' in MS Access. Is there any easy(or indeed any way) to do it?
yes and no.
You can't do:
insert into foo (c1, c2, c3)
values ("v1a", "v2a", "v3a"),
("v1b", "v2b", "v3b"),
("v1c", "v2c", "v3c")
but you can do
insert into foo (c1, c2, c3)
select (v1, v2, v3) from bar
What does that get you if you don't already have the data in a table? Well, you could craft a Select statement composed of a lot of unions of Selects with hard coded results.
INSERT INTO foo (f1, f2, f3)
SELECT *
FROM (select top 1 "b1a" AS f1, "b2a" AS f2, "b3a" AS f3 from onerow
union all
select top 1 "b1b" AS f1, "b2b" AS f2, "b3b" AS f3 from onerow
union all
select top 1 "b1c" AS f1, "b2c" AS f2, "b3c" AS f3 from onerow)
Note: I also have to include a some form of a dummy table (e.g., onerow) to fool access into allowing the union (it must have at least one row in it), and you need the "top 1" to ensure you don't get repeats for a table with more than one row
But then again, it would probably be easier just to do three separate insert statements,
especially if you are already building things up in a loop (unless of course the cost of doing the inserts is greater than the cost of your time to code it).
Personally, I'd create a VBA subroutine to do it, and connect to the database using some form of sql connection.
Off the top of my head, the code to do it should look something like:
Sub InsertLots ()
Dim SqlConn as Connection
SqlConn.Connect("your connection string")
SqlConn.Execute("INSERT <tablename> (column1, column2) VALUES (1, 2)")
SqlConn.Execute("INSERT <tablename> (column1, column2) VALUES (2, 3)")
SqlConn.Close()
End Sub
I think it's inadvisable to propose a particular data interface, as Jonathan does, when you haven't clarified the context in which the code is going to run.
If the data store is a Jet database, it makes little sense to use any form of ADO unless you're running your code from a scripting platform where it's the preferred choice. If you're in Access, this is definitely not the case, and DAO is the preferred interface.
MS Access does not allow multiple insert from same sql window. If you want to insert, say 10 rows in table, say movie (mid, mname, mdirector,....), you would need to
open the sql windows,
type the 1st stmt, execute 1st stmt, delete 1st stmt
type the 2nd stmt, execute 2nd stmt, delete 2nd stmt
type the 3rd stmt, execute 3rd stmt, delete 3rd stmt ......
Very boring.
Instead you could import the lines from excel by doing:
Right-click on the table name that you have already created
Import from Excel (Import dialog box is opened)
Browse to the excel file containing the records to be imported in the table
Click on "Append a copy of the records to the table:"
Select the required table (in this example movie)
Click on "OK"
Select the worksheet that contains the data in the spreadsheet
Click on Finish
The whole dataset in the excel has been loaded in the table "MOVIE"
No - a query in Access is a single SQL statement. There is no way of creating a batch of several statements within one query object.
You could create multiple query objects and run them from a macro/module.
#Rik Garner: Not sure what you mean by 'batch' but the
INSERT INTO foo (f1, f2, f3)
SELECT *
FROM (select top 1 "b1a" AS f1, "b2a" AS f2, "b3a" AS f3 from onerow
union all
select top 1 "b1b" AS f1, "b2b" AS f2, "b3b" AS f3 from onerow
union all
select top 1 "b1c" AS f1, "b2c" AS f2, "b3c" AS f3 from onerow)
construct, although being a single SQL statement, will actually insert each row one at a time (rather than all at once) but in the same transaction: you can test this by adding a relevant constraint e.g.
ALTER TABLE foo ADD
CONSTRAINT max_two_foo_rows
CHECK (2 >= (SELECT COUNT(*) FROM foo AS T2));
Assuming the table is empty, the above INSERT INTO..SELECT.. should work: the fact it doesn't is because the constraint was checked after the first row was inserted rather than the after all three were inserted (a violation of ANSI SQL-92 but that's MS Access for you ); the fact the table remains empty shows that the internal transaction was rolled back.
#David W. Fenton: you may have a strong personal preference for DAO but please do not be too hard on someone for choosing an alternative data access technology (in this case ADO), especially for a vanilla INSERT and when they qualify their comments with, " Off the top of my head, the code to do it should look something like…" After all, you can't use DAO to create a CHECK constraint :)
MS Access can also Append data into a table from a simple text file. CSV the values (I simply used the Replace All box to delete all but the commas) and under External Data select the Text File.
From this:
INSERT INTO CLASS VALUES('10012','ACCT-211','1','MWF 8:00-8:50 a.m.','BUS311','105');
INSERT INTO CLASS VALUES('10013','ACCT-211','2','MWF 9:00-9:50 a.m.','BUS200','105');
INSERT INTO CLASS VALUES('10014','ACCT-211','3','TTh 2:30-3:45 p.m.','BUS252','342');
To this:
10012,ACCT-211,1,MWF 8:00-8:50 a.m.,BUS311,105
10013,ACCT-211,2,MWF 9:00-9:50 a.m.,BUS200,105
10014,ACCT-211,3,TTh 2:30-3:45 p.m.,BUS252,342
Based on the VBA workaround from #Jonathan, and for execution in the current Access database:
Public Sub InsertMinimalData()
CurrentDb.Execute "INSERT INTO FinancialYear (FinancialYearID) VALUES ('FY2019/2020');"
CurrentDb.Execute "INSERT INTO FinancialYear (FinancialYearID) VALUES ('FY2020/2021');"
End Sub