Table Renaming in an Explicit Transaction - sql-server-2008

I am extracting a subset of data from a backend system to load into a SQL table for querying by a number of local systems. I do not expect the dataset to ever be very large - no more than a few thousand records. The extract will run every two minutes on a SQL2008 server. The local systems are in use 24 x 7.
In my prototype, I extract the data into a staging table, then drop the live table and rename the staging table to become the live table in an explicit transaction.
SELECT fieldlist
INTO Temp_MyTable_Staging
FROM FOOBAR;
BEGIN TRANSACTION
IF(OBJECT_ID('dbo.MyTable') Is Not Null)
DROP TABLE MyTable;
EXECUTE sp_rename N'dbo.Temp_MyTable_Staging', N'MyTable';
COMMIT
I have found lots of posts on the theory of transactions and locks, but none that explain what actually happens if a scheduled job tries to query the table in the few milliseconds while the drop/rename executes. Does the scheduled job just wait a few moments, or does it terminate?
Conversely, what happens if the rename starts while a scheduled job is selecting from the live table? Does transaction fail to get a lock and therefore terminate?

Related

How can I prevent a stored procedure from running twice at the same time?

I'm using an Aurora DB (ie MySQL version 5.6.10) as a queue, and I'm using a stored procedure to pull records out of a table in batches. The sproc works with the following steps...
Select the next batch of data into a temptable
Write the IDs from the records from the temp table into to a log table
Output the records
Once a record has been added to the log, the sproc won't select it again next time it's called, so multiple servers can call this sproc, and both deal with batches of data from the queue without stepping on each others toes.
The sproc runs in a fraction of a second, but my company is now spinning up servers automatically, and these cloned servers are calling the sproc at exactly the same time, and the result is the same records are being selected twice
Is there a way I can make this sproc be limited to one call at a time? Ideally, any additional calls should wait until the first call is finished, and then they can run
Unfortunately, I have very little experience working with MySQL, so I'm not really sure where to start. I'd much appreciate it if anyone could point me in the right direction
This is a job for MySQL table locking. Try something like this. (You didn't show us your queries so there's a lot of guesswork here.)
SET autocommit = 0;
LOCK TABLES logtable WRITE;
CREATE TEMPORARY TABLE temptable AS
SELECT whatever FROM whatevertable FOR UPDATE;
INSERT INTO logtable (id)
SELECT id FROM temptable;
COMMIT;
UNLOCK TABLES;
If more than one connection tries to run this sequence concurrently, one will wait for the other's UNLOCK TABLES; to proceed. You say your SP is quick, so probably nobody will notice the short wait.
Pro tip: When you have the same timed code running on lots of servers, it's best to put in a short random delay before running the job. That way the shared resources (like your MySQL database) won't get hammered by a whole lot of requests precisely timed to be simultaneous.

Copy some data from a database and keep referential integrity

The requirement is to extract some data from an active database (159 tables at the moment) into another database, such that the copied data has full referential integrity, whilst the data is in flux (it is a live database). This is not about dumping the entire database (approaching 50GB), just extracting some rows that we have identified from the whole database into a separate database.
We currently create a new DB based upon our initial schema and subsequent DDL migrations and repeatables (views, stored procedures, etc.), and then
copy the appropriate rows. This normally takes more than 10 minutes, but less than 1 hour, depending upon the size of the set to be extracted.
Is there a way to tell mysql that I want ignore any transactions committed after I start running the extract, be they new rows added, rows deleted, or rows updated, but any other connection to the database just carries on working as normal, as if I wasn't making any requests.
What I don't want to have happen is I copy data from table 1 and by the time I get to table 159, table 1 has changed and a row in table 159 refers to that new row in table 1.
Use mysqldump --single-transaction. This starts a repeatable-read transaction before it starts dumping data, so any concurrent transactions that happen while you are dumping data don't affect the data dumped by your transaction.
Re your updated question:
You can do your own custom queries in a transaction.
Start a transaction in repeatable-read mode before you begin running queries for your extraction. You can run many queries against many tables, and all the data you extract will be exactly what was currently committed as of the moment you started that transaction.
You might like to read https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html

How to achieve zero downtime in ETl

I have an ETL process which takes data from transaction db and keeps after processing stores the data to another DB. While storing the data we are truncating the old data and storing new data to have better performance, as update takes a lot of time than truncate insert. So in this process we experience counts as 0 or wrong data for some time (like for 2 3 mins). We are running the ETL in every 8 hours.
So how can we avoid this problem? How can we achieve zero downtime?
One way we did use in the past was to prepare the prod data in a table named temp. Then when finished (and checked, that was the lengthy part in our process), drop prod and rename the temp in prod.
Takes almost no time, and the process was successful even in case some other users were locking the table.

Talend job truncate records when i killed the job and run again

I am using the Talend open studio for Data Integration tool for transfer sql server table data to mysql server database.
I have a 40 million records into the table.
I created and run the job but after inserting 20 million approx, connection failed.
When i Tried again to insert the Data then talend job firstly truncate the data from the table then it is inserting data from the beginning,
The question seems to be incomplete, but assuming that you want the table not to truncate before each load, check the "Action on table" property. It should be set as "Default" or "Create Table if does not exist".
Now, if you're question is to handle restart-ability of the job where the job should resume from 20 million rows on the next run, there are multiple ways you could achieve this. In your case since you are dealing with high number of records, having a mechanism like pagination would help in which you load the data in chunks (lets say 10000 at a time) and loop it setting the commit interval as 10000. After each successful entry in the database of 10000 records, make an entry into one log table with the timestamp or incremental key in your data (to mark the checkpoint). Your job should look something like this:
tLoop--{read checkpoint from table}--tMSSqlInput--tMySqlOutput--{load new checkpoint in table}
You can set the a property in context variable, say 'loadType' which will have value either 'initial' or 'incremental'.
And before truncating table you should have 'if' link to check what is the value of this variable, if it is 'initial' it will truncate and it is 'incremental' then you can run your subjob to load data.

Interrupted bulk insert statement while on table lock

In a production server we've got a huge Myisam table that collects visits and we run a routine once a month let's say to move old data to an Archive table and make the huge table lighter for backups etc. The problem is when in the process of running the routine if the server crashes or simply mysql had to restart it may interrupt the insert to archive statement or the delete from live table statement. As we cannot use transactions because of the ARCHIVE engine for the archive table, is locking the tables a solution ? or have we to plan integrity checks manually ?